Re: nodetool connection refused

2012-09-14 Thread Manu Zhang
should we update the wiki? On Fri, Sep 14, 2012 at 1:18 PM, aaron morton aa...@thelastpickle.comwrote: Yes. If your IDE is starting cassandra the settings from cassandra-env.sh will not be used. Cheers - Aaron Morton Freelance Developer @aaronmorton

Re: Schema consistently not propagating to a node.

2012-09-14 Thread aaron morton
Out of interest, how out of sync where they ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/09/2012, at 6:53 AM, Ben Frank b...@airlust.com wrote: Hi Sergey, That was exactly it, thank you! -Ben On Thu, Sep 13, 2012 at

Re: secondery indexes TTL - strange issues

2012-09-14 Thread aaron morton
INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line 221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E ventsByItem.ebi_eventtypeIndex-he-10-Data.db,]. 78,623,000 to 373,348 (~0% of o riginal) bytes for 83 keys at 0.000280MB/s.

Re: Composite Column Query Modeling

2012-09-14 Thread aaron morton
You _could_ use one wide row and do a multiget against the same row for different column slices. Would be less efficient than a single get against the row. But you could still do big contiguous column slices. You may get some benefit from the collections in CQL 3

Re: Data Model

2012-09-14 Thread aaron morton
Consider a course_students col family which gives a list of students for a course I would use two CF's: Course CF: * Each row is one course * Columns are the properties and values of the course CourseEnrolements CF * Each row is one course * Column name is the

Re: Changing bloom filter false positive ratio

2012-09-14 Thread aaron morton
I have a hunch that the SSTable selection based on the Min and Max keys in ColumnFamilyStore.markReferenced() means that a higher false positive has less of an impact. it's just a hunch, i've not tested it. Cheers - Aaron Morton Freelance Developer @aaronmorton

Re: hadoop inserts blow out heap

2012-09-14 Thread aaron morton
Hi Brian did you see my follow up questions here http://www.mail-archive.com/user@cassandra.apache.org/msg24840.html Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/09/2012, at 11:52 PM, Brian Jeltema brian.jelt...@digitalenvoy.net

Re: Reading column names only

2012-09-14 Thread aaron morton
It's not possible to read just the column names. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/09/2012, at 9:05 PM, Robin Verlangen ro...@us2.nl wrote: Hi there, Would it be possible to read only the column names, instead of the

Re: Reading column names only

2012-09-14 Thread Robin Verlangen
Hi Aaron, Is this something that's worth becoming a feature in the future? Or should I rework my data model? If so, do you have any suggestions? Best regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this

Re: Many ParNew collections

2012-09-14 Thread Rene Kochen
Thanks Aaron, At another production site the exact same problems occur (also after ~6 months). Here I have a very small cluster of three nodes with replication factor = 3. One of the three nodes begins to have many long Parnews and high CPU load. I upgraded to Cassandra 1.0.11, but the GC problem

Query advice to prevent node overload

2012-09-14 Thread André Cruz
Hello. I have a schema that represents a filesystem and one example of a Super CF is: CF FilesPerDir: (DIRNAME - (FILENAME - (attribute1: value1, attribute2: value2)) And in cases of directory moves, I have to fetch all files of that directory and subdirectories. This implies one cassandra

Re: Data Model

2012-09-14 Thread Hiller, Dean
playOrm uses EXACTLY that pattern where @OneToMany becomes student.rowkeyStudent1 student.rowkeyStudent2 and the other fields are fixed. It is a common pattern in noSQL. Dean From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To:

Cassandra node going down

2012-09-14 Thread rohit reddy
Hi, I'm facing a problem in Cassandra cluster deployed on EC2 where the node is going down under write load. I have configured a cluster of 4 Large EC2 nodes with RF of 2. All nodes are instance storage backed. DISK is RAID0 with 800GB I'm pumping in write requests at about 4000 writes/sec. One

Re: Cassandra node going down

2012-09-14 Thread Robin Verlangen
Hi Robbit, I think it's running out of disk space, please verify that (on Linux: df -h ). Best regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the

Re: Cassandra node going down

2012-09-14 Thread Robin Verlangen
Robbit = Rohit of course, excuse me. Best regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be

Re: Cassandra node going down

2012-09-14 Thread rohit reddy
Hi Robin, I had checked that. Our disk size is about 800GB, and the total data size is not more than 40GB. Even if all the data is stored in one node, this won't happen. I'll try to see if the disk failed. Is this anything to do with VM memory?.. cause this logs suggests that.. Heap is

Is it possible to create a schema before a Cassandra node starts up ?

2012-09-14 Thread Xu, Zaili
Guys I am pretty new to Cassandra. I have a script that needs to set up a schema first before starting up the cassandra node. Is this possible ? Can I create the schema directly on cassandra storage and then when the node starts up it will pick up the schema ? Zaili From: rohit reddy

Re: Cassandra node going down

2012-09-14 Thread Robin Verlangen
Cassandra writes to memtables, that will get flushed to disk when it's time. That might be because of running out of memory (the log message you just posted), on a shutdown, or at other times. That's why you're using memory while writing. You seem to be running on AWS, are you sure your data

cassandra does not close the stdout console on startup

2012-09-14 Thread Xu, Zaili
Hi, Another newby question. My script needs to start up cassandra node. However cassandra doesn't close the stdout console and therefore never returns cassandra -p c.pid Is there anyway to have cassandra close the stdout ? Zaili **

Re: Composite Column Query Modeling

2012-09-14 Thread Adam Holmberg
I think what you're describing might give me what I'm after, but I don't see how I can pass different column slices in a multiget call. I may be missing something, but it looks like you pass multiple keys but only a singular SlicePredicate. Please let me know if that's not what you meant. I'm

Re: Cassandra, AWS and EBS Optimized Instances/Provisioned IOPs

2012-09-14 Thread Chris Dodge
Michael Theroux mtheroux2 at yahoo.com writes: Hello, A number of weeks ago, Amazon announced the availability of EBS Optimized instances and Provisioned IOPs for Amazon EC2.  Historically, I've read EBS is not recommended for Cassandra due to the network contention that can quickly result

Re: Composite Column Query Modeling

2012-09-14 Thread Hiller, Dean
There is another trick here. On the playOrm open source project, we need to do a sparse query for a join and so we send out 100 async requests and cache up the java Future objects and return the first needed result back without waiting for the others. With the S-SQLin playOrm, we have the IN

Astyanax - build

2012-09-14 Thread A J
Hi, I am new to java and trying to get the Astyanax client running for Cassandra. Downloaded astyanax from https://github.com/Netflix/astyanax. How do I compile the source code from here it in a very simple fashion from linux command line ? Thanks.

Re: Changing bloom filter false positive ratio

2012-09-14 Thread Peter Schuller
I have a hunch that the SSTable selection based on the Min and Max keys in ColumnFamilyStore.markReferenced() means that a higher false positive has less of an impact. it's just a hunch, i've not tested it. For leveled compaction, yes. For non-leveled, I can't see how it would since each

Re: Astyanax - build

2012-09-14 Thread Philip O'Toole
On Fri, Sep 14, 2012 at 12:28:08PM -0400, A J wrote: Hi, I am new to java and trying to get the Astyanax client running for Cassandra. Downloaded astyanax from https://github.com/Netflix/astyanax. How do I compile the source code from here it in a very simple fashion from linux command line

Re: Astyanax - build

2012-09-14 Thread Hiller, Dean
I didn't need to compile it. It is up in the maven repositories as we http://mvnrepository.com/artifact/com.netflix.astyanax/astyanax Or are you trying to see how it works? (We use the same client on playORM open source projectŠit works like a charm). Dean On 9/14/12 10:28 AM, A J

Re: Astyanax - build

2012-09-14 Thread Philip O'Toole
On Fri, Sep 14, 2012 at 10:49 AM, Hiller, Dean dean.hil...@nrel.gov wrote: I didn't need to compile it. It is up in the maven repositories as we http://mvnrepository.com/artifact/com.netflix.astyanax/astyanax Actually, yeah, that's what I ended up doing with my ghetto set up too, but I did

nodetool cfstats and compression

2012-09-14 Thread Jim Ancona
Do the row size stats reported by 'nodetool cfstats' include the effect of compression? Thanks, Jim

minor compaction and delete expired column-tombstones

2012-09-14 Thread Rene Kochen
Hi all, Does minor compaction delete expired column-tombstones when the row is also present in another table which is not subject to the minor compaction? Example: Say there are 5 SStables: - Customers_0 (10 MB) - Customers_1 (10 MB) - Customers_2 (10 MB) - Customers_3 (10 MB) - Customers_4

cassandra/hadoop BulkOutputFormat failures

2012-09-14 Thread Brian Jeltema
I'm trying to do a bulk load from a Cassandra/Hadoop job using the BulkOutputFormat class. It appears that the reducers are generating the SSTables, but is failing to load them into the cluster: 12/09/14 14:08:13 INFO mapred.JobClient: Task Id : attempt_201208201337_0184_r_04_0, Status :

Disk configuration in new cluster node

2012-09-14 Thread Casey Deccio
I'm building a new cluster (to replace the broken setup I've written about in previous posts) that will consist of only two nodes. I understand that I'll be sacrificing high availability of writes if one of the nodes goes down, and I'm okay with that. I'm more interested in maintaining high

Re: Cassandra node going down

2012-09-14 Thread rohit reddy
Thanks for the inputs. The disk on the EC2 node failed. This led to the problem. Now i have created a new cassandra node and added it to the cluster. Do i need to do anything to delete the old node from the cluster, or will the cluster balance it self. Asking this since in Datastax ops center its

Re: Cassandra node going down

2012-09-14 Thread Tyler Hobbs
You will need to run nodetool removetoken with the old node's token to permanently remove it from the cluster. On Fri, Sep 14, 2012 at 3:06 PM, rohit reddy rohit.kommare...@gmail.comwrote: Thanks for the inputs. The disk on the EC2 node failed. This led to the problem. Now i have created a

Differences in row iteration behavior

2012-09-14 Thread Todd Fast
Hi-- We are iterating rows in a column family two different ways and are seeing radically different row counts. We are using 1.0.8 and RandomPartitioner on a 3-node cluster. In the first case, we have a trivial Hadoop job that counts 29M rows using the standard MR pattern for counting

Re: Differences in row iteration behavior

2012-09-14 Thread Jeremy Hanna
Are there any deletions in your data? The Hadoop support doesn't filter out tombstones, though you may not be filtering them out in your code either. I've used the hadoop support for doing a lot of data validation in the past and as long as you're sure that the code is sound, I'm pretty

Re: cassandra/hadoop BulkOutputFormat failures

2012-09-14 Thread Jeremy Hanna
A couple of guesses: - are you mixing versions of Cassandra? Streaming differences between versions might throw this error. That is, are you bulk loading with one version of Cassandra into a cluster that's a different version? - (shot in the dark) is your cluster overwhelmed for some reason?