Re: how large can a cluster over the WAN be?

2011-03-08 Thread Robert Coli
On Mon, Mar 7, 2011 at 11:32 AM, John Lewis lewili...@gmail.com wrote: When you say decent latency and throughput what numbers do you consider decent? I know throughput would be highly dependent on the quantity of kb shoved through the pipe so I would expect throughput needs would be highly

setting consistency level

2011-03-08 Thread Sagar Kohli
Hi, Can we define consistency level in yaml file(or at the time of designing cassandra data modal), my question may sound stupid since m still in process of understanding Cassandra :)... Thanks and regards sagar Are you exploring a Big Data Strategy ? Listen

setting consistency level

2011-03-08 Thread Sagar Kohli
Hi, Can we define consistency level in yaml file(or at the time of designing cassandra data modal), my question may sound stupid since m still in process of understanding Cassandra :)... Thanks and regards sagar Are you exploring a Big Data Strategy ? Listen

Re: Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-08 Thread Norman Maurer
Yeah this make sense as far as I can tell. Bye, Norman 2011/3/8 Aditya Narayan ady...@gmail.com My application displays list of several blogs' overview data (like blogTitle/ nameOfBlogger/ shortDescrption for each blog) on 1st page (in very much similar manner like Digg's newsfeed) and

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread aaron morton
It looks like the node is sending out it application state and waiting the required time after which it expects to know about all other nodes in the cluster. INFO [main] 2011-03-07 17:04:06,660 StorageService.java (line 399) Joining: sleeping 3 ms for pending range setup For some reason

auto_bootstrap setting after bootstrapping

2011-03-08 Thread Maki Watanabe
Hello, According to the Wiki/StorageConfiguration page, auto_bootstrap is described as below: auto_bootstrap Set to 'true' to make new [non-seed] nodes automatically migrate the right data to themselves. (If no InitialToken is specified, they will pick one such that they will get half the

Re: TException: Error: TSocket read 0 bytes

2011-03-08 Thread aaron morton
Just checking the version of Thrift, you said 0.7.2 the latest stable is 0.6 Unfortunately for cassandra 0.6 you need to match a specific SVN release for thrift see http://wiki.apache.org/cassandra/InstallThrift For cassandra 0.6.12 it's r917130 Is there a reason you are using cassandra

Re: setting consistency level

2011-03-08 Thread aaron morton
Consistency is set by the client for each read or write requests. You define the Replication Factor when creating the Keyspace, either in cassandra.yaml or as part of the create keyspace statement using the cassandra-cli. For background... Check the docs if any for the high level client you

Re: Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-08 Thread aaron morton
You could duplicate the data from CF1 in CF2 as well (use a batch_mutation through whatever client you have). So when serving the second page you only need to read one row from CF2. Aaron On 8/03/2011, at 8:13 PM, Norman Maurer wrote: Yeah this make sense as far as I can tell. Bye,

Re: auto_bootstrap setting after bootstrapping

2011-03-08 Thread aaron morton
AFAIK yes. The node marks itself as bootstrapped whenever it starts, and will not re-bootstrap once that it set. More info here http://wiki.apache.org/cassandra/Operations#Bootstrap Hope that helps. Aaron On 8/03/2011, at 9:35 PM, Maki Watanabe wrote: Hello, According to the

Re: changing ip's ...

2011-03-08 Thread Sasha Dolgy
One of the issues with ec2 is after a reboot. the internal ip changes.this caused a a big problem for me yesterday. On Mar 8, 2011 2:29 AM, aaron morton aa...@thelastpickle.com wrote: Not this fits your problem, but if you pass -Dcassandra.load_ring_state=false as a JVM option it will stop

Re: Splitting the data of a single blog into 2 CFs (to implement effective caching) according to views.

2011-03-08 Thread Aditya Narayan
Yes Aaron I thought about that but that doesnt seem to be just a small amount of data either (contains text), but yes we can consider to do so later as we find the need for it.. Thank you both! On Tue, Mar 8, 2011 at 2:25 PM, aaron morton aa...@thelastpickle.comwrote: You could duplicate the

Re: when do snapshots go away?

2011-03-08 Thread Sylvain Lebresne
On Tue, Mar 8, 2011 at 1:53 AM, Jeffrey Wang jw...@palantir.com wrote: Hi all, When I drop a column family, it creates a snapshot. When does the snapshot go away and free up the disk space? I was able to run nodetool clearsnapshot to get rid of them, but will they go away themselves?

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-08 Thread Jean-Christophe Sirot
On 03/07/2011 10:08 PM, Aaron Morton wrote: You can fill your boots. So long as your boots have a capacity of 2 billion. Background ... http://wiki.apache.org/cassandra/LargeDataSetConsiderations http://wiki.apache.org/cassandra/CassandraLimitations

Re: recommended way to grow a cluster?

2011-03-08 Thread aaron morton
I do not know of any articles I could send your way, and others may have some tales from running production systems. But here are a few thoughts, others please correct me if I am wrong: - the replication factor is not intended to the changed on a running system. It can be, but it will be a

Re: auto_bootstrap setting after bootstrapping

2011-03-08 Thread Maki Watanabe
Thx! 2011/3/8 aaron morton aa...@thelastpickle.com: AFAIK yes. The node marks itself as bootstrapped whenever it starts, and will not re-bootstrap once that it set. More info here http://wiki.apache.org/cassandra/Operations#Bootstrap Hope that helps. Aaron On 8/03/2011, at 9:35 PM, Maki

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Chris Goffinet c...@chrisgoffinet.com How large are your SSTables on disk? My thought was because you have so many on disk, we have to store the bloom filter + every 128 keys from index in memory. 0.5GB But as I understand store in memory happens only when read happens, i do only

London meetup on Hadoop Integration

2011-03-08 Thread Dave Gardner
Hi all, This month's London user group will be on the topic of Hadoop integration. If anyone is interested in sharing knowledge about how they use Hadoop with Cassandra then please get in touch, there are some speaker slots available. If you'd like to learn more then please come along!

Re: Nodes frozen in GC

2011-03-08 Thread David Boxenhorn
If RF=2 and CL= QUORUM, you're getting no benefit from replication. When a node is in GC it stops everything. Set RF=3, so when one node is busy the cluster will still work. On Tue, Mar 8, 2011 at 11:46 AM, ruslan usifov ruslan.usi...@gmail.comwrote: 2011/3/8 Chris Goffinet

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
(1) I cannot stress this one enough: Run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output. Actually, I wonder if it's worth someone getting this enabled by default, with the obvious problems associated with getting the log output placed appropriately and

Re: setting consistency level

2011-03-08 Thread Mayank Mishra
Sagar, Consistency level defines how your reads and writes should work. You can defer it according to your needs, defines what are your expectations when you are reading/writing data. Hence, they are not static to Keyspace/CF metadata. With regards, Mayank On 08-03-2011 13:15, Sagar Kohli

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
Also: * What is the frequency of the pauses? Are we talking every few seconds, minutes, hours, days * If you say decrease the load down to 25%. Are you seeing the same effect but at 1/4th the frequency, or does it remain unchanged, or does the problem go away completely? -- / Peter Schuller

Re: changing ip's ...

2011-03-08 Thread David McNelis
I've run into this issue as well when running a test instance on my laptop. In the office (where I set it up) I have no issues, go outside the office on a different network, different story. I'll try your suggestion, Aaron. On Tue, Mar 8, 2011 at 12:43 AM, Sasha Dolgy sdo...@gmail.com wrote:

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Peter Schuller peter.schul...@infidyne.com (1) I cannot stress this one enough: Run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output. (2) Attach to your process with jconsole or some similar tool. (3) Observe the behavior of the heap over time.

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
JVM_OPTS=$JVM_OPTS -XX:+PrintGCApplicationStoppedTime JVM_OPTS=$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log Add: JVM_OPTS=$JVM_OPTS -XX:+PrintGC JVM_OPTS=$JVM_OPTS -XX:+PrintGCDetails JVM_OPTS=$JVM_OPTS -XX:+PrintGCTimeStamps And you will see significantly more detail in the GC log. -- /

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
                $client-batch_mutate($mutations, cassandra_ConsistencyLevel::QUORUM); Btw, what are the mutations? Are you doing something like inserting both very small values and very large ones? In any case: My main reason to butt back into this thread is that under normal circumstances you

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
Also, why is there so much garbage collection to begin with?  Memcache uses a slab allocator to reuse blocks to prevent allocation/deallocation of blocks from consuming all the cpu time.  Are there any plans to reuse blocks so the garbage collector doesn't have to work so hard? And to address

problem with bootstrap

2011-03-08 Thread Patrik Modesto
Hi, I've small test cluster, 2 servers, both running successfully cassandra 0.7.3. I've three keyspaces, two with RF1, one with RF3. Now when I try to bootstrap 3rd server (empty initial_token, auto_bootstrap: true), I get this exception on the new server. INFO 23:13:43,229 Joining: getting

nodetool repair hung in 0.7.3

2011-03-08 Thread Karl Hiramoto
I never saw this before upgrading to 0.7.3 but now I do nodetool repair and it sits there for hours. Previously it took about 20 minutes per node (about 10GB of data per node). I had some OOM crashes, but haven't seen them since I increased the heap size and decreased the key cache. In

Re: nodetool repair hung in 0.7.3

2011-03-08 Thread Sylvain Lebresne
I just saw repair hang here too, it's actually very easy to reproduce. I'm looking at it right now. -- Sylvain On Tue, Mar 8, 2011 at 4:30 PM, Karl Hiramoto k...@hiramoto.org wrote: I never saw this before upgrading to 0.7.3 but now I do nodetool repair and it sits there for hours.

Re: nodetool repair hung in 0.7.3

2011-03-08 Thread Karl Hiramoto
On 08/03/2011 16:34, Sylvain Lebresne wrote: I just saw repair hang here too, it's actually very easy to reproduce. I'm looking at it right now. -- Thanks. Should i bump GCGraceSeconds since i can no longer repair? I tried repair on 3 nodes of a 6 node cluster and they all hang.

Re: recommended way to grow a cluster?

2011-03-08 Thread Peter Schuller
- When adding nodes to a cluster it's mode efficient if you can change the range to existing nodes to be a sub set of what they were responsible for previously. So the node only has to stream out data, rather than stream out and stream in data. Say you have this contrived example (where values

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Jonathan Ellis
Is he trying to bootstrap? What does that have to do with failure recovery? Doesn't make sense to me. On Tue, Mar 8, 2011 at 2:33 AM, aaron morton aa...@thelastpickle.com wrote: It looks like the node is sending out it application state and waiting the required time after which it expects to

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
On 08/03/2011 17:09, Jonathan Ellis wrote: No. What is the history of your cluster? It started out as 0.7.0 - RC3 And I've upgraded 0.7.0, 0.7.1, 0.7.2, 0.7.3 within a few days after each was released. I have 6 nodes about 10GB of data each RF=2. Only one CF every row/column has a

Several 'TimedOutException' in stress.py

2011-03-08 Thread A J
Trying out stress.py on AWS EC2 environment (4 Large instances. Each of 2-cores and 7.5GB RAM. All in the same region/zone.) python stress.py -o insert -d 10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t 10 -n 500 -S 100 -k (I want to try with column size of about 1MB.

Re: when do snapshots go away?

2011-03-08 Thread Robert Coli
On Tue, Mar 8, 2011 at 1:25 AM, Sylvain Lebresne sylv...@datastax.com wrote: And it's far easier for you to know what to do with the snapshot (whether that is deleting it or archiving it somewhere) than for the application. Snapshots also have the neat property of not being the full size of

Re: how large can a cluster over the WAN be?

2011-03-08 Thread John Lewis
Thanks for the reply, I realize my question was rather nebulous as I consider this proposed deployment to be rather nebulous as well. Any bit of information and a direction on which sections of documentation are relevant helps this challenge become less nebulous over time. I will do some

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread mcasandra
I turned the auto_bootstrap off and it worked fine. I don't think it's connectivity issue or network issue at all. I am very confused about what's going on here. Can you please let me know if this a bug that I am facing? Also, what are the disadvantage of turning off auto bootstrap? Do I need to

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Peter Schuller
Also, what are the disadvantage of turning off auto bootstrap? Do I need to do anything after the fact? Inserting a new node into a ring without auto_bootstrap implies that it will join the ring, but will not contain any data for which it is supposedly responsible. A 'nodetool repair' should

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Peter Schuller
2) When I brought 2 nodes down (out of 3), I was able to start one node (with 66 % load below) even though auto_bootstrap is set to true. Shouldn't it have failed for the same reason? This is a good point/question. As far as I can tell, a node being bootstrapped would need to receive data from

Re: Alternative to repair

2011-03-08 Thread Daniel Doubleday
Thanks for the reply! Not really: - range scans do not perform read repair Ok I obviously overlooked that RangeSliceResponseResolver does not repair rows on nodes that never saw a write for a given key at all. But that's not a big problem for us since we are mainly interested in fixing

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread ruslan usifov
2011/3/8 A J s5a...@gmail.com Trying out stress.py on AWS EC2 environment (4 Large instances. Each of 2-cores and 7.5GB RAM. All in the same region/zone.) python stress.py -o insert -d 10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t 10 -n 500 -S 100 -k (I want

Re: Nodes frozen in GC

2011-03-08 Thread Paul Pak
Hi Ruslan, Is it possible for you to tell us the details on what you have done which measurably helped your situation, so we can start a best practices doc on growing cassandra systems? So far, I see that under load, cassandra is rarely ready to take heavy load in it's default configuration and

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread mcasandra
I am as clear as mud with what is happening here :) But with some suggestions I can try to start my test from scratch and post results in that order. -- View this message in context:

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Paul Pak p...@yellowseo.com Hi Ruslan, Is it possible for you to tell us the details on what you have done which measurably helped your situation, so we can start a best practices doc on growing cassandra systems? So far, I see that under load, cassandra is rarely ready to take

RE: Cassandra Meetup in Austin, TX

2011-03-08 Thread Sanchez, Carlos
Anything in Dallas? From: Jake Luciani [mailto:jak...@gmail.com] Sent: Tuesday, March 08, 2011 12:53 PM To: user@cassandra.apache.org Subject: Re: Cassandra Meetup in Austin, TX There is also a newly formed NYC area Cassandra User Group http://www.meetup.com/NYC-Cassandra-User-Group On Tue,

Re: cassandra + zabbix

2011-03-08 Thread pob
Hello, that was the way i was thinking about, actually its written https://gist.github.com/744761 But any hint how to get those data from httpserver into zabbix? Thanks 2011/3/8 ruslan usifov ruslan.usi...@gmail.com You can simply write you own java agent(this doesn't require chage of

Re: nodetool repair hung in 0.7.3

2011-03-08 Thread Sylvain Lebresne
I suspect you are in the case of https://issues.apache.org/jira/browse/CASSANDRA-2290. That is some neighbor node died or was unable to perform its part of the repair. You can always retry making sure all node are and stay alive to see if it is the former one. But seeing the other exception in

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread aaron morton
Is this a client side time out or a server side one? What does the error stack look like ? Also check the server side logs for errors. The thrift API will raise a timeout when less the CL level of nodes return in rpc_timeout. Good luck Aaron On 9/03/2011, at 7:37 AM, ruslan usifov wrote:

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Sylvain Lebresne
Did you run scrub as soon as you updated to 0.7.3 ? And did you had problems/exceptions before running scrub ? If yes, did you had problems with only 0.7.3 or also with 0.7.2 ? If the problems started with running scrub, since it takes a snapshot before running, can you try restarting a test

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Terje Marthinussen
I had similar errors in late 0.7.3 releases related to testing I did for the mails with subject Argh: Data Corruption (LOST DATA) (0.7.0). I do not see these corruptions or the above error anymore with 0.7.3 release as long as the dataset is created from scratch. The patch (2104) mentioned in the

Re: problem with bootstrap

2011-03-08 Thread mcasandra
I think this not the right functionality and it is really odd that you can't successfully bring it online without turning off bootstrap BUT you can bring it online by turning auto_boostrap off and then run nodetool repair afterwards. Also, if that's the case then when one node goes down, say out

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Karl Hiramoto
On 03/08/11 21:45, Sylvain Lebresne wrote: Did you run scrub as soon as you updated to 0.7.3 ? Yes, whithin a few minutes of starting up 0.7.3 on the node And did you had problems/exceptions before running scrub ? Not sure. If yes, did you had problems with only 0.7.3 or also with 0.7.2 ?

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread A J
Client side (it is just a 5th instance in the same EC2 zone, having stress.py installed on it) gives the following error: Process Inserter-4: Traceback (most recent call last): File /usr/lib64/python2.6/multiprocessing/process.py, line 232, in _bootstrap self.run() File stress.py, line

Re: Cassandra Meetup in Austin, TX

2011-03-08 Thread Sasha Dolgy
And there three people here in Zurich if anyone else is lurking ... not organized beer + discussion yet. On Tue, Mar 8, 2011 at 7:52 PM, Jake Luciani jak...@gmail.com wrote: There is also a newly formed NYC area Cassandra User Group http://www.meetup.com/NYC-Cassandra-User-Group On Tue,

Re: Several 'TimedOutException' in stress.py

2011-03-08 Thread aaron morton
Cool, so it's a server side because - in the client side stack the thrift code is raising the error - server side log has this DEBUG 22:29:10,318 ... timed out The TimedOutException is raised when the number of replicas required by your CL have not returned inside the timespan specified by

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Jonathan Ellis
alienth on irc is reporting the same error. His path was 0.6.8 to 0.7.1 to 0.7.3. It's probably a bug in scrub. If we can get an sstable exhibiting the problem posted here or on Jira that would help troubleshoot. On Tue, Mar 8, 2011 at 10:31 AM, Karl Hiramoto k...@hiramoto.org wrote: On

Re: Cassandra Meetup in Austin, TX

2011-03-08 Thread Christopher St John
On Tue, Mar 8, 2011 at 1:56 PM, Sanchez, Carlos carlos.sanc...@msci.com wrote: Anything in Dallas? Funny you should ask, on March 22nd there's: http://dbdmh.eventbrite.com Informal get-together more than a real event, but Cassandra has come up as a topic and I suspect it would be a good

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Jonathan Ellis
Turn on debug logging and see if the output looks like what I posted to https://issues.apache.org/jira/browse/CASSANDRA-2296 It *may* be harmless depending on where those zero-length rows are coming from. I've added asserts to 0.7 branch that fire if we attempt to write a zero-length row, so if

Re: 0.7.3 nodetool scrub exceptions

2011-03-08 Thread Jonathan Ellis
Looks like it is harmless -- Scrub would write a zero-length row when tombstones expire and there is nothing left, instead of writing no row at all. Fix attached to the jira ticket. On Tue, Mar 8, 2011 at 8:58 PM, Jonathan Ellis jbel...@gmail.com wrote: It *may* be harmless depending on where

Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Aditya Narayan
Do the overwrites of newly written columns(that are present in memtable) *replace the old column* or is it just a simple append. I am trying to understand that if I update these column very very frequently(while they are in memtable), does the read performance of these columns gets affected,

Re: Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Narendra Sharma
Multiple write for same key and column will result in overwriting of column in a memtable. Basically multiple updates for same (key, column) are reconciled based on the column's timestamp. This happens per memtable. So if a memtable is flushed to an sstable, this rule will be valid for the next

Re: Does the memtable replace the old version of column with the new overwriting version or is it just a simple append ?

2011-03-08 Thread Aditya Narayan
so this means that in memtable only the most recent version of a column will reside!? For this implementation, while writing to memtable Cassandra will see if there are other versions and will overwrite them (reconcilation while writing) !? I know that different SST tables may have different