RE: error using get_range_slice with random partitioner

2010-08-09 Thread Adam Crain
Hi Thomas, Can you share your client code for the iteration? It would probably help me catch my problem. Anyone know where in the cassandra source the integration tests are for this functionality on the random partitioner? Note that I posted a specific example where the iteration failed and I

Re: TokenRange contains endpoints without any port information?

2010-08-09 Thread Gary Dusbabek
On Sun, Aug 8, 2010 at 07:21, Carsten Krebs carsten.kr...@gmx.net wrote: I'm wondering why a TokenRange returned by describe_ring(keyspace) of the thrift API just returns endpoints consisting only of an address but omits any port information? My first thought was, this method could be used

Question on nodetool ring

2010-08-09 Thread Mark
I'm running a 2 node cluster and when I run nodetool ring I get the following output Address Status State LoadToken 160032583171087979418578389981025646900 127.0.0.1 Up Normal 42.28 MB

Re: Question on load balancing in a cluster

2010-08-09 Thread anand_s
Cool thanks, I think I will experiment with nodetool move. Can somebody confirm on the reason for decommissioning, instead of just splitting the token on the fly? Yes it does seem simpler to just decommission and bootstrap, but that does mean a lot of data has to be moved around to get a

Re: batch_mutate atomicity

2010-08-09 Thread Peter Schuller
I am using the familiar meanings from ACID: atomic means either the entire update will succeed or none of it. isolated means other threads will not see partial updates while it is being applied. A related concern is whether there is a write *ordering* guarantee for mutations within a row

COMMIT-LOG_WRITER Assertion Error

2010-08-09 Thread Arya Goudarzi
Just throwing this out there as it could be a concern. I had a cluster of 3 nodes running. Over the weekend I updated to trunc (Aug 9th @ 2pm). Today, I came to run my daily tests and my client kept giving me TSocket timeouts. Checking the error log of Cassandra servers, all 3 nodes had this

Re: Question on nodetool ring

2010-08-09 Thread S Ahmed
that's the token range so node#1 is from 1600.. to 429.. node#2 is from 429... to 1600... hopefully others can chime into confirm. On Mon, Aug 9, 2010 at 12:30 PM, Mark static.void@gmail.com wrote: I'm running a 2 node cluster and when I run nodetool ring I get the following output

Re: Question on nodetool ring

2010-08-09 Thread Mark
On 8/9/10 12:51 PM, S Ahmed wrote: that's the token range so node#1 is from 1600.. to 429.. node#2 is from 429... to 1600... hopefully others can chime into confirm. On Mon, Aug 9, 2010 at 12:30 PM, Mark static.void@gmail.com mailto:static.void@gmail.com wrote: I'm running a 2

Growing commit log directory.

2010-08-09 Thread Edward Capriolo
I have a 16 node 6.3 cluster and two nodes from my cluster are giving me major headaches. 10.71.71.56 Up 58.19 GB 10827166220211678382926910108067277| ^ 10.71.71.61 Down 67.77 GB 123739042516704895804863493611552076888v | 10.71.71.66 Up 43.51 GB

Re: TokenRange contains endpoints without any port information?

2010-08-09 Thread Carsten Krebs
On 08.08.2010, at 14:47 aaron morton wrote: What sort of client side load balancing where you thinking of? I just use round robin DNS to distribute clients around the cluster, and have them recycle their connections every so often. I was thinking about to use this method to give the

Re: Question on nodetool ring

2010-08-09 Thread S Ahmed
b/c node#1 has a start and end range, so you can see the boundaries for each node by looking at the last column. On Mon, Aug 9, 2010 at 4:12 PM, Mark static.void@gmail.com wrote: On 8/9/10 12:51 PM, S Ahmed wrote: that's the token range so node#1 is from 1600.. to 429.. node#2 is

Re: Growing commit log directory.

2010-08-09 Thread S Ahmed
if your commit logs are not getting cleared, doesn't that indicate your load is more than your servers can handle? On Mon, Aug 9, 2010 at 4:50 PM, Edward Capriolo edlinuxg...@gmail.comwrote: I have a 16 node 6.3 cluster and two nodes from my cluster are giving me major headaches.

Re: row cache during bootstrap

2010-08-09 Thread Artie Copeland
On Sun, Aug 8, 2010 at 5:24 AM, aaron morton aa...@thelastpickle.comwrote: Not sure how feasible it is or if it's planned. But it would probably require that the nodes are able so share the state of their row cache so as to know which parts to warm. Otherwise it sounds like you're assuming the

Re: backport of pre cache load

2010-08-09 Thread Artie Copeland
No we aren't caching 100%, we cache over 20 - 30 million which only starts to get a high hit rate overtime so to have a useful cache can take over a week of running. We would love to store the complete CF in memory but know know of a server that can hold that much data in memory while still being

Re: TokenRange contains endpoints without any port information?

2010-08-09 Thread Aaron Morton
The FAQ lists Round-Robin as the recommended way to find a node to connect to...http://wiki.apache.org/cassandra/FAQ#node_clients_connect_toAs you say, your clients need to retry anyway. I have them hold the connection for a while (on the scale of minutes), then hit the DNS again and acquire a new

Re: 2 nodes on one machine

2010-08-09 Thread Aaron Morton
http://www.onemanclapping.org/2010/03/running-multiple-cassandra-nodes-on.htmlAlso some recent discussion on the users list.AaronOn 10 Aug, 2010,at 08:58 AM, Pavlo Baron p...@pbit.org wrote:Hello users, I'm a total Cassandra noob beside what I read about it, so please be patient :) I want to

Re: 2 nodes on one machine

2010-08-09 Thread Pavlo Baron
cool, thank you Aaron, I'll check it out through the next days and post the results Pavlo Am 10.08.2010 00:11, schrieb Aaron Morton: http://www.onemanclapping.org/2010/03/running-multiple-cassandra-nodes-on.html Also some recent discussion on the users list. Aaron On 10 Aug, 2010,at 08:58

Using a separate commit log drive was 4x slower

2010-08-09 Thread Jeremy Davis
I have a weird one to share with the list, Using a separate commit log drive dropped my performance a lot more than I would expect... I'm doing perf tests on 3 identical machines but with 3 different drive sets. (SAS 15K,10K, and SATA 7.5K) Each system has a single system disk (Same as the data

Re: error using get_range_slice with random partitioner

2010-08-09 Thread Thomas Heller
Sure, but its in my ruby client which currently has close to no documentation. ;) Client is here: http://github.com/thheller/greek_architect Relevant Row Spec: http://bit.ly/9uS6Ba Row-based iteration: http://bit.ly/cRVSTc #each_slice Currently uses a hack since I wasnt able to produce

Re: Growing commit log directory.

2010-08-09 Thread Benjamin Black
what does the io load look like on those nodes? On Mon, Aug 9, 2010 at 1:50 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have a 16 node 6.3 cluster and two nodes from my cluster are giving me major headaches. 10.71.71.56   Up         58.19 GB 10827166220211678382926910108067277    

Re: COMMIT-LOG_WRITER Assertion Error

2010-08-09 Thread Jonathan Ellis
Sounds like you upgraded to trunk from 0.6 without draining your commitlog first? On Mon, Aug 9, 2010 at 3:30 PM, Arya Goudarzi agouda...@gaiaonline.com wrote: Just throwing this out there as it could be a concern. I had a cluster of 3 nodes running. Over the weekend I updated to trunc (Aug

Re: Growing commit log directory.

2010-08-09 Thread Jonathan Ellis
what does tpstats or other JMX monitoring of the o.a.c.concurrent stages show? On Mon, Aug 9, 2010 at 4:50 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have a 16 node 6.3 cluster and two nodes from my cluster are giving me major headaches. 10.71.71.56   Up         58.19 GB

Re: COMMIT-LOG_WRITER Assertion Error

2010-08-09 Thread Arya Goudarzi
I've never run 0.6. I have been running of trunc with automatic svn update and build everyday at 2pm. One of my nodes got this error which lead to the same last error prior to build and restart today. Hope this helps better: java.lang.RuntimeException: java.util.concurrent.ExecutionException:

Re: Growing commit log directory.

2010-08-09 Thread Edward Capriolo
On Mon, Aug 9, 2010 at 8:20 PM, Jonathan Ellis jbel...@gmail.com wrote: what does tpstats or other JMX monitoring of the o.a.c.concurrent stages show? On Mon, Aug 9, 2010 at 4:50 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have a 16 node 6.3 cluster and two nodes from my cluster are

explanation of generated files and ops

2010-08-09 Thread S Ahmed
In /var/lib/cassandra there is: /data/system LocationInfo-4-Data.db LocationInfo-4-Filter.db LocationInfo-4-Index.db .. .. /data/Keyspace1/ Standard2-2-Data.db Standard2-2-Filter.db Standard2-2-Index.db /commitlog CommitLog-timestamp.log /var/log/cassandra system.log Is this pretty much all

Re: How to migrate any relational database to Cassandra

2010-08-09 Thread Peng Guo
Maybe you could integrate with Hadoop. On Mon, Aug 9, 2010 at 1:15 PM, sonia gehlot sonia.geh...@gmail.com wrote: Hi Guys, Thanks for sharing your experiences and valuable links these are really helpful. But I want to do ETL and then wanted to load data in Cassandra. I have link 10-15