Re: A Simple scenario, Help needed

2011-04-01 Thread Prasanna Rajaperumal
Hi , I happened to figure out the problem. I had set the replication_factor=1 in cassandra.yaml Changing it to 2, made sure the entire keyspace is stored in each node. ( It has its half and the others half as well) For others looking at an explanation on Replication Factor and Consistency Level

RE: Using RowMutations with super columns

2011-04-01 Thread George Ciubotaru
Hello, The exception from the previous email was caused by a mistake of mine, sorry for that. I've fixed it, no more exceptions of the client (bulk loader) side but I'm getting now an exception in Cassandra. My configuration is simple: I have a single Cassandra instance running and I launch

Re: changing replication strategy and effects on replica nodes

2011-04-01 Thread aaron morton
See the section on Replication here http://wiki.apache.org/cassandra/Operations#Replication It talks about how to change the RF and then says you can do the same when change the placement strategy. It can be done, but is a little messy. Depending on your setup it may also be possible to

Re: Endless minor compactions after heavy inserts

2011-04-01 Thread aaron morton
If you are doing some sort of bulk load you can disable minor compactions by setting the min_compaction_threshold and max_compaction_threshold to 0 . Then once your insert is complete run a major compaction via nodetool before turning the minor compaction back on. You can also reduce the

Re: changing replication strategy and effects on replica nodes

2011-04-01 Thread Jonathan Colby
Hi Aaron - Yes, I've read the part about changing the replication factor on a running cluster. I've even done it without a problem. My real point of my question was do you now have unused replica data on the old replica nodes that you need to clean up manually? any insight would be

Re: How to repair HintsColumnFamily?

2011-04-01 Thread Terje Marthinussen
Seeing similar errors on another system (0.7.4). Maybe something bogus with the hint columnfamilies. Terje On Mon, Mar 28, 2011 at 7:15 PM, Shotaro Kamio kamios...@gmail.com wrote: I see. Then, I'll remove the HintsColumnFamily. Because our cluster has a lot of data, running repair takes

Re: balance between concurrent_[reads|writes] and feeding/reading threads i clients

2011-04-01 Thread Terje Marthinussen
The reason I am asking is obviously that we saw a bunch of stability issues for a while. We had some periods with a lot of dropped messages, but also a bunch of dead/UP messages without drops (followed by hintedhandoffs) and loads of read repairs. This all seems to work a lot better after

Re: Using RowMutations with super columns

2011-04-01 Thread aaron morton
I've not used that binary memtable example before, but reading the contrib example (from 0.7.4) there is something odd. We build a CF in the reduce() function and then serialise it in the createMessage() function and hide it inside another Column. So that eventually Table.load() can use that

Re: changing replication strategy and effects on replica nodes

2011-04-01 Thread aaron morton
You may do, if a node is no longer a replica for a token range. Which would be similar to reducing the RF. nodetool cleanup is the thing to run after you have repaired to remove data a node should no longer have. Aaron On 1 Apr 2011, at 23:10, Jonathan Colby wrote: Hi Aaron - Yes, I've

nodetool cleanup - results in more disk use?

2011-04-01 Thread Jonathan Colby
I ran node cleanup on a node in my cluster and discovered the disk usage went from 3.3 GB to 5.4 GB. Why is this? I thought cleanup just removed hinted handoff information. I read that *during* cleanup extra disk space will be used similar to a compaction. But I was expecting the disk

Abnormal memory consumption

2011-04-01 Thread openvictor Open
Hello everybody, I am quite new to Cassandra and I am worried about an apache cassandra server that is running on an small isolated server with only 2 Gb of RAM. On this server there is very little data in Cassandra ( ~3 Mb only text in column values) but there are other servers such as : SolR,

Re: nodetool cleanup - results in more disk use?

2011-04-01 Thread Jonathan Colby
I discovered that a Garbage collection cleans up the unused old SSTables. But I still wonder whether cleanup really does a full compaction. This would be undesirable if so. On Apr 1, 2011, at 4:08 PM, Jonathan Colby wrote: I ran node cleanup on a node in my cluster and discovered the disk

RE: Ditching Cassandra

2011-04-01 Thread Jeremiah Jordan
Quick comment on libraries for different languages. The libraries for different languages should almost ALWAYS look different. They should look like what someone using that language expects an API to look like. If someone gave me a python API that used java's builder pattern instead of named

urgent

2011-04-01 Thread Anurag Gujral
Hi All, I have setup a cassandra cluster with three data directories but cassandra is using only one of them and that disk is out of space and .Why is cassandra not using all the three data directories. Plz Suggest. Thanks Anurag

RE: Ditching Cassandra

2011-04-01 Thread Eric Evans
On Fri, 2011-04-01 at 09:52 -0500, Jeremiah Jordan wrote: Quick comment on libraries for different languages. The libraries for different languages should almost ALWAYS look different. They should look like what someone using that language expects an API to look like. +1 The language APIs

Re: Ditching Cassandra

2011-04-01 Thread Jeremy Hanna
On Apr 1, 2011, at 10:13 AM, Eric Evans wrote: On Fri, 2011-04-01 at 09:52 -0500, Jeremiah Jordan wrote: Quick comment on libraries for different languages. The libraries for different languages should almost ALWAYS look different. They should look like what someone using that language

RE: Ditching Cassandra

2011-04-01 Thread mcasandra
Where can I read more about CQL? I am assuming it's similar to SQL and drivers like JDBC can be written on top of it. Is that right? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Ditching-Cassandra-tp6221436p6231654.html Sent from the

Re: Endless minor compactions after heavy inserts

2011-04-01 Thread mcasandra
Is there a way to monitor the compactions using nodetools? I don't see it in tpstats. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Endless-minor-compactions-after-heavy-inserts-tp6229633p6231672.html Sent from the

Re: Endless minor compactions after heavy inserts

2011-04-01 Thread Jake Luciani
nodetool compactionstats On Fri, Apr 1, 2011 at 12:14 PM, mcasandra mohitanch...@gmail.com wrote: Is there a way to monitor the compactions using nodetools? I don't see it in tpstats. -- View this message in context:

Re: Ditching Cassandra

2011-04-01 Thread Moaz Reyad
See: https://svn.apache.org/viewvc/cassandra/trunk/doc/cql/CQL.html?view=co On Fri, Apr 1, 2011 at 6:09 PM, mcasandra mohitanch...@gmail.com wrote: Where can I read more about CQL? I am assuming it's similar to SQL and drivers like JDBC can be written on top of it. Is that right? -- View

Question on conflict handling

2011-04-01 Thread Paolo Ragone
Hi all, I've seen that lately there's been a lot of talks on conflicts, split brain problems, and related. My question is a lot simpler: does Conflict only happens at column level? Or is it at row level? I ask this because all the examples I've seen imply two clients writing to the same

Re: Node added, no performance boost -- are the tokens correct?

2011-04-01 Thread buddhasystem
On two different clusters, if I set the token to zero, on a node, its ownership drops to zero after migration. After I added the third one and moved tokens, I now have this: 33.33% 56713727820156410577229101238628035242 33.33% 113427455640312821154458202477256070484 33.33%

Re: Ditching Cassandra

2011-04-01 Thread Jeremy Hanna
Speaking of jdbc - there's already a jdbc driver that's been written :) http://svn.apache.org/repos/asf/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/ On Apr 1, 2011, at 11:21 AM, Moaz Reyad wrote: See: https://svn.apache.org/viewvc/cassandra/trunk/doc/cql/CQL.html?view=co

Re: Question on conflict handling

2011-04-01 Thread Peter Schuller
My question is a lot simpler: does Conflict only happens at column level? Or is it at row level? Individual column level only. -- / Peter Schuller

Re: Question on conflict handling

2011-04-01 Thread Peter Schuller
My question is a lot simpler: does Conflict only happens at column level? Or is it at row level? Individual column level only. Well, with the special exception of whole-row deletes. -- / Peter Schuller

Re: Node added, no performance boost -- are the tokens correct?

2011-04-01 Thread Peter Schuller
Now, I moved the tokens. I still observe that read latency deteriorated with 3 machines vs original one. Replication factor is 1, Cassandra version 0.7.2 (didn't have time to upgrade as I need results by this weekend). Read *latency* is fully expected to increase if you just add a node.

Understanding cfhistogram output

2011-04-01 Thread Anurag Gujral
Hi All, I ran nodetool with cfhistogram I dont fully understand the output.Can someone please shower some light on it. Thanks Anurag

Re: Node added, no performance boost -- are the tokens correct?

2011-04-01 Thread Edward Capriolo
On Fri, Apr 1, 2011 at 1:15 PM, Peter Schuller peter.schul...@infidyne.com wrote: Now, I moved the tokens. I still observe that read latency deteriorated with 3 machines vs original one. Replication factor is 1, Cassandra version 0.7.2 (didn't have time to upgrade as I need results by this

Re: Node added, no performance boost -- are the tokens correct?

2011-04-01 Thread Eric Gilmore
The DS docs go with should regarding setting the initial token to zero. It's not a must, but you get enough convenience out of never having to move tokens on that node that I'm not sure why you wouldn't do it. If anyone has a compelling reason not to do so, I'm happy to hear it :) On Fri, Apr 1,

Re: Understanding cfhistogram output

2011-04-01 Thread Narendra Sharma
There are 6 columns in the output. *- Offset* This is the buckets. Same as values on X-axis in a graph. The unit is determined based on the other columns. *- SSTables* This represents the number of sstables accessed per read. For eg if a read operation involved accessing 3 sstables then you will

Re: ParNew (promotion failed)

2011-04-01 Thread ruslan usifov
Also after all this messages in stdout.log i see follow: [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor3] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor2] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor1] [Unloading class

Re: Understanding cfhistogram output

2011-04-01 Thread mcasandra
I can't find it on wiki. Do you have a link where it can give detail help? Also, is the latency in micro sec. or millisec? How about latency in cfstats? Is it micro or mill? It says ms which is gen. millisec. -- View this message in context:

Re: Endless minor compactions after heavy inserts

2011-04-01 Thread Sheng Chen
Thank you very much. The major compaction will merge everything into one big file., which would be very large. Is there any way to control the number or size of files created by major compaction? Or, is there a recommended number or size of files for cassandra to handle? Thanks. I see the

Bizarre side-effect of increasing read concurrency

2011-04-01 Thread Jason Harvey
After increasing read concurrency from 8 to 64, GC mark-and-sweep was suddenly able to reclaim much more memory than it previously did. Previously, mark-and-sweep would run around 5.5GB, and would cut heap usage to 4GB. Now, it still runs at 5.5GB, but it shrinks all the way down to 2GB used.

Re: Bizarre side-effect of increasing read concurrency

2011-04-01 Thread Jason Harvey
On further analysis, it looks like this behavior occurs when a node is simply restarted. Is that normal behavior? If mark-and-sweep becomes less and less effective over time, does that suggest an issue with GC, or an issue with memory use? On Apr 1, 8:21 pm, Jason Harvey alie...@gmail.com wrote:

Re: Bizarre side-effect of increasing read concurrency

2011-04-01 Thread Edward Capriolo
On Fri, Apr 1, 2011 at 11:27 PM, Jason Harvey alie...@gmail.com wrote: On further analysis, it looks like this behavior occurs when a node is simply restarted. Is that normal behavior? If mark-and-sweep becomes less and less effective over time, does that suggest an issue with GC, or an issue

Embedding Cassandra in Java code w/o using ports

2011-04-01 Thread Bob Futrelle
Connecting via CLI to local host with a port number has never been successful for me in Snow Leopard. No amount of reading suggestions and varying the approach has worked. So I'm going to talk to Cassandra via its API, from Java. But I noticed that in some code samples that call the API from

Re: Bizarre side-effect of increasing read concurrency

2011-04-01 Thread Jason Harvey
Ah, that would probably explain it. Thanks! On Apr 1, 8:49 pm, Edward Capriolo edlinuxg...@gmail.com wrote: On Fri, Apr 1, 2011 at 11:27 PM, Jason Harvey alie...@gmail.com wrote: On further analysis, it looks like this behavior occurs when a node is simply restarted. Is that normal behavior?