Re: Confused about consistency
- A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors are the bain of my existence) Following up on this bit; OOM should not be the status quo. Have you tweaked JVM heap sizes to reflect your memtables sizes etc? http://wiki.apache.org/cassandra/MemtableThresholds -- / Peter Schuller
Re: Replacing nodes of the cluster in 0.7.0-RC1
As far as I remember, cassandra had been causing problems when there was an IP change back in version 0.6? I know you already proceeded, but FWIW I think the complication with IP addresses are limited to changing the address of a node. It sounded like you were going to add 4 nodes and them simply decommission the rest. I don't think there's an issue with that having to do with IP addresses (someone correct me if I'm wrong). -- / Peter Schuller
Re: Get CF with where condition supplied by cassandra-cli
thx for the answers! 2010/12/3 Jonathan Ellis jbel...@gmail.com http://www.riptano.com/blog/whats-new-cassandra-07-secondary-indexes On Thu, Dec 2, 2010 at 9:34 AM, Yann Perchec, Novapost yann.perc...@novapost.fr wrote: Hello everybody, I'm playing since a couple of days with the cassandra-cli supplied with the Cassandra 0.7 RC. It seems that new request are possible : I'm very interested in the get cf where column = value. Does it mean that request with condition on column other than keys is now possible? Is it possible to have more information on that feature? Thank you very much! -- Yann PERCHEC Chef de projet 32 rue de Paradis, 75010 Paris *Tel:* +33 (0)1 83 62 46 81 *Mail:* yann.perc...@novapost.fr silvere.dup...@novapost.fr *Web:* www.novapost.fr / www.novapost-rh.fr -- *(P) Before printing, think about the ENVIRONMENT* -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- Yann PERCHEC Chef de projet 32 rue de Paradis, 75010 Paris *Tel:* +33 (0)1 83 62 46 81 *Mail:* yann.perc...@novapost.fr silvere.dup...@novapost.fr *Web:* www.novapost.fr / www.novapost-rh.fr -- *(P) Before printing, think about the ENVIRONMENT*
Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
In our insert-tests the average heap usage is slowly growing up to the 3 GB border (jconsole monitor over 50 min http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is also constantly growing up to about 50 jobs pending. Since you're obviously bottlenecking on compaction; are you building up lots of memtables flushes that don't complete? (I don't remember the name of the stage off hand, but it should be visible in cfstats). Also, if you simply stop writing suddenly and wait for the nodes to finish doing background activies, does memory usage go down again? (You may want to force a full GC before/after in order to do a proper test that is not affected by GC scheduling.) I don't remember the switches to JRockit, but you can definitely enable GC logging there which should tell you in more detail what's happening. IIRC, though possibly not for all GC modes, you should see periodic completions of concurrent GC:s that should collect all garbage that existed at the beginning of the GC cycle. Assuming you're not under so much load that this takes a very long time, that should give you a pretty good idea of the actual live set (which is probably going to be the low dips in your graphs, but it doesn't hurt confirming). -- / Peter Schuller
Re:Re: probability of node receiving (not be responsible for) the request
thank you. but I mean the probability of a node to receive the request not process it eventually . At 2010-12-06 00:56:58,Brandon Williams dri...@gmail.com wrote: 2010/12/5 魏金仙sei_...@126.com If a particular client send 5 requests to a 6-node cluster, then the probability of each node receiving(not be responsible for) the first request is 1/6. Assuming RF=1 and RandomPartitioner. Assume that node1 received the 1st request, will node1 receive the 2nd request, the 3rd one, the 4th one and the 5th one with high probability or 1/6? thanks for your time. 1/6th with RandomPartitioner, something much higher with OrderPreservingPartitioner. -Brandon
Re:Re: index file and KeysCached
so when will index files be in the memory? At 2010-12-06 00:54:48,Brandon Williams dri...@gmail.com wrote: 2010/12/5 魏金仙sei_...@126.com for each sstable, there is an index file, which is loaded in memory to locate a particular key's offset efficiently. Index files are not held in memory. and for each CF, KeysCached can be set to cache keys' location. could you pls tell me the difference between the two? KeysCached are held in memory. I'm wondering whether it's necessary to set KeysCached for a CF or not. Not necessary, but advantageous. -Brandon
If one seed node crash, how can I add one seed node?
After one seed node crash, I want to add one node as seed node, I set *auto_bootstrap to true, but the new node don't *migrate data from other nodes. How can I add one new seed node and let the node to * *migrate data from other nodes? Thanks, LiuLei
Re: Re: index file and KeysCached
2010/12/6 魏金仙 sei_...@126.com: so when will index files be in the memory? The index files are never fully in memory (because it would quickly be too big). Hence, only a sample of this file is in memory (1 every 128 entry by default). When cassandra needs to know where a (row) key is on disk (for a given SStable), it checks this sample (an offset in the index file to a block of key location), reads the block of (128) entries corresponding in the index file on disk, thus finding the actual offset in the sstable where the corresponding row start. The goal of the KeyCache is to avoid this on a cache hit (thus basically avoiding a seek). At 2010-12-06 00:54:48,Brandon Williams dri...@gmail.com wrote: 2010/12/5 魏金仙 sei_...@126.com for each sstable, there is an index file, which is loaded in memory to locate a particular key's offset efficiently. Index files are not held in memory. and for each CF, KeysCached can be set to cache keys' location. could you pls tell me the difference between the two? KeysCached are held in memory. I'm wondering whether it's necessary to set KeysCached for a CF or not. Not necessary, but advantageous. -Brandon 网易163/126邮箱百分百兼容iphone ipad邮件收发
0.7.0beta3 Cassandra frequently log GC for ConcurrentMarkSweep
I hava a two node , running cassandra ,both's memory is 4G. First i set heap size to 2G ,both run normal . The i set heap size to 1G , the client who insert data to and read data from cassandra began throw Read\Write Unavailable Exception . And one cassandra node began logging GC for ConcurrentMarkSweep frequently ,every ConcurrentMarkSweep 's time is over 5000ms. How this happen ? Why frequently ConcurrentMarkSweep GC ,not ParNew GC? -- Best regards, Ivy Tang
Re: 0.7.0beta3 Cassandra frequently log GC for ConcurrentMarkSweep
And after checking node who gc concurrentMarkSweep frequently ,it's OC(Current old space capacity (KB)) is 1006016.0 ,it's OU(Old space utilization (KB).) is also 1006016.0 ,almost all memory. Dose this situation imply this heap size is set too low? On Mon, Dec 6, 2010 at 8:07 PM, Ying Tang ivytang0...@gmail.com wrote: I hava a two node , running cassandra ,both's memory is 4G. First i set heap size to 2G ,both run normal . The i set heap size to 1G , the client who insert data to and read data from cassandra began throw Read\Write Unavailable Exception . And one cassandra node began logging GC for ConcurrentMarkSweep frequently ,every ConcurrentMarkSweep 's time is over 5000ms. How this happen ? Why frequently ConcurrentMarkSweep GC ,not ParNew GC? -- Best regards, Ivy Tang -- Best regards, Ivy Tang
Re: 0.7.0beta3 Cassandra frequently log GC for ConcurrentMarkSweep
If it's GCing frequently and each CMS is only collecting a small fraction of the old gen, then your heap is probably too small. (GCInspector only logs collections that take over 1s, which should never include ParNew.) On Mon, Dec 6, 2010 at 7:11 AM, Ying Tang ivytang0...@gmail.com wrote: And after checking node who gc concurrentMarkSweep frequently ,it's OC(Current old space capacity (KB)) is 1006016.0 ,it's OU(Old space utilization (KB).) is also 1006016.0 ,almost all memory. Dose this situation imply this heap size is set too low? On Mon, Dec 6, 2010 at 8:07 PM, Ying Tang ivytang0...@gmail.com wrote: I hava a two node , running cassandra ,both's memory is 4G. First i set heap size to 2G ,both run normal . The i set heap size to 1G , the client who insert data to and read data from cassandra began throw Read\Write Unavailable Exception . And one cassandra node began logging GC for ConcurrentMarkSweep frequently ,every ConcurrentMarkSweep 's time is over 5000ms. How this happen ? Why frequently ConcurrentMarkSweep GC ,not ParNew GC? -- Best regards, Ivy Tang -- Best regards, Ivy Tang -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Confused about consistency
You're right, they should be the same. Next time this happens, set the log level to debug (from StorageService jmx) on the surviving nodes and let a couple queries fail, before restarting the 3rd (and setting level back to info). On Sat, Dec 4, 2010 at 12:01 AM, Dan Hendry dan.hendry.j...@gmail.com wrote: Doesn't consistency level ALL=QUORUM at RF=2 ? I have not had a chance to test your fix but I don't THINK this is the issue. If it is the issue, how do consistency levels ALL and QUORUM differ at this replication factor? On Sat, Dec 4, 2010 at 12:03 AM, Jonathan Ellis jbel...@gmail.com wrote: I think you are running into https://issues.apache.org/jira/browse/CASSANDRA-1316, where when an inconsistency on QUORUM/ALL is discovered it always peformed the repair at QUORUM instead of the original CL. Thus, reading at ALL you would see the correct answer on the 2nd read but you weren't guaranteed to see it on the first. This was fixed in 0.6.4 but apparently I botched the merge to the 0.7 branch. I corrected that just now, so when you update, you should be good to go. On Fri, Dec 3, 2010 at 9:19 PM, Dan Hendry dan.hendry.j...@gmail.com wrote: I am seeing fairly strange, behavior in my Cassandra cluster. Setup - 3 nodes (lets call them nodes 1 2 and 3) - RF=2 - A set of servers (producers) which which write data to the cluster at consistency level ONE - A set of servers (consumers/processors) which read data from the cluster at consistency level ALL - Cassandra 0.7 (recent out of the svn branch, post beta 3) - Clients use the pelops library Situation: - Everything is humming along nicely - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors are the bain of my existence) - Producers continue to happily write to the cluster but consumers start complaining by throwing TimeOutExceptions and UnavailableExceptions. - I stagger out of bed in the middle of the night and restart Cassandra on node 3. - The consumers stop complaining and get back to business but generate garbage data for the period node 3 was down. Its almost like half the data is missing half the time. (Again, I am reading at consistency level ALL). - I force the consumers to reprocess data for the period node 3 was down. They generate accurate output which is different from the first time round. To be explicit, what seems to be happening is first read at consistency ALL gives A,C,E (for example) and the second read at consistency level ALL gives A,B,C,D,E. Is this a Cassandra bug? Is my knowledge of consistency levels flawed? My understanding is that you could achieve strongly consistent behavior by writing at ONE and reading at ALL. After this experience, my theory (uneducated, untested, and under-researched) is that strong consistency applies only to column values, not the set of columns (or super-columns in this case) which make up a row. Any thoughts? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: If one seed node crash, how can I add one seed node?
set it as a seed _after_ bootstrapping it into the cluster. On Mon, Dec 6, 2010 at 5:01 AM, lei liu liulei...@gmail.com wrote: After one seed node crash, I want to add one node as seed node, I set auto_bootstrap to true, but the new node don't migrate data from other nodes. How can I add one new seed node and let the node to migrate data from other nodes? Thanks, LiuLei -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: If one seed node crash, how can I add one seed node?
Thank Jonathan for your reply. How can I bootstrap the node into cluster, I know if the node is seed node, I can't set AutoBootstrap to true. 2010/12/6 Jonathan Ellis jbel...@gmail.com set it as a seed _after_ bootstrapping it into the cluster. On Mon, Dec 6, 2010 at t5:01 AM, lei liu liulei...@gmail.com wrote: After one seed node crash, I want to add one node as seed node, I set auto_bootstrap to true, but the new node don't migrate data from other node s. How can I add one new seed node and let the node to migrate data from other nodes? Thanks, LiuLei -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Sorting problem on supercolumns names using OPP on 0.6.2
Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
RE: Various exceptions on 0.7
bleeding edge code you are running (did you try rc1?) or you do have nodes on different versions All nodes are running code from https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.7 which I thought was essentially RC1 with fixes but I will give the actual release a try. you have a hardware problem Hard to say. I dont think so, everything else seems to be working fine. I will try and run some diagnostics on the two nodes which seem to be acting up. Now for some new developments; the plot thickens. I am fairly sure there is a corrupt ColumnFamily/SSTable. After a restart, two adjacent nodes both show the following error. After which the CompactionManager pending tasks never returns to zero. I am fairly sure this cf is not getting compacted but compaction for other column families seems to continue. In order to get rid of all these errors I have to perform a truncate operation using the cli, after which I get the same IndexOutOfBounds exception. Can I just shut down the node (draining first), and delete all data files related to this column family on the two problematic nodes? The data they contain is reasonably unimportant and I dont mind loosing it. ERROR [CompactionExecutor:1] 2010-12-06 05:07:56,736 AbstractCassandraDaemon.java (line 90) Fatal exception in thread Thread[CompactionExecutor:1,1,main] java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:520) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:340) at org.apache.cassandra.db.DeletedColumn.getLocalDeletionTime(DeletedColumn.jav a:57) at org.apache.cassandra.db.ColumnFamilyStore.removeDeletedSuper(ColumnFamilySto re.java:818) at org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(ColumnFam ilyStore.java:781) at org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(ColumnFamilyStore.ja va:774) at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:93) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterato r.java:138) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav a:107) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav a:42) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.jav a:73) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator .java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131 ) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(Filter Iterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterat or.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.jav a:321) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:97) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja va:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9 08) at java.lang.Thread.run(Thread.java:662) Dan -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: December-04-10 22:45 To: user Subject: Re: Various exceptions on 0.7 At least one of your nodes is sending garbage to the others. Either there's a bug in the bleeding edge code you are running (did you try rc1?) or you do have nodes on different versions or you have a hardware problem. On Sat, Dec 4, 2010 at 5:51 PM, Dan Hendry dan.hendry.j...@gmail.com wrote: Here are two other errors which appear frequently: ERROR [MutationStage:29] 2010-12-04 17:47:46,931 RowMutationVerbHandler.java (line 83) Error in row mutation java.io.IOException: Invalid localDeleteTime read: 0 at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:3 55) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:3 12) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFami lySerializer.java:129) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySeria lizer.java:120) at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.ja va:383) at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:3 93) at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:3 51) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler .java:52) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63 ) at
Re: If one seed node crash, how can I add one seed node?
The node can be set as a seed node at any time. It does not need to be a seed node when it joins the cluster. You should remove it as a seed node, set autobootstrap to true and let it join the cluster. Once it has joined the cluster you should add it as a seed node in the configuration for all of your nodes. On Mon, Dec 6, 2010 at 9:59 AM, lei liu liulei...@gmail.com wrote: Thank Jonathan for your reply. How can I bootstrap the node into cluster, I know if the node is seed node, I can't set AutoBootstrap to true. 2010/12/6 Jonathan Ellis jbel...@gmail.com set it as a seed _after_ bootstrapping it into the cluster. On Mon, Dec 6, 2010 at t5:01 AM, lei liu liulei...@gmail.com wrote: After one seed node crash, I want to add one node as seed node, I set auto_bootstrap to true, but the new node don't migrate data from other node s. How can I add one new seed node and let the node to migrate data from other nodes? Thanks, LiuLei -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Sorting problem on supercolumns names using OPP on 0.6.2
What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
LA, Tokyo Cassandra trainings this week
There are a few seats open for each: LA training Wednesday: http://www.eventbrite.com/event/1002369113 Tokyo training Thursday: http://nosqlfd.eventbrite.com/ I will be teaching the LA class. Tokyo will be taught by Nate McCall (with pauseless translation to Japanese) and hosted by our friends at Gemini Mobile. See you there! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Re: probability of node receiving (not be responsible for) the request
2010/12/6 魏金仙 sei_...@126.com thank you. but I mean the probability of a node to receive the request not process it eventually . I see. That depends on how the client-side load balancing is written. -Brandon
Re: LA, Tokyo Cassandra trainings this week
We've also got Jake Luciani (@tjake) giving a talk at Cassandra London this Wednesday - this is a great opportunity to meet with other Cassandra users. There will be some free beer and food available. http://www.meetup.com/Cassandra-London/calendar/15351291/ Dave On 6 December 2010 17:05, Jonathan Ellis jbel...@gmail.com wrote: There are a few seats open for each: LA training Wednesday: http://www.eventbrite.com/event/1002369113 Tokyo training Thursday: http://nosqlfd.eventbrite.com/ I will be teaching the LA class. Tokyo will be taught by Nate McCall (with pauseless translation to Japanese) and hosted by our friends at Gemini Mobile. See you there! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Sorting problem on supercolumns names using OPP on 0.6.2
I'm using thrift in C++ and inserting the results in a vector of pairs, so client-side-mangling does not seem to be the problem. Also I'm using a test column where I insert the same value I'm using as super column name (in this case the same date converted to string) and when queried using cassandra cli is unsorted too: cassandra get Events.EventsByUserDate ['guille'] = (super_column=9088542550893002752, (column=4342323443303834363833383437454339364433324530324538413039373736, value=2010-12-06 17:43:36.000, timestamp=1291657416526732)) = (super_column=5990347482238812160, (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d, value=2010-12-06 17:46:08.000, timestamp=1291657568569039)) = (super_column=-3089190841516818432, (column=3634343644353236463830303437363542454245354630343845393533373337, value=2010-12-06 17:44:47.000, timestamp=1291657487450738)) = (super_column=-4026221038986592256, (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532, value=2010-12-06 17:39:50.000, timestamp=1291657190117981)) On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote: What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
Testathon at Twitter on December 13th
We're going to be hosting people at the Twitter offices the evening of December 13th to focus on testing 0.7. If you're interested please contact me offlist and I'll add you to the invite. Note that we're trying to keep the group small and focused. -ryan
Re: Sorting problem on supercolumns names using OPP on 0.6.2
How are you packing the longs into strings? The large negative numbers point to that being done incorrectly. Bitshifting and putting each byte of the long into a char[8] then stringifying the char[] is the best way to go. Cassandra expects big-ending longs, as well. - Tyler On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: I'm using thrift in C++ and inserting the results in a vector of pairs, so client-side-mangling does not seem to be the problem. Also I'm using a test column where I insert the same value I'm using as super column name (in this case the same date converted to string) and when queried using cassandra cli is unsorted too: cassandra get Events.EventsByUserDate ['guille'] = (super_column=9088542550893002752, (column=4342323443303834363833383437454339364433324530324538413039373736, value=2010-12-06 17:43:36.000, timestamp=1291657416526732)) = (super_column=5990347482238812160, (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d, value=2010-12-06 17:46:08.000, timestamp=1291657568569039)) = (super_column=-3089190841516818432, (column=3634343644353236463830303437363542454245354630343845393533373337, value=2010-12-06 17:44:47.000, timestamp=1291657487450738)) = (super_column=-4026221038986592256, (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532, value=2010-12-06 17:39:50.000, timestamp=1291657190117981)) On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote: What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
Re: Sorting problem on supercolumns names using OPP on 0.6.2
That should be big-endian. On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote: How are you packing the longs into strings? The large negative numbers point to that being done incorrectly. Bitshifting and putting each byte of the long into a char[8] then stringifying the char[] is the best way to go. Cassandra expects big-ending longs, as well. - Tyler On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: I'm using thrift in C++ and inserting the results in a vector of pairs, so client-side-mangling does not seem to be the problem. Also I'm using a test column where I insert the same value I'm using as super column name (in this case the same date converted to string) and when queried using cassandra cli is unsorted too: cassandra get Events.EventsByUserDate ['guille'] = (super_column=9088542550893002752, (column=4342323443303834363833383437454339364433324530324538413039373736, value=2010-12-06 17:43:36.000, timestamp=1291657416526732)) = (super_column=5990347482238812160, (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d, value=2010-12-06 17:46:08.000, timestamp=1291657568569039)) = (super_column=-3089190841516818432, (column=3634343644353236463830303437363542454245354630343845393533373337, value=2010-12-06 17:44:47.000, timestamp=1291657487450738)) = (super_column=-4026221038986592256, (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532, value=2010-12-06 17:39:50.000, timestamp=1291657190117981)) On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote: What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
Re: Sorting problem on supercolumns names using OPP on 0.6.2
Also, thought I should mention: When you make a std::string out of the char[], make sure to use the constructor with the size_t parameter (size 8). - Tyler On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote: That should be big-endian. On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote: How are you packing the longs into strings? The large negative numbers point to that being done incorrectly. Bitshifting and putting each byte of the long into a char[8] then stringifying the char[] is the best way to go. Cassandra expects big-ending longs, as well. - Tyler On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: I'm using thrift in C++ and inserting the results in a vector of pairs, so client-side-mangling does not seem to be the problem. Also I'm using a test column where I insert the same value I'm using as super column name (in this case the same date converted to string) and when queried using cassandra cli is unsorted too: cassandra get Events.EventsByUserDate ['guille'] = (super_column=9088542550893002752, (column=4342323443303834363833383437454339364433324530324538413039373736, value=2010-12-06 17:43:36.000, timestamp=1291657416526732)) = (super_column=5990347482238812160, (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d, value=2010-12-06 17:46:08.000, timestamp=1291657568569039)) = (super_column=-3089190841516818432, (column=3634343644353236463830303437363542454245354630343845393533373337, value=2010-12-06 17:44:47.000, timestamp=1291657487450738)) = (super_column=-4026221038986592256, (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532, value=2010-12-06 17:39:50.000, timestamp=1291657190117981)) On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote: What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
Re: Sorting problem on supercolumns names using OPP on 0.6.2
+1 I'm doing this in my C++ client so contact me offlist if you need code David Sent from my iPhone On Dec 6, 2010, at 1:33 PM, Tyler Hobbs ty...@riptano.com wrote: Also, thought I should mention: When you make a std::string out of the char[], make sure to use the constructor with the size_t parameter (size 8). - Tyler On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote: That should be big-endian. On Mon, Dec 6, 2010 at 12:29 PM, Tyler Hobbs ty...@riptano.com wrote: How are you packing the longs into strings? The large negative numbers point to that being done incorrectly. Bitshifting and putting each byte of the long into a char[8] then stringifying the char[] is the best way to go. Cassandra expects big-ending longs, as well. - Tyler On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: I'm using thrift in C++ and inserting the results in a vector of pairs, so client-side-mangling does not seem to be the problem. Also I'm using a test column where I insert the same value I'm using as super column name (in this case the same date converted to string) and when queried using cassandra cli is unsorted too: cassandra get Events.EventsByUserDate ['guille'] = (super_column=9088542550893002752, (column=4342323443303834363833383437454339364433324530324538413039373736, value=2010-12-06 17:43:36.000, timestamp=1291657416526732)) = (super_column=5990347482238812160, (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d, value=2010-12-06 17:46:08.000, timestamp=1291657568569039)) = (super_column=-3089190841516818432, (column=3634343644353236463830303437363542454245354630343845393533373337, value=2010-12-06 17:44:47.000, timestamp=1291657487450738)) = (super_column=-4026221038986592256, (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532, value=2010-12-06 17:39:50.000, timestamp=1291657190117981)) On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote: What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
Newbie question about connecting to a cassandra server from another server using Fauna
Hi I'm trying to create a connection to a server running cassandra doing this: compass = Cassandra.new('Compas', servers=223.798.456.123:9160) But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ? Thanks
Re: Sorting problem on supercolumns names using OPP on 0.6.2
uh, ok I was just copying :P string result; result.resize(sizeof(long long)); memcpy(result[0], l, sizeof(long long)); I'll try and let you know many thanks! On Mon, Dec 6, 2010 at 4:29 PM, Tyler Hobbs ty...@riptano.com wrote: How are you packing the longs into strings? The large negative numbers point to that being done incorrectly. Bitshifting and putting each byte of the long into a char[8] then stringifying the char[] is the best way to go. Cassandra expects big-ending longs, as well. - Tyler On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: I'm using thrift in C++ and inserting the results in a vector of pairs, so client-side-mangling does not seem to be the problem. Also I'm using a test column where I insert the same value I'm using as super column name (in this case the same date converted to string) and when queried using cassandra cli is unsorted too: cassandra get Events.EventsByUserDate ['guille'] = (super_column=9088542550893002752, (column=4342323443303834363833383437454339364433324530324538413039373736, value=2010-12-06 17:43:36.000, timestamp=1291657416526732)) = (super_column=5990347482238812160, (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d, value=2010-12-06 17:46:08.000, timestamp=1291657568569039)) = (super_column=-3089190841516818432, (column=3634343644353236463830303437363542454245354630343845393533373337, value=2010-12-06 17:44:47.000, timestamp=1291657487450738)) = (super_column=-4026221038986592256, (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532, value=2010-12-06 17:39:50.000, timestamp=1291657190117981)) On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote: What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
Re: Newbie question about connecting to a cassandra server from another server using Fauna
What function are you calling to get data and what is the error ?Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected.If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/binAaronOn 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote:Hi I'm trying to create a connection to a server running cassandra doing this:compass = Cassandra.new('Compas', servers="223.798.456.123:9160")But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ?Thanks
Re: [RELEASE] 0.7.0 rc1
On Thu, 2010-12-02 at 10:30 -0800, Clint Byrum wrote: On Wed, 2010-12-01 at 17:00 +0100, Olivier Rosello wrote: FYI, 0.7.0~rc1 debs are available in a new PPA for experimental releases: http://launchpad.net/~cassandra-ubuntu/+archive/experimental It seems there is a dependancy on libjets3t-java Is it really needed ? This dependancy cannot be resolved on Ubuntu Lucid :-( I'll be working on getting any of the packaged dependencies from natty added to the lucid PPA very soon. Please stay tuned! Ok, jets3t has been uploaded and cassandra copied to the lucid and maverick releases, so this PPA should work for users on 10.04 and later now.
Re: Newbie question about connecting to a cassandra server from another server using Fauna
I've tried the keyspaces() function and got this on return: compass.keyspaces() CassandraThrift::Cassandra::Client::TransportException: CassandraThrift::Cassandra::Client::TransportException from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in `rescue in open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in `handled_proxy' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in `get_string_property' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in `all_nodes' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in `reconnect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in `client' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in `keyspaces' from (irb):4 from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main' about the cassandra-cli, should I run the command on the server from which I'm trying to connect? Thanks for the help On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote: What function are you calling to get data and what is the error ? Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected. If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/bin Aaron On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote: Hi I'm trying to create a connection to a server running cassandra doing this: compass = Cassandra.new('Compas', servers=223.798.456.123:9160) But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ? Thanks
Re: Newbie question about connecting to a cassandra server from another server using Fauna
You can run the cassandra-cli from any machine. If you run it from the same machine as your ruby code it's a reliable way to check you can connect to the cluster.ok, next set of questions- what version of cassandra are you using ? Is it 0.7?- what require did you run ? was it require 'cassandra/0.7' ? see the Usage sectionhttps://github.com/fauna/cassandrafor details.Cassandra 0.7 defaults to frames transport, 0.6 does not.AaronOn 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com wrote:I've tried the keyspaces() function and got this on return:compass.keyspaces()CassandraThrift::Cassandra::Client::TransportException: CassandraThrift::Cassandra::Client::TransportException from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in `rescue in open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in `handled_proxy' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in `get_string_property' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in `all_nodes' from /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in `reconnect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in `client' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in `keyspaces' from (irb):4 from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main'about the cassandra-cli, should I run the command on the server from which I'm trying to connect?Thanks for the helpOn Dec 6, 2010, at 2:52 PM, Aaron Morton wrote:What function are you calling to get data and what is the error ?Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected.If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/binAaronOn 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote:Hi I'm trying to create a connection to a server running cassandra doing this:compass = Cassandra.new('Compas', servers="223.798.456.123:9160")But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ?Thanks
Re: Newbie question about connecting to a cassandra server from another server using Fauna
Hi I've successfully managed to connect to the server through the cassandra-cli command but still no luck on doing it from Fauna, I'm running cassandra 0.6.8 and I did the usual require 'cassandra' I've changed the ThriftAddress on the storage-conf.xml to the IP address of the server itself, do I need to change anything else? Thanks once again On Dec 6, 2010, at 3:15 PM, Aaron Morton wrote: You can run the cassandra-cli from any machine. If you run it from the same machine as your ruby code it's a reliable way to check you can connect to the cluster. ok, next set of questions - what version of cassandra are you using ? Is it 0.7? - what require did you run ? was it require 'cassandra/0.7' ? see the Usage section https://github.com/fauna/cassandra for details. Cassandra 0.7 defaults to frames transport, 0.6 does not. Aaron On 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com wrote: I've tried the keyspaces() function and got this on return: compass.keyspaces() CassandraThrift::Cassandra::Client::TransportException: CassandraThrift::Cassandra::Client::TransportException from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in `rescue in open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in `handled_proxy' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in `get_string_property' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in `all_nodes' from /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in `reconnect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in `client' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in `keyspaces' from (irb):4 from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main' about the cassandra-cli, should I run the command on the server from which I'm trying to connect? Thanks for the help On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote: What function are you calling to get data and what is the error ? Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected. If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/bin Aaron On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote: Hi I'm trying to create a connection to a server running cassandra doing this: compass = Cassandra.new('Compas', servers=223.798.456.123:9160) But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ? Thanks
Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
Jake or anyone else got experience bulk loading into Lucandra ?Or does anyone have experience with JRocket ?Max, are you sending one document at a time into lucene. Can you send them in batches (like solr), if so does it reduce theamount of requests going to cassandra?Also, cassandra.bat is configured withXX:+HeapDumpOnOutOfMemoryError so you should be able to take a look at where all the memory if going. Riptano blog points tohttp://www.eclipse.org/mat/also seehttp://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrrHope that helps.AaronOn 07 Dec, 2010,at 09:17 AM, Aaron Morton aa...@thelastpickle.com wrote:Accidentallysent to me.Begin forwarded message:From: Max cassan...@ajowa.deDate: 07 December 2010 6:00:36 AMTo: Aaron Morton aa...@thelastpickle.comSubject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)Thank you both for your answer! After several tests with different parameters we came to the conclusion that it must be a bug. It looks very similar to: https://issues.apache.org/jira/browse/CASSANDRA-1014 For both CFs we reduced thresholds: - memtable_flush_after_mins = 60 (both CFs are used permanently, therefore other thresholds should trigger first) - memtable_throughput_in_mb = 40 - memtable_operations_in_millions = 0.3 - keys_cached = 0 - rows_cached = 0 - in_memory_compaction_limit_in_mb = 64 First we disabled caching, later we disabled compacting and after that we set commitlog_sync: batch commitlog_sync_batch_window_in_ms: 1 But our problem still appears: During inserting files with Lucandra memory usage is slowly growing until OOM crash after about 50 min. @Peter: In our latest test we stopped writing suddenly but cassandra didn\'t relax and remains even after minutes on ~90% heap usage. http://oi54.tinypic.com/2dueeix.jpg With our heap calculation we should need: 64 MB * 2 * 3 + 1 GB = 1,4 GB All recent tests we run with 3 GB. I think that should be ok for a test machine. Also consistency level is one. But Aaron is right, Lucandra produces even more than 200 inserts/s. My 200 documents per second are about 200 operations (writecount) on first CF and about 3000 on second CF. But even with about 120 documents/s cassandra crashes. Disk I/O monitored with Windows performance admin tools is on both discs moderate (commitlog is on seperate harddisc). Any ideas? If it's really a bug, in my opinion it's very critical. Aaron Morton aa...@thelastpickle.com wrote: I remember you have 2 CF's but what are the settings for: - memtable_flush_after_mins -memtable_throughput_in_mb -memtable_operations_in_millions -keys_cached -rows_cached -in_memory_compaction_limit_in_mb Can you do the JVM Heap Calculation here and see what it says http://wiki.apache.org/cassandra/MemtableThresholds What Consistency Level are you writing at? (Checking it's not Zero) When you talk about 200 inserts per second is that storing 200 documents through lucandra or 200 request to cassandra. If it's the first option I would assume that would generate a lot more actual requests into cassandra. Open up jconsole and take a look at the WriteCount settings for the CF'shttp://wiki.apache.org/cassandra/MemtableThresholds You could also try setting the compaction thresholds to 0 to disable compaction while you are pushing this data in. Then use node tool to compact and turn the settings back to normal. See cassandra.yam for more info. I would have thought you could get the writes through with the setup you've described so far (even though a single 32bit node is unusual). The best advice is to turn all the settings down (e.g. caches off, mtable flush 64MB, compaction disabled) and if it still fails try: - checking your IO stats, not sure on windows but JConsole has some IO stats. If your IO cannot keep up then your server is not fast enough for your client load. - reducing the client load Hope that helps. Aaron On 04 Dec, 2010,at 05:23 AM, Max cassan...@ajowa.de wrote: Hi, we increased heap space to 3 GB (with JRocket VM under 32-bit Win with 4 GB RAM) but under "heavy" inserts Cassandra is still crashing with OutOfMemory error after a GC storm It sounds very similar to https://issues.apache.org/jira/browse/CASSANDRA-1177 In our insert-tests the average heap usage is slowly growing up to the 3 GB border (jconsole monitor over 50 min http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is also constantly growing up to about 50 jobs pending We tried to decrease CF memtable threshold but after about half a million inserts it's over. - Cassandra 0.7.0 beta 3 - Single Node - about 200 inserts/s ~500byte - 1 kb Is there no other possibility instead of slowing down inserts/s ? What could be an indicator to see if a node works stable with this amount of inserts? Thank you for your answer, Max Aaron Morton aa...@thelastpickle.com: Sounds like you need to increase the Heap size and/or reduce the
Re: Sorting problem on supercolumns names using OPP on 0.6.2
o now it's behaving :) #define ntohll(x) (((_int64)(ntohl((int)((x 32) 32))) 32) | (unsigned int)ntohl(((int)(x 32 string result; result.resize(sizeof(long long)); long long bigendian = htonll(l); memcpy(result[0], bigendian, sizeof(long long)); = (super_column=1291668233, (column=35646130653133632d333766642d343231312d386138382d393936383966326462643364, value=2010-12-06 20:43:53.000, timestamp=1291668233034754) (column=61323432323262622d353734342d346133322d393530312d626238343365346363376335, value=2010-12-06 20:43:53.000, timestamp=1291668233169771) (column=66633136333166382d373733622d343734652d393265362d376162633364316564383964, value=2010-12-06 20:43:53.000, timestamp=1291668233302288)) = (super_column=1291668232, (column=61343765353432352d613066392d343334392d613761392d336635313631633261303161, value=2010-12-06 20:43:52.000, timestamp=1291668232563694) (column=64343635396433382d316166302d343732662d623737392d336634303931323961373364, value=2010-12-06 20:43:52.000, timestamp=1291668232889235)) Thanks again! Guille On Mon, Dec 6, 2010 at 5:45 PM, Guillermo Winkler gwink...@inconcertcc.comwrote: uh, ok I was just copying :P string result; result.resize(sizeof(long long)); memcpy(result[0], l, sizeof(long long)); I'll try and let you know many thanks! On Mon, Dec 6, 2010 at 4:29 PM, Tyler Hobbs ty...@riptano.com wrote: How are you packing the longs into strings? The large negative numbers point to that being done incorrectly. Bitshifting and putting each byte of the long into a char[8] then stringifying the char[] is the best way to go. Cassandra expects big-ending longs, as well. - Tyler On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: I'm using thrift in C++ and inserting the results in a vector of pairs, so client-side-mangling does not seem to be the problem. Also I'm using a test column where I insert the same value I'm using as super column name (in this case the same date converted to string) and when queried using cassandra cli is unsorted too: cassandra get Events.EventsByUserDate ['guille'] = (super_column=9088542550893002752, (column=4342323443303834363833383437454339364433324530324538413039373736, value=2010-12-06 17:43:36.000, timestamp=1291657416526732)) = (super_column=5990347482238812160, (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d, value=2010-12-06 17:46:08.000, timestamp=1291657568569039)) = (super_column=-3089190841516818432, (column=3634343644353236463830303437363542454245354630343845393533373337, value=2010-12-06 17:44:47.000, timestamp=1291657487450738)) = (super_column=-4026221038986592256, (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532, value=2010-12-06 17:39:50.000, timestamp=1291657190117981)) On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs ty...@riptano.com wrote: What client are you using? Is it storing the results in a hash map or some other type of non-order preserving dictionary? - Tyler On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi, I've the following schema defined: EventsByUserDate : { UserId : { epoch: { // SC IID, IID, IID, IID }, // and the other events in time epoch: { IID, IID, IID } } } ColumnFamily ColumnType=Super CompareWith=LongType CompareSubcolumnsWith=BytesType Name=EventsByUserDate / Where I'm expecting to store all the event ids for a user ordered by date (it's seconds since epoch as long long), I'm using OrdingPreservingPartitioner. But a call to: GetSuperRangeSlices(EventsByUserDate , --column family , --supercolumn userId, --startkey userId, --endkey { column_names = {}, slice_range = { start = , finish = , reversed = true, count = 20} }, 1 --total keys ) Is not sorting correctly by supercolumn (the supercolumn names come out unsorted), this is a sample output for the pervious query using thrift directly: SC 1291648883 SC 1291588465 SC 1291588453 SC 1291586385 SC 1291587408 SC 1291588174 SC 1291585331 SC 1291587116 SC 1291651116 SC 1291586332 SC 1291588548 SC 1291588036 SC 1291648703 SC 1291583651 SC 1291583650 SC 1291583649 SC 1291583648 SC 1291583647 SC 1291583646 SC 1291587485 Anything I'm missing regarding sorting schemes? Thanks, Guille
questions about cassandra-1072/1546
Can we get an update? After reading through the comments on 1072, it looks like this is getting close to finished, but it's hard for someone not knee-deep in the project to tell. I'm primarily interested in the timeline you foresee for getting the increment support into trunk for 0.7, and some documentation around how counters will be supported from the user's perspective - chiefly what a Column and SuperColumn will look like with counters and what the thrift API will be. Some documentation about the remaining issues and concerns we should be aware of when using counters would be good, too, since it looks like there were some in the comments. Again, as someone not knee-deep in the project, it's hard to tell how severe they are or how or when they would apply in general use.
Re: Newbie question about connecting to a cassandra server from another server using Fauna
It would help if you give us more context. The code snippet you've given us is incomplete and not very helpful. -ryan On Mon, Dec 6, 2010 at 12:33 PM, Alberto Velandia betovelan...@gmail.com wrote: Hi I've successfully managed to connect to the server through the cassandra-cli command but still no luck on doing it from Fauna, I'm running cassandra 0.6.8 and I did the usual require 'cassandra' I've changed the ThriftAddress on the storage-conf.xml to the IP address of the server itself, do I need to change anything else? Thanks once again On Dec 6, 2010, at 3:15 PM, Aaron Morton wrote: You can run the cassandra-cli from any machine. If you run it from the same machine as your ruby code it's a reliable way to check you can connect to the cluster. ok, next set of questions - what version of cassandra are you using ? Is it 0.7? - what require did you run ? was it require 'cassandra/0.7' ? see the Usage section https://github.com/fauna/cassandra for details. Cassandra 0.7 defaults to frames transport, 0.6 does not. Aaron On 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com wrote: I've tried the keyspaces() function and got this on return: compass.keyspaces() CassandraThrift::Cassandra::Client::TransportException: CassandraThrift::Cassandra::Client::TransportException from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in `rescue in open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in `handled_proxy' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in `get_string_property' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in `all_nodes' from /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in `reconnect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in `client' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in `keyspaces' from (irb):4 from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main' about the cassandra-cli, should I run the command on the server from which I'm trying to connect? Thanks for the help On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote: What function are you calling to get data and what is the error ? Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected. If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/bin Aaron On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote: Hi I'm trying to create a connection to a server running cassandra doing this: compass = Cassandra.new('Compas', servers=223.798.456.123:9160) But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ? Thanks
Re: Newbie question about connecting to a cassandra server from another server using Fauna
I've found the solution, thanks for the help, I needed to change the addresses on the storage-conf.xml both ListenAddress and ThriftAddress to the address of the server itself. Sorry about the snippet being incomplete btw On Dec 6, 2010, at 4:18 PM, Ryan King wrote: It would help if you give us more context. The code snippet you've given us is incomplete and not very helpful. -ryan On Mon, Dec 6, 2010 at 12:33 PM, Alberto Velandia betovelan...@gmail.com wrote: Hi I've successfully managed to connect to the server through the cassandra-cli command but still no luck on doing it from Fauna, I'm running cassandra 0.6.8 and I did the usual require 'cassandra' I've changed the ThriftAddress on the storage-conf.xml to the IP address of the server itself, do I need to change anything else? Thanks once again On Dec 6, 2010, at 3:15 PM, Aaron Morton wrote: You can run the cassandra-cli from any machine. If you run it from the same machine as your ruby code it's a reliable way to check you can connect to the cluster. ok, next set of questions - what version of cassandra are you using ? Is it 0.7? - what require did you run ? was it require 'cassandra/0.7' ? see the Usage section https://github.com/fauna/cassandra for details. Cassandra 0.7 defaults to frames transport, 0.6 does not. Aaron On 07 Dec, 2010,at 09:07 AM, Alberto Velandia betovelan...@gmail.com wrote: I've tried the keyspaces() function and got this on return: compass.keyspaces() CassandraThrift::Cassandra::Client::TransportException: CassandraThrift::Cassandra::Client::TransportException from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:53:in `rescue in open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/socket.rb:36:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift-0.2.0.4/lib/thrift/transport/buffered_transport.rb:37:in `open' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/connection/socket.rb:11:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:82:in `connect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:110:in `handled_proxy' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/thrift_client-0.5.0/lib/thrift_client/abstract_thrift_client.rb:57:in `get_string_property' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:302:in `all_nodes' from /home/compass/.rvm/gems/ruby-19.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:285:in `reconnect!' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:280:in `client' from /home/compass/.rvm/gems/ruby-1.9.2...@rails3/gems/cassandra-0.8.2/lib/cassandra/cassandra.rb:86:in `keyspaces' from (irb):4 from /home/compass/.rvm/rubies/ruby-1.9.2-p0/bin/irb:16:in `main' about the cassandra-cli, should I run the command on the server from which I'm trying to connect? Thanks for the help On Dec 6, 2010, at 2:52 PM, Aaron Morton wrote: What function are you calling to get data and what is the error ? Try calling a function like keyspaces(), it should return a list of the keyspaces in your cluster and is a good way to test things are connected. If there is still no joy check you can connect to your cluster using the cassandra-cli command line app located in cassandra/bin Aaron On 07 Dec, 2010,at 08:46 AM, Alberto Velandia betovelan...@gmail.com wrote: Hi I'm trying to create a connection to a server running cassandra doing this: compass = Cassandra.new('Compas', servers=223.798.456.123:9160) But once I try to get some data I realize that there's no connection, any ideas?? I'm I missing something ? Thanks
Pagination
How is pagination accomplished when you dont know a start key? For example, how can I jump to page 10?
Re: Pagination
Short answer: that's a bad idea; don't do it. Long answer: you could count 10 pages of results and jump there manually, which is what offset 10 * page_size is doing for you under the hood, but that gets slow quickly as your offset grows. Which is why you shouldn't do it with a SQL db either. On Mon, Dec 6, 2010 at 3:35 PM, Mark static.void@gmail.com wrote: How is pagination accomplished when you dont know a start key? For example, how can I jump to page 10? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL
I'm running a big test -- ten nodes with 3T disk each. I'm using 0.7.0rc1. After some tuning help (thanks Tyler) lots of this is working as it should. However a serious event occurred as well -- the server froze up -- and though mutations were dropped, no error was reported to the client. Here's what the log said on host X.19: WARN [ScheduledTasks:1] 2010-12-06 14:04:11,125 MessagingService.java (line 527) Dropped 76 MUTATION messages in the last 5000ms Meanwhile, on the OTHER nodes, gossip decided the node was not available for a while: INFO [ScheduledTasks:1] 2010-12-06 14:04:02,396 Gossiper.java (line 195) InetAddress /X.19 is now dead. INFO [GossipStage:1] 2010-12-06 14:04:06,127 Gossiper.java (line 569) InetAddress /X.19 is now UP And despite the fact that I was writing with consistency=ALL, none of my clients reported any errors on their mutations. Tyler has this information but I would like to know if anyone has seen this before, and/or has a diagnosis.
Re: If one seed node crash, how can I add one seed node?
Thanks Nick. After I add the new node as seed node in the configuration for all of my nodes, do I need to restart all of my nodes? 2010/12/7 Nick Bailey n...@riptano.com The node can be set as a seed node at any time. It does not need to be a seed node when it joins the cluster. You should remove it as a seed node, set autobootstrap to true and let it join the cluster. Once it has joined the cluster you should add it as a seed node in the configuration for all of your nodes. On Mon, Dec 6, 2010 at 9:59 AM, lei liu liulei...@gmail.com wrote: Thank Jonathan for your reply. How can I bootstrap the node into cluster, I know if the node is seed node, I can't set AutoBootstrap to true. 2010/12/6 Jonathan Ellis jbel...@gmail.com set it as a seed _after_ bootstrapping it into the cluster. On Mon, Dec 6, 2010 at t5:01 AM, lei liu liulei...@gmail.com wrote: After one seed node crash, I want to add one node as seed node, I set auto_bootstrap to true, but the new node don't migrate data from other node s. How can I add one new seed node and let the node to migrate data from other nodes? Thanks, LiuLei -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com