Re: deletedAt and localDeletion
Thanks Ryan 2015-01-06 20:21 GMT+01:00 Ryan Svihla r...@foundev.pro: If you look at the source there are some useful comments regarding those specifics https://github.com/apache/cassandra/blob/8d8fed52242c34b477d0384ba1d1ce3978efbbe8/src/java/org/apache/cassandra/db/DeletionTime.java /** * A timestamp (typically in microseconds since the unix epoch, although this is not enforced) after which * data should be considered deleted. If set to Long.MIN_VALUE, this implies that the data has not been marked * for deletion at all. */ public final long markedForDeleteAt; /** * The local server timestamp, in seconds since the unix epoch, at which this tombstone was created. This is * only used for purposes of purging the tombstone after gc_grace_seconds have elapsed. */ public final int localDeletionTime; On Mon, Jan 5, 2015 at 6:13 AM, Kais Ahmed k...@neteck-fr.com wrote: Hi all, Can anyone explain what mine deletedAt and localDeletion in SliceQueryFilter log. SliceQueryFilter.java (line 225) Read 6 live and 2688 tombstoned cells in ks.mytable (see tombstone_warn_threshold). 10 columns was requested, slices=[-], delInfo={deletedAt=-9223372036854775808, localDeletion= 2147483647} Thanks, -- Thanks, Ryan Svihla
deletedAt and localDeletion
Hi all, Can anyone explain what mine deletedAt and localDeletion in SliceQueryFilter log. SliceQueryFilter.java (line 225) Read 6 live and 2688 tombstoned cells in ks.mytable (see tombstone_warn_threshold). 10 columns was requested, slices=[-], delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647} Thanks,
Re: Anyone know when DSE will support Cassandra 2.1?
Hi, For cassandra 2.0, DSE 4.0 come with casandra 2.0.5 For cassndra 1.2, DSE 3.1 come with cassandra 1.2.6 you should wait at least version 2.1.5 2014-10-14 21:20 GMT+02:00 Jason Lewis jle...@packetnexus.com: I can't find any info related to dates anywhere. jas
Re: assertion error on joining
I found the problem. jira ticket : https://issues.apache.org/jira/browse/CASSANDRA-8081 2014-10-06 18:45 GMT+02:00 Kais Ahmed k...@neteck-fr.com: Hi all, I'm a bit stuck , i want to expand my cluster C* 2.0.6 but i encountered an error on the new node. ERROR [FlushWriter:2] 2014-10-06 16:15:35,147 CassandraDaemon.java (line 199) Exception in thread Thread[FlushWriter:2,5,main] java.lang.AssertionError: 394920 at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342) at org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201) at org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188) at org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133) at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187) ... This assertion is here : public static void writeWithShortLength(ByteBuffer buffer, DataOutput out) throws IOException { int length = buffer.remaining(); -- assert 0 = length length = FBUtilities.MAX_UNSIGNED_SHORT : length; out.writeShort(length); write(buffer, out); // writing data bytes to output source } But i dont know what i can do to complete the bootstrap. Thanks,
assertion error on joining
Hi all, I'm a bit stuck , i want to expand my cluster C* 2.0.6 but i encountered an error on the new node. ERROR [FlushWriter:2] 2014-10-06 16:15:35,147 CassandraDaemon.java (line 199) Exception in thread Thread[FlushWriter:2,5,main] java.lang.AssertionError: 394920 at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342) at org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201) at org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188) at org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133) at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187) ... This assertion is here : public static void writeWithShortLength(ByteBuffer buffer, DataOutput out) throws IOException { int length = buffer.remaining(); -- assert 0 = length length = FBUtilities.MAX_UNSIGNED_SHORT : length; out.writeShort(length); write(buffer, out); // writing data bytes to output source } But i dont know what i can do to complete the bootstrap. Thanks,
Re: Migration from Cassandra 1.2.5 to Cassandra 2.0.8 with changed partitioner settings
hi tsi, You have you upgrade to 1.2.9 first. 2.0.0 = Upgrading - - Java 7 is now *required*! - Upgrading is ONLY supported from Cassandra 1.2.9 or later. This goes for sstable compatibility as well as network. When upgrading from an earlier release, upgrade to 1.2.9 first and run upgradesstables before proceeding to 2.0. 2014-07-30 8:51 GMT+02:00 Paco Trujillo f.truji...@genetwister.nl: Hi tsi We faced a similar situation a few months ago. At the end what we make is developed a service in our application to migrate the data from the old cassandra 1.2 to the new cassandra 2.0 using cql (with normal SELECT and UPDATE commands) (our application at that moment have two cassandra sessions, one for the 1.2 and another for the 2.0) . I know it is not the perfect solution but by this way we could migrate the data without considering anything about partitioner Kind regards Francisco Trujillo - BioInformatic Developer Genetwister Technologies B.V. Phone: +31 317 466420 Fax: +31 317 466421 Email:f.truji...@genetwister.nl Internet:www.genetwister.nl -- Disclaimer This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom/which they are addressed. If you have received this e-mail in error please notify the sender immediately, delete this e-mail from your system and do not disseminate, distribute or copy this e-mail. This e-mail shall not constitute a binding agreement unless expressly confirmed in writing by one of the Directors of Genetwister. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Genetwister. Finally, the recipient should check this e-mail and any attachments for the presence of viruses. Genetwister accepts no liability for any damage caused by any virus transmitted by this e-mail or damage caused by the use of the information contained in this e-mail. From: tsi [thorsten.s...@t-systems.com] Sent: Tuesday, July 29, 2014 2:53 PM To: cassandra-u...@incubator.apache.org Subject: Re: Migration from Cassandra 1.2.5 to Cassandra 2.0.8 with changed partitioner settings We are moving our application to a plattform with this partitioner setting and as far as I know there is no way to specify a partitioner on keyspace level. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Migration-from-Cassandra-1-2-5-to-Cassandra-2-0-8-with-changed-partitioner-settings-tp7596019p7596021.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Migration from Cassandra 1.2.5 to Cassandra 2.0.8 with changed partitioner settings
Sorry my advice is not good for you, if you are moving to another platform with a different portionner, i think sstableloader is the right tool for you. http://www.datastax.com/docs/1.1/references/bulkloader http://www.datastax.com/dev/blog/bulk-loading 2014-07-30 10:51 GMT+02:00 Hao Cheng br...@critica.io: No idea if this will help, but have you tried the sstable2json and json2sstable utilities to output json from your old cluster and import it into the new one? On Wed, Jul 30, 2014 at 1:40 AM, Kais Ahmed k...@neteck-fr.com wrote: hi tsi, You have you upgrade to 1.2.9 first. 2.0.0 = Upgrading - - Java 7 is now *required*! - Upgrading is ONLY supported from Cassandra 1.2.9 or later. This goes for sstable compatibility as well as network. When upgrading from an earlier release, upgrade to 1.2.9 first and run upgradesstables before proceeding to 2.0. 2014-07-30 8:51 GMT+02:00 Paco Trujillo f.truji...@genetwister.nl: Hi tsi We faced a similar situation a few months ago. At the end what we make is developed a service in our application to migrate the data from the old cassandra 1.2 to the new cassandra 2.0 using cql (with normal SELECT and UPDATE commands) (our application at that moment have two cassandra sessions, one for the 1.2 and another for the 2.0) . I know it is not the perfect solution but by this way we could migrate the data without considering anything about partitioner Kind regards Francisco Trujillo - BioInformatic Developer Genetwister Technologies B.V. Phone: +31 317 466420 Fax: +31 317 466421 Email:f.truji...@genetwister.nl Internet:www.genetwister.nl -- Disclaimer This e-mail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom/which they are addressed. If you have received this e-mail in error please notify the sender immediately, delete this e-mail from your system and do not disseminate, distribute or copy this e-mail. This e-mail shall not constitute a binding agreement unless expressly confirmed in writing by one of the Directors of Genetwister. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Genetwister. Finally, the recipient should check this e-mail and any attachments for the presence of viruses. Genetwister accepts no liability for any damage caused by any virus transmitted by this e-mail or damage caused by the use of the information contained in this e-mail. From: tsi [thorsten.s...@t-systems.com] Sent: Tuesday, July 29, 2014 2:53 PM To: cassandra-u...@incubator.apache.org Subject: Re: Migration from Cassandra 1.2.5 to Cassandra 2.0.8 with changed partitioner settings We are moving our application to a plattform with this partitioner setting and as far as I know there is no way to specify a partitioner on keyspace level. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Migration-from-Cassandra-1-2-5-to-Cassandra-2-0-8-with-changed-partitioner-settings-tp7596019p7596021.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: MemtablePostFlusher and FlushWriter
Thanks christian, I'll check on my side. Have you an idea about FlushWriter 'All time blocked' Thanks, 2014-07-16 16:23 GMT+02:00 horschi hors...@gmail.com: Hi Ahmed, this exception is caused by you creating rows with a key-length of more than 64kb. Your key is 394920 bytes long it seems. Keys and column-names are limited to 64kb. Only values may be larger. I cannot say for sure if this is the cause of your high MemtablePostFlusher pending count, but I would say it is possible. kind regards, Christian PS: I still use good old thrift lingo. On Wed, Jul 16, 2014 at 3:14 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi chris, christan, Thanks for reply, i'm not using DSE. I have in the log files, this error that appear two times. ERROR [FlushWriter:3456] 2014-07-01 18:25:33,607 CassandraDaemon.java (line 196) Exception in thread Thread[FlushWriter:3456,5,main] java.lang.AssertionError: 394920 at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342) at org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201) at org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188) at org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133) at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187) at org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:365) at org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:318) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) It's the same error than this link http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3cbay169-w52699dd7a1c0007783f8d8a8...@phx.gbl%3E , with the same configuration 2 nodes RF 2 with SimpleStrategy. Hope this help. Thanks, 2014-07-16 1:49 GMT+02:00 Chris Lohfink clohf...@blackbirdit.com: The MemtablePostFlusher is also used for flushing non-cf backed (solr) indexes. Are you using DSE and solr by chance? Chris On Jul 15, 2014, at 5:01 PM, horschi hors...@gmail.com wrote: I have seen this behavour when Commitlog files got deleted (or permissions were set to read only). MemtablePostFlusher is the stage that marks the Commitlog as flushed. When they fail it usually means there is something wrong with the commitlog files. Check your logfiles for any commitlog related errors. regards, Christian On Tue, Jul 15, 2014 at 7:03 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi all, I have a small cluster (2 nodes RF 2) running with C* 2.0.6 on I2 Extra Large (AWS) with SSD disk, the nodetool tpstats shows many MemtablePostFlusher pending and FlushWriter All time blocked. The two nodes have the default configuration. All CF use size-tiered compaction strategy. There are 10 times more reads than writes (1300 reads/s and 150 writes/s). ubuntu@node1 :~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1158 159590 0 0 FlushWriter 0 0 11568 0 1031 ubuntu@node1:~$ nodetool compactionstats pending tasks: 90 Active compaction remaining time :n/a ubuntu@node2:~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1020 50987 0 0 FlushWriter 0 0 6672 0 948 ubuntu@node2:~$ nodetool compactionstats pending tasks: 89 Active compaction remaining time :n/a I think there is something wrong, thank you for your help.
Re: MemtablePostFlusher and FlushWriter
Hi chris, christan, Thanks for reply, i'm not using DSE. I have in the log files, this error that appear two times. ERROR [FlushWriter:3456] 2014-07-01 18:25:33,607 CassandraDaemon.java (line 196) Exception in thread Thread[FlushWriter:3456,5,main] java.lang.AssertionError: 394920 at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342) at org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201) at org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188) at org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133) at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187) at org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:365) at org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:318) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) It's the same error than this link http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3cbay169-w52699dd7a1c0007783f8d8a8...@phx.gbl%3E , with the same configuration 2 nodes RF 2 with SimpleStrategy. Hope this help. Thanks, 2014-07-16 1:49 GMT+02:00 Chris Lohfink clohf...@blackbirdit.com: The MemtablePostFlusher is also used for flushing non-cf backed (solr) indexes. Are you using DSE and solr by chance? Chris On Jul 15, 2014, at 5:01 PM, horschi hors...@gmail.com wrote: I have seen this behavour when Commitlog files got deleted (or permissions were set to read only). MemtablePostFlusher is the stage that marks the Commitlog as flushed. When they fail it usually means there is something wrong with the commitlog files. Check your logfiles for any commitlog related errors. regards, Christian On Tue, Jul 15, 2014 at 7:03 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi all, I have a small cluster (2 nodes RF 2) running with C* 2.0.6 on I2 Extra Large (AWS) with SSD disk, the nodetool tpstats shows many MemtablePostFlusher pending and FlushWriter All time blocked. The two nodes have the default configuration. All CF use size-tiered compaction strategy. There are 10 times more reads than writes (1300 reads/s and 150 writes/s). ubuntu@node1 :~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1158 159590 0 0 FlushWriter 0 0 11568 0 1031 ubuntu@node1:~$ nodetool compactionstats pending tasks: 90 Active compaction remaining time :n/a ubuntu@node2:~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1020 50987 0 0 FlushWriter 0 0 6672 0 948 ubuntu@node2:~$ nodetool compactionstats pending tasks: 89 Active compaction remaining time :n/a I think there is something wrong, thank you for your help.
MemtablePostFlusher and FlushWriter
Hi all, I have a small cluster (2 nodes RF 2) running with C* 2.0.6 on I2 Extra Large (AWS) with SSD disk, the nodetool tpstats shows many MemtablePostFlusher pending and FlushWriter All time blocked. The two nodes have the default configuration. All CF use size-tiered compaction strategy. There are 10 times more reads than writes (1300 reads/s and 150 writes/s). ubuntu@node1 :~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1158 159590 0 0 FlushWriter 0 0 11568 0 1031 ubuntu@node1:~$ nodetool compactionstats pending tasks: 90 Active compaction remaining time :n/a ubuntu@node2:~$ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MemtablePostFlusher 1 1020 50987 0 0 FlushWriter 0 0 6672 0 948 ubuntu@node2:~$ nodetool compactionstats pending tasks: 89 Active compaction remaining time :n/a I think there is something wrong, thank you for your help.
Re: Crash with TombstoneOverwhelmingException
This threshold is to prevent bad performance, you can increase the value 2013/12/27 Sanjeeth Kumar sanje...@exotel.in Thanks for the replies. I dont think this is just a warning , incorrectly logged as an error. Everytime there is a crash, this is the exact traceback I see in the logs. I just browsed through the code and the code throws a TombstoneOverwhelmingException exception in these situations and I did not see this being caught and handled some place. I might be wrong though. But I would also like to understand why this threshold value is important , so that I can set a right threshold. - Sanjeeth On Fri, Dec 27, 2013 at 11:33 AM, Edward Capriolo edlinuxg...@gmail.comwrote: I do not think the feature is supposed to crash the server. It could be that the message is the logs and the crash is not related to this message. WARN might be a better logging level for any message, even though the first threshold is WARN and the second is FAIL. ERROR is usually something more dramatic. On Wed, Dec 25, 2013 at 1:02 PM, Laing, Michael michael.la...@nytimes.com wrote: It's a feature: In the stock cassandra.yaml file for 2.03 see: # When executing a scan, within or across a partition, we need to keep the # tombstones seen in memory so we can return them to the coordinator, which # will use them to make sure other replicas also know about the deleted rows. # With workloads that generate a lot of tombstones, this can cause performance # problems and even exaust the server heap. # ( http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets ) # Adjust the thresholds here if you understand the dangers and want to # scan more tombstones anyway. These thresholds may also be adjusted at runtime # using the StorageService mbean. tombstone_warn_threshold: 1000 tombstone_failure_threshold: 10 You are hitting the failure threshold. ml On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon ra...@apigee.com wrote: Sanjeeth, Looks like the error is being populated from the hintedhandoff, what is the size of your hints cf? Thanks Rahul On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar sanje...@exotel.inwrote: Hi all, One of my cassandra nodes crashes with the following exception periodically - ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java (line 200) Scanned over 10 tombstones; query aborted (see tombstone_fail_thr eshold) ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java (line 187) Exception in thread Thread[HintedHandoff:33,1,main] org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) at org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Why does this happen? Does this relate to any incorrect config value? The Cassandra Version I'm running is ReleaseVersion: 2.0.3 - Sanjeeth
Re: Crash with TombstoneOverwhelmingException
You can read the comments about this new feature here : https://issues.apache.org/jira/browse/CASSANDRA-6117 2013/12/27 Kais Ahmed k...@neteck-fr.com This threshold is to prevent bad performance, you can increase the value 2013/12/27 Sanjeeth Kumar sanje...@exotel.in Thanks for the replies. I dont think this is just a warning , incorrectly logged as an error. Everytime there is a crash, this is the exact traceback I see in the logs. I just browsed through the code and the code throws a TombstoneOverwhelmingException exception in these situations and I did not see this being caught and handled some place. I might be wrong though. But I would also like to understand why this threshold value is important , so that I can set a right threshold. - Sanjeeth On Fri, Dec 27, 2013 at 11:33 AM, Edward Capriolo edlinuxg...@gmail.comwrote: I do not think the feature is supposed to crash the server. It could be that the message is the logs and the crash is not related to this message. WARN might be a better logging level for any message, even though the first threshold is WARN and the second is FAIL. ERROR is usually something more dramatic. On Wed, Dec 25, 2013 at 1:02 PM, Laing, Michael michael.la...@nytimes.com wrote: It's a feature: In the stock cassandra.yaml file for 2.03 see: # When executing a scan, within or across a partition, we need to keep the # tombstones seen in memory so we can return them to the coordinator, which # will use them to make sure other replicas also know about the deleted rows. # With workloads that generate a lot of tombstones, this can cause performance # problems and even exaust the server heap. # ( http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets ) # Adjust the thresholds here if you understand the dangers and want to # scan more tombstones anyway. These thresholds may also be adjusted at runtime # using the StorageService mbean. tombstone_warn_threshold: 1000 tombstone_failure_threshold: 10 You are hitting the failure threshold. ml On Wed, Dec 25, 2013 at 12:17 PM, Rahul Menon ra...@apigee.com wrote: Sanjeeth, Looks like the error is being populated from the hintedhandoff, what is the size of your hints cf? Thanks Rahul On Wed, Dec 25, 2013 at 8:54 PM, Sanjeeth Kumar sanje...@exotel.inwrote: Hi all, One of my cassandra nodes crashes with the following exception periodically - ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java (line 200) Scanned over 10 tombstones; query aborted (see tombstone_fail_thr eshold) ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java (line 187) Exception in thread Thread[HintedHandoff:33,1,main] org.apache.cassandra.db.filter.TombstoneOverwhelmingException at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306) at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92) at org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Why does this happen? Does this relate to any incorrect config value? The Cassandra Version I'm running is ReleaseVersion: 2.0.3 - Sanjeeth
Re: Configuration of multiple nodes in a single machine
You can try https://github.com/pcmanus/ccm 2013/12/2 Santosh Shet santosh.s...@vista-one-solutions.com Hi, I have requirement to setup multiple nodes to the existing single node Cassandra cluster. I am new to Cassandra and I don’t have idea whether do I need to setup multiple nodes in a single machine by copying Cassandra configuration or do I require separate physical machines to create multiple nodes. But I have tried doing it in a single machine by following below steps. 1. Download apache Cassandra 2. Unzip Cassandra file 3. Go inside /Cassandra folder 4. Make 2 copies of /conf folder named as /conf2,/conf3(in case of 3 nodes) 5. Go inside conf folder and edit cassandra.yaml · Name of the cluster · Specify seed node · Change listening address · Change rpc_address · Change locations of data_file, commitlog and saved_caches 6. Go to Cassandra-env.sh under /conf and change JMX_PORT 7. Go to log4j-server.properties under /conf folder and modify property log4j.appender.R.File address if needed. 8. Make 2 copies of Cassandra.in.sh under /bin folder and edit CASSANDRA_HOME. 9. Make copies of Cassandra under /bin folder. Specify which config folder it has to use. 10. Create aliases to localhost using below command *Sudo ifconfig lo0 alias 127.0.0.2 up* I have followed all the above steps but when I am trying to do step 10, it gives following error Please could you provide some inputs. *Santosh Shet* Software Engineer | VistaOne Solutions Direct India : * +91 80 30273829 %2B91%2080%2030273829* | Mobile India : *+91 8105720582 %2B91%208105720582* Skype : santushet image001.png
ccm and loading data
Hi all, I'm trying to do some test using ccm (https://github.com/pcmanus/ccm), how can import data into ? i copied some sstables from production but ccm nod1 refresh do not exist. can anyone tell me which method can i use ? Thanks,
Re: MemtablePostFlusher pending
Hello aaron, I hope you had a nice flight. Any information on how you are using cassandra, does the zero columns no row delete idea sound like something you are doing ? I do not know what I could do, but we use an old versions of phpcassa (0.8.a.2) that are not explictly compatible with cassandra 2.0, but work fine for us. Another thing that could help, when i created the keyspace i do : CREATE KEYSPACE ks01 WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1b': 3 }; USE ks01 ; DROP KEYSPACE ks01 ; CREATE KEYSPACE ks01 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3}; In jira i saw that creating keyspace after drop has already cause a bug in the past (CASSANDRA-4219) Thanks; Kais 2013/10/23 Aaron Morton aa...@thelastpickle.com On a plane and cannot check jira but… ERROR [FlushWriter:216] 2013-10-07 07:11:46,538 CassandraDaemon.java (line 186) Exception in thread Thread[FlushWriter:216,5,main] java.lang.AssertionError at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:198) Happened because we tried to write a row to disk that had zero columns and was not a row level tombstone. ERROR [ValidationExecutor:2] 2013-10-23 08:39:27,558 CassandraDaemon.java (line 185) Exception in thread Thread[ValidationExecutor:2,1,main] java.lang.AssertionError at org.apache.cassandra.db.compaction.PrecompactedRow.update(PrecompactedRow.java:171) at org.apache.cassandra.repair.Validator.rowHash(Validator.java:198) at org.apache.cassandra.repair.Validator.add(Validator.java:151) I *think* this is happening for similar reasons. (notes to self below)… public PrecompactedRow(CompactionController controller, ListSSTableIdentityIterator rows) { this(rows.get(0).getKey(), removeDeletedAndOldShards(rows.get(0).getKey(), controller, merge(rows, controller))); } results in call to this on CFS public static ColumnFamily removeDeletedCF(ColumnFamily cf, int gcBefore) { cf.maybeResetDeletionTimes(gcBefore); return cf.getColumnCount() == 0 !cf.isMarkedForDelete() ? null : cf; } If the CF has zero columns and is not marked for delete the CF will be null, and the PreCompacedRow will be created with a non cf. This is the source of the assertion. Any information on how you are using cassandra, does the zero columns no row delete idea sound like something you are doing ? This may already be fixed. Will take a look later when on the ground. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 23/10/2013, at 9:50 PM, Kais Ahmed k...@neteck-fr.com wrote: Thanks robert, For info if it helps to fix the bug i'm starting the downgrade, i restart all the node and do a repair and there are a lot of error like this : EERROR [ValidationExecutor:2] 2013-10-23 08:39:27,558 Validator.java (line 242) Failed creating a merkle tree for [repair #9f9b7fc0-3bbe-11e3-a220-b18f7c69b044 on ks01/messages, (8746393670077301406,8763948586274310360]], /172.31.38.135 (see log for details) ERROR [ValidationExecutor:2] 2013-10-23 08:39:27,558 CassandraDaemon.java (line 185) Exception in thread Thread[ValidationExecutor:2,1,main] java.lang.AssertionError at org.apache.cassandra.db.compaction.PrecompactedRow.update(PrecompactedRow.java:171) at org.apache.cassandra.repair.Validator.rowHash(Validator.java:198) at org.apache.cassandra.repair.Validator.add(Validator.java:151) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:798) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:60) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:395) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) And the repair stop after this error : ERROR [FlushWriter:9] 2013-10-23 08:39:32,979 CassandraDaemon.java (line 185) Exception in thread Thread[FlushWriter:9,5,main] java.lang.AssertionError at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:198) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:186) at org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:358) at org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:317) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run
Re: MemtablePostFlusher pending
Thanks robert, For info if it helps to fix the bug i'm starting the downgrade, i restart all the node and do a repair and there are a lot of error like this : EERROR [ValidationExecutor:2] 2013-10-23 08:39:27,558 Validator.java (line 242) Failed creating a merkle tree for [repair #9f9b7fc0-3bbe-11e3-a220-b18f7c69b044 on ks01/messages, (8746393670077301406,8763948586274310360]], /172.31.38.135 (see log for details) ERROR [ValidationExecutor:2] 2013-10-23 08:39:27,558 CassandraDaemon.java (line 185) Exception in thread Thread[ValidationExecutor:2,1,main] java.lang.AssertionError at org.apache.cassandra.db.compaction.PrecompactedRow.update(PrecompactedRow.java:171) at org.apache.cassandra.repair.Validator.rowHash(Validator.java:198) at org.apache.cassandra.repair.Validator.add(Validator.java:151) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:798) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:60) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:395) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) And the repair stop after this error : ERROR [FlushWriter:9] 2013-10-23 08:39:32,979 CassandraDaemon.java (line 185) Exception in thread Thread[FlushWriter:9,5,main] java.lang.AssertionError at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:198) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:186) at org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:358) at org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:317) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) 2013/10/22 Robert Coli rc...@eventbrite.com On Mon, Oct 21, 2013 at 11:57 PM, Kais Ahmed k...@neteck-fr.com wrote: I will try to create a new cluster 1.2 and copy data, can you tell me please the best pratice to do this, do i have to use sstable2json / json2sstable or other method. Unfortunately to downgrade versions you are going to need to use a method like sstable2json/json2sstable. Other bulkload options, which mostly don't apply in the downgrade case, here : http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra =Rob
Re: MemtablePostFlusher pending
Thanks Robert, you are right, I upgraded to 2.0.1 and don't fix the problem, 2 of 3 nodes raise the same error after 30 minutes. I will try to create a new cluster 1.2 and copy data, can you tell me please the best pratice to do this, do i have to use sstable2json / json2sstable or other method. Thanks, 2013/10/21 Robert Coli rc...@eventbrite.com On Mon, Oct 21, 2013 at 2:17 AM, Kais Ahmed k...@neteck-fr.com wrote: We have recently run in production a new cluster C* 2.0.0 with 3 nodes RF 3. https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ What Version of Cassandra Should I Run in Production? If I were you I would probably try to read my data into a 1.2.x cluster. Downgrading versions on the same cluster is unlikely to work. You could try going to 2.0.1 but there is not compelling reason to believe this will fix your problem. =Rob
Re: How to determine which node(s) an insert would go to in C* 2.0 with vnodes?
hi, you can try : nodetool getendpoints keyspace cf key - Print the end points that owns the key 2013/10/8 Christopher Wirt chris.w...@struq.com In CQL there is a token() function you can use to find the result of your partitioning schemes hash function for any value. ** ** e.g. select token(value) from column_family1 where partition_column = value; ** ** You then need to find out which nodes are responsible for that value using nodetool ring or looking at system.peers table for tokens ** ** Not that straight forward esp. with 100 nodes and vNodes. Maybe someone has written a script or something to do this already? ** ** Or I suppose you could turn on tracing and repeat the query until you’ve seen it hit three different end nodes? i.e. tracing on; select * from column_family1 where partition_column = value; ** ** ** ** ** ** *From:* Sameer Farooqui [mailto:sam...@blueplastic.com] *Sent:* 08 October 2013 10:20 *To:* user@cassandra.apache.org *Subject:* How to determine which node(s) an insert would go to in C* 2.0 with vnodes? ** ** Hi, When using C* 2.0 in a large 100 node cluster with Murmer3Hash, vnodes and 256 tokens assigned to each node, is it possible to find out where a certain key is destined to go? If the keyspace defined has replication factor = 3, then a specific key like 'row-1' would be destined to go to 3 nodes, right? Is there a way I can pre-determine which of the 3 nodes out of 100 that that insert of 'row-1' would go to? Or alternatively, after I've already written the 'row-1', can I find out which 3 nodes it went to?
Re: Connecting to a remote cassandra node..
hello, you have to check listen_address in cassandra.yaml, change the localhost value by the ip of the machine and restart cassandra 2013/9/27 Krishna Chaitanya bnsk1990r...@gmail.com Hello, I am relatively new to cassandra. I am using a library called libQtCassandra for accesing the cassandra database from my c++ programs. When I try to connect to the localhost cassandra , everything seems fine but when I try to connect to a remote node on which cassandra is up and running , it says connection refused. Any help would be of a great value. Thank You... -- Regards, BNSK*. *
Re: Installing specific version
Hi ben, You can get it from http://archive.apache.org/dist/cassandra/ 2013/7/5 Ben Gambley ben.gamb...@intoscience.com Hi all Can anyone point me in the right direction for installing a specific version from datastax repo, we need 1.2.4 to keep consistent with our qa environment. It's for a new prod cluster , on Debian 6. I thought it may be a value in /etc/apt/source.list ? The latest 1.2.6 does not appear compatible with our phpcassa thrift drivers. After many late nights my google ability seems to have evaporated! Cheers Ben
Re: Cassandra read reapair
Thanks aaron 2013/5/28 aaron morton aa...@thelastpickle.com Start using QUOURM for reads and writes and then run a nodetool repair. That should get you back to the land of the consistent. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 27/05/2013, at 10:01 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, I was sure that phpcassa use QUORUM for W and R by default, but you're right, the default CL for R AND W is ONE. We are in this configuration W + R N, how can i do to repair some keys that always return inconsistent data. Thanks, 2013/5/24 Kais Ahmed k...@neteck-fr.com Hi aaron an thanks, If you are reading and writing at CL QUOURM and getting inconsistent results that sounds like a bug. If you are mixing the CL levels such that R + W = N then it's expected behaviour. I think it's a bug, it concern only some keys (~200 over 120 000 keys) on one column family, As I understand it, if R+W = N it's expected behaviour at a moment, but read repair will correct the inconsistent data (related to read_repair_chance value) and the next query will return a consistent data , right ? Here is an exemple of a result that i have on a key (keyspace RF 3), 3 differents replicas : [default@prod] ASSUME contacts_timeordered KEYS AS ascii; [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=1a927740-97ec-11e2-ab16-a1afd66a735e, value=363838353936, timestamp=1364505108842098) = (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 9 results. Elapsed time: 10 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 7 results. Elapsed time: 7.49 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 8 results. Elapsed time: 9.37 msec(s). Do I have to change read_repair_chance to 1 to correct the inconsistency, nodetool repair don't solve it. Thanks a lot, 2013/5/23 aaron morton aa...@thelastpickle.com If you are reading and writing at CL QUOURM and getting inconsistent results that sounds like a bug. If you are mixing the CL levels such that R + W = N then it's expected behaviour. Can you reproduce the issue outside of your app ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 21/05/2013, at 8:55 PM, Kais Ahmed k...@neteck-fr.com wrote: Checking you do not mean the row key is corrupt and cannot be read. Yes, i can read it but all read don't return the same result except for CL ALL By default in 1.X and beyond the default read repair chance is 0.1, so it's only enabled on 10% of requests. You
Re: Buggy JRE error
Hi chiddu, You have to configure your operating system to use the Oracle JRE instead of OpenJDK. http://www.datastax.com/docs/1.0/install/install_jre 2013/5/27 S Chidambaran chi...@gmail.com I get these errors frequently as Cassandra starts up. I'm using the official Java distribution from Ubuntu. WARN 08:11:48,145 MemoryMeter uninitialized (jamm not specified as java agent); assuming liveRatio of 10.0. Usually this means cassandra-env.sh disabled jamm because you are using a buggy JRE; upgrade to the Sun JRE instead java version 1.6.0_27 OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1) OpenJDK Server VM (build 20.0-b12, mixed mode) Any idea on how to fix this? Regards Chiddu
Re: Cassandra read reapair
Hi aaron, I was sure that phpcassa use QUORUM for W and R by default, but you're right, the default CL for R AND W is ONE. We are in this configuration W + R N, how can i do to repair some keys that always return inconsistent data. Thanks, 2013/5/24 Kais Ahmed k...@neteck-fr.com Hi aaron an thanks, If you are reading and writing at CL QUOURM and getting inconsistent results that sounds like a bug. If you are mixing the CL levels such that R + W = N then it's expected behaviour. I think it's a bug, it concern only some keys (~200 over 120 000 keys) on one column family, As I understand it, if R+W = N it's expected behaviour at a moment, but read repair will correct the inconsistent data (related to read_repair_chance value) and the next query will return a consistent data , right ? Here is an exemple of a result that i have on a key (keyspace RF 3), 3 differents replicas : [default@prod] ASSUME contacts_timeordered KEYS AS ascii; [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=1a927740-97ec-11e2-ab16-a1afd66a735e, value=363838353936, timestamp=1364505108842098) = (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 9 results. Elapsed time: 10 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 7 results. Elapsed time: 7.49 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 8 results. Elapsed time: 9.37 msec(s). Do I have to change read_repair_chance to 1 to correct the inconsistency, nodetool repair don't solve it. Thanks a lot, 2013/5/23 aaron morton aa...@thelastpickle.com If you are reading and writing at CL QUOURM and getting inconsistent results that sounds like a bug. If you are mixing the CL levels such that R + W = N then it's expected behaviour. Can you reproduce the issue outside of your app ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 21/05/2013, at 8:55 PM, Kais Ahmed k...@neteck-fr.com wrote: Checking you do not mean the row key is corrupt and cannot be read. Yes, i can read it but all read don't return the same result except for CL ALL By default in 1.X and beyond the default read repair chance is 0.1, so it's only enabled on 10% of requests. You are right read repair chance is set to 0.1, but i launched a read repair which did not solved the problem. Any idea? What CL are you writing at ? All write are in CL QUORUM thank you aaron for your answer. 2013/5/21 aaron morton aa...@thelastpickle.com Only some keys of one CF are corrupt. Checking you do not mean the row key is corrupt and cannot be read. I thought using CF ALL, would
Re: Cassandra read reapair
Hi aaron an thanks, If you are reading and writing at CL QUOURM and getting inconsistent results that sounds like a bug. If you are mixing the CL levels such that R + W = N then it's expected behaviour. I think it's a bug, it concern only some keys (~200 over 120 000 keys) on one column family, As I understand it, if R+W = N it's expected behaviour at a moment, but read repair will correct the inconsistent data (related to read_repair_chance value) and the next query will return a consistent data , right ? Here is an exemple of a result that i have on a key (keyspace RF 3), 3 differents replicas : [default@prod] ASSUME contacts_timeordered KEYS AS ascii; [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=1a927740-97ec-11e2-ab16-a1afd66a735e, value=363838353936, timestamp=1364505108842098) = (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 9 results. Elapsed time: 10 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 7 results. Elapsed time: 7.49 msec(s). [default@prod] get contacts_timeordered['1425185-IGNORED']; = (column=1a93d5b0-97ec-11e2-888c-2bf068e0f754, value=31373930303330, timestamp=1364505108851088) = (column=b5c559c0-9d0f-11e2-8682-f7ecd4112689, value=32343130303930, timestamp=1365070157421869) = (column=7ba22b90-b48b-11e2-a2c2-914573921d9f, value=32353031353039, timestamp=1367652194221857) = (column=63ef5d80-b7e8-11e2-abf8-593c289227cd, value=32383435323830, timestamp=1368021951146575) = (column=d6383fc0-b810-11e2-a880-bd2ecacbaee3, value=31363334363737, timestamp=1368039322753824) = (column=f47d8e60-bd3f-11e2-88f4-533a93fe9432, value=32373938313038, timestamp=1368609315699785) = (column=c5bfe060-bf8e-11e2-ab1f-07be407aff58, value=32333634353034, timestamp=1368863069848610) = (column=f07ae4b0-c42f-11e2-8064-9794e872eb2b, value=363838353936, timestamp=1369372095163129) Returned 8 results. Elapsed time: 9.37 msec(s). Do I have to change read_repair_chance to 1 to correct the inconsistency, nodetool repair don't solve it. Thanks a lot, 2013/5/23 aaron morton aa...@thelastpickle.com If you are reading and writing at CL QUOURM and getting inconsistent results that sounds like a bug. If you are mixing the CL levels such that R + W = N then it's expected behaviour. Can you reproduce the issue outside of your app ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 21/05/2013, at 8:55 PM, Kais Ahmed k...@neteck-fr.com wrote: Checking you do not mean the row key is corrupt and cannot be read. Yes, i can read it but all read don't return the same result except for CL ALL By default in 1.X and beyond the default read repair chance is 0.1, so it's only enabled on 10% of requests. You are right read repair chance is set to 0.1, but i launched a read repair which did not solved the problem. Any idea? What CL are you writing at ? All write are in CL QUORUM thank you aaron for your answer. 2013/5/21 aaron morton aa...@thelastpickle.com Only some keys of one CF are corrupt. Checking you do not mean the row key is corrupt and cannot be read. I thought using CF ALL, would correct the problem with READ REPAIR, but by returning to CL QUORUM, the problem persists. By default in 1.X and beyond the default read repair chance is 0.1, so it's only enabled on 10% of requests. In the absence of further writes all reads (at any CL) should return the same value. What CL are you writing at ? Cheers - Aaron Morton
Re: Cassandra read reapair
Checking you do not mean the row key is corrupt and cannot be read. Yes, i can read it but all read don't return the same result except for CL ALL By default in 1.X and beyond the default read repair chance is 0.1, so it's only enabled on 10% of requests. You are right read repair chance is set to 0.1, but i launched a read repair which did not solved the problem. Any idea? What CL are you writing at ? All write are in CL QUORUM thank you aaron for your answer. 2013/5/21 aaron morton aa...@thelastpickle.com Only some keys of one CF are corrupt. Checking you do not mean the row key is corrupt and cannot be read. I thought using CF ALL, would correct the problem with READ REPAIR, but by returning to CL QUORUM, the problem persists. By default in 1.X and beyond the default read repair chance is 0.1, so it's only enabled on 10% of requests. In the absence of further writes all reads (at any CL) should return the same value. What CL are you writing at ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/05/2013, at 1:28 AM, Kais Ahmed k...@neteck-fr.com wrote: Hi all, I encountered a consistency problem one some keys using phpcassa and Cassandra 1.2.3 since a server crash Only some keys of one CF are corrupt. I lauched a nodetool repair that successfully completed but don't correct the issue. When i try to get a corrupt Key with : CL ONE, the result contains 7 or 8 or 9 columns CL QUORUM, result contains 8 or 9 columns CL ALL, the data is consistent and returns always 9 columns I thought using CF ALL, would correct the problem with READ REPAIR, but by returning to CL QUORUM, the problem persists. Thank you for your help
Cassandra read reapair
Hi all, I encountered a consistency problem one some keys using phpcassa and Cassandra 1.2.3 since a server crash Only some keys of one CF are corrupt. I lauched a nodetool repair that successfully completed but don't correct the issue. When i try to get a corrupt Key with : CL ONE, the result contains 7 or 8 or 9 columns CL QUORUM, result contains 8 or 9 columns CL ALL, the data is consistent and returns always 9 columns I thought using CF ALL, would correct the problem with READ REPAIR, but by returning to CL QUORUM, the problem persists. Thank you for your help
Re: Moving cluster
Yes Mike, this is what I meant when I spoke about the two solutions :) Thank you to all, 2013/4/21 Michael Theroux mthero...@yahoo.com I believe the two solutions that are being referred to is the lift and shift vs. upgrading by replacing a node and letting it restore from the cluster. I don't think there are any more risks per-say on the upgrading by replacing, as long as you can make sure your new node is configured properly. One might choose to do lift-and-shift in order to have a node down for less time (depending on your individual situation), or to have less of an impact on the cluster, as replacing a node would result in other nodes streaming their data to the newly replaced node. Depending on your dataset, this could take quite some time. All this also assumes, of course, that you are replicating your data such that the new node can retrieve the information it is responsible for from the other nodes. Thanks, -Mike On Apr 21, 2013, at 4:18 PM, aaron morton wrote: Sorry i do not understand you question. What are the two solutions ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 20/04/2013, at 3:43 AM, Kais Ahmed k...@neteck-fr.com wrote: Hello and thank you for your answers. The first solution is much easier for me because I use the vnode. What is the risk of the first solution thank you, 2013/4/18 aaron morton aa...@thelastpickle.com This is roughly the lift and shift process I use. Note that disabling thrift and gossip does not stop an existing repair session. So I often drain and then shutdown, and copy the live data dir rather than a snapshot dir. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/04/2013, at 4:10 AM, Michael Theroux mthero...@yahoo.com wrote: This should work. Another option is to follow a process similar to what we recently did. We recently and successfully upgraded 12 instances from large to xlarge instances in AWS. I chose not to replace nodes as restoring data from the ring would have taken significant time and put the cluster under some additional load. I also wanted to eliminate the possibility that any issues on the new nodes could be blamed on new configuration/operating system differences. Instead we followed the following procedure (removing some details that would likely be unique to our infrastructure). For a node being upgraded: 1) nodetool disable thrift 2) nodetool disable gossip 3) Snapshot the data (nodetool snapshot ...) 4) Backup the snapshot data to EBS (assuming you are on ephemeral) 5) Stop cassandra 6) Move the cassandra.yaml configuration file to cassandra.yaml.bak (to prevent any future restarts to cause cassandra to restart) 7) Shutdown the instance 8) Take an AMI of the instance 9) Start a new instance from the AMI with the desired hardware 10) If you assign the new instance a new IP Address, make sure any entries in /etc/hosts, or the broadcast_address in cassandra.yaml is updated 11) Attach the volume you backed up your snapshot data to to the new instance and mount it 12) Restore the snapshot data 13) Restore cassandra.yaml file 13) Restart cassandra - I recommend practicing this on a test cluster first - As you replace nodes with new IP Addresses, eventually all your seeds will need be updated. This is not a big deal until all your seed nodes have been replaced. - Don't forget about NTP! Make sure it is running on all your new nodes. Myself, to be extra careful, I actually deleted the ntp drift file and let NTP recalculate it because its a new instance, and it took over an hour to restore our snapshot data... but that may have been overkill. - If you have the opportunity, depending on your situation, increase the max_hint_window_in_ms - Your details may vary Thanks, -Mike On Apr 18, 2013, at 11:07 AM, Alain RODRIGUEZ wrote: I would say add your 3 servers to the 3 tokens where you want them, let's say : { 0: { 0: 0, 1: 56713727820156410577229101238628035242, 2: 113427455640312821154458202477256070485 } } or these token -1 or +1 if you already have these token used. And then just decommission x1Large nodes. You should be good to go. 2013/4/18 Kais Ahmed k...@neteck-fr.com Hi, What is the best pratice to move from a cluster of 7 nodes (m1.xlarge) to 3 nodes (hi1.4xlarge). Thanks,
Re: Moving cluster
Hello and thank you for your answers. The first solution is much easier for me because I use the vnode. What is the risk of the first solution thank you, 2013/4/18 aaron morton aa...@thelastpickle.com This is roughly the lift and shift process I use. Note that disabling thrift and gossip does not stop an existing repair session. So I often drain and then shutdown, and copy the live data dir rather than a snapshot dir. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/04/2013, at 4:10 AM, Michael Theroux mthero...@yahoo.com wrote: This should work. Another option is to follow a process similar to what we recently did. We recently and successfully upgraded 12 instances from large to xlarge instances in AWS. I chose not to replace nodes as restoring data from the ring would have taken significant time and put the cluster under some additional load. I also wanted to eliminate the possibility that any issues on the new nodes could be blamed on new configuration/operating system differences. Instead we followed the following procedure (removing some details that would likely be unique to our infrastructure). For a node being upgraded: 1) nodetool disable thrift 2) nodetool disable gossip 3) Snapshot the data (nodetool snapshot ...) 4) Backup the snapshot data to EBS (assuming you are on ephemeral) 5) Stop cassandra 6) Move the cassandra.yaml configuration file to cassandra.yaml.bak (to prevent any future restarts to cause cassandra to restart) 7) Shutdown the instance 8) Take an AMI of the instance 9) Start a new instance from the AMI with the desired hardware 10) If you assign the new instance a new IP Address, make sure any entries in /etc/hosts, or the broadcast_address in cassandra.yaml is updated 11) Attach the volume you backed up your snapshot data to to the new instance and mount it 12) Restore the snapshot data 13) Restore cassandra.yaml file 13) Restart cassandra - I recommend practicing this on a test cluster first - As you replace nodes with new IP Addresses, eventually all your seeds will need be updated. This is not a big deal until all your seed nodes have been replaced. - Don't forget about NTP! Make sure it is running on all your new nodes. Myself, to be extra careful, I actually deleted the ntp drift file and let NTP recalculate it because its a new instance, and it took over an hour to restore our snapshot data... but that may have been overkill. - If you have the opportunity, depending on your situation, increase the max_hint_window_in_ms - Your details may vary Thanks, -Mike On Apr 18, 2013, at 11:07 AM, Alain RODRIGUEZ wrote: I would say add your 3 servers to the 3 tokens where you want them, let's say : { 0: { 0: 0, 1: 56713727820156410577229101238628035242, 2: 113427455640312821154458202477256070485 } } or these token -1 or +1 if you already have these token used. And then just decommission x1Large nodes. You should be good to go. 2013/4/18 Kais Ahmed k...@neteck-fr.com Hi, What is the best pratice to move from a cluster of 7 nodes (m1.xlarge) to 3 nodes (hi1.4xlarge). Thanks,
Moving cluster
Hi, What is the best pratice to move from a cluster of 7 nodes (m1.xlarge) to 3 nodes (hi1.4xlarge). Thanks,
Re: Lost data after expanding cluster c* 1.2.3-1
Thanks aaron, I feel that rebuilding indexes went well, but the result of my query (SELECT * FROM userdata WHERE login='kais';) is still emty. INFO [Creating index: userdata.userdata_login_idx] 2013-03-30 01:16:33,110 SecondaryIndex.java (line 175) Submitting index build of userdata.userdata_login_idx INFO [Creating index: userdata.userdata_login_idx] 2013-03-30 01:34:11,667 SecondaryIndex.java (line 202) Index build of userdata.userdata_login_idx complete Thanks, 2013/4/9 aaron morton aa...@thelastpickle.com Look in the logs for messages from the SecondaryIndexManager starts with Submitting index build of end with Index build of Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 7/04/2013, at 12:55 AM, Kais Ahmed k...@neteck-fr.com wrote: hi aaron, nodetool compactionstats on all nodes return 1 pending task : ubuntu@app:~$ nodetool compactionstats host pending tasks: 1 Active compaction remaining time :n/a The command nodetool rebuild_index was launched several days ago. 2013/4/5 aaron morton aa...@thelastpickle.com but nothing's happening, how can i monitor the progress? and how can i know when it's finished? check nodetool compacitonstats Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 2:51 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, I ran the command nodetool rebuild_index host keyspace cf on all the nodes, in the log i see : INFO [RMI TCP Connection(5422)-10.34.139.xxx] 2013-04-04 08:31:53,641 ColumnFamilyStore.java (line 558) User Requested secondary index re-build for ... but nothing's happening, how can i monitor the progress? and how can i know when it's finished? Thanks, 2013/4/2 aaron morton aa...@thelastpickle.com The problem come from that i don't put auto_boostrap to true for the new nodes, not in this documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) auto_bootstrap defaults to True if not specified in the yaml. can i do that at any time, or when the cluster are not loaded Not sure what the question is. Both those operations are online operations you can do while the node is processing requests. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 1/04/2013, at 9:26 PM, Kais Ahmed k...@neteck-fr.com wrote: At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? The errors was in my side in the application, not cassandra errors I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Thank you for that advice. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. The problem come from that i don't put auto_boostrap to true for the new nodes, not in this documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) if you are using secondary indexes use nodetool rebuild_index to rebuild those. can i do that at any time, or when the cluster are not loaded Thanks aaron, 2013/4/1 aaron morton aa...@thelastpickle.com Please do not rely on colour in your emails, the best way to get your emails accepted by the Apache mail servers is to use plain text. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Now the cluster seem to work normally, but i can use the secondary for the moment, the queryanswer are random run nodetool repair -pr on each node, let it finish before starting the next one. if you are using secondary indexes use nodetool rebuild_index to rebuild those. Add one node new node to the cluster and confirm everything is ok, then add the remaining ones. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 31/03/2013, at 10:01 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, Thanks for reply, i will try to explain what append exactly I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2 ami (https://aws.amazon.com/amis/datastax-auto-clustering-ami
Re: Lost data after expanding cluster c* 1.2.3-1
hi aaron, nodetool compactionstats on all nodes return 1 pending task : ubuntu@app:~$ nodetool compactionstats host pending tasks: 1 Active compaction remaining time :n/a The command nodetool rebuild_index was launched several days ago. 2013/4/5 aaron morton aa...@thelastpickle.com but nothing's happening, how can i monitor the progress? and how can i know when it's finished? check nodetool compacitonstats Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 4/04/2013, at 2:51 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, I ran the command nodetool rebuild_index host keyspace cf on all the nodes, in the log i see : INFO [RMI TCP Connection(5422)-10.34.139.xxx] 2013-04-04 08:31:53,641 ColumnFamilyStore.java (line 558) User Requested secondary index re-build for ... but nothing's happening, how can i monitor the progress? and how can i know when it's finished? Thanks, 2013/4/2 aaron morton aa...@thelastpickle.com The problem come from that i don't put auto_boostrap to true for the new nodes, not in this documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) auto_bootstrap defaults to True if not specified in the yaml. can i do that at any time, or when the cluster are not loaded Not sure what the question is. Both those operations are online operations you can do while the node is processing requests. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 1/04/2013, at 9:26 PM, Kais Ahmed k...@neteck-fr.com wrote: At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? The errors was in my side in the application, not cassandra errors I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Thank you for that advice. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. The problem come from that i don't put auto_boostrap to true for the new nodes, not in this documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) if you are using secondary indexes use nodetool rebuild_index to rebuild those. can i do that at any time, or when the cluster are not loaded Thanks aaron, 2013/4/1 aaron morton aa...@thelastpickle.com Please do not rely on colour in your emails, the best way to get your emails accepted by the Apache mail servers is to use plain text. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Now the cluster seem to work normally, but i can use the secondary for the moment, the queryanswer are random run nodetool repair -pr on each node, let it finish before starting the next one. if you are using secondary indexes use nodetool rebuild_index to rebuild those. Add one node new node to the cluster and confirm everything is ok, then add the remaining ones. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 31/03/2013, at 10:01 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, Thanks for reply, i will try to explain what append exactly I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2 ami (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) with this config --clustername myDSCcluster --totalnodes 4--version community Two days after this cluster in production, i saw that the cluster was overload, I wanted to extend it by adding 3 another nodes. I create a new cluster with 3 C* [D,E,F] ( https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) And follow the documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) for adding them in the ring. I put for each of them seeds = A ip, and start each with two minutes intervals. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/ Moving -- Address Load Tokens Owns Host ID
Re: Lost data after expanding cluster c* 1.2.3-1
At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? The errors was in my side in the application, not cassandra errors I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Thank you for that advice. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. The problem come from that i don't put auto_boostrap to true for the new nodes, not in this documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) if you are using secondary indexes use nodetool rebuild_index to rebuild those. can i do that at any time, or when the cluster are not loaded Thanks aaron, 2013/4/1 aaron morton aa...@thelastpickle.com Please do not rely on colour in your emails, the best way to get your emails accepted by the Apache mail servers is to use plain text. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) What errors? I put for each of them seeds = A ip, and start each with two minutes intervals. When I'm making changes I tend to change a single node first, confirm everything is OK and then do a bulk change. Now the cluster seem to work normally, but i can use the secondary for the moment, the queryanswer are random run nodetool repair -pr on each node, let it finish before starting the next one. if you are using secondary indexes use nodetool rebuild_index to rebuild those. Add one node new node to the cluster and confirm everything is ok, then add the remaining ones. I'm not sure what or why it went wrong, but that should get you to a stable place. If you have any problems keep an eye on the logs for errors or warnings. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 31/03/2013, at 10:01 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi aaron, Thanks for reply, i will try to explain what append exactly I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2 ami (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) with this config --clustername myDSCcluster --totalnodes 4--version community Two days after this cluster in production, i saw that the cluster was overload, I wanted to extend it by adding 3 another nodes. I create a new cluster with 3 C* [D,E,F] ( https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) And follow the documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) for adding them in the ring. I put for each of them seeds = A ip, and start each with two minutes intervals. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/ Moving -- Address Load Tokens Owns Host ID Rack UN 10.34.142.xxx 10.79 GB 256 15.4% 4e2e26b8-aa38-428c-a8f5-e86c13eb4442 1b UN 10.32.49.xxx 1.48 MB25613.7% e86f67b6-d7cb-4b47-b090-3824a5887145 1b UN 10.33.206.xxx 2.19 MB25611.9% 92af17c3-954a-4511-bc90-29a9657623e4 1b UN 10.32.27.xxx 1.95 MB256 14.9% 862e6b39-b380-40b4-9d61-d83cb8dacf9e 1b UN 10.34.139.xxx 11.67 GB 25615.5% 0324e394-b65f-46c8-acb4-1e1f87600a2c 1b UN 10.34.147.xxx 11.18 GB 256 13.9% cfc09822-5446-4565-a5f0-d25c917e2ce8 1b UN 10.33.193.xxx 10.83 GB 256 14.7% 59f440db-cd2d-4041-aab4-fc8e9518c954 1b I saw that the 3 nodes have join the ring but they had no data, i put the website in maintenance and lauch a nodetool repair on the 3 new nodes, during 5 hours i see in opcenter the data streamed to the new nodes (very nice :)) During this time, i write a script to check if all members are present (relative to a copy of members in mysql). After data streamed seems to be finish, but i'm not sure because nodetool compactionstats show pending task but nodetool netstats seems to be ok. I ran my script to check if the data, but members are still missing. I decide to roolback by running nodetool decommission node D, E, F I re run my script, all seems to be ok but secondary index have strange behavior, some time the row was returned some times no result. the user kais can be retrieve using his key with cassandra-cli but if i use cqlsh : cqlsh:database SELECT login FROM userdata where login='kais' ; login kais cqlsh:database SELECT login FROM
Re: Lost data after expanding cluster c* 1.2.3-1
Hi aaron, Thanks for reply, i will try to explain what append exactly I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2 ami ( https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) with this config --clustername myDSCcluster --totalnodes 4--version community Two days after this cluster in production, i saw that the cluster was overload, I wanted to extend it by adding 3 another nodes. I create a new cluster with 3 C* [D,E,F] ( https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) And follow the documentation ( http://www.datastax.com/docs/1.2/install/expand_ami) for adding them in the ring. I put for each of them seeds = A ip, and start each with two minutes intervals. At this moment the errors started, we see that members and other data are gone, at this moment the nodetool status return (in red color the 3 new nodes) Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/ Moving -- Address Load Tokens Owns Host ID Rack UN 10.34.142.xxx 10.79 GB 256 15.4% 4e2e26b8-aa38-428c-a8f5-e86c13eb4442 1b UN 10.32.49.xxx 1.48 MB25613.7% e86f67b6-d7cb-4b47-b090-3824a5887145 1b UN 10.33.206.xxx 2.19 MB25611.9% 92af17c3-954a-4511-bc90-29a9657623e4 1b UN 10.32.27.xxx 1.95 MB256 14.9% 862e6b39-b380-40b4-9d61-d83cb8dacf9e 1b UN 10.34.139.xxx 11.67 GB 25615.5% 0324e394-b65f-46c8-acb4-1e1f87600a2c 1b UN 10.34.147.xxx 11.18 GB 256 13.9% cfc09822-5446-4565-a5f0-d25c917e2ce8 1b UN 10.33.193.xxx 10.83 GB 256 14.7% 59f440db-cd2d-4041-aab4-fc8e9518c954 1b I saw that the 3 nodes have join the ring but they had no data, i put the website in maintenance and lauch a nodetool repair on the 3 new nodes, during 5 hours i see in opcenter the data streamed to the new nodes (very nice :)) During this time, i write a script to check if all members are present (relative to a copy of members in mysql). After data streamed seems to be finish, but i'm not sure because nodetool compactionstats show pending task but nodetool netstats seems to be ok. I ran my script to check if the data, but members are still missing. I decide to roolback by running nodetool decommission node D, E, F I re run my script, all seems to be ok but secondary index have strange behavior, some time the row was returned some times no result. the user kais can be retrieve using his key with cassandra-cli but if i use cqlsh : cqlsh:database SELECT login FROM userdata where login='kais' ; login kais cqlsh:database SELECT login FROM userdata where login='kais' ; //empty cqlsh:database SELECT login FROM userdata where login='kais' ; login kais cqlsh:database SELECT login FROM userdata where login='kais' ; login kais cqlsh:database SELECT login FROM userdata where login='kais' ; //empty cqlsh:database SELECT login FROM userdata where login='kais' ; login kais cqlsh:mydatabaseTracing on; When tracing is activate i have this error but not all time cqlsh:mydatabase SELECT * FROM userdata where login='kais' ; unsupported operand type(s) for /: 'NoneType' and 'float' NOTE : When the cluster contained 7 nodes, i see that my table userdata (RF 3) on node D was replicated on E and F, that would seem strange because its 3 node was not correctly filled Now the cluster seem to work normally, but i can use the secondary for the moment, the query answer are random Thanks a lot for any help, Kais 2013/3/31 aaron morton aa...@thelastpickle.com First thought is the new nodes were marked as seeds. Next thought is check the logs for errors. You can always run a nodetool repair if you are concerned data is not where you think it should be. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 29/03/2013, at 8:01 PM, Kais Ahmed k...@neteck-fr.com wrote: Hi all, I follow this tutorial for expanding a 4 c* cluster (production) and add 3 new nodes. Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.34.142.xxx 10.79 GB 256 15.4% 4e2e26b8-aa38-428c-a8f5-e86c13eb4442 1b UN 10.32.49.xxx 1.48 MB25613.7% e86f67b6-d7cb-4b47-b090-3824a5887145 1b UN 10.33.206.xxx 2.19 MB25611.9% 92af17c3-954a-4511-bc90-29a9657623e4 1b UN 10.32.27.xxx 1.95 MB256 14.9% 862e6b39-b380-40b4-9d61-d83cb8dacf9e 1b UN 10.34.139.xxx 11.67 GB 25615.5% 0324e394-b65f-46c8-acb4-1e1f87600a2c 1b UN 10.34.147.xxx 11.18 GB 256 13.9% cfc09822-5446-4565-a5f0-d25c917e2ce8 1b UN 10.33.193.xxx 10.83 GB 256 14.7% 59f440db-cd2d-4041-aab4-fc8e9518c954 1b The data
Lost data after expanding cluster c* 1.2.3-1
Hi all, I follow this tutorial for expanding a 4 c* cluster (production) and add 3 new nodes. Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.34.142.xxx 10.79 GB 256 15.4% 4e2e26b8-aa38-428c-a8f5-e86c13eb4442 1b UN 10.32.49.xxx 1.48 MB25613.7% e86f67b6-d7cb-4b47-b090-3824a5887145 1b UN 10.33.206.xxx 2.19 MB25611.9% 92af17c3-954a-4511-bc90-29a9657623e4 1b UN 10.32.27.xxx 1.95 MB256 14.9% 862e6b39-b380-40b4-9d61-d83cb8dacf9e 1b UN 10.34.139.xxx 11.67 GB 25615.5% 0324e394-b65f-46c8-acb4-1e1f87600a2c 1b UN 10.34.147.xxx 11.18 GB 256 13.9% cfc09822-5446-4565-a5f0-d25c917e2ce8 1b UN 10.33.193.xxx 10.83 GB 256 14.7% 59f440db-cd2d-4041-aab4-fc8e9518c954 1b The data are not streamed. Can any one help me, our web site is down. Thanks a lot,