upgrade from cassandra 1.2.3 - 1.2.13 + start using SSL
Hi all, I would appreciate an advice whether is a good idea to upgrade from cassandra 1.2.3 to 1.2.13 and how to best proceed. The particular cluster consists of 3 nodes (each one in a different DC having 1 replica) with a relativelly low traffic and 10GB load per node. I am specifically interested whether is possible to upgrade just one node and keep it running like that for some time, i.e. if the gossip protocol is compatible in both directions. We are a bit afraid to upgrade all nodes to 1.2.13 at once in a case we would need to rollback. I know that the sstable format changed in 1.2.5 so in case we need to rollback, the newly written data would need to be synchronized from the old servers. Also, after the migration to 1.2.13, we would like to start using node-to-node encryption. I imagine that you need to configure it on all nodes at once, so it would require a small outage. Thank you in advance Jiri Horky
Re: Keyspaces on different volumes
I don't think Cassandra will complain if the cassandra/data/keyspace directory exists when you create keyspace, so you can just create your symlinks first and move on. Don't have to do the start C*, create keyspace, stop C*, move directory dance. Other than that, I would probably just directly mount my volumes into the cassandra/data/keyspace directories directly instead of using symlinks, but if you're probably fine with symlinks if you really prefer. -- Sylvain On Tue, Jan 7, 2014 at 4:05 PM, Robert Wille rwi...@fold3.com wrote: The obvious (but painful) way to do that would be to create the keyspace, and then repeat the following for each node: shut down the node, move cassandra/data/keyspace to the other volume, create a symlink in its place, restart the node. Is there a better way? Robert From: Tupshin Harper tups...@tupshin.com Reply-To: user@cassandra.apache.org Date: Tuesday, January 7, 2014 at 6:07 AM To: user@cassandra.apache.org Subject: Re: Keyspaces on different volumes That is a fine option and can make perfect sense if you have keyspaces with very different runtime characteristics. -Tupshin On Jan 7, 2014 7:30 AM, Robert Wille rwi...@fold3.com wrote: I’d like to have my keyspaces on different volumes, so that some can be on SSD and others on spinning disk. Is such a thing possible or advisable?
Re: nodetool cleanup / TTL
Is there some other mechanism for forcing expired data to be removed without also compacting? (major compaction having obvious problematic side effects, and user defined compaction being significant work to script up). Online scrubs will, as a side effect, purge expired tombstones *when possible* (even expired data cannot be removed if it possibly overwrite some older data in some other sstable than the one scubbed). Please don't take that as me saying that this is a guarantee of scrub: it is just one of its current implementation side effect and it might very well change tomorrow. -- Sylvain
OOM after some days related to RunnableScheduledFuture and meter persistance
Hi, On linux and cassandra version 2.0.2 I had an OOM after a heavy load and then some (15 ) days of idle running (not exactly idle but very very low activity). Two out of a 4 machine cluster had this OOM. I checked the heap dump (9GB) and that tells me : One instance of java.util.concurrent.ScheduledThreadPoolExecutor loaded by system class loader occupies 8.927.175.368 (94,53%) bytes. The instance is referenced by org.apache.cassandra.io.sstable.SSTableReader @ 0x7fadf89e0 , loaded by sun.misc.Launcher$AppClassLoader @ 0x683e6ad30. The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[] loaded by system class loader. So I checked the SSTableReader instance and found out the 'ScheduledThreadPoolExecutor syncExecutor ' object is holding about 600k of ScheduledFutureTasks. According to the code on SSTableReader these tasks must have been created by the code line syncExecutor.scheduleAtFixedRate. That means that none of these tasks ever get scheduled because some (and only one) initial task is probably blocking. But then again, the one thread to execute these tasks, seems to be in a 'normal' state (at time of OOM) and is executing with a stack trace pasted below : Thread 0x696777eb8 at org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/AtomicSortedColumns; (AtomicSortedColumns.java:58) at org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/ColumnFamily; (AtomicSortedColumns.java:55) at org.apache.cassandra.db.ColumnFamily.cloneMeShallow(Lorg/apache/cassandra/db/ColumnFamily$Factory;Z)Lorg/apache/cassandra/db/ColumnFamily; (ColumnFamily.java:70) at org.apache.cassandra.db.Memtable.resolve(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (Memtable.java:187) at org.apache.cassandra.db.Memtable.put(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (ColumnFamilyStore.java:840) at org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;ZZ)V (Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;Z)V (Keyspace.java:338) at org.apache.cassandra.db.RowMutation.apply()V (RowMutation.java:201) at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(Lorg/apache/cassandra/service/QueryState;)Lorg/apache/cassandra/transport/messages/ResultMessage; (ModificationStatement.java:477) at org.apache.cassandra.cql3.QueryProcessor.processInternal(Ljava/lang/String;)Lorg/apache/cassandra/cql3/UntypedResultSet; (QueryProcessor.java:178) at org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(Ljava/lang/String;Ljava/lang/String;ILorg/apache/cassandra/metrics/RestorableMeter;)V (SystemKeyspace.java:938) at org.apache.cassandra.io.sstable.SSTableReader$2.run()V (SSTableReader.java:342) at java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; (Executors.java:471) at java.util.concurrent.FutureTask.runAndReset()Z (FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Ljava/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask;)Z (ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V (ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run()V (ThreadPoolExecutor.java:615) at java.lang.Thread.run()V (Thread.java:724) Since each of these tasks are throttled by meterSyncThrottle.acquire() I suspect that the RateLimiter is causing a delay. The RateLimiter instance attributes are : Type|Name|Value long|nextFreeTicketMicros|3016022567383 double|maxPermits|100.0 double|storedPermits|99.0 long|offsetNanos|334676357831746 I guess that these attributes will practically result in a blocking behavior, resulting in the OOM ... Is there someone that can make sense out of it? I hope this helps in finding out what the reason is for this and maybe could be avoided in the future. I still have the heap dump, so I can always pass more information if needed. Regards, Ignace Desimpel
[JOB] - Full time opportunity in San Francisco bay area
We have a full time perm opportunity with a reputable client in the San Francisco bay area. Looking for good Cassandra and Java/ J2EE skills. Should you be interested, please reply with your resume. Will call to discuss more. Thanks, * _* Gnani Balaraman | EthicalSoft, Inc. 2570 N 1st St, Ste 200, San Jose CA 95131 | ( (408) 329-0351 * gn...@ethicalsoft.com gn...@intelliswift.com
nodetool repair stalled
Hi, I have two nodes with Cassandra 2.0.3, where repair sessions hang for an undefinite time. I'm running nodetool repair once a week on every node, on different days. Currently I have like 4 repair sessions running on each node, one since 3 weeks and none has finished. Reading the logs I didn't find any exception, apparently one of the repair session got stuck at this command: INFO [AntiEntropySessions:10] 2014-01-05 01:00:02,804 RepairJob.java (line 116) [repair #5385ea40-759c-11e3-93dc-a1357a0d9222] requesting merkle trees for events (to [/10.255.235.19, /10.255.235.18]) Has anybody any suggestion on why a nodetool repair might be stuck and how to debug it? Regards, Paolo Crosato
Re: nodetool repair stalled
On Wed, Jan 8, 2014 at 8:52 AM, Paolo Crosato paolo.cros...@targaubiest.com wrote: I have two nodes with Cassandra 2.0.3, where repair sessions hang for an undefinite time. I'm running nodetool repair once a week on every node, on different days. Currently I have like 4 repair sessions running on each node, one since 3 weeks and none has finished. Reading the logs I didn't find any exception, apparently one of the repair session got stuck at this command: Has anybody any suggestion on why a nodetool repair might be stuck and how to debug it? Cassandra repair has never quite worked right. It got a wholesale re-write in 2.0.x and should be more robust and at very least log more than before. But unfortunately I have heard a few reports like yours, so it is probably not completely fixed. That said, that only option you have for failed repairs seems to be to restart the affected nodes. Your input as an operator of 2.0.x who would appreciate an alternative is welcome at : https://issues.apache.org/jira/browse/CASSANDRA-3486 =Rob
Re: upgrade from cassandra 1.2.3 - 1.2.13 + start using SSL
On Wed, Jan 8, 2014 at 1:17 AM, Jiri Horky ho...@avast.com wrote: I am specifically interested whether is possible to upgrade just one node and keep it running like that for some time, i.e. if the gossip protocol is compatible in both directions. We are a bit afraid to upgrade all nodes to 1.2.13 at once in a case we would need to rollback. This not not officially supported. It will probably work for these particular versions, but it is not recommended. The most serious potential issue is an inability to replace the new node if it fails. There's also the problem of not being able to repair until you're back on the same versions. And other, similar, undocumented edge cases... =Rob
Re: OOM after some days related to RunnableScheduledFuture and meter persistance
I believe this is https://issues.apache.org/jira/browse/CASSANDRA-6358, which was fixed in 2.0.3. On Wed, Jan 8, 2014 at 7:15 AM, Desimpel, Ignace ignace.desim...@nuance.com wrote: Hi, On linux and cassandra version 2.0.2 I had an OOM after a heavy load and then some (15 ) days of idle running (not exactly idle but very very low activity). Two out of a 4 machine cluster had this OOM. I checked the heap dump (9GB) and that tells me : One instance of *java.util.concurrent.ScheduledThreadPoolExecutor*loaded by *system class loader* occupies *8.927.175.368 (94,53%)* bytes. The instance is referenced by *org.apache.cassandra.io.sstable.SSTableReader @ 0x7fadf89e0* , loaded by *sun.misc.Launcher$AppClassLoader @ 0x683e6ad30*. The memory is accumulated in one instance of *java.util.concurrent.RunnableScheduledFuture[]* loaded by *system class loader*. So I checked the SSTableReader instance and found out the ‘ScheduledThreadPoolExecutor syncExecutor ‘ object is holding about 600k of ScheduledFutureTasks. According to the code on SSTableReader these tasks must have been created by the code line syncExecutor.scheduleAtFixedRate. That means that none of these tasks ever get scheduled because some (and only one) initial task is probably blocking. But then again, the one thread to execute these tasks, seems to be in a ‘normal’ state (at time of OOM) and is executing with a stack trace pasted below : Thread 0x696777eb8 at org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/AtomicSortedColumns; (AtomicSortedColumns.java:58) at org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/ColumnFamily; (AtomicSortedColumns.java:55) at org.apache.cassandra.db.ColumnFamily.cloneMeShallow(Lorg/apache/cassandra/db/ColumnFamily$Factory;Z)Lorg/apache/cassandra/db/ColumnFamily; (ColumnFamily.java:70) at org.apache.cassandra.db.Memtable.resolve(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (Memtable.java:187) at org.apache.cassandra.db.Memtable.put(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (ColumnFamilyStore.java:840) at org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;ZZ)V (Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;Z)V (Keyspace.java:338) at org.apache.cassandra.db.RowMutation.apply()V (RowMutation.java:201) at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(Lorg/apache/cassandra/service/QueryState;)Lorg/apache/cassandra/transport/messages/ResultMessage; (ModificationStatement.java:477) at org.apache.cassandra.cql3.QueryProcessor.processInternal(Ljava/lang/String;)Lorg/apache/cassandra/cql3/UntypedResultSet; (QueryProcessor.java:178) at org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(Ljava/lang/String;Ljava/lang/String;ILorg/apache/cassandra/metrics/RestorableMeter;)V (SystemKeyspace.java:938) at org.apache.cassandra.io.sstable.SSTableReader$2.run()V (SSTableReader.java:342) at java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; (Executors.java:471) at java.util.concurrent.FutureTask.runAndReset()Z (FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Ljava/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask;)Z (ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V (ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run()V (ThreadPoolExecutor.java:615) at java.lang.Thread.run()V (Thread.java:724) Since each of these tasks are throttled by meterSyncThrottle.acquire() I suspect that the RateLimiter is causing a delay. The RateLimiter instance attributes are : Type|Name|Value long|nextFreeTicketMicros|3016022567383 double|maxPermits|100.0 double|storedPermits|99.0 long|offsetNanos|334676357831746 I guess that these attributes will practically result in a blocking behavior, resulting in the OOM … Is there someone that can make sense out of it? I hope this helps in finding out what the reason is for this and maybe could be avoided in the future. I still have the heap dump, so
Re: Gotchas when creating a lot of tombstones
On Wed, Jan 1, 2014 at 7:53 AM, Robert Wille rwi...@fold3.com wrote: Also, for this application, it would be quite reasonable to set gc grace seconds to 0 for these tables. Zombie data wouldn’t really be a problem. The background process that cleans up orphaned browse structures would simply re-delete any deleted data that reappeared. If you can set gc grace to 0, that will basically eliminate your tombstone concerns entirely, so I would suggest that. -- Tyler Hobbs DataStax http://datastax.com/
Re: cassandra monitoring
Install Errored: Failure installing agent on beta.jokefire.com. Error output: /var/lib/opscenter/ssl/agentKeyStore.pem: No such file or directory Exit code: 1 This indicates that there was a problem generating ssl files when OpsCenter first started up. I would check the log around the first time you started opscenter for errors. Another option would be to disable ssl communication between OpsCenter and the agents. http://www.datastax.com/documentation/opscenter/4.0/webhelp/index.html#opsc/configure/opscConfigSSL_g.html I was wondering where I could go from here. Also I would like to password protect my OpsCenter installation (assuming I can ever get any useful data into it). Are there any docs on how I can do that? http://www.datastax.com/documentation/opscenter/4.0/webhelp/index.html#opsc/configure/opscConfigureUserAccess_c.html#opscConfigureUserAccess
Re: Gotchas when creating a lot of tombstones
With Level compaction, you will have some data which could not be reclaimed with gc grace=0 because it has not compacted yet. For this you might want to look at tombstone_threshold On Wed, Jan 8, 2014 at 10:31 AM, Tyler Hobbs ty...@datastax.com wrote: On Wed, Jan 1, 2014 at 7:53 AM, Robert Wille rwi...@fold3.com wrote: Also, for this application, it would be quite reasonable to set gc grace seconds to 0 for these tables. Zombie data wouldn’t really be a problem. The background process that cleans up orphaned browse structures would simply re-delete any deleted data that reappeared. If you can set gc grace to 0, that will basically eliminate your tombstone concerns entirely, so I would suggest that. -- Tyler Hobbs DataStax http://datastax.com/
Latest Stable version of cassandra in production
Hi all, What is the latest stable version of cassandra you have in production ? We are migrating a large chunk of our mysql database to cassandra. I see a lot of discussions regarding 1.* versions, but I have not seen / could not find discussions regarding using 2.* versions in production. Any suggestions for the version based on your experience? - Sanjeeth
Re: nodetool repair stalled
Hi, Can you attach the logs around repair. Please do that for node which triggered it and nodes involved in repair. I will try to find something useful. Thanks, Sankalp On Wed, Jan 8, 2014 at 10:18 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jan 8, 2014 at 8:52 AM, Paolo Crosato paolo.cros...@targaubiest.com wrote: I have two nodes with Cassandra 2.0.3, where repair sessions hang for an undefinite time. I'm running nodetool repair once a week on every node, on different days. Currently I have like 4 repair sessions running on each node, one since 3 weeks and none has finished. Reading the logs I didn't find any exception, apparently one of the repair session got stuck at this command: Has anybody any suggestion on why a nodetool repair might be stuck and how to debug it? Cassandra repair has never quite worked right. It got a wholesale re-write in 2.0.x and should be more robust and at very least log more than before. But unfortunately I have heard a few reports like yours, so it is probably not completely fixed. That said, that only option you have for failed repairs seems to be to restart the affected nodes. Your input as an operator of 2.0.x who would appreciate an alternative is welcome at : https://issues.apache.org/jira/browse/CASSANDRA-3486 =Rob