upgrade from cassandra 1.2.3 - 1.2.13 + start using SSL

2014-01-08 Thread Jiri Horky
Hi all,

I would appreciate an advice whether is a good idea to upgrade from
cassandra 1.2.3 to 1.2.13 and how to best proceed. The particular
cluster consists of 3 nodes (each one in a different DC having 1
replica) with a relativelly low traffic and 10GB load per node.

I am specifically interested whether is possible to upgrade just one
node and keep it running like that for some time, i.e. if the gossip
protocol is compatible in both directions. We are a bit afraid to
upgrade all nodes to 1.2.13 at once in a case we would need to rollback.

I know that the sstable format changed in 1.2.5 so in case we need to
rollback, the newly written data would need to be synchronized from the
old servers.

Also, after the migration to 1.2.13, we would like to start using
node-to-node encryption. I imagine that you need to configure it on all
nodes at once, so it would require a small outage.

Thank you in advance
Jiri Horky



Re: Keyspaces on different volumes

2014-01-08 Thread Sylvain Lebresne
I don't think Cassandra will complain if the cassandra/data/keyspace
directory exists when you create keyspace, so you can just create your
symlinks first and move on. Don't have to do the start C*, create keyspace,
stop C*, move directory dance.

Other than that, I would probably just directly mount my volumes into the
cassandra/data/keyspace directories directly instead of using symlinks,
but if you're probably fine with symlinks if you really prefer.

--
Sylvain




On Tue, Jan 7, 2014 at 4:05 PM, Robert Wille rwi...@fold3.com wrote:

 The obvious (but painful) way to do that would be to create the keyspace,
 and then repeat the following for each node: shut down the node, move
 cassandra/data/keyspace to the other volume, create a symlink in its
 place, restart the node.

 Is there a better way?

 Robert

 From: Tupshin Harper tups...@tupshin.com
 Reply-To: user@cassandra.apache.org
 Date: Tuesday, January 7, 2014 at 6:07 AM
 To: user@cassandra.apache.org
 Subject: Re: Keyspaces on different volumes

 That is a fine option and can make perfect sense if you have keyspaces
 with very different runtime characteristics.

 -Tupshin
 On Jan 7, 2014 7:30 AM, Robert Wille rwi...@fold3.com wrote:

 I’d like to have my keyspaces on different volumes, so that some can be
 on SSD and others on spinning disk. Is such a thing possible or advisable?




Re: nodetool cleanup / TTL

2014-01-08 Thread Sylvain Lebresne


 Is there some other mechanism for forcing expired data to be removed
 without also compacting? (major compaction having obvious problematic side
 effects, and user defined compaction being significant work to script up).


Online scrubs will, as a side effect, purge expired tombstones *when
possible* (even expired data cannot be removed if it possibly overwrite
some older data in some other sstable than the one scubbed). Please don't
take that as me saying that this is a guarantee of scrub: it is just one
of its current implementation side effect and it might very well change
tomorrow.

--
Sylvain


OOM after some days related to RunnableScheduledFuture and meter persistance

2014-01-08 Thread Desimpel, Ignace
Hi,

On linux and cassandra version 2.0.2 I had an OOM after a heavy load and then 
some (15 ) days of idle running (not exactly idle but very very low activity).
Two out of a 4 machine cluster had this OOM.

I checked the heap dump (9GB) and that tells me :

One instance of java.util.concurrent.ScheduledThreadPoolExecutor loaded by 
system class loader occupies 8.927.175.368 (94,53%) bytes. The instance is 
referenced by org.apache.cassandra.io.sstable.SSTableReader @ 0x7fadf89e0 , 
loaded by sun.misc.Launcher$AppClassLoader @ 0x683e6ad30. The memory is 
accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[] 
loaded by system class loader.

So I checked the SSTableReader instance and found out the 
'ScheduledThreadPoolExecutor syncExecutor ' object is holding about 600k of 
ScheduledFutureTasks.
According to the code on SSTableReader these tasks must have been created by 
the code line syncExecutor.scheduleAtFixedRate. That means that none of these 
tasks ever get scheduled because some (and only one) initial task is probably 
blocking.
But then again, the one thread to execute these tasks, seems to be in a 
'normal' state (at time of OOM) and is executing with a stack trace pasted 
below :

Thread 0x696777eb8
  at 
org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/AtomicSortedColumns;
 (AtomicSortedColumns.java:58)
  at 
org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/ColumnFamily;
 (AtomicSortedColumns.java:55)
  at 
org.apache.cassandra.db.ColumnFamily.cloneMeShallow(Lorg/apache/cassandra/db/ColumnFamily$Factory;Z)Lorg/apache/cassandra/db/ColumnFamily;
 (ColumnFamily.java:70)
  at 
org.apache.cassandra.db.Memtable.resolve(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V
 (Memtable.java:187)
  at 
org.apache.cassandra.db.Memtable.put(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V
 (Memtable.java:158)
  at 
org.apache.cassandra.db.ColumnFamilyStore.apply(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V
 (ColumnFamilyStore.java:840)
  at 
org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;ZZ)V
 (Keyspace.java:373)
  at 
org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;Z)V 
(Keyspace.java:338)
  at org.apache.cassandra.db.RowMutation.apply()V (RowMutation.java:201)
  at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(Lorg/apache/cassandra/service/QueryState;)Lorg/apache/cassandra/transport/messages/ResultMessage;
 (ModificationStatement.java:477)
  at 
org.apache.cassandra.cql3.QueryProcessor.processInternal(Ljava/lang/String;)Lorg/apache/cassandra/cql3/UntypedResultSet;
 (QueryProcessor.java:178)
  at 
org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(Ljava/lang/String;Ljava/lang/String;ILorg/apache/cassandra/metrics/RestorableMeter;)V
 (SystemKeyspace.java:938)
  at org.apache.cassandra.io.sstable.SSTableReader$2.run()V 
(SSTableReader.java:342)
  at java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; 
(Executors.java:471)
  at java.util.concurrent.FutureTask.runAndReset()Z (FutureTask.java:304)
  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Ljava/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask;)Z
 (ScheduledThreadPoolExecutor.java:178)
  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V 
(ScheduledThreadPoolExecutor.java:293)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
 (ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run()V 
(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run()V (Thread.java:724)


Since each of these tasks are throttled by meterSyncThrottle.acquire() I 
suspect that the RateLimiter is causing a delay. The RateLimiter instance 
attributes are :
Type|Name|Value
long|nextFreeTicketMicros|3016022567383
double|maxPermits|100.0
double|storedPermits|99.0
long|offsetNanos|334676357831746

I guess that these attributes will practically result in a blocking behavior, 
resulting in the OOM ...

Is there someone that can make sense out of it?
I hope this helps in finding out what the reason is for this and maybe could be 
avoided in the future. I still have the heap dump, so I can always pass more 
information if needed.

Regards,

Ignace Desimpel


[JOB] - Full time opportunity in San Francisco bay area

2014-01-08 Thread Gnani Balaraman
We have a full time perm opportunity with a reputable client in the San
Francisco bay area. Looking for good Cassandra and Java/ J2EE skills.



Should you be interested, please reply with your resume. Will call to
discuss more.



Thanks,

*
  _*

Gnani Balaraman | EthicalSoft, Inc.
2570 N 1st St, Ste 200, San Jose CA 95131 | ( (408) 329-0351
* gn...@ethicalsoft.com gn...@intelliswift.com


nodetool repair stalled

2014-01-08 Thread Paolo Crosato

Hi,

I have two nodes with Cassandra 2.0.3, where repair sessions hang for an 
undefinite time. I'm running nodetool repair once a week on every node, 
on different days. Currently I have like 4 repair sessions running on 
each node, one since 3 weeks and none has finished.
Reading the logs I didn't find any exception, apparently one of the 
repair session got stuck at this command:


 INFO [AntiEntropySessions:10] 2014-01-05 01:00:02,804 RepairJob.java 
(line 116) [repair #5385ea40-759c-11e3-93dc-a1357a0d9222] requesting 
merkle trees for events (to [/10.255.235.19, /10.255.235.18])


Has anybody any suggestion on why a nodetool repair might be stuck and 
how to debug it?


Regards,

Paolo Crosato



Re: nodetool repair stalled

2014-01-08 Thread Robert Coli
On Wed, Jan 8, 2014 at 8:52 AM, Paolo Crosato paolo.cros...@targaubiest.com
 wrote:

 I have two nodes with Cassandra 2.0.3, where repair sessions hang for an
 undefinite time. I'm running nodetool repair once a week on every node, on
 different days. Currently I have like 4 repair sessions running on each
 node, one since 3 weeks and none has finished.
 Reading the logs I didn't find any exception, apparently one of the repair
 session got stuck at this command:

 Has anybody any suggestion on why a nodetool repair might be stuck and how
 to debug it?


Cassandra repair has never quite worked right. It got a wholesale re-write
in 2.0.x and should be more robust and at very least log more than
before. But unfortunately I have heard a few reports like yours, so it is
probably not completely fixed.

That said, that only option you have for failed repairs seems to be to
restart the affected nodes. Your input as an operator of 2.0.x who would
appreciate an alternative is welcome at :

https://issues.apache.org/jira/browse/CASSANDRA-3486

=Rob


Re: upgrade from cassandra 1.2.3 - 1.2.13 + start using SSL

2014-01-08 Thread Robert Coli
On Wed, Jan 8, 2014 at 1:17 AM, Jiri Horky ho...@avast.com wrote:

 I am specifically interested whether is possible to upgrade just one
 node and keep it running like that for some time, i.e. if the gossip
 protocol is compatible in both directions. We are a bit afraid to
 upgrade all nodes to 1.2.13 at once in a case we would need to rollback.


This not not officially supported. It will probably work for these
particular versions, but it is not recommended.

The most serious potential issue is an inability to replace the new node if
it fails. There's also the problem of not being able to repair until you're
back on the same versions. And other, similar, undocumented edge cases...

=Rob


Re: OOM after some days related to RunnableScheduledFuture and meter persistance

2014-01-08 Thread Tyler Hobbs
I believe this is https://issues.apache.org/jira/browse/CASSANDRA-6358,
which was fixed in 2.0.3.


On Wed, Jan 8, 2014 at 7:15 AM, Desimpel, Ignace ignace.desim...@nuance.com
 wrote:

  Hi,



 On linux and cassandra version 2.0.2 I had an OOM after a heavy load and
 then some (15 ) days of idle running (not exactly idle but very very low
 activity).

 Two out of a 4 machine cluster had this OOM.



 I checked the heap dump (9GB) and that tells me :



 One instance of *java.util.concurrent.ScheduledThreadPoolExecutor*loaded by 
 *system
 class loader* occupies *8.927.175.368 (94,53%)* bytes. The instance is
 referenced by *org.apache.cassandra.io.sstable.SSTableReader @
 0x7fadf89e0* , loaded by *sun.misc.Launcher$AppClassLoader @
 0x683e6ad30*. The memory is accumulated in one instance of
 *java.util.concurrent.RunnableScheduledFuture[]* loaded by *system
 class loader*.



 So I checked the SSTableReader instance and found out the
 ‘ScheduledThreadPoolExecutor syncExecutor ‘ object is holding about 600k of
 ScheduledFutureTasks.

 According to the code on SSTableReader these tasks must have been created
 by the code line syncExecutor.scheduleAtFixedRate. That means that none of
 these tasks ever get scheduled because some (and only one) initial task is
 probably blocking.

 But then again, the one thread to execute these tasks, seems to be in a
 ‘normal’ state (at time of OOM) and is executing with a stack trace pasted
 below :



 Thread 0x696777eb8

   at
 org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/AtomicSortedColumns;
 (AtomicSortedColumns.java:58)

   at
 org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/ColumnFamily;
 (AtomicSortedColumns.java:55)

   at
 org.apache.cassandra.db.ColumnFamily.cloneMeShallow(Lorg/apache/cassandra/db/ColumnFamily$Factory;Z)Lorg/apache/cassandra/db/ColumnFamily;
 (ColumnFamily.java:70)

   at
 org.apache.cassandra.db.Memtable.resolve(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V
 (Memtable.java:187)

   at
 org.apache.cassandra.db.Memtable.put(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V
 (Memtable.java:158)

   at
 org.apache.cassandra.db.ColumnFamilyStore.apply(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V
 (ColumnFamilyStore.java:840)

   at
 org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;ZZ)V
 (Keyspace.java:373)

   at
 org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;Z)V
 (Keyspace.java:338)

   at org.apache.cassandra.db.RowMutation.apply()V (RowMutation.java:201)

   at
 org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(Lorg/apache/cassandra/service/QueryState;)Lorg/apache/cassandra/transport/messages/ResultMessage;
 (ModificationStatement.java:477)

   at
 org.apache.cassandra.cql3.QueryProcessor.processInternal(Ljava/lang/String;)Lorg/apache/cassandra/cql3/UntypedResultSet;
 (QueryProcessor.java:178)

   at
 org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(Ljava/lang/String;Ljava/lang/String;ILorg/apache/cassandra/metrics/RestorableMeter;)V
 (SystemKeyspace.java:938)

   at org.apache.cassandra.io.sstable.SSTableReader$2.run()V
 (SSTableReader.java:342)

   at
 java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;
 (Executors.java:471)

   at java.util.concurrent.FutureTask.runAndReset()Z (FutureTask.java:304)

   at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Ljava/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask;)Z
 (ScheduledThreadPoolExecutor.java:178)

   at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V
 (ScheduledThreadPoolExecutor.java:293)

   at
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
 (ThreadPoolExecutor.java:1145)

   at java.util.concurrent.ThreadPoolExecutor$Worker.run()V
 (ThreadPoolExecutor.java:615)

   at java.lang.Thread.run()V (Thread.java:724)





 Since each of these tasks are throttled by meterSyncThrottle.acquire() I
 suspect that the RateLimiter is causing a delay. The RateLimiter instance
 attributes are :

 Type|Name|Value

 long|nextFreeTicketMicros|3016022567383

 double|maxPermits|100.0

 double|storedPermits|99.0

 long|offsetNanos|334676357831746



 I guess that these attributes will practically result in a blocking
 behavior, resulting in the OOM …



 Is there someone that can make sense out of it?

 I hope this helps in finding out what the reason is for this and maybe
 could be avoided in the future. I still have the heap dump, so 

Re: Gotchas when creating a lot of tombstones

2014-01-08 Thread Tyler Hobbs
On Wed, Jan 1, 2014 at 7:53 AM, Robert Wille rwi...@fold3.com wrote:


 Also, for this application, it would be quite reasonable to set gc grace
 seconds to 0 for these tables. Zombie data wouldn’t really be a problem.
 The background process that cleans up orphaned browse structures would
 simply re-delete any deleted data that reappeared.


If you can set gc grace to 0, that will basically eliminate your tombstone
concerns entirely, so I would suggest that.


-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: cassandra monitoring

2014-01-08 Thread Nick Bailey
 Install Errored: Failure installing agent on beta.jokefire.com. Error
 output: /var/lib/opscenter/ssl/agentKeyStore.pem: No such file or directory
 Exit code: 1


This indicates that there was a problem generating ssl files when OpsCenter
first started up. I would check the log around the first time you started
opscenter for errors. Another option would be to disable ssl communication
between OpsCenter and the agents.

http://www.datastax.com/documentation/opscenter/4.0/webhelp/index.html#opsc/configure/opscConfigSSL_g.html



 I was wondering where I could go from here. Also I would like to password
 protect my OpsCenter installation (assuming I can ever get any useful data
 into it). Are there any docs on how I can do that?


http://www.datastax.com/documentation/opscenter/4.0/webhelp/index.html#opsc/configure/opscConfigureUserAccess_c.html#opscConfigureUserAccess


Re: Gotchas when creating a lot of tombstones

2014-01-08 Thread sankalp kohli
With Level compaction, you will have some data which could not be reclaimed
with gc grace=0 because it has not compacted yet. For this you might want
to look at tombstone_threshold


On Wed, Jan 8, 2014 at 10:31 AM, Tyler Hobbs ty...@datastax.com wrote:


 On Wed, Jan 1, 2014 at 7:53 AM, Robert Wille rwi...@fold3.com wrote:


 Also, for this application, it would be quite reasonable to set gc grace
 seconds to 0 for these tables. Zombie data wouldn’t really be a problem.
 The background process that cleans up orphaned browse structures would
 simply re-delete any deleted data that reappeared.


 If you can set gc grace to 0, that will basically eliminate your tombstone
 concerns entirely, so I would suggest that.


 --
 Tyler Hobbs
 DataStax http://datastax.com/



Latest Stable version of cassandra in production

2014-01-08 Thread Sanjeeth Kumar
Hi all,
  What is the latest stable version of cassandra  you have in production ?
We are migrating a large chunk of our mysql database to cassandra. I see a
lot of discussions regarding 1.* versions, but I have not seen / could not
find discussions regarding using 2.* versions in production. Any
suggestions for the version based on your experience?

- Sanjeeth


Re: nodetool repair stalled

2014-01-08 Thread sankalp kohli
Hi,
Can you attach the logs around repair. Please do that for node which
triggered it and nodes involved in repair. I will try to find something
useful.

Thanks,
Sankalp


On Wed, Jan 8, 2014 at 10:18 AM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Jan 8, 2014 at 8:52 AM, Paolo Crosato 
 paolo.cros...@targaubiest.com wrote:

 I have two nodes with Cassandra 2.0.3, where repair sessions hang for an
 undefinite time. I'm running nodetool repair once a week on every node, on
 different days. Currently I have like 4 repair sessions running on each
 node, one since 3 weeks and none has finished.
 Reading the logs I didn't find any exception, apparently one of the
 repair session got stuck at this command:

 Has anybody any suggestion on why a nodetool repair might be stuck and
 how to debug it?


 Cassandra repair has never quite worked right. It got a wholesale re-write
 in 2.0.x and should be more robust and at very least log more than
 before. But unfortunately I have heard a few reports like yours, so it is
 probably not completely fixed.

 That said, that only option you have for failed repairs seems to be to
 restart the affected nodes. Your input as an operator of 2.0.x who would
 appreciate an alternative is welcome at :

  https://issues.apache.org/jira/browse/CASSANDRA-3486

 =Rob