Re: replaced node keeps returning in gossip

2012-10-19 Thread Thomas van Neerijnen
Hi

When I sent the mail I'd had the new node on for about an hour, the old
node died about an hour before that.
The weirdness in the log files stopped yesterday afternoon, about 4 or 5
hours after I replaced the node so it seems to have resolved itself.
Seeing as there's no error to look at in the log files not sure if you
still want the output of my gossipinfo but I've pasted it below anyway.
Thanks!

/10.16.96.212
  LOAD:7.8018521345E10
  RPC_ADDRESS:0.0.0.0
  RELEASE_VERSION:1.0.11
  STATUS:NORMAL,113427455640312821154458202477256070484
  SCHEMA:9b152e00-fd90-11e1--2d22988ca597
/10.16.128.211
  LOAD:7.8416250275E10
  RPC_ADDRESS:0.0.0.0
  RELEASE_VERSION:1.0.11
  STATUS:NORMAL,85070591730234615865843651857942052863
  SCHEMA:9b152e00-fd90-11e1--2d22988ca597
/10.16.32.210
  LOAD:1.29054735121E11
  RPC_ADDRESS:0.0.0.0
  RELEASE_VERSION:1.0.11
  STATUS:NORMAL,56713727820156407428984779325531226112
  SCHEMA:9b152e00-fd90-11e1--2d22988ca597
/10.16.32.211
  LOAD:7.2937725831E10
  RPC_ADDRESS:0.0.0.0
  RELEASE_VERSION:1.0.11
  STATUS:NORMAL,141784319550391032739561396922763706368
  SCHEMA:9b152e00-fd90-11e1--2d22988ca597
ip-10-16-128-197.localdomain/10.16.128.197
  LOAD:6.5571879526E10
  RPC_ADDRESS:0.0.0.0
  RELEASE_VERSION:1.0.11
  STATUS:NORMAL,0
  SCHEMA:9b152e00-fd90-11e1--2d22988ca597
/10.16.96.211
  LOAD:1.0633383453E11
  RPC_ADDRESS:0.0.0.0
  RELEASE_VERSION:1.0.11
  STATUS:NORMAL,28356863910078203714492389662765613056
  SCHEMA:9b152e00-fd90-11e1--2d22988ca597


On Fri, Oct 19, 2012 at 2:56 AM, aaron morton aa...@thelastpickle.comwrote:

 I replaced it with a new node, IP 10.16.128.197 and again token 0 with a
 -Dcassandra.replace_token=0 at startup

 Good Good.

 How long ago did you bring the new node on ? There is a fail safe to
 remove 128.210 after 3 days if it does not gossip to other nodes.

 I *thought* that remove_token would remove the old IP from the ring. Can
 you post the output from nodetool gossipinfo from the 128.197 node ?

 Thanks

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 19/10/2012, at 2:44 AM, Thomas van Neerijnen t...@bossastudios.com
 wrote:

 Hi all

 I'm running Cassandra 1.0.11 on Ubuntu 11.10.

 I've got a ghost node which keeps showing up on my ring.

 A node living on IP 10.16.128.210 and token 0 died and had to be replaced.
 I replaced it with a new node, IP 10.16.128.197 and again token 0 with a
 -Dcassandra.replace_token=0 at startup. This all went well but now I'm
 seeing the following weirdness constantly reported in the log files around
 the ring:

  INFO [GossipTasks:1] 2012-10-18 13:39:22,441 Gossiper.java (line 632)
 FatClient /10.16.128.210 has been silent for 3ms, removing from gossip
  INFO [GossipStage:1] 2012-10-18 13:40:25,933 Gossiper.java (line 838)
 Node /10.16.128.210 is now part of the cluster
  INFO [GossipStage:1] 2012-10-18 13:40:25,934 Gossiper.java (line 804)
 InetAddress /10.16.128.210 is now UP
  INFO [GossipStage:1] 2012-10-18 13:40:25,937 StorageService.java (line
 1017) Nodes /10.16.128.210 and /10.16.128.197 have the same token 0.
 Ignoring /10.16.128.210
  INFO [GossipTasks:1] 2012-10-18 13:40:37,509 Gossiper.java (line 818)
 InetAddress /10.16.128.210 is now dead.
  INFO [GossipTasks:1] 2012-10-18 13:40:56,526 Gossiper.java (line 632)
 FatClient /10.16.128.210 has been silent for 3ms, removing from gossip





unexpected behaviour on seed nodes when using -Dcassandra.replace_token

2012-10-19 Thread Thomas van Neerijnen
Hi all

I recently tried to replace a dead node using
-Dcassandra.replace_token=token, which so far has been good to me.
However on one of my nodes this option was ignored and the node simply
picked a different token to live at and started up there.

It was a foolish mistake on my part because it was set as a seed node,
which results in this error in the log file:
INFO [main] 2012-10-19 12:41:00,886 StorageService.java (line 518) This
node will not auto bootstrap because it is configured to be
 a seed node.
but it seems a little scary that this would mean it'll just ignore the fact
that you want a replace a token and put itself somewhere else in the
cluster. Surely it should behave similarly to trying to replace a live node
by throwing some kind of exception?


replaced node keeps returning in gossip

2012-10-18 Thread Thomas van Neerijnen
Hi all

I'm running Cassandra 1.0.11 on Ubuntu 11.10.

I've got a ghost node which keeps showing up on my ring.

A node living on IP 10.16.128.210 and token 0 died and had to be replaced.
I replaced it with a new node, IP 10.16.128.197 and again token 0 with a
-Dcassandra.replace_token=0 at startup. This all went well but now I'm
seeing the following weirdness constantly reported in the log files around
the ring:

 INFO [GossipTasks:1] 2012-10-18 13:39:22,441 Gossiper.java (line 632)
FatClient /10.16.128.210 has been silent for 3ms, removing from gossip
 INFO [GossipStage:1] 2012-10-18 13:40:25,933 Gossiper.java (line 838) Node
/10.16.128.210 is now part of the cluster
 INFO [GossipStage:1] 2012-10-18 13:40:25,934 Gossiper.java (line 804)
InetAddress /10.16.128.210 is now UP
 INFO [GossipStage:1] 2012-10-18 13:40:25,937 StorageService.java (line
1017) Nodes /10.16.128.210 and /10.16.128.197 have the same token 0.
Ignoring /10.16.128.210
 INFO [GossipTasks:1] 2012-10-18 13:40:37,509 Gossiper.java (line 818)
InetAddress /10.16.128.210 is now dead.
 INFO [GossipTasks:1] 2012-10-18 13:40:56,526 Gossiper.java (line 632)
FatClient /10.16.128.210 has been silent for 3ms, removing from gossip


Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-10 Thread Thomas van Neerijnen
Hi Aaron

Removing NodeInfo did do the trick, thanks. I see the ticket is already
resolved, good news.

Thanks for the help.

On Fri, Sep 7, 2012 at 12:26 AM, aaron morton aa...@thelastpickle.comwrote:

 This is a problem…

 [default@system] list NodeIdInfo ;
 Using default limit of 100
 ...

 ---
 RowKey: 43757272656e744c6f63616c
 = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
 timestamp=1344414498989)
 = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
 timestamp=1345386691897)


 There is an assertion that we should only have one column for the
 CurrentLocal row. The exception is occurring in a static class member
 initialisation and so is getting lost.

 I'm not sure how two ended up there.

 Deleting the NodeIdInfo CF SSTables should fix it.

 I created https://issues.apache.org/jira/browse/CASSANDRA-4626 can you
 please add more information there if you can and/or watch the ticket incase
 there are other questions.

 Thanks


 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5/09/2012, at 10:18 PM, Thomas van Neerijnen t...@bossastudios.com
 wrote:

 forgot to answer your first question. I see this:
 INFO 14:31:31,896 No saved local node id, using newly generated:
 92109b80-ea0a-11e1--51be601cd0af


 On Wed, Sep 5, 2012 at 8:41 AM, Thomas van Neerijnen t...@bossastudios.com
  wrote:

 Thanks for the help Aaron.
 I've checked NodeIdInfo and LocationInfo as below.
 What am I looking at? I'm guessing the first row in NodeIdInfo represents
 the ring with the node ids, but the second row perhaps dead nodes with old
 schemas? That's a total guess, I'd be very interested to know what it and
 the LocationInfo are.
 If there's anything else you'd like me to check let me know, otherwise
 I'll attempt your workaround later today.

 [default@system] list NodeIdInfo ;
 Using default limit of 100
 ---
 RowKey: 4c6f63616c
 = (column=b10552c0-ea0f-11e0--cb1f02ccbcff, value=0a1020d2,
 timestamp=1317241393645)
 = (column=e64fc8f0-595b-11e1--51be601cd0d7, value=0a1020d2,
 timestamp=1329478703871)
 = (column=732d4690-a596-11e1--51be601cd09f, value=0a1020d2,
 timestamp=1337860139385)
 = (column=bffd9d40-aa45-11e1--51be601cd0fe, value=0a1020d2,
 timestamp=1338375234836)
 = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
 timestamp=1344414498989)
 = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
 timestamp=1345386691897)
 ---
 RowKey: 43757272656e744c6f63616c
 = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
 timestamp=1344414498989)
 = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
 timestamp=1345386691897)

 2 Rows Returned.
 Elapsed time: 128 msec(s).
 [default@system] list LocationInfo ;
 Using default limit of 100
 ---
 RowKey: 52696e67
 = (column=00, value=0a1080d2, timestamp=134104900)
 = (column=04a7128b6c83505dcd618720f92028f4, value=0a1020b7,
 timestamp=1332360971660)
 = (column=09249249249249249249249249249249, value=0a1080cd,
 timestamp=1341136002862)
 = (column=12492492492492492492492492492492, value=0a1020d3,
 timestamp=1341135999465)
 = (column=1500, value=0a1060d3,
 timestamp=134104671)
 = (column=1555, value=0a1020d3,
 timestamp=1344530188382)
 = (column=1b6db6db6db6db6db6db6db6db6db6db, value=0a1020b1,
 timestamp=1341135997643)
 = (column=1c71c71c71c71bff, value=0a1080d2,
 timestamp=1317241889689)
 = (column=24924924924924924924924924924924, value=0a1060d3,
 timestamp=1341135996555)
 = (column=29ff, value=0a1020d3,
 timestamp=1317241534292)
 = (column=2aaa, value=0a1060d3,
 timestamp=1344530187539)
 = (column=38e38e38e38e37ff, value=0a1060d3,
 timestamp=1317241257569)
 = (column=38e38e38e38e38e38e38e38e38e38e38, value=0a1060d3,
 timestamp=1343136501647)
 = (column=393170e0207a17d8519f0c1bfe325d51, value=0a1020d3,
 timestamp=1345381375120)
 = (column=3fff, value=0a1080d3,
 timestamp=134104939)
 = (column=471c71c71c71c71c71c71c71c71c71c6, value=0a1080d3,
 timestamp=1343133153701)
 = (column=471c71c71c71c7ff, value=0a1080d3,
 timestamp=1317241786636)
 = (column=49249249249249249249249249249249, value=0a1080d3,
 timestamp=1341136002693)
 = (column=52492492492492492492492492492492, value=0a106010,
 timestamp=1341136002626)
 = (column=53ff, value=0a1020d4,
 timestamp=1328473688357)
 = (column=5554, value=0a1060d4,
 timestamp=134104910)
 = (column=5b6db6db6db6db6db6db6db6db6db6da, value=0a1060d4,
 timestamp=1332389784945)
 = (column=5b6db6db6db6db6db6db6db6db6db6db, value=0a1060d4,
 timestamp=1341136001027)
 = (column=638e38e38e38e38e38e38e38e38e38e2, value=0a1060d4,
 timestamp=1343125208462)
 = (column

Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-05 Thread Thomas van Neerijnen
 of the NodeIdInfo
 and LocationInfo CF's from system ?

 I *think* a work around may be to:

 * stop the node
 * remove  LocationInfo and NodeInfo cfs.
 * restart

 Note this will read the token from the yaml file again, so make sure it's
 right.

  cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 4/09/2012, at 9:51 PM, Thomas van Neerijnen t...@bossastudios.com
 wrote:

 Hi

 I have a single node in a 6 node Cassandra 1.0.11 cluster that seems to
 have a single column family in a weird state.

 Repairs, upgradesstables, anything that touches this CF crashes.
 I've drained the node, removed every file for this CF from said node,
 removed the commit log, started it up and as soon as data is written to
 this CF on this node I'm in the same situation again. Anyone have any
 suggestions for how to fix this?
 I'm tempted to remove the node and re-add it but I was hoping for
 something a little less disruptive.

 $ nodetool -h localhost upgradesstables Player PlayerCounters
 Error occured while upgrading the sstables for keyspace Player
 java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError:
 Could not initialize class org.apache.cassandra.utils.NodeId$LocalIds
 at
 java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
 at java.util.concurrent.FutureTask.get(FutureTask.java:83)
 at
 org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:219)
 at
 org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:235)
 at
 org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:999)
 at
 org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:1652)
 at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
 at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
 at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
 at
 com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
 at
 com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
 at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
 at
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
 at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
 at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
 at
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
 at sun.reflect.GeneratedMethodAccessor58.invoke(Unknown Source)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303)
 at sun.rmi.transport.Transport$1.run(Transport.java:159)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
 at
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.NoClassDefFoundError: Could not initialize class
 org.apache.cassandra.utils.NodeId$LocalIds
 at org.apache.cassandra.utils.NodeId.localIds(NodeId.java:49)
 at
 org.apache.cassandra.utils.NodeId.getOldLocalNodeIds(NodeId.java:79)
 at
 org.apache.cassandra.db.CounterColumn.computeOldShardMerger(CounterColumn.java:251)
 at
 org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:297)
 at
 org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:271)
 at
 org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:81

Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-05 Thread Thomas van Neerijnen
forgot to answer your first question. I see this:
INFO 14:31:31,896 No saved local node id, using newly generated:
92109b80-ea0a-11e1--51be601cd0af


On Wed, Sep 5, 2012 at 8:41 AM, Thomas van Neerijnen
t...@bossastudios.comwrote:

 Thanks for the help Aaron.
 I've checked NodeIdInfo and LocationInfo as below.
 What am I looking at? I'm guessing the first row in NodeIdInfo represents
 the ring with the node ids, but the second row perhaps dead nodes with old
 schemas? That's a total guess, I'd be very interested to know what it and
 the LocationInfo are.
 If there's anything else you'd like me to check let me know, otherwise
 I'll attempt your workaround later today.

 [default@system] list NodeIdInfo ;
 Using default limit of 100
 ---
 RowKey: 4c6f63616c
 = (column=b10552c0-ea0f-11e0--cb1f02ccbcff, value=0a1020d2,
 timestamp=1317241393645)
 = (column=e64fc8f0-595b-11e1--51be601cd0d7, value=0a1020d2,
 timestamp=1329478703871)
 = (column=732d4690-a596-11e1--51be601cd09f, value=0a1020d2,
 timestamp=1337860139385)
 = (column=bffd9d40-aa45-11e1--51be601cd0fe, value=0a1020d2,
 timestamp=1338375234836)
 = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
 timestamp=1344414498989)
 = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
 timestamp=1345386691897)
 ---
 RowKey: 43757272656e744c6f63616c
 = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2,
 timestamp=1344414498989)
 = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2,
 timestamp=1345386691897)

 2 Rows Returned.
 Elapsed time: 128 msec(s).
 [default@system] list LocationInfo ;
 Using default limit of 100
 ---
 RowKey: 52696e67
 = (column=00, value=0a1080d2, timestamp=134104900)
 = (column=04a7128b6c83505dcd618720f92028f4, value=0a1020b7,
 timestamp=1332360971660)
 = (column=09249249249249249249249249249249, value=0a1080cd,
 timestamp=1341136002862)
 = (column=12492492492492492492492492492492, value=0a1020d3,
 timestamp=1341135999465)
 = (column=1500, value=0a1060d3,
 timestamp=134104671)
 = (column=1555, value=0a1020d3,
 timestamp=1344530188382)
 = (column=1b6db6db6db6db6db6db6db6db6db6db, value=0a1020b1,
 timestamp=1341135997643)
 = (column=1c71c71c71c71bff, value=0a1080d2,
 timestamp=1317241889689)
 = (column=24924924924924924924924924924924, value=0a1060d3,
 timestamp=1341135996555)
 = (column=29ff, value=0a1020d3,
 timestamp=1317241534292)
 = (column=2aaa, value=0a1060d3,
 timestamp=1344530187539)
 = (column=38e38e38e38e37ff, value=0a1060d3,
 timestamp=1317241257569)
 = (column=38e38e38e38e38e38e38e38e38e38e38, value=0a1060d3,
 timestamp=1343136501647)
 = (column=393170e0207a17d8519f0c1bfe325d51, value=0a1020d3,
 timestamp=1345381375120)
 = (column=3fff, value=0a1080d3,
 timestamp=134104939)
 = (column=471c71c71c71c71c71c71c71c71c71c6, value=0a1080d3,
 timestamp=1343133153701)
 = (column=471c71c71c71c7ff, value=0a1080d3,
 timestamp=1317241786636)
 = (column=49249249249249249249249249249249, value=0a1080d3,
 timestamp=1341136002693)
 = (column=52492492492492492492492492492492, value=0a106010,
 timestamp=1341136002626)
 = (column=53ff, value=0a1020d4,
 timestamp=1328473688357)
 = (column=5554, value=0a1060d4,
 timestamp=134104910)
 = (column=5b6db6db6db6db6db6db6db6db6db6da, value=0a1060d4,
 timestamp=1332389784945)
 = (column=5b6db6db6db6db6db6db6db6db6db6db, value=0a1060d4,
 timestamp=1341136001027)
 = (column=638e38e38e38e38e38e38e38e38e38e2, value=0a1060d4,
 timestamp=1343125208462)
 = (column=638e38e38e38e3ff, value=0a1060d4,
 timestamp=1317241257577)
 = (column=6c00, value=0a1020d3,
 timestamp=134104789)
 ---
 RowKey: 4c
 = (column=436c75737465724e616d65,
 value=4d6f6e737465724d696e642050726f6420436c7573746572,
 timestamp=1317241251097000)
 = (column=47656e65726174696f6e, value=50447e78,
 timestamp=134104152000)
 = (column=50617274696f6e6572,
 value=6f72672e6170616368652e63617373616e6472612e6468742e52616e646f6d506172746974696f6e6572,
 timestamp=1317241251097000)
 = (column=546f6b656e, value=2a00,
 timestamp=134104214)
 ---
 RowKey: 436f6f6b696573
 =
 (column=48696e7473207075726765642061732070617274206f6620757067726164696e672066726f6d20302e362e7820746f20302e37,
 value=6f68207965732c20697420746865792077657265207075726765642e,
 timestamp=1317241251249)
 = (column=5072652d312e302068696e747320707572676564,
 value=6f68207965732c2074686579207765726520707572676564,
 timestamp=1326274339337)
 ---
 RowKey: 426f6f747374726170
 = (column=42, value=01, timestamp=134104213)

 4 Rows Returned.
 Elapsed time: 34 msec(s).


 On Wed, Sep 5, 2012 at 2:42 AM

java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-04 Thread Thomas van Neerijnen
Hi

I have a single node in a 6 node Cassandra 1.0.11 cluster that seems to
have a single column family in a weird state.

Repairs, upgradesstables, anything that touches this CF crashes.
I've drained the node, removed every file for this CF from said node,
removed the commit log, started it up and as soon as data is written to
this CF on this node I'm in the same situation again. Anyone have any
suggestions for how to fix this?
I'm tempted to remove the node and re-add it but I was hoping for something
a little less disruptive.

$ nodetool -h localhost upgradesstables Player PlayerCounters
Error occured while upgrading the sstables for keyspace Player
java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError:
Could not initialize class org.apache.cassandra.utils.NodeId$LocalIds
at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:219)
at
org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:235)
at
org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:999)
at
org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:1652)
at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at
com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at
com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.GeneratedMethodAccessor58.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.cassandra.utils.NodeId$LocalIds
at org.apache.cassandra.utils.NodeId.localIds(NodeId.java:49)
at
org.apache.cassandra.utils.NodeId.getOldLocalNodeIds(NodeId.java:79)
at
org.apache.cassandra.db.CounterColumn.computeOldShardMerger(CounterColumn.java:251)
at
org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:297)
at
org.apache.cassandra.db.CounterColumn.mergeAndRemoveOldShards(CounterColumn.java:271)
at
org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:81)
at
org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:97)
at
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:137)
at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:97)
at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:82)
at
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:207)
at

Why does a large compaction on one node affect the entire cluster?

2012-05-24 Thread Thomas van Neerijnen
Hi all

I am running Cassandra 1.0.10 installed from the apache debs on ubuntu
11.10 on a 7 node cluster.

I moved some tokens around my cluster and now have one node compacting a
large Leveled compaction column family. It has done about 5k out of 10k
outstanding compactions today. The other nodes have all finished.

The weird thing is when it hits a big-ish chunk to compact, for example:
pending tasks: 4555
  compaction typekeyspace   column family bytes
compacted bytes total  progress
   Compaction  PlayerPlayerDetail
213097  4286616517 0.00%
, I see heap usage on it AND all other nodes go insane.
Normal operation on all nodes is a leisurely saw-toothed climb to a CMS at
just below 3/4 heap size every 10 minutes or so.
During the big-ish compaction all nodes in the cluster CMS multiple times
in a minute, with the peaks getting close to heap size.

So my question is why does one node compacting put so much memory pressure
on all the other nodes in the cluster and ruin my day?


Cassandra CF merkle tree

2012-04-02 Thread Thomas van Neerijnen
Hi all

Is there a way I can easily retrieve a Merkle tree for a CF, like the one
created during a repair?
I didn't see anything about this in the Thrift API docs, I'm assuming this
is a data structure made available only to internal Cassandra functions.

I would like to explore using the Merkle trees as a method for data
integrity checks after config changes, version upgrades, and probably loads
of other scenarios I haven't even thought of that may result in data loss
going initially unnoticed.


Re: ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-23 Thread Thomas van Neerijnen
The main issue turned out to be a bug in our code whereby we were writing a
lot of new columns to the same row key instead of a new row key, turning
what we expected to be a skinny rowed CF into a CF with one very, very wide
row. These writes on the single key were putting pressure on the 3 nodes
holding our replicas.
One of the replicas would eventually fail under the pressure and the rest
of the cluster would try holding hints for the bad keys writes, which would
cause the same problem on the rest of the cluster.

On Thu, Mar 22, 2012 at 1:55 AM, Thomas van Neerijnen
t...@bossastudios.comwrote:

 Hi

 I'm going with yes to all three of your questions.

 I found a very heavily hit index which we have since reworked to remove
 the secondry index entirely.
 This fixed a large portion of the problem but during the panic of the
 overloaded cluster we did the simple scaling out trick of doubling the
 cluster, however in the rush two out of the 7 new nodes accidentally ended
 up on EC2 EBS volumes instead of the usual ephemeral RAID10.
 So, same error but this time all nodes reporting only the two EBS backed
 nodes as down instead of the whole cluster getting weird.
 I'm rsyncing the data off the EBS volume onto an ephemeral RAID10 array as
 I type so in the next hour or so I'll know if this fixed the issue.


 On Wed, Mar 21, 2012 at 5:24 PM, aaron morton aa...@thelastpickle.comwrote:

 The node is overloaded with hints.

 I'll just grab the comments from code…

 // avoid OOMing due to excess hints.  we need to do this
 check even for live nodes, since we can
 // still generate hints for those if it's overloaded or
 simply dead but not yet known-to-be-dead.
 // The idea is that if we have over maxHintsInProgress hints
 in flight, this is probably due to
 // a small number of nodes causing problems, so we should
 avoid shutting down writes completely to
 // healthy nodes.  Any node with no hintsInProgress is
 considered healthy.

 Are the nodes going up and down a lot ? Are they under GC pressure. The
 other possibility is that you have overloaded the cluster.

 Cheers


   -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 22/03/2012, at 3:20 AM, Thomas van Neerijnen wrote:

 Hi all

 I'm running into a weird error on Cassandra 1.0.7.
 As my clusters load gets heavier many of the nodes seem to hit the same
 error around the same time, resulting in MutationStage backing up and never
 clearing down. The only way to recover the cluster is to kill all the nodes
 and start them up again. The error is as below and is repeated continuously
 until I kill the Cassandra process.

 ERROR [ReplicateOnWriteStage:57] 2012-03-21 14:02:05,099
 AbstractCassandraDaemon.java (line 139) Fatal exception in thread
 Thread[ReplicateOnWriteStage:57,5,main]
 java.lang.RuntimeException: java.util.concurrent.TimeoutException
 at
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1227)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.util.concurrent.TimeoutException
 at
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:301)
 at
 org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:544)
 at
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1223)
 ... 3 more






ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-21 Thread Thomas van Neerijnen
Hi all

I'm running into a weird error on Cassandra 1.0.7.
As my clusters load gets heavier many of the nodes seem to hit the same
error around the same time, resulting in MutationStage backing up and never
clearing down. The only way to recover the cluster is to kill all the nodes
and start them up again. The error is as below and is repeated continuously
until I kill the Cassandra process.

ERROR [ReplicateOnWriteStage:57] 2012-03-21 14:02:05,099
AbstractCassandraDaemon.java (line 139) Fatal exception in thread
Thread[ReplicateOnWriteStage:57,5,main]
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1227)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.util.concurrent.TimeoutException
at
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:301)
at
org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:544)
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1223)
... 3 more


Re: ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-21 Thread Thomas van Neerijnen
Hi

I'm going with yes to all three of your questions.

I found a very heavily hit index which we have since reworked to remove the
secondry index entirely.
This fixed a large portion of the problem but during the panic of the
overloaded cluster we did the simple scaling out trick of doubling the
cluster, however in the rush two out of the 7 new nodes accidentally ended
up on EC2 EBS volumes instead of the usual ephemeral RAID10.
So, same error but this time all nodes reporting only the two EBS backed
nodes as down instead of the whole cluster getting weird.
I'm rsyncing the data off the EBS volume onto an ephemeral RAID10 array as
I type so in the next hour or so I'll know if this fixed the issue.

On Wed, Mar 21, 2012 at 5:24 PM, aaron morton aa...@thelastpickle.comwrote:

 The node is overloaded with hints.

 I'll just grab the comments from code…

 // avoid OOMing due to excess hints.  we need to do this check
 even for live nodes, since we can
 // still generate hints for those if it's overloaded or simply
 dead but not yet known-to-be-dead.
 // The idea is that if we have over maxHintsInProgress hints
 in flight, this is probably due to
 // a small number of nodes causing problems, so we should
 avoid shutting down writes completely to
 // healthy nodes.  Any node with no hintsInProgress is
 considered healthy.

 Are the nodes going up and down a lot ? Are they under GC pressure. The
 other possibility is that you have overloaded the cluster.

 Cheers


 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 22/03/2012, at 3:20 AM, Thomas van Neerijnen wrote:

 Hi all

 I'm running into a weird error on Cassandra 1.0.7.
 As my clusters load gets heavier many of the nodes seem to hit the same
 error around the same time, resulting in MutationStage backing up and never
 clearing down. The only way to recover the cluster is to kill all the nodes
 and start them up again. The error is as below and is repeated continuously
 until I kill the Cassandra process.

 ERROR [ReplicateOnWriteStage:57] 2012-03-21 14:02:05,099
 AbstractCassandraDaemon.java (line 139) Fatal exception in thread
 Thread[ReplicateOnWriteStage:57,5,main]
 java.lang.RuntimeException: java.util.concurrent.TimeoutException
 at
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1227)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.util.concurrent.TimeoutException
 at
 org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:301)
 at
 org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:544)
 at
 org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1223)
 ... 3 more





Re: Network, Compaction, Garbage collection and Cache monitoring in cassandra

2012-03-21 Thread Thomas van Neerijnen
Collectd with GenericJMX pushing data into Graphite is what we use.
You can monitor the Graphite graphs directly instead of having an extra JMX
interface on the Cassandra nodes for monitoring.

On Wed, Mar 21, 2012 at 8:16 PM, Jeremiah Jordan 
jeremiah.jor...@morningstar.com wrote:

  You can also use any network/server monitoring tool which can talk to
 JMX.  We are currently using vFabric Hyperic's JMX plugin for this.

 IIRC there are some cacti and nagios scripts on github for getting the
 data into those.

 -Jeremiah


  --
 *From:* R. Verlangen [ro...@us2.nl]
 *Sent:* Wednesday, March 21, 2012 10:40 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: Network, Compaction, Garbage collection and Cache
 monitoring in cassandra

  Hi Rishabh,

  Please take a look at OpsCenter:
 http://www.datastax.com/products/opscenter

  It provides most of the details you request for.

  Good luck!

 2012/3/21 Rishabh Agrawal rishabh.agra...@impetus.co.in

  Hello,



 Can someone help me with how to proactively monitor  Network, Compaction,
 Garbage collection and Cache use in Cassandra.





 Regards

 Rishabh

 --

 Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22.
 Know more about our Big Data quick-start program at the event.

 New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis
 On-premise’ available at http://bit.ly/z6zT4L.


 NOTE: This message may contain information that is confidential,
 proprietary, privileged or otherwise protected by law. The message is
 intended solely for the named addressee. If received in error, please
 destroy and notify the sender. Any use of this email is prohibited when
 received in error. Impetus does not represent, warrant and/or guarantee,
 that the integrity of this communication has been maintained nor that the
 communication is free of errors, virus, interception or interference.





Re: Single Node Cassandra Installation

2012-03-16 Thread Thomas van Neerijnen
You'll need to either read or write at at least quorum to get consistent
data from the cluster so you may as well do both.
Now that you mention it, I was wrong about downtime, with a two node
cluster reads or writes at quorum will mean both nodes need to be online.
Perhaps you could have an emergency switch in your application which flips
to consistency of 1 if one of your Cassandra servers goes down? Just make
sure it's set back to quorum when the second one returns or again you could
end up with inconsistent data.

On Fri, Mar 16, 2012 at 2:04 AM, Drew Kutcharian d...@venarc.com wrote:

 Thanks for the comments, I guess I will end up doing a 2 node cluster with
 replica count 2 and read consistency 1.

 -- Drew



 On Mar 15, 2012, at 4:20 PM, Thomas van Neerijnen wrote:

 So long as data loss and downtime are acceptable risks a one node cluster
 is fine.
 Personally this is usually only acceptable on my workstation, even my dev
 environment is redundant, because servers fail, usually when you least want
 them to, like for example when you've decided to save costs by waiting
 before implementing redundancy. Could a failure end up costing you more
 than you've saved? I'd rather get cheaper servers (maybe even used off
 ebay??) so I could have at least two of them.

 If you do go with a one node solution, altho I haven't tried it myself
 Priam looks like a good place to start for backups, otherwise roll your own
 with incremental snapshotting turned on and a watch on the snapshot
 directory. Storage on something like S3 or Cloud Files is very cheap so
 there's no good excuse for no backups.

 On Thu, Mar 15, 2012 at 7:12 PM, R. Verlangen ro...@us2.nl wrote:

 Hi Drew,

 One other disadvantage is the lack of consistency level and
 replication. Both ware part of the high availability / redundancy. So you
 would really need to backup your single-node-cluster to some other
 external location.

 Good luck!


 2012/3/15 Drew Kutcharian d...@venarc.com

 Hi,

 We are working on a project that initially is going to have very little
 data, but we would like to use Cassandra to ease the future scalability.
 Due to budget constraints, we were thinking to run a single node Cassandra
 for now and then add more nodes as required.

 I was wondering if it is recommended to run a single node cassandra in
 production? Are there any other issues besides lack of high availability?

 Thanks,

 Drew







Re: 1.0.8 with Leveled compaction - Possible issues

2012-03-15 Thread Thomas van Neerijnen
Heya

I'd suggest staying away from Leveled Compaction until 1.0.9.
For the why see this great explanation I got from Maki Watanabe on the
list:
http://mail-archives.apache.org/mod_mbox/cassandra-user/201203.mbox/%3CCALqbeQbQ=d-hORVhA-LHOo_a5j46fQrsZMm+OQgfkgR=4rr...@mail.gmail.com%3E
Keep an eye on that one because I'm busy testing one of his suggestions,
I'll post back with the results soon.

My understanding is after a change in compaction or compression, until you
run an upgradesstables on all the nodes the current sstables will have the
old schema settings, only new ones get the new format. Obviously this
compounds the issue I mentioned above tho.
Be warned, an upgradesstables can take a long time so maybe keep an eye on
the number of files around vs over 5MB to get an idea of progress. Maybe
someone else knows a better way?

You can change back and forth between compression and compaction options
quite safely, but again you need an upgradesstables to remove it from
current sstables.

In my experience I've safely applied compression and leveled compaction to
the same CF at the same time without issue so I guess it's ok:)

On Thu, Mar 15, 2012 at 10:05 PM, Johan Elmerfjord jelme...@adobe.comwrote:

 **
 Hi, I'm testing the community-version of Cassandra 1.0.8.
 We are currently on 0.8.7 in our production-setup.

 We have 3 Column Families that each takes between 20 and 35 GB on disk per
 node. (8*2 nodes total)
 We would like to change to Leveled Compaction - and even try compression
 as well to reduce the space needed for compactions.
 We are running on SSD-drives as latency is a key-issue.

 As test I have imported one Column Family from 3 production-nodes to a 3
 node test-cluster.
 The data on the 3 nodes ranges from 19-33GB. (with at least one large
 SSTable (Tiered size - recently compacted)).

 After loading this data to the 3 test-nodes, and running scrub and repair,
 I took a backup of the data so I have good test-set of data to work on.
 Then I changed changed to leveled compaction, using the cassandra-cli:

 UPDATE COLUMN FAMILY TestCF1 WITH
 compaction_strategy=LeveledCompactionStrategy;
 I could see the change being written to the logfile on all nodes.

 Then I don't know for for sure if I need to run anything else to make the
 change happen - or if it's just to wait.
 My test-cluster does not receive new data.

 For this  KS  CF and on each of the nodes I have tried some or several
 of: upgradesstable, scrub, compact, cleanup and repair - each task taking
 between 40 minutes and 4 hours.
 With the exception of compact that returns almost immediately with no
 visible compactions made.

 On some node I ended up with over 3 files with the default 5MB size
 for leveled compaction, on another node it didn't look like anything has
 been done and I still have a 19GB SSTable.

 I then made another change.
 UPDATE COLUMN FAMILY TestCF1 WITH
 compaction_strategy=LeveledCompactionStrategy AND
 compaction_strategy_options=[{sstable_size_in_mb: 64}];
 WARNING: [{}] strategy_options syntax is deprecated, please use {}
 Which is probably wrong in the documentation - and should be:
 UPDATE COLUMN FAMILY TestCF1 WITH
 compaction_strategy=LeveledCompactionStrategy AND
 compaction_strategy_options={sstable_size_in_mb: 64};

 I think that we will be able to find the data in 3 searches with a 64MB
 size - and still only use around 700MB while doing compactions - and keep
 the number of files ~3000 per CF.

 A few days later it looks like I still have a mix between original huge
 SStables, 5MB once - and some nodes has 64MB files as well.
 Do I need to do something special to clean this up?
 I have tried another scrub /upgradesstables/clean - but nothing seems to
 do any change to me.

 Finally I have also tried to enable compression:
 UPDATE COLUMN FAMILY TestCF1 WITH
 compression_options=[{sstable_compression:SnappyCompressor,
 chunk_length_kb:64}];
 - which results in the same [{}] - warning.

 As you can see below - this created CompressionInfo.db - files on some
 nodes - but not on all.

 *Is there a way I can force Teired sstables to be converted into Leveled
 once - and then to compression as well?*
 *Why are the original file (Tiered Sized SSTables still present on
 testnode1 - when is it supposed to delete them?*

 *Can I change back and forth between compression (on/off - or chunksizes)
 - and between Leveled vs Size Tiered compaction?*
 *Is there a way to see if the node is done - or waiting for something?*
 *When is it safe to apply another setting - does it have to complete one
 reorg before moving on to the next?*

 *Any input or own experiences are warmly welcome.*

 Best regards, Johan


 Some lines of example directory-listings below.:

 Some files for testnode 3. (looks like it's still have the original Size
 Tiered files around, and a mixture of compressed 64MB files - and 5MB
 files?

 total 19G
 drwxr-xr-x 3 cass cass 4.0K Mar 13 17:11 snapshots
 -rw-r--r-- 1 cass cass 6.0G Mar 13 

Re: Single Node Cassandra Installation

2012-03-15 Thread Thomas van Neerijnen
So long as data loss and downtime are acceptable risks a one node cluster
is fine.
Personally this is usually only acceptable on my workstation, even my dev
environment is redundant, because servers fail, usually when you least want
them to, like for example when you've decided to save costs by waiting
before implementing redundancy. Could a failure end up costing you more
than you've saved? I'd rather get cheaper servers (maybe even used off
ebay??) so I could have at least two of them.

If you do go with a one node solution, altho I haven't tried it myself
Priam looks like a good place to start for backups, otherwise roll your own
with incremental snapshotting turned on and a watch on the snapshot
directory. Storage on something like S3 or Cloud Files is very cheap so
there's no good excuse for no backups.

On Thu, Mar 15, 2012 at 7:12 PM, R. Verlangen ro...@us2.nl wrote:

 Hi Drew,

 One other disadvantage is the lack of consistency level and
 replication. Both ware part of the high availability / redundancy. So you
 would really need to backup your single-node-cluster to some other
 external location.

 Good luck!


 2012/3/15 Drew Kutcharian d...@venarc.com

 Hi,

 We are working on a project that initially is going to have very little
 data, but we would like to use Cassandra to ease the future scalability.
 Due to budget constraints, we were thinking to run a single node Cassandra
 for now and then add more nodes as required.

 I was wondering if it is recommended to run a single node cassandra in
 production? Are there any other issues besides lack of high availability?

 Thanks,

 Drew





cleanup crashing with java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8

2012-03-14 Thread Thomas van Neerijnen
Hi all

I am trying to run a cleanup on a column family and am getting the
following error returned after about 15 seconds. A cleanup on a slightly
smaller column family completes in about 21 minutes. This is on the Apache
packaged version of Cassandra on Ubuntu 11.10, version 1.0.7.

~# nodetool -h localhost cleanup Player PlayerDetail
Error occured during cleanup
java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: 8
at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:203)
at
org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:237)
at
org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:984)
at
org.apache.cassandra.service.StorageService.forceTableCleanup(StorageService.java:1635)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at
com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at
com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 8
at
org.apache.cassandra.db.compaction.LeveledManifest.add(LeveledManifest.java:298)
at
org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:186)
at
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:141)
at
org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:494)
at
org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:234)
at
org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:1006)
at
org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:791)
at
org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63)
at
org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:241)
at
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:182)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
... 3 more


Re: LeveledCompaction and/or SnappyCompressor causing memory pressure during repair

2012-03-14 Thread Thomas van Neerijnen
Thanks for the suggestions but I'd already removed the compression when
your message came thru. That alleviated the problem but didn't solve it.
I'm still looking at a few other possible causes, I'll post back if I work
out what's going on, for now I am running rolling repairs to avoid another
outage.

On Sun, Mar 11, 2012 at 6:32 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 One thing you may want to look at is the meanRowSize from nodetool
 cfstats and your compression block size. In our case the mean
 compacted size is 560 bytes and 64KB block size caused CPU tickets and
 a lot of short lived memory. I have brought by block size down to 16K.
 The result tables are not noticeably larger and less memory pressure
 on the young gen. I might try going down to 4 K next.

 On Sat, Mar 10, 2012 at 5:38 PM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
  The only downside of compression is it does cause more memory
  pressure. I can imagine something like repair could confound this.
  Since it would seem like building the merkle tree would involve
  decompressing every block on disk.
 
  I have been attempting to determine if the block size being larger or
  smaller has any effect on memory pressure.
 
  On Sat, Mar 10, 2012 at 4:50 PM, Peter Schuller
  peter.schul...@infidyne.com wrote:
  However, when I run a repair my CMS usage graph no longer shows sudden
 drops
  but rather gradual slopes and only manages to clear around 300MB each
 GC.
  This seems to occur on 2 other nodes in my cluster around the same
 time, I
  assume this is because they're the replicas (we use 3 replicas). Parnew
  collections look about the same on my graphs with or without repair
 running
  so no trouble there so far as I can tell.
 
  I don't know why leveled/snappy would affect it, but disregarding
  that, I would have been suggesting that you are seeing additional heap
  usage because of long-running repairs retaining sstables and delaying
  their unload/removal (index sampling/bloom filters filling your heap).
  If it really only happens for leveled/snappy however, I don't know
  what that might be caused by.
 
  --
  / Peter Schuller (@scode, http://worldmodscode.wordpress.com)



LeveledCompaction and/or SnappyCompressor causing memory pressure during repair

2012-03-08 Thread Thomas van Neerijnen
Hi all

Running Cassandra 1.0.7, I recently changed a few read heavy column
families from SizeTieredCompactionStrategy to LeveledCompactionStrategy and
added in SnappyCompressor, all with defaults so 5MB files and if memory
serves me correctly 64k chunk size for compression.
The results were amazingly good, my data size halved and my heap usage and
performance stabilised nicely, until it came time to run a repair.

When a repair isn't running I'm seeing a saw toothed pattern on my heap
graphs with CMS clearing out about 1.5GB each GC run. The CMS GC appears as
a sudden vertical drop on the Old Gen usage graph. In addition to what I
consider a healthy looking heap usage, my par new and CMS collections are
running far quicker than before I made the changes.

However, when I run a repair my CMS usage graph no longer shows sudden
drops but rather gradual slopes and only manages to clear around 300MB each
GC. This seems to occur on 2 other nodes in my cluster around the same
time, I assume this is because they're the replicas (we use 3 replicas).
Parnew collections look about the same on my graphs with or without repair
running so no trouble there so far as I can tell.
The symptom of the memory pressure during repair is either the node running
the repair of one of the two replicas tends to perform badly with read
stage backing up into the thousands at times.
If I run a repair on more than one or two nodes at the same time (it's a 7
node cluster) the memory pressure is so bad that half the cluster ends up
OOMing, and this happened during off-peak when it's doing about half the
reads we handle during peak so not particularly loaded.

The question I'm asking is has anyone run into this behaviour before, and
if so how was it dealt with?

Once I have nursed the cluster thru the repair it's currently running I
will be turning off compression on one of my larger CFs to see if it makes
a difference, I'll send the results of that test tomorrow.


Final buffer length 4690 to accomodate data size of 2347 for RowMutation error caused node death

2012-02-20 Thread Thomas van Neerijnen
Hi all

I am running the Apache packaged Cassandra 1.0.7 on Ubuntu 11.10.
It has been running fine for over a month however I encountered the below
error yesterday which almost immediately resulted in heap usage rising
quickly to almost 100% and client requests timing out on the affected node.
I gave up waiting for the init script to stop Cassandra and killed it
myself after about 3 minutes, restarted it and it has been fine since.
Anyone seen this before?

Here is the error in the output.log:

ERROR 10:51:44,282 Fatal exception in thread
Thread[COMMIT-LOG-WRITER,5,main]
java.lang.AssertionError: Final buffer length 4690 to accomodate data size
of 2347 (predicted 2344) for RowMutation(keyspace='Player',
key='36336138643338652d366162302d343334392d383466302d356166643863353133356465',
modifications=[ColumnFamily(PlayerCity [SuperColumn(owneditem_1019
[]),SuperColumn(owneditem_1024 []),SuperColumn(owneditem_1026
[]),SuperColumn(owneditem_1074 []),SuperColumn(owneditem_1077
[]),SuperColumn(owneditem_1084 []),SuperColumn(owneditem_1094
[]),SuperColumn(owneditem_1130 []),SuperColumn(owneditem_1136
[]),SuperColumn(owneditem_1141 []),SuperColumn(owneditem_1142
[]),SuperColumn(owneditem_1145 []),SuperColumn(owneditem_1218
[636f6e6e6563746564:false:5@1329648704269002
,63757272656e744865616c7468:false:3@1329648704269006
,656e64436f6e737472756374696f6e54696d65:false:13@1329648704269007
,6964:false:4@1329648704269000,6974656d4964:false:15@1329648704269001
,6c61737444657374726f79656454696d65:false:1@1329648704269008
,6c61737454696d65436f6c6c6563746564:false:13@1329648704269005
,736b696e4964:false:7@1329648704269009,78:false:4@1329648704269003
,79:false:3@1329648704269004,]),SuperColumn(owneditem_133
[]),SuperColumn(owneditem_134 []),SuperColumn(owneditem_135
[]),SuperColumn(owneditem_141 []),SuperColumn(owneditem_147
[]),SuperColumn(owneditem_154 []),SuperColumn(owneditem_159
[]),SuperColumn(owneditem_171 []),SuperColumn(owneditem_253
[]),SuperColumn(owneditem_422 []),SuperColumn(owneditem_438
[]),SuperColumn(owneditem_515 []),SuperColumn(owneditem_521
[]),SuperColumn(owneditem_523 []),SuperColumn(owneditem_525
[]),SuperColumn(owneditem_562 []),SuperColumn(owneditem_61
[]),SuperColumn(owneditem_634 []),SuperColumn(owneditem_636
[]),SuperColumn(owneditem_71 []),SuperColumn(owneditem_712
[]),SuperColumn(owneditem_720 []),SuperColumn(owneditem_728
[]),SuperColumn(owneditem_787 []),SuperColumn(owneditem_797
[]),SuperColumn(owneditem_798 []),SuperColumn(owneditem_838
[]),SuperColumn(owneditem_842 []),SuperColumn(owneditem_847
[]),SuperColumn(owneditem_849 []),SuperColumn(owneditem_851
[]),SuperColumn(owneditem_852 []),SuperColumn(owneditem_853
[]),SuperColumn(owneditem_854 []),SuperColumn(owneditem_857
[]),SuperColumn(owneditem_858 []),SuperColumn(owneditem_874
[]),SuperColumn(owneditem_884 []),SuperColumn(owneditem_886
[]),SuperColumn(owneditem_908 []),SuperColumn(owneditem_91
[]),SuperColumn(owneditem_911 []),SuperColumn(owneditem_930
[]),SuperColumn(owneditem_934 []),SuperColumn(owneditem_937
[]),SuperColumn(owneditem_944 []),SuperColumn(owneditem_945
[]),SuperColumn(owneditem_962 []),SuperColumn(owneditem_963
[]),SuperColumn(owneditem_964 []),])])
at
org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:682)
at
org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:279)
at
org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
at
org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:599)
at
org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.lang.Thread.run(Thread.java:662)
 WARN 10:51:54,302 Heap is 0.764063958911146 full.  You may need to reduce
memtable and/or cache sizes.  Cassandra will now flush up to the two
largest memtables to free up memory.  Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this
automatically
 WARN 10:51:54,303 Flushing CFS(Keyspace='Player',
ColumnFamily='PlayerDetail') to relieve memory pressure
 INFO 11:00:41,162 Started hinted handoff for token:
121529416757478022665490931225631504090 with IP: /10.16.96.212
 INFO 11:00:41,163 Finished hinted handoff of 0 rows to endpoint /
10.16.96.212
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor192]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor165]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor202]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor232]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor146]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor181]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor190]
[Unloading class