Re: decommissioning a cassandra node
As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem. On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm trying to decommission a node. First I'm getting a status: [root@beta-new:/usr/local] #nodetool statusNote: Ownership information does not include topology; for complete information, specify a keyspaceDatacenter: datacenter1===Status=Up/Down|/ State=Normal/Leaving/Joining/Moving-- Address Load Tokens Owns Host ID RackUN 162.243.86.41 1.08 MB 1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1UL 162.243.109.94 1.28 MB 256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 But when I try to decommission the node I get this message: [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommissionnodetool: Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such object in table'. Yet I can telnet to that host on that port just fine: [root@beta-new:/usr/local] #telnet 162.243.86.41 7199Trying 162.243.86.41...Connected to 162.243.86.41.Escape character is '^]'. And I have verified that cassandra is running and accessible via cqlsh on the other machine. What could be going wrong? ThanksTim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: OOM(Java heap space) on start-up during commit log replaying
Graham, Thanks for the reply. As I stated in mine first mail increasing the heap size fixes the problem but I'm more interesting in figuring out the right properties for commitlog and memtable sizes when we need to keep the heap smaller. Also I think we are not seeing CASSANDRA-7546 as I apply your patch but the problem still persists. What more details do you need? I'll be happy to provide them. On Wednesday, August 13, 2014 1:05 AM, graham sanderson gra...@vast.com wrote: Agreed need more details; and just start by increasing heap because that may wells solve the problem. I have just observed (which makes sense when you think about it) while testing fix for https://issues.apache.org/jira/browse/CASSANDRA-7546, that if you are replaying a commit log which has a high level of updates for the same partition key, you can hit that issue - excess memory allocation under high contention for the same partition key - (this might not cause OOM but will certainly massively tax GC and it sounds like you don’t have a lot/any headroom). On Aug 12, 2014, at 12:31 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 12, 2014 at 9:34 AM, jivko donev jivko_...@yahoo.com wrote: We have a node with commit log director ~4G. During start-up of the node on commit log replaying the used heap space is constantly growing ending with OOM error. The heap size and new heap size properties are - 1G and 256M. We are using the default settings for commitlog_sync, commitlog_sync_period_in_ms and commitlog_segment_size_in_mb. What version of Cassandra? 1G is tiny for cassandra heap. There is a direct relationship between the data in the commitlog and memtables and in the heap. You almost certainly need more heap or less commitlog. =Rob
OOM(Java heap space) on start-up during commit log replaying
Hi all, We have a node with commit log director ~4G. During start-up of the node on commit log replaying the used heap space is constantly growing ending with OOM error. The heap size and new heap size properties are - 1G and 256M. We are using the default settings for commitlog_sync, commitlog_sync_period_in_ms and commitlog_segment_size_in_mb. The log shows that cassandra is stuck on MutationStage: Active Pending Completed Blocked 16 385 196 0 The stack trace is: ERROR [metrics-meter-tick-thread-1] 2014-08-12 19:15:10,181 CassandraDaemon.java (line 198) Exception in thread Thread[metrics-meter-tick-thread-1,5,main] java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.locks.AbstractQueuedSynchronizer.addWaiter(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(Unknown Source) at java.util.concurrent.locks.ReentrantLock.lock(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.offer(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor.reExecutePeriodic(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR [MutationStage:8] 2014-08-12 19:15:10,181 CassandraDaemon.java (line 198) Exception in thread Thread[MutationStage:8,5,main] java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.duplicate(Unknown Source) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:62) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:99) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:188) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:219) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:184) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:226) at org.apache.cassandra.db.Memtable.put(Memtable.java:173) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:893) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:352) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR [MutationStage:8] 2014-08-12 19:15:12,080 CassandraDaemon.java (line 198) Exception in thread Thread[MutationStage:8,5,main] java.lang.IllegalThreadStateException at java.lang.Thread.start(Unknown Source) at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:204) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.handleOrLog(DebuggableThreadPoolExecutor.java:220) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.logExceptionsAfterExecute(DebuggableThreadPoolExecutor.java:203) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:183) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Increasing the heap space to 2G solves the problem but we want to know if the problem could be solved without increasing the heap space. Does anyone have experience similar problem? If so are there any tuning options in cassandra.yaml? Any help will be much appreciated. If you need more information fell free to ask. Thanks, Jivko Donev
Re: OOM(Java heap space) on start-up during commit log replaying
Hi Robert, Thanks for your reply. The Cassandra version is 2.07. Is there some commonly used rule for determining the commitlog and memtables size depending on the heap size? What would be the main disadvantage when having smaller commitlog? On Tuesday, August 12, 2014 8:32 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 12, 2014 at 9:34 AM, jivko donev jivko_...@yahoo.com wrote: We have a node with commit log director ~4G. During start-up of the node on commit log replaying the used heap space is constantly growing ending with OOM error. The heap size and new heap size properties are - 1G and 256M. We are using the default settings for commitlog_sync, commitlog_sync_period_in_ms and commitlog_segment_size_in_mb. What version of Cassandra? 1G is tiny for cassandra heap. There is a direct relationship between the data in the commitlog and memtables and in the heap. You almost certainly need more heap or less commitlog. =Rob
Adding a node to cluster keeping 100% data replicated on all nodes
Hi, Our environment will consist of cluster with size not bigger than 2 to 4 nodes per cluster(all located in the same DC). We want to ensure that every node in the cluster will own 100% of the data. A node adding(or removing) procedure will be automated so we want to ensure we're making the right steps. Lets say we have node 'A' up and running and want to add another node 'B' to make a cluster. Node A configuration will be: seed: IP of A listen_address: IP of A num_tokens: 256 rpc_address: 0.0.0.0 The keyspace uses SimpleStrategy with RF: 1. Adding node 'B' to cluster we are doing the following: 1. Stop cassandra on B. 2. Update cassandra.yaml - change seed to point to IP of A 3. Update cassandra-topology.properties - add node A ip to it and make it the default one. 4. rm -rf /var/lib/cassandra/* 5. Start cassandra on B. 6. Wait untill nodetool status reports the node B is up. 7. Update RP of the keyspace to 2. 8. Run nodetool repair on B and wait it to finish. Can we update the RF factor on A before starting Cassandra on B in order to skip steps 7 and 8? Now when the data is sync on both nodes we want to make a node B a seed node. 9. Update seed property on A and B to include the the IP of B node. 10. Restart cassandra on both nodes. If adding more nodes to the cluster the steps will be the same except that seed property will contain all existing nodes in the cluster. So are these steps everything we need to do? Is there anything more we need to do? Is there an easier way to do what we want or all the steps above are mandatory?