Re: decommissioning a cassandra node

2014-10-27 Thread jivko donev
As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing 
the problem. 

 On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com 
wrote:
   

 Hey all,
 I'm trying to decommission a node. 
 First I'm getting a status:
[root@beta-new:/usr/local] #nodetool statusNote: Ownership information does not 
include topology; for complete information, specify a keyspaceDatacenter: 
datacenter1===Status=Up/Down|/ 
State=Normal/Leaving/Joining/Moving--  Address         Load       Tokens  Owns  
  Host ID                               RackUN  162.243.86.41   1.08 MB    1    
   0.1%    e945f3b5-2e3e-4a20-b1bd-e30c474a7634  rack1UL  162.243.109.94  1.28 
MB    256     99.9%   fd2f76ae-8dcf-4e93-a37f-bf1e9088696e  rack1

But when I try to decommission the node I get this message:
[root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommissionnodetool: 
Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such 
object in table'.
Yet I can telnet to that host on that port just fine:
[root@beta-new:/usr/local] #telnet 162.243.86.41 7199Trying 
162.243.86.41...Connected to 162.243.86.41.Escape character is '^]'.

And I have verified that cassandra is running and accessible via cqlsh on the 
other machine. 
What could be going wrong? 
ThanksTim

-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B



   

Re: OOM(Java heap space) on start-up during commit log replaying

2014-08-13 Thread jivko donev
Graham,
Thanks for the reply. As I stated in mine first mail increasing the heap size 
fixes the problem but I'm more interesting in figuring out the right properties 
for commitlog and memtable sizes when we need to keep the heap smaller. 
Also I think we are not seeing CASSANDRA-7546 as I apply your patch but the 
problem still persists. 
What more details do you need? I'll be happy to provide them.


On Wednesday, August 13, 2014 1:05 AM, graham sanderson gra...@vast.com wrote:
 


Agreed need more details; and just start by increasing heap because that may 
wells solve the problem.

I have just observed (which makes sense when you think about it) while testing 
fix for https://issues.apache.org/jira/browse/CASSANDRA-7546, that if you are 
replaying a commit log which has a high level of updates for the same partition 
key, you can hit that issue - excess memory allocation under high contention 
for the same partition key - (this might not cause OOM but will certainly 
massively tax GC and it sounds like you don’t have a lot/any headroom).

On Aug 12, 2014, at 12:31 PM, Robert Coli rc...@eventbrite.com wrote:



On Tue, Aug 12, 2014 at 9:34 AM, jivko donev jivko_...@yahoo.com wrote:

We have a node with commit log director ~4G. During start-up of the node on 
commit log replaying the used heap space is constantly growing ending with OOM 
error. 



The heap size and new heap size properties are - 1G and 256M. We are using 
the default settings for commitlog_sync, commitlog_sync_period_in_ms and 
commitlog_segment_size_in_mb.


What version of Cassandra?


1G is tiny for cassandra heap. There is a direct relationship between the data 
in the commitlog and memtables and in the heap. You almost certainly need more 
heap or less commitlog.


=Rob
  

OOM(Java heap space) on start-up during commit log replaying

2014-08-12 Thread jivko donev
Hi all, 

We have a node with commit log director ~4G. During start-up of the node on 
commit log replaying the used heap space is constantly growing ending with OOM 
error. 

The heap size and new heap size properties are - 1G and 256M. We are using the 
default settings for commitlog_sync, commitlog_sync_period_in_ms and 
commitlog_segment_size_in_mb.
 
The log shows that cassandra is stuck on MutationStage:
Active   Pending      Completed   Blocked 

 16           385              196                  0 


The stack trace is:
ERROR [metrics-meter-tick-thread-1] 2014-08-12 19:15:10,181 
CassandraDaemon.java (line 198) Exception in thread 
Thread[metrics-meter-tick-thread-1,5,main]
java.lang.OutOfMemoryError: Java heap space
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.addWaiter(Unknown Source)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(Unknown 
Source)
        at java.util.concurrent.locks.ReentrantLock.lock(Unknown Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.offer(Unknown 
Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(Unknown 
Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(Unknown 
Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.reExecutePeriodic(Unknown 
Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
ERROR [MutationStage:8] 2014-08-12 19:15:10,181 CassandraDaemon.java (line 198) 
Exception in thread Thread[MutationStage:8,5,main]
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.duplicate(Unknown Source)
        at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:62)
        at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72)
        at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:99)
        at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35)
        at 
org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:188)
        at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:219)
        at 
org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:184)
        at org.apache.cassandra.db.Memtable.resolve(Memtable.java:226)
        at org.apache.cassandra.db.Memtable.put(Memtable.java:173)
        at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:893)
        at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368)
        at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333)
        at 
org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:352)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
ERROR [MutationStage:8] 2014-08-12 19:15:12,080 CassandraDaemon.java (line 198) 
Exception in thread Thread[MutationStage:8,5,main]
java.lang.IllegalThreadStateException
        at java.lang.Thread.start(Unknown Source)
        at 
org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:204)
        at 
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.handleOrLog(DebuggableThreadPoolExecutor.java:220)
        at 
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.logExceptionsAfterExecute(DebuggableThreadPoolExecutor.java:203)
        at 
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:183)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)


Increasing the heap space to 2G solves the problem but we want to know if the 
problem could be solved without increasing the heap space. Does anyone have 
experience similar problem? If so are there any tuning options in 
cassandra.yaml? 
  

Any help will be much appreciated. If you need more information fell free to 
ask.

Thanks,
Jivko Donev

Re: OOM(Java heap space) on start-up during commit log replaying

2014-08-12 Thread jivko donev
Hi Robert,

Thanks for your reply. The Cassandra version is 2.07. Is there some commonly 
used rule for determining the commitlog and memtables size depending on the 
heap size? What would be the main disadvantage when having smaller commitlog?


On Tuesday, August 12, 2014 8:32 PM, Robert Coli rc...@eventbrite.com wrote:
 




On Tue, Aug 12, 2014 at 9:34 AM, jivko donev jivko_...@yahoo.com wrote:

We have a node with commit log director ~4G. During start-up of the node on 
commit log replaying the used heap space is constantly growing ending with OOM 
error. 



The heap size and new heap size properties are - 1G and 256M. We are using the 
default settings for commitlog_sync, commitlog_sync_period_in_ms and 
commitlog_segment_size_in_mb.

What version of Cassandra?

1G is tiny for cassandra heap. There is a direct relationship between the data 
in the commitlog and memtables and in the heap. You almost certainly need more 
heap or less commitlog.

=Rob

Adding a node to cluster keeping 100% data replicated on all nodes

2014-02-07 Thread jivko donev
Hi,

Our environment will consist of cluster with size not bigger than 2 to 4 nodes 
per cluster(all located in the same DC). We want to ensure that every node in 
the cluster will own 100% of the data. A node adding(or removing) procedure 
will be automated so we want to ensure we're making the right steps. Lets say 
we have node 'A' up and running and want to add another node 'B' to make a 
cluster. Node A configuration will be: 
seed: IP of A
listen_address: IP of A
num_tokens: 256
rpc_address: 0.0.0.0
The keyspace uses SimpleStrategy with RF: 1.

Adding node 'B' to cluster we are doing the following:
1. Stop cassandra on B.
2. Update cassandra.yaml - change seed to point to IP of A
3. Update cassandra-topology.properties - add node A ip to it and make it the 
default one.
4. rm -rf /var/lib/cassandra/*
5. Start cassandra on B.
6. Wait untill nodetool status reports the node B is up.
7. Update RP of the keyspace to 2.
8. Run nodetool repair on B and wait it to finish.

Can we update the RF factor on A before starting Cassandra on B in order to 
skip steps 7 and 8?


Now when the data is sync on both nodes we want to make a node B a seed node.
9. Update seed property on A and B to include the the IP of B node.
10. Restart cassandra on both nodes.

If adding more nodes to the cluster the steps will be the same except that seed 
property will contain all existing nodes in the cluster.

So are these steps everything we need to do? 
Is there anything more we need to do?
Is there an easier way to do what we want or all the steps above are mandatory?