me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Cluster schema does not yet agree)

2011-05-04 Thread Dikang Gu
I got this exception when I was trying to create a new columnFamily using
hector api.

me.prettyprint.hector.api.exceptions.HInvalidRequestException:
InvalidRequestException(why:Cluster schema does not yet agree)

What does this mean and how to resolve this?

I have 3 nodes in the cassandra 0.7.4 cluster.

Thanks.

-- 
Dikang Gu

0086 - 18611140205


Re: me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Cluster schema does not yet agree)

2011-05-05 Thread Dikang Gu
Is this fixed in cassandra-0.7.5 or cassandra-0.8 ?

On Thu, May 5, 2011 at 1:43 PM, Tyler Hobbs ty...@datastax.com wrote:

 The issue is quite possibly this:
 https://issues.apache.org/jira/browse/CASSANDRA-2536

 A person on the ticket commented that decomissioning and rejoining the node
 with the disagreeing shema solved the issue.


 On Thu, May 5, 2011 at 12:40 AM, Dikang Gu dikan...@gmail.com wrote:

 I got this exception when I was trying to create a new columnFamily using
 hector api.

 me.prettyprint.hector.api.exceptions.HInvalidRequestException:
 InvalidRequestException(why:Cluster schema does not yet agree)

 What does this mean and how to resolve this?

 I have 3 nodes in the cassandra 0.7.4 cluster.

 Thanks.

 --
 Dikang Gu

  0086 - 18611140205




 --
 Tyler Hobbs
 Software Engineer, DataStax http://datastax.com/
 Maintainer of the pycassa http://github.com/pycassa/pycassa Cassandra
 Python client library




-- 
Dikang Gu

0086 - 18611140205


How to reduce the Read Latency.

2011-05-20 Thread Dikang Gu
Hi All,

I'm running three cassandra 0.7.4 nodes in a cluster, and I give 2G memory to 
each node. 

Now, I get the cfstats here:

Keyspace: UserMap
Read Count: 38411
Read Latency: 123.54214613001484 ms.
Write Count: 44155
Write Latency: 0.02341093873853471 ms.
Pending Tasks: 0
Column Family: Map
SSTable count: 3
Space used (live): 32704387
Space used (total): 32704387
Memtable Columns Count: 49
Memtable Data Size: 3348
Memtable Switch Count: 56
Read Count: 38411
Read Latency: 123.542 ms.
Write Count: 44155
Write Latency: 0.023 ms.
Pending Tasks: 0
Key cache capacity: 20
Key cache size: 611
Key cache hit rate: 0.9294361241314483
Row cache: disabled
Compacted row minimum size: 125
Compacted row maximum size: 17436917
Compacted row mean size: 147647

You can find that the Read Latency is really high here, so what can I do to 
reduce the latency? Give more memory to the three nodes? Any other options?

Thanks. 
-- 
Dikang Gu
0086 - 18611140205


Re: How to reduce the Read Latency.

2011-05-20 Thread Dikang Gu
I use the default consistency level in the hector client, so it should be 
QUORUM.

-- 
Dikang Gu
0086 - 18611140205
On Friday, May 20, 2011 at 4:25 PM, Jeffrey Kesselman wrote: 
 What consistency are you asking for?
 
 On Fri, May 20, 2011 at 7:42 AM, Dikang Gu dikan...@gmail.com wrote:
  Hi All,
  I'm running three cassandra 0.7.4 nodes in a cluster, and I give 2G memory
  to each node.
  Now, I get the cfstats here:
  Keyspace: UserMap
  Read Count: 38411
  Read Latency: 123.54214613001484 ms.
  Write Count: 44155
  Write Latency: 0.02341093873853471 ms.
  Pending Tasks: 0
  Column Family: Map
  SSTable count: 3
  Space used (live): 32704387
  Space used (total): 32704387
  Memtable Columns Count: 49
  Memtable Data Size: 3348
  Memtable Switch Count: 56
  Read Count: 38411
  Read Latency: 123.542 ms.
  Write Count: 44155
  Write Latency: 0.023 ms.
  Pending Tasks: 0
  Key cache capacity: 20
  Key cache size: 611
  Key cache hit rate: 0.9294361241314483
  Row cache: disabled
  Compacted row minimum size: 125
  Compacted row maximum size: 17436917
  Compacted row mean size: 147647
  You can find that the Read Latency is really high here, so what can I do to
  reduce the latency? Give more memory to the three nodes? Any other options?
  Thanks.
  --
  Dikang Gu
  0086 - 18611140205
 
 
 
 -- 
 It's always darkest just before you are eaten by a grue.
 


Re: How to reduce the Read Latency.

2011-05-22 Thread Dikang Gu
Thanks Aaron!

Through the commands top and iostats, I find the IO system is not 
overloaded yet. So I will check the data model.

And how to get the row size of a specific key? Do we have the api yet?

Thanks.

-- 
Dikang Gu
0086 - 18611140205
On Sunday, May 22, 2011 at 6:15 PM, aaron morton wrote: 
 It's hard to say the latency is to high without knowing how many columns and 
 how many bytes you are asking for. It's also handy to know what the query 
 looks like, i.e. is it a slice or a get by name, and the CF level 
 
 Latency reported at the CF or KS level are for local read / write operations. 
 Do all nodes in the cluster show similar values or just this one? If you are 
 looking for it the o.a.c.db.StorageProxy MBean has the latency trackers for 
 the cluster wide requests. 
 
 If you are looking for a quick speed bump consider enabling the row cache. It 
 looks like you have a small hot set, they key cache has only 611 entries and 
 gets a 90%+ hit rate and the average row is 144KB. You will need more memory 
 though. 
 
 If you want to dig deeper into it:
 - make sure the app making the correct query and not wasting effort
 - watch TP stats to see if the server is keeping up with other things
 - check if the IO system is overloaded (google is your friend) 
 
 I'd start with the data model and the query.
 
 Hope that helps. 
 
 -
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 20 May 2011, at 20:32, Dikang Gu wrote:
  I use the default consistency level in the hector client, so it should be 
  QUORUM.
  
  -- 
  Dikang Gu
  0086 - 18611140205
  On Friday, May 20, 2011 at 4:25 PM, Jeffrey Kesselman wrote:
   What consistency are you asking for?
   
   On Fri, May 20, 2011 at 7:42 AM, Dikang Gu dikan...@gmail.com wrote:
Hi All,
I'm running three cassandra 0.7.4 nodes in a cluster, and I give 2G 
memory
to each node.
Now, I get the cfstats here:
Keyspace: UserMap
Read Count: 38411
Read Latency: 123.54214613001484 ms.
Write Count: 44155
Write Latency: 0.02341093873853471 ms.
Pending Tasks: 0
Column Family: Map
SSTable count: 3
Space used (live): 32704387
Space used (total): 32704387
Memtable Columns Count: 49
Memtable Data Size: 3348
Memtable Switch Count: 56
Read Count: 38411
Read Latency: 123.542 ms.
Write Count: 44155
Write Latency: 0.023 ms.
Pending Tasks: 0
Key cache capacity: 20
Key cache size: 611
Key cache hit rate: 0.9294361241314483
Row cache: disabled
Compacted row minimum size: 125
Compacted row maximum size: 17436917
Compacted row mean size: 147647
You can find that the Read Latency is really high here, so what can I 
do to
reduce the latency? Give more memory to the three nodes? Any other 
options?
Thanks.
--
Dikang Gu
0086 - 18611140205
   
   
   
   -- 
   It's always darkest just before you are eaten by a grue.
   
  
 


What's the valid name format of the column family in cassandra?

2011-05-22 Thread Dikang Gu
What's the naming convention of the column family in cassandra? I did not find 
this in the wiki yet...

Thanks.

-- 
Dikang Gu
0086 - 18611140205


Re: How to programmatically index an existed column?

2011-05-26 Thread Dikang Gu
Hi Aaron,

Thank you for your reminder. I've found out the solution myself, and I share it 
here:

KeyspaceDefinition keyspaceDefinition = cluster.describeKeyspace(KEYSPACE);
ColumnFamilyDefinition cdf = keyspaceDefinition.getCfDefs().get(0);

BasicColumnFamilyDefinition columnFamilyDefinition = new 
BasicColumnFamilyDefinition(cdf);

BasicColumnDefinition bcdf = new BasicColumnDefinition();
bcdf.setName(StringSerializer.get().toByteBuffer(birthyear));
bcdf.setIndexName(birthyearidx);
bcdf.setIndexType(ColumnIndexType.KEYS);
bcdf.setValidationClass(ComparatorType.LONGTYPE.getClassName());

columnFamilyDefinition.addColumnDefinition(bcdf);

cluster.updateColumnFamily(new ThriftCfDef(columnFamilyDefinition)); 


-- 
Dikang Gu
0086 - 18611140205
On Thursday, May 26, 2011 at 3:16 PM, aaron morton wrote: 
 Please post to one list at a time. Otherwise people may spend their time 
 helping you when someone already has. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 26 May 2011, at 17:35, Dikang Gu wrote:
 
  
  I want to build a secondary index on an existed column, how to 
  programmatically do this using hector API?
  
  Thanks.
  
  -- 
  Dikang Gu
  0086 - 18611140205
 


Re: [RELEASE] 0.8.0

2011-06-02 Thread Dikang Gu
 Great! Congratulations!

-- 
Dikang Gu
0086 - 18611140205
On Friday, June 3, 2011 at 10:06 AM, aaron morton wrote: 
 Big thanks to all the contributors and committers :)
 
 A
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 3 Jun 2011, at 11:48, Joseph Stein wrote:
  Awesome!
  
  On Thu, Jun 2, 2011 at 7:36 PM, Eric Evans eev...@rackspace.com wrote:
   
I am very pleased to announce the official release of Cassandra 0.8.0.
   
If you haven't been paying attention to this release, this is your last
chance, because by this time tomorrow all your friends are going to be
raving, and you don't want to look silly.
   
So why am I resorting to hyperbole? Well, for one because this is the
release that debuts the Cassandra Query Language (CQL). In one fell
swoop Cassandra has become more than NoSQL, it's MoSQL.
   
Cassandra also has distributed counters now. With counters, you can
count stuff, and counting stuff rocks.
   
A kickass use-case for Cassandra is spanning data-centers for
fault-tolerance and locality, but doing so has always meant sending data
in the clear, or tunneling over a VPN.  New for 0.8.0, encryption of
intranode traffic.
   
If you're not motivated to go upgrade your clusters right now, you're
either not easily impressed, or you're very lazy. If it's the latter,
would it help knowing that rolling upgrades between releases is now
supported? Yeah. You can upgrade your 0.7 cluster to 0.8 without
shutting it down.
   
You see what I mean? Then go read the release notes[1] to learn about
the full range of awesomeness, then grab a copy[2] and become a
(fashionably )early adopter.
   
Drivers for CQL are available in Python[3], Java[3], and Node.js[4].
   
As usual, a Debian package is available from the project's APT
repository[5].
   
Enjoy!
   
   
[1]: http://goo.gl/CrJqJ (NEWS.txt)
[2]: http://cassandra.debian.org/download
[3]: http://www.apache.org/dist/cassandra/drivers
[4]: https://github.com/racker/node-cassandra-client
[5]: http://wiki.apache.org/cassandra/DebianPackaging
   
--
Eric Evans
   eev...@rackspace.com
   
   
  
  
  -- 
  
  /*
  Joe Stein
  http://www.linkedin.com/in/charmalloc
  Twitter: @allthingshadoop
  */
  
 


Cannot recover SSTable with version f (current version g).

2011-06-03 Thread Dikang Gu
Hi guys,

I upgrade the 4-nodes cassandra 0.7.4 cluster to 0.8.0. Then, I do the 
bin/nodetool decommission on one node, the decommission hangs there and I got 
the following exceptions on others nodes.

ERROR [Thread-55] 2011-06-03 18:02:03,500 AbstractCassandraDaemon.java (line 
113) Fatal exception in thread Thread[Thread-55,5,main]
java.lang.RuntimeException: Cannot recover SSTable with version f (current 
version g).
at 
org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:240)
at 
org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:1088)
at 
org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:108)
at 
org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:104)
at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
at 
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155)
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93)
ERROR [Thread-56] 2011-06-03 18:02:04,285 AbstractCassandraDaemon.java (line 
113) Fatal exception in thread Thread[Thread-56,5,main]
java.lang.RuntimeException: Cannot recover SSTable with version f (current 
version g).
at 
org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:240)
at 
org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:1088)
at 
org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:108)
at 
org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:104)
at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
at 
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155)
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93)

Do you have any ideas about this?

Thanks.

-- 
Dikang Gu
0086 - 18611140205


Re: Cannot recover SSTable with version f (current version g).

2011-06-03 Thread Dikang Gu
Hi Aaron,

Thank you for your reply, I've reported a bug on the jira 
https://issues.apache.org/jira/browse/CASSANDRA-2739 .

Since I'm not in the office now, so I will come up with more information and 
try your methods when I'm in the office again, maybe next Monday.

Thanks.

-- 
Dikang Gu
0086 - 18611140205
On Friday, June 3, 2011 at 7:00 PM, aaron morton wrote: 
 Could you please create a bug report for this in Jira 
 https://issues.apache.org/jira/browse/CASSANDRA
 
 Please check the data directory on the node that encountered the error and 
 include any files names that have tmp in them. 
 
 Can you also check for log messages on the node you decommissioned at the 
 INFO level that start with Stream context metadata.
 
 it looks like the file version from the old files on the decommissioning node 
 are included in the stream header and it's used to create the new temp file 
 on the node that got this error. So when the it tries to build the other 
 SSTable files from the new data it thinks it's an old data file, even though 
 it's a v0.8 file. v0.8 can read v0.7 data files, but it would not expected to 
 see a new v0.7 data file. 
 
 If this is the case the only work around I can think of would be to nodetool 
 scrub the node before starting decommission as this would write new data 
 files. It's a bit late here so I may have missed something. 
 
 Thanks for reporting it. 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 3 Jun 2011, at 22:06, Dikang Gu wrote:
  Hi guys,
  
  I upgrade the 4-nodes cassandra 0.7.4 cluster to 0.8.0. Then, I do the 
  bin/nodetool decommission on one node, the decommission hangs there and I 
  got the following exceptions on others nodes.
  
  ERROR [Thread-55] 2011-06-03 18:02:03,500 AbstractCassandraDaemon.java 
  (line 113) Fatal exception in thread Thread[Thread-55,5,main]
  java.lang.RuntimeException: Cannot recover SSTable with version f (current 
  version g).
  at 
  org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:240)
  at 
  org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:1088)
  at 
  org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:108)
  at 
  org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:104)
  at 
  org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
  at 
  org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155)
  at 
  org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93)
  ERROR [Thread-56] 2011-06-03 18:02:04,285 AbstractCassandraDaemon.java 
  (line 113) Fatal exception in thread Thread[Thread-56,5,main]
  java.lang.RuntimeException: Cannot recover SSTable with version f (current 
  version g).
  at 
  org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:240)
  at 
  org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:1088)
  at 
  org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:108)
  at 
  org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:104)
  at 
  org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
  at 
  org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:155)
  at 
  org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:93)
  
  Do you have any ideas about this?
  
  Thanks.
  
  -- 
  Dikang Gu
  0086 - 18611140205
  
 


Re: Schema Disagreement

2011-08-01 Thread Dikang Gu
I thought the schema disagree problem was already solved in 0.8.1...

On possible solution is to decommission the disagree node and rejoin it.


On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote:

 Dear all,

 I'm always meeting mp with schema disagree problems while trying to create
 a column family like this, using cassandra-cli:

 create column family sd
with column_type = 'Super'
and key_validation_class = 'UUIDType'
and comparator = 'LongType'
and subcomparator = 'UTF8Type'
and column_metadata = [
{
column_name: 'time',
validation_class : 'LongType'
},{
column_name: 'open',
validation_class : 'FloatType'
},{
column_name: 'high',
validation_class : 'FloatType'
},{
column_name: 'low',
validation_class : 'FloatType'
},{
column_name: 'close',
validation_class : 'FloatType'
},{
column_name: 'volumn',
validation_class : 'LongType'
},{
column_name: 'splitopen',
validation_class : 'FloatType'
},{
column_name: 'splithigh',
validation_class : 'FloatType'
},{
column_name: 'splitlow',
validation_class : 'FloatType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
},{
column_name: 'splitvolume',
validation_class : 'LongType'
},{
column_name: 'splitclose',
validation_class : 'FloatType'
}
]
 ;

 I've tried to erase everything and restart Cassandra but this still
 happens.   But when I clear the column_metadata section this no more
 disagreement error.   Do you have any idea why this happens?

 Environment: 2 VMs, using the same harddrive, Cassandra 0.8.1, Ubuntu 10.04
 This is for testing only.   We'll move to dedicated servers later.

 Best regards,
 Yi




-- 
Dikang Gu

0086 - 18611140205


HCassandraInternalException when RangeSlicesQuery using hector.

2011-08-02 Thread Dikang Gu
I'm using the hector 0.8.0-2 against cassandra 0.8.1.

When I executed the code:

rangeSlicesQuery.setColumnFamily(columnFamily).setKeys(key, key)
 .setRange(startColumn, null, reversed, count);

I got the following errors:

Caused by: me.prettyprint.hector.api.exceptions.HCassandraInternalException: 
Cassandra encountered an internal error processing this request: 
TApplicationError type: 6 message:Internal error processing get_range_slices
at 
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:29)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:163)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:145)
at 
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at 
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.getRangeSlices(KeyspaceServiceImpl.java:167)
at 
me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery$1.doInKeyspace(ThriftRangeSlicesQuery.java:67)
at 
me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery$1.doInKeyspace(ThriftRangeSlicesQuery.java:63)
at 
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at 
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at 
me.prettyprint.cassandra.model.thrift.ThriftRangeSlicesQuery.execute(ThriftRangeSlicesQuery.java:62)


I only got this error when the startColumn is greater than the 'biggest' column 
in the row.

Is this a bug? Any ideas? 

Thanks.

-- 
Dikang Gu
0086 - 18611140205



Re: Schema Disagreement

2011-08-02 Thread Dikang Gu
I also encounter the schema disagreement in my 0.8.1 cluster today…

The disagreement occurs when I create a column family using the hector api, and 
I found the following errors in my cassandra/system.log

ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378) 
Internal error processing remove
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
down
at 
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
at 
org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539)
at org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547)
at 
org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

And when I try to decommission, I got this:

ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462) 
Internal error processing batch_mutate
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
down
at 
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
at 
org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
at 
org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
at 
org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
at 
org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
at 
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

What does this mean? 

Thanks.

-- 
Dikang Gu
0086 - 18611140205
On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote: 
 Hang on, using brain now. 
 
 That is triggering a small bug in the code see 
 https://issues.apache.org/jira/browse/CASSANDRA-2984
 
 For not just remove the column meta data. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 2 Aug 2011, at 21:19, aaron morton wrote:
  What do you see when you run describe cluster; in the cassandra-cli ? Whats 
  the exact error you get and is there anything in the server side logs ?
  
  Have you added other CF's before adding this one ? Did the schema agree 
  before starting this statement?
  
  I ran the statement below on the current trunk and it worked. 
  
  Cheers
  
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  
  
  
  
  
  On 2 Aug 2011, at 12:08, Dikang Gu wrote:
   I thought the schema disagree problem was already solved in 0.8.1...
   
   On possible solution is to decommission the disagree node and rejoin it.
   
   
   On Tue, Aug 2, 2011 at 8:01 AM, Yi Yang yy...@me.com wrote:
Dear all,

 I'm always meeting mp with schema disagree problems while trying to 
create

Re: Schema Disagreement

2011-08-02 Thread Dikang Gu
I followed the instructions in the FAQ, but got the following when describe 
cluster;

Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
dd73c740-bd84-11e0--98dab94442fb: [192.168.1.28, 192.168.1.9, 192.168.1.27]
UNREACHABLE: [192.168.1.25]


What's the UNREACHABLE?

Thanks.

-- 
Dikang Gu
0086 - 18611140205
On Wednesday, August 3, 2011 at 11:28 AM, Jonathan Ellis wrote: 
 Have you seen http://wiki.apache.org/cassandra/FAQ#schema_disagreement ?
 
 On Tue, Aug 2, 2011 at 10:25 PM, Dikang Gu dikan...@gmail.com wrote:
  I also encounter the schema disagreement in my 0.8.1 cluster today…
  
  The disagreement occurs when I create a column family using the hector api,
  and I found the following errors in my cassandra/system.log
  ERROR [pool-2-thread-99] 2011-08-03 11:21:18,051 Cassandra.java (line 3378)
  Internal error processing remove
  java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
  down
  at
  org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
  at
  java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
  at
  java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
  at
  org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
  at
  org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
  at
  org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
  at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
  at
  org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
  at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
  at
  org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
  at
  org.apache.cassandra.thrift.CassandraServer.internal_remove(CassandraServer.java:539)
  at
  org.apache.cassandra.thrift.CassandraServer.remove(CassandraServer.java:547)
  at
  org.apache.cassandra.thrift.Cassandra$Processor$remove.process(Cassandra.java:3370)
  at
  org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
  at
  org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
  at
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:636)
  And when I try to decommission, I got this:
  ERROR [pool-2-thread-90] 2011-08-03 11:24:35,611 Cassandra.java (line 3462)
  Internal error processing batch_mutate
  java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut
  down
  at
  org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
  at
  java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
  at
  java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
  at
  org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
  at
  org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
  at
  org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
  at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
  at
  org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
  at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
  at
  org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
  at
  org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
  at
  org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
  at
  org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
  at
  org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
  at
  org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
  at
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:636)
  What does this mean?
  Thanks.
  --
  Dikang Gu
  0086 - 18611140205
  
  On Tuesday, August 2, 2011 at 6:04 PM, aaron morton wrote:
  
  Hang on, using brain now.
  That is triggering a small bug in the code
  see https://issues.apache.org/jira/browse/CASSANDRA-2984
  For not just remove the column meta data.
  Cheers
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  On 2 Aug 2011, at 21:19, aaron morton wrote:
  
  What do you see when you run describe cluster; in the cassandra-cli ? Whats
  the exact error you get

Cassandra encountered an internal error processing this request: TApplicationError type: 6 message:Internal error

2011-08-02 Thread Dikang Gu
I got this error when processing a lot operations….

2011-08-03 11:26:35,786 ERROR [com.iw.nebula.dao.simpledb.SimpleDBAdapter] - 
Cassandra encountered an internal error processing this request: 
TApplicationError type: 6 message:Internal error processing batch_mutate

2011-08-03 11:48:21,998 ERROR [com.iw.nebula.dao.simpledb.SimpleDBAdapter] - 
Cassandra encountered an internal error processing this request: 
TApplicationError type: 6 message:Internal error processing get_range_slices

I did not see anything wrong in the cassandra/system.log

What's your suggestions?

-- 
Dikang Gu
0086 - 18611140205


Do we have any restrictions on the number of column families in a keyspace?

2011-08-02 Thread Dikang Gu
As per subject.

Thanks.
-- 
Dikang Gu
0086 - 18611140205


If I always send the schema change requests to one particular node in the cassandra cluster.

2011-08-03 Thread Dikang Gu
Hi,

Can the schema disagreement problem be avoided?

Thanks.
-- 
Dikang Gu
0086 - 18611140205


Re: Cassandra encountered an internal error processing this request: TApplicationError type: 6 message:Internal error

2011-08-04 Thread Dikang Gu
Sure, I can find the stack trace for some exceptions:

ERROR [pool-2-thread-132] 2011-07-23 13:29:04,869 Cassandra.java (line 3210)
Internal error processing get_range_slices
java.lang.NullPointerException
at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:298)
at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:406)
at
org.apache.cassandra.service.RowRepairResolver.maybeScheduleRepairs(RowRepairResolver.java:103)
at
org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:120)
at
org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:85)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:715)
at
org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617)
at
org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
 INFO [NonPeriodicTasks:1] 2011-07-23 13:38:23,284 ColumnFamilyStore.java
(line 1013) Enqueuing flush of Memtable-MessageKey@2036597133(5020/62750
serialized/live bytes, 61 ops)

But can no for some others:

ERROR [pool-2-thread-181] 2011-07-27 11:20:39,550 Cassandra.java (line 3210)
Internal error processing get_range_slices
java.lang.NullPointerException
 INFO [NonPeriodicTasks:1] 2011-07-27 11:22:43,561 ColumnFamilyStore.java
(line 1013) Enqueuing flush of Memtable-MessageKey@1288355086(74715/933937
serialized/live bytes, 773 ops)

Why does this happen?

Thanks.

On Fri, Aug 5, 2011 at 6:26 AM, aaron morton aa...@thelastpickle.comwrote:

 The error log will contain a call stack, we need that.

 e.g.

 Failed with exception java.io.IOException:java.lang.NullPointerException
 ERROR 15:22:33,528 Failed with exception
 java.io.IOException:java.lang.NullPointerException
 java.io.IOException: java.lang.NullPointerException
 at
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:341)
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:133)
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.lang.NullPointerException
 at
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getCurrentKey(ColumnFamilyRecordReader.java:82)
 at
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getCurrentKey(ColumnFamilyRecordReader.java:53)
 at
 org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat$2.next(HiveCassandraStandardColumnInputFormat.java:164)
 at
 org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat$2.next(HiveCassandraStandardColumnInputFormat.java:111)
 at
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:326)
 ... 10 more

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 4 Aug 2011, at 15:26, Dikang Gu wrote:

  Yes, I do find the error log!

 ERROR [pool-2-thread-63] 2011-08-04 13:23:54,138 Cassandra.java (line 3210)
 Internal error processing get_range_slices
 java.lang.NullPointerException

 I'm using the cassandra-0.8.1, is this a known bug?

 Thanks.

 --
 Dikang Gu
 0086 - 18611140205

 On Wednesday, August 3, 2011 at 7:53 PM, aaron morton wrote:

 There really should be something logged at the ERROR level in the server
 side log, that error is raised when an unhanded exception bubbles out to the
 thrift layer on the server.

 Double check the logging is configured correctly.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 3 Aug 2011, at 14:19, Dikang Gu wrote:

 I got this error when processing a lot

How to solve this kind of schema disagreement...

2011-08-05 Thread Dikang Gu
[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]


 three different schema versions in the cluster...

-- 
Dikang Gu

0086 - 18611140205


Re: How to solve this kind of schema disagreement...

2011-08-06 Thread Dikang Gu
I have tried this, but the schema still does not agree in the cluster:

[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
UNREACHABLE: [192.168.1.28]
75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

Any other suggestions to solve this?

Because I have some production data saved in the cassandra cluster, so I can
not afford data lost...

Thanks.

On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote:

 Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and
 remove the schema* and migration* sstables from both 192.168.1.28 and
 192.168.1.27


 2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster;
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]
 
   three different schema versions in the cluster...
  --
  Dikang Gu
  0086 - 18611140205
 




-- 
Dikang Gu

0086 - 18611140205


Re: How to solve this kind of schema disagreement...

2011-08-06 Thread Dikang Gu
I restart both nodes, and deleted the shcema* and migration* and restarted
them.

The current cluster looks like this:
[default@unknown] describe cluster;
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions:
75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9,
192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

the 1.28 looks good, and the 1.27 still can not get the schema agreement...

I have tried several times, even delete all the data on 1.27, and rejoin it
as a new node, but it is still unhappy.

And the ring looks like this:

Address DC  RackStatus State   LoadOwns
   Token

   127605887595351923798765477786913079296
192.168.1.28datacenter1 rack1   Up Normal  8.38 GB
25.00%  1
192.168.1.25datacenter1 rack1   Up Normal  8.55 GB
34.01%  57856537434773737201679995572503935972
192.168.1.27datacenter1 rack1   Up Joining 1.81 GB
24.28%  99165710459060760249270263771474737125
192.168.1.9 datacenter1 rack1   Up Normal  8.75 GB
16.72%  127605887595351923798765477786913079296

The 1.27 seems can not join the cluster, and it just hangs there...

Any suggestions?

Thanks.


On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.comwrote:

 After there restart you what was in the  logs for the 1.27 machine  from
 the Migration.java logger ? Some of the messages will start with Applying
 migration

 You should have shut down both of the nodes, then deleted the schema* and
 migration* system sstables, then restarted one of them and watched to see if
 it got to schema agreement.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6 Aug 2011, at 22:56, Dikang Gu wrote:

 I have tried this, but the schema still does not agree in the cluster:

 [default@unknown] describe cluster;
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 UNREACHABLE: [192.168.1.28]
 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

 Any other suggestions to solve this?

 Because I have some production data saved in the cassandra cluster, so I
 can not afford data lost...

 Thanks.

 On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch wrote:

 Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and
 remove the schema* and migration* sstables from both 192.168.1.28 and
 192.168.1.27


 2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster;
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  743fe590-bf48-11e0--4d205df954a7: [192.168.1.28]
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
  06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27]
 
   three different schema versions in the cluster...
  --
  Dikang Gu
  0086 - 18611140205
 




 --
 Dikang Gu

 0086 - 18611140205





-- 
Dikang Gu

0086 - 18611140205


Re: move one node for load re-balancing then it status stuck at Leaving

2011-08-07 Thread Dikang Gu
Yes, I think you are right.

The nodetool move will move the keys on the node to the other two nodes,
and the required replication is 3, but you will only have 2 live nodes after
the move, so you have the exception.


On Sun, Aug 7, 2011 at 2:03 PM, Yan Chunlu springri...@gmail.com wrote:

 is that possible that the implements of cassandra only calculate live
 nodes?

 for example:
 node move node3 cause node3 Leaving, then cassandra iterate over the
 endpoints and found node1 and node2. so the endpoints is 2, but RF=3,
 Exception raised.

 is that true?



 On Fri, Aug 5, 2011 at 3:20 PM, Yan Chunlu springri...@gmail.com wrote:

 nothing...

 nodetool -h node3 netstats
 Mode: Normal
 Not sending any streams.
  Nothing streaming from /10.28.53.11
 Pool NameActive   Pending  Completed
 Commandsn/a 0  186669475
 Responses   n/a 0  117986130


 nodetool -h node3 compactionstats
 compaction type: n/a
 column family: n/a
 bytes compacted: n/a
 bytes total in progress: n/a
 pending tasks: 0



 On Fri, Aug 5, 2011 at 1:47 PM, mcasandra mohitanch...@gmail.com wrote:
  Check things like netstats, disk space etc to see why it's in Leaving
 state.
  Anything in the logs that shows Leaving?
 
  --
  View this message in context:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/move-one-node-for-load-re-balancing-then-it-status-stuck-at-Leaving-tp6655168p6655326.html
  Sent from the cassandra-u...@incubator.apache.org mailing list archive
 at Nabble.com.
 





-- 
Dikang Gu

0086 - 18611140205


Re: Cassandra encountered an internal error processing this request: TApplicationError type: 6 message:Internal error

2011-08-07 Thread Dikang Gu
That's great!

Thanks Aaron.

On Sun, Aug 7, 2011 at 2:21 PM, aaron morton aa...@thelastpickle.comwrote:

 The NPE is fixed in 0.8.2 see
 https://github.com/apache/cassandra/blob/cassandra-0.8.2/CHANGES.txt#L13

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 5 Aug 2011, at 12:46, Dikang Gu wrote:

 Sure, I can find the stack trace for some exceptions:

 ERROR [pool-2-thread-132] 2011-07-23 13:29:04,869 Cassandra.java (line
 3210) Internal error processing get_range_slices
 java.lang.NullPointerException
 at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:298)
 at org.apache.cassandra.db.ColumnFamily.diff(ColumnFamily.java:406)
 at
 org.apache.cassandra.service.RowRepairResolver.maybeScheduleRepairs(RowRepairResolver.java:103)
 at
 org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:120)
 at
 org.apache.cassandra.service.RangeSliceResponseResolver$2.getReduced(RangeSliceResponseResolver.java:85)
 at
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 at
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:715)
 at
 org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617)
 at
 org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202)
 at
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)
  INFO [NonPeriodicTasks:1] 2011-07-23 13:38:23,284 ColumnFamilyStore.java
 (line 1013) Enqueuing flush of Memtable-MessageKey@2036597133(5020/62750
 serialized/live bytes, 61 ops)

 But can no for some others:

 ERROR [pool-2-thread-181] 2011-07-27 11:20:39,550 Cassandra.java (line
 3210) Internal error processing get_range_slices
 java.lang.NullPointerException
  INFO [NonPeriodicTasks:1] 2011-07-27 11:22:43,561 ColumnFamilyStore.java
 (line 1013) Enqueuing flush of Memtable-MessageKey@1288355086(74715/933937
 serialized/live bytes, 773 ops)

 Why does this happen?

 Thanks.

 On Fri, Aug 5, 2011 at 6:26 AM, aaron morton aa...@thelastpickle.comwrote:

 The error log will contain a call stack, we need that.

 e.g.

 Failed with exception java.io.IOException:java.lang.NullPointerException
 ERROR 15:22:33,528 Failed with exception
 java.io.IOException:java.lang.NullPointerException
 java.io.IOException: java.lang.NullPointerException
  at
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:341)
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:133)
  at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.lang.NullPointerException
 at
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getCurrentKey(ColumnFamilyRecordReader.java:82)
  at
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader.getCurrentKey(ColumnFamilyRecordReader.java:53)
 at
 org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat$2.next(HiveCassandraStandardColumnInputFormat.java:164)
  at
 org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat$2.next(HiveCassandraStandardColumnInputFormat.java:111)
 at
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:326)
  ... 10 more

 Cheers

  -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 4 Aug 2011, at 15:26, Dikang Gu wrote:

  Yes, I do find the error log!

 ERROR [pool-2-thread-63] 2011-08-04 13:23:54,138 Cassandra.java (line
 3210) Internal error processing get_range_slices
 java.lang.NullPointerException

 I'm using the cassandra-0.8.1, is this a known bug?

 Thanks.

 --
 Dikang Gu
 0086 - 18611140205

 On Wednesday, August 3, 2011 at 7:53 PM, aaron morton wrote

Re: How to solve this kind of schema disagreement...

2011-08-07 Thread Dikang Gu
Hi Aaron, 

I repeat the whole procedure:

1. kill the cassandra instance on 1.27.
2. rm the data/system/Migrations-g-*
3. rm the data/system/Schema-g-*
4. bin/cassandra to start the cassandra.

Now, the migration seems stop and I do not find any error in the system.log yet.

The ring looks good:
[root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring
Address  DC Rack Status State  Load Owns Token 
127605887595351923798765477786913079296 
192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
192.168.1.25 datacenter1 rack1  Up  Normal 8.54 GB  34.01% 
57856537434773737201679995572503935972 
192.168.1.27 datacenter1 rack1  Up Normal 1.78 GB  24.28% 
99165710459060760249270263771474737125 
192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
127605887595351923798765477786913079296 


But the schema still does not correct:
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]


The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…

And in the log, the last Migration.java log is:
INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
SimpleDB_4E38DAA64894A9146105rep 
strategy:SimpleStrategy{}durable_writes: true

Could you explain this? 

If I change the token given to 1.27 to another one, will it help?

Thanks.
-- 
Dikang Gu
0086 - 18611140205
On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: 
 did you check the logs in 1.27 for errors ? 
 
 Could you be seeing this ? 
 https://issues.apache.org/jira/browse/CASSANDRA-2867
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 7 Aug 2011, at 16:24, Dikang Gu wrote:
  I restart both nodes, and deleted the shcema* and migration* and restarted 
  them.
  
  The current cluster looks like this:
  [default@unknown] describe cluster; 
  Cluster Information:
  Snitch: org.apache.cassandra.locator.SimpleSnitch
  Partitioner: org.apache.cassandra.dht.RandomPartitioner
  Schema versions: 
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
  192.168.1.25]
  5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
  
  
  the 1.28 looks good, and the 1.27 still can not get the schema agreement...
  
  I have tried several times, even delete all the data on 1.27, and rejoin it 
  as a new node, but it is still unhappy. 
  
  And the ring looks like this: 
  
  Address  DC Rack Status State  Load Owns Token 
  127605887595351923798765477786913079296 
  192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
  192.168.1.25 datacenter1 rack1  Up  Normal 8.55 GB  34.01% 
  57856537434773737201679995572503935972 
  192.168.1.27 datacenter1 rack1  Up Joining 1.81 GB  24.28% 
  99165710459060760249270263771474737125 
   192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
  127605887595351923798765477786913079296 
  
  
  The 1.27 seems can not join the cluster, and it just hangs there... 
  
  Any suggestions?
  
  Thanks.
  
  
  On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com 
  wrote:
   After there restart you what was in the logs for the 1.27 machine from 
   the Migration.java logger ? Some of the messages will start with 
   Applying migration 
   
   You should have shut down both of the nodes, then deleted the schema* and 
   migration* system sstables, then restarted one of them and watched to see 
   if it got to schema agreement. 
   
Cheers
   
   -
   Aaron Morton
   Freelance Cassandra Developer
   @aaronmorton
   http://www.thelastpickle.com
   
   
   
   
   
   On 6 Aug 2011, at 22:56, Dikang Gu wrote:
I have tried this, but the schema still does not agree in the cluster:

[default@unknown] describe cluster; 
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
UNREACHABLE: [192.168.1.28]
75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]

Any other suggestions to solve this?

Because I have some production data saved in the cassandra cluster, so 
I can not afford data lost... 

Thanks.
On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud ben...@noisette.ch 
wrote:
  Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement,
  75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown 
 and
  remove the schema* and migration* sstables from both 192.168.1.28 and
  192.168.1.27
 
 
  2011/8/5 Dikang Gu dikan...@gmail.com:
  [default@unknown] describe cluster

Re: How to solve this kind of schema disagreement...

2011-08-09 Thread Dikang Gu
Hi Aaron,

I set the log level to be DEBUG, and find a lot of forceFlush debug info in the 
log:

DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean
DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean
DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean
DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 725) 
forceFlush requested but everything is clean

What does this mean?

Thanks.


-- 
Dikang Gu
0086 - 18611140205
On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote: 
 um. There has got to be something stopping the migration from completing. 
 
 Turn the logging up to DEBUG before starting and look for messages from 
 MigrationManager.java
 
 Provide all the log messages from Migration.java on the 1.27 node
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 
 On 8 Aug 2011, at 15:52, Dikang Gu wrote:
  Hi Aaron, 
  
  I repeat the whole procedure:
  
  1. kill the cassandra instance on 1.27.
  2. rm the data/system/Migrations-g-*
  3. rm the data/system/Schema-g-*
  4. bin/cassandra to start the cassandra.
  
  Now, the migration seems stop and I do not find any error in the system.log 
  yet.
  
  The ring looks good:
  [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 
  ring
  Address  DC Rack Status State  Load Owns Token 
  127605887595351923798765477786913079296 
  192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
  192.168.1.25 datacenter1 rack1  Up  Normal 8.54 GB  34.01% 
  57856537434773737201679995572503935972 
  192.168.1.27 datacenter1 rack1  Up Normal 1.78 GB  24.28% 
  99165710459060760249270263771474737125 
  192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
  127605887595351923798765477786913079296 
  
  
  But the schema still does not correct:
  Cluster Information:
  Snitch: org.apache.cassandra.locator.SimpleSnitch
  Partitioner: org.apache.cassandra.dht.RandomPartitioner
  Schema versions: 
  75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
  192.168.1.25]
  5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
  
  
  The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…
  
  And in the log, the last Migration.java log is:
  INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
  Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
  SimpleDB_4E38DAA64894A9146105rep 
  strategy:SimpleStrategy{}durable_writes: true
  
  Could you explain this? 
  
  If I change the token given to 1.27 to another one, will it help?
  
  Thanks.
  -- 
  Dikang Gu
  0086 - 18611140205
  On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote:
   did you check the logs in 1.27 for errors ? 
   
   Could you be seeing this ? 
   https://issues.apache.org/jira/browse/CASSANDRA-2867
   
   Cheers
   
   -
   Aaron Morton
   Freelance Cassandra Developer
   @aaronmorton
   http://www.thelastpickle.com
   
   
   
   
   
   On 7 Aug 2011, at 16:24, Dikang Gu wrote:
I restart both nodes, and deleted the shcema* and migration* and 
restarted them.

The current cluster looks like this:
[default@unknown] describe cluster; 
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
192.168.1.25]
5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]


the 1.28 looks good, and the 1.27 still can not get the schema 
agreement...

I have tried several times, even delete all the data on 1.27, and 
rejoin it as a new node, but it is still unhappy. 

And the ring looks like this: 

Address  DC Rack Status State  Load Owns Token 
127605887595351923798765477786913079296 
192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
192.168.1.25 datacenter1 rack1  Up  Normal 8.55 GB  34.01% 
57856537434773737201679995572503935972 
192.168.1.27 datacenter1 rack1  Up Joining 1.81 GB  24.28% 
99165710459060760249270263771474737125 
 192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
127605887595351923798765477786913079296 


The 1.27 seems can not join the cluster, and it just hangs there... 

Any suggestions?

Thanks.


On Sun, Aug 7, 2011 at 10:01 AM, aaron morton aa...@thelastpickle.com 
wrote:
 After there restart you what was in the logs for the 1.27 machine 
 from the Migration.java logger ? Some of the messages will start with 
 Applying migration 
 
 You should have shut down both of the nodes

Re: How to solve this kind of schema disagreement...

2011-08-09 Thread Dikang Gu
And a lot of not apply logs.

DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 
DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
/192.168.1.9
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,376 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 
DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
/192.168.1.9
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,379 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 
DefinitionsUpdateVerbHandler.java (line 70) Applying AddColumnFamily from 
/192.168.1.9
DEBUG [MigrationStage:1] 2011-08-10 11:36:29,382 
DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous 
version mismatch. cannot apply.


-- 
Dikang Gu
0086 - 18611140205
On Wednesday, August 10, 2011 at 11:35 AM, Dikang Gu wrote: 
 Hi Aaron,
 
 I set the log level to be DEBUG, and find a lot of forceFlush debug info in 
 the log:
 
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 DEBUG [StreamStage:1] 2011-08-10 11:31:56,345 ColumnFamilyStore.java (line 
 725) forceFlush requested but everything is clean
 
 What does this mean?
 
 Thanks.
 
 
 -- 
 Dikang Gu
 0086 - 18611140205
 On Wednesday, August 10, 2011 at 6:42 AM, aaron morton wrote:
  um. There has got to be something stopping the migration from completing. 
  
  Turn the logging up to DEBUG before starting and look for messages from 
  MigrationManager.java
  
  Provide all the log messages from Migration.java on the 1.27 node
  
  Cheers
  
  
  -
  Aaron Morton
  Freelance Cassandra Developer
  @aaronmorton
  http://www.thelastpickle.com
  
  
  
  
  
  On 8 Aug 2011, at 15:52, Dikang Gu wrote:
   Hi Aaron, 
   
   I repeat the whole procedure:
   
   1. kill the cassandra instance on 1.27.
   2. rm the data/system/Migrations-g-*
   3. rm the data/system/Schema-g-*
   4. bin/cassandra to start the cassandra.
   
   Now, the migration seems stop and I do not find any error in the 
   system.log yet.
   
   The ring looks good:
   [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 
   -p8090 ring
   Address  DC Rack Status State  Load Owns Token 
   127605887595351923798765477786913079296 
   192.168.1.28 datacenter1 rack1  Up  Normal 8.38 GB  25.00% 1 
   192.168.1.25 datacenter1 rack1  Up  Normal 8.54 GB  34.01% 
   57856537434773737201679995572503935972 
   192.168.1.27 datacenter1 rack1  Up Normal 1.78 GB  24.28% 
   99165710459060760249270263771474737125 
   192.168.1.9  datacenter1 rack1  Up  Normal 8.75 GB  16.72% 
   127605887595351923798765477786913079296 
   
   
   But the schema still does not correct:
   Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
   75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
   192.168.1.25]
   5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
   
   
   The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time…
   
   And in the log, the last Migration.java log is:
   INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) 
   Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: 
   SimpleDB_4E38DAA64894A9146105rep 
   strategy:SimpleStrategy{}durable_writes: true
   
   Could you explain this? 
   
   If I change the token given to 1.27 to another one, will it help?
   
   Thanks.
   -- 
   Dikang Gu
   0086 - 18611140205
   On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote:
did you check the logs in 1.27 for errors ? 

Could you be seeing this ? 
https://issues.apache.org/jira/browse/CASSANDRA-2867

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com





On 7 Aug 2011, at 16:24, Dikang Gu wrote:
 I restart both nodes, and deleted the shcema* and migration* and 
 restarted them.
 
 The current cluster looks like this:
 [default@unknown] describe cluster; 
 Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions: 
 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 
 192.168.1.25]
 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27]
 
 
 the 1.28 looks good, and the 1.27 still can not get the schema

Exception encountered during startup.

2011-09-08 Thread Dikang Gu
I have a 4 cassandra 0.8.1 nodes in the cluster, one node crashes and I'm
trying to restart it.

But I encounter the following errors during the startup, is this a known
bug?


DEBUG [main] 2011-09-08 20:26:17,959 Table.java (line 305) Initializing
system.NodeIdInfo
DEBUG [main] 2011-09-08 20:26:17,963 ColumnFamilyStore.java (line 264)
Starting CFS NodeIdInfo
DEBUG [main] 2011-09-08 20:26:17,967 AutoSavingCache.java (line 175)
KeyCache capacity for NodeIdInfo is 1
ERROR [main] 2011-09-08 20:26:17,969 AbstractCassandraDaemon.java (line 332)
Exception encountered during startup.
java.lang.RuntimeException: javax.management.InstanceAlreadyExistsException:
org.apache.cassandra.db:type=ColumnFamilies,keyspace=system,columnfamily=NodeIdInfo
at
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:315)
at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:455)
at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:436)
at org.apache.cassandra.db.Table.initCf(Table.java:369)
at org.apache.cassandra.db.Table.init(Table.java:306)
at org.apache.cassandra.db.Table.open(Table.java:111)
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:212)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:128)
at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:315)
at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80)
Caused by: javax.management.InstanceAlreadyExistsException:
org.apache.cassandra.db:type=ColumnFamilies,keyspace=system,columnfamily=NodeIdInfo
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:453)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1484)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:963)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:917)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:482)
at
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:311)
... 9 more

Thanks.

-- 
Dikang Gu

0086 - 18611140205


Re: Ignorning message. showing in the log while upgrade to 0.8

2011-09-16 Thread Dikang Gu
You might need to do the nodetool scrub on the nodes to rebuild the sstables
for the different protocols.

On Fri, Sep 16, 2011 at 10:50 PM, Yan Chunlu springri...@gmail.com wrote:

 and also the load is unusual(node1 has 80M data before the upgrade):

 bash-3.2$ bin/nodetool -h localhost ring
 Address DC  RackStatus State   LoadOwns
Token

93798607613553124915572813490354413064
 node2   datacenter1 rack1   Up Normal  86.03 MB46.81%
  3303745385038694806791595159000401786
 node3   datacenter1 rack1   Up Normal  67.68 MB26.65%
  48642301133762927375044585593194981764
 node1   datacenter1 rack1   Up Normal  114.81 KB   26.54%
  93798607613553124915572813490354413064



 On Fri, Sep 16, 2011 at 10:48 PM, Yan Chunlu springri...@gmail.comwrote:

 after kill node1 and start it again, node 3 has the same problems with
 node2...


 On Fri, Sep 16, 2011 at 10:42 PM, Yan Chunlu springri...@gmail.comwrote:

 I am running local tests about upgrade cassandra.  upgrade from 0.7.4 to
 0.8.5
 after upgrade one node1,  two problem happened:

 1,  node2 keep saying:

 Received connection from newer protocol version. Ignorning message.

 is that normal behaviour?

 2, while running describe cluster on node1, it shows node2 unreachable:
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 UNREACHABLE: [node2]
 05f1ee3b-e063-11e0-97d5-63c2fb3f0ca8: [node1, node3]

 node3 seems act normal.


 I saw the JMXPORT has changed since 0.8, is that the reason node was
 unreachable?


 thanks!







-- 
Dikang Gu

0086 - 18611140205


Could not reach schema agreement when adding a new node.

2011-09-24 Thread Dikang Gu
I found this in the system.log when adding a new node to the cluster.

Anyone familiar with this?

ERROR [HintedHandoff:2] 2011-09-24 18:01:30,498 AbstractCassandraDaemon.java
(line 113) Fatal exception in thread Thread[HintedHandoff:2,1,main]
java.lang.RuntimeException: java.lang.RuntimeException: Could not reach
schema agreement with /192.168.1.9 in 6ms
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.RuntimeException: Could not reach schema agreement with
/192.168.1.9 in 6ms
at
org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:290)
at
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:301)
at
org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89)
at
org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:394)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more

Thanks.

-- 
Dikang Gu

0086 - 18611140205


Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Dikang Gu
Congrats!

In 0.8, the schema disagreement occurs sometimes when I create
keyspaces/column families dynamically, is this also fixed?

Regards.

On Tue, Oct 18, 2011 at 9:20 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Short version: yes, 1.0 addresses the known repair problems.

 On Tue, Oct 18, 2011 at 7:29 AM, Maxim Potekhin potek...@bnl.gov wrote:
  There was a problem in early 0.8 where the repair was taking
  forever -- am I right to assume this was fixed in 1.0?
 
  Many thanks to you guys,
 
  Maxim
 
 
  On 10/18/2011 2:25 PM, Thibaut Britz wrote:
 
  Great news!
 
  Especially the improved read performance and compactions are great!
 
  Thanks,
  Thibaut
 
 
  On Tue, Oct 18, 2011 at 2:11 PM, Jonathan Ellisjbel...@gmail.com
  wrote:
 
  Thanks for the help, everyone!  This is a great milestone for
 Cassandra.
 
  On Tue, Oct 18, 2011 at 7:01 AM, Sylvain Lebresnesylv...@datastax.com
 
   wrote:
 
  The Cassandra team is very pleased to announce the release of Apache
  Cassandra
  version 1.0.0. Cassandra 1.0.0 is a new major release that build upon
  the
  awesomeness of previous versions and adds numerous improvements[1,2],
  amongst
  which:
   - Compression of on-disk data files (SSTables), with checksummed
 blocks
  to
 protect against bitrot[4].
   - Improvements to memory management through off-heap caches, arena
 allocation and automatic self-tuning, for less GC pauses and more
 predictable performances[5].
   - Better disk-space management: better control of the space taken by
  commit
 logs and immediate deletion of obsolete data files.
   - New optional leveled compaction strategy with more predictable
  performance
 and fixed sstable size[6].
   - Improved hinted handoffs, leading to less need for read repair for
 better read performances.
   - Lots of improvements to performance[7], CQL, repair, easier
  operation,
 etc[8]...
 
  And as is the rule for some time now, rolling upgrades from previous
  versions
  are supported, so there is nothing stopping you to get all those
 goodies
  right
  now!
 
  Both source and binary distributions of Cassandra 1.0.0 can be
  downloaded at:
 
   http://cassandra.apache.org/download/
 
  Or you can use the debian package available from the project APT
  repository[3]
  (you will need to use the 10x series).
 
  The download page also link to the CQL drivers that, from this release
  on, are
  maintained out of tree[9].
 
 
  That's all folks!
 
  [1]: http://goo.gl/t3qpw (CHANGES.txt)
  [2]: http://goo.gl/6t0qN (NEWS.txt)
  [3]: http://wiki.apache.org/cassandra/DebianPackaging
  [4]:
 
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression
  [5]:
 
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
  [6]:
 
 http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra
  [7]:
 
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-performance
  [8]:
 
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-windows-service-new-cql-clients-and-more
  [9]: http://acunu.com/blogs/eric-evans/cassandra-drivers-released/
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of DataStax, the source for professional Cassandra support
  http://www.datastax.com
 
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com




-- 
Dikang Gu

0086 - 18611140205


Questions about the nodetool ring.

2011-04-12 Thread Dikang Gu
I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:

[root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p 8090
ring
Address Status State   LoadOwnsToken


 109028275973926493413574716008500203721
192.168.1.25Up Normal  157.25 MB   69.92%
 57856537434773737201679995572503935972
192.168.1.27Up Normal  201.71 MB   24.28%
 99165710459060760249270263771474737125
192.168.1.28Up Normal  68.12 MB5.80%
109028275973926493413574716008500203721

The load and owns vary on each node, is this normal?  And is there a way to
balance the three nodes?

Thanks.

-- 
Dikang Gu

0086 - 18611140205


Re: Questions about the nodetool ring.

2011-04-12 Thread Dikang Gu
The 3 nodes were added to the cluster at the same time, so I'm not sure whey
the data vary.

I calculate the tokens and get:
node 0: 0
node 1: 56713727820156410577229101238628035242
node 2: 113427455640312821154458202477256070485

So I should set these tokens to the three nodes?

And during the time I execute the nodetool move commands, can the cassandra
servers serve the front end requests at the same time? Is the data safe?

Thanks.

On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby jonathan.co...@gmail.comwrote:

 This is normal when you just add single nodes.   When no token is
 assigned, the new node takes a portion of the ring from the most heavily
 loaded node.As a consequence of this, the nodes will be out of balance.

 In other words, when you double the amount nodes you would not have this
 problem.

 The best way to rebalance the cluster is to generate new tokens and use the
 nodetool move new-token command to rebalance the nodes, one at a time.

 After rebalancing you can run cleanup so the nodes get rid of data they
 no longer are responsible for.

 links:

 http://wiki.apache.org/cassandra/Operations#Range_changes

 http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes

 http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity



 On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote:

  I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:
 
  [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p
 8090 ring
  Address Status State   LoadOwnsToken
 
  109028275973926493413574716008500203721
  192.168.1.25Up Normal  157.25 MB   69.92%
  57856537434773737201679995572503935972
  192.168.1.27Up Normal  201.71 MB   24.28%
  99165710459060760249270263771474737125
  192.168.1.28Up Normal  68.12 MB5.80%
 109028275973926493413574716008500203721
 
  The load and owns vary on each node, is this normal?  And is there a way
 to balance the three nodes?
 
  Thanks.
 
  --
  Dikang Gu
 
  0086 - 18611140205
 




-- 
Dikang Gu

0086 - 18611140205


Re: Questions about the nodetool ring.

2011-04-12 Thread Dikang Gu
After the nodetool move, I got this:

[root@server3 apache-cassandra-0.7.4]# bin/nodetool -h 10.18.101.213 ring
Address Status State   LoadOwnsToken


 113427455640312821154458202477256070485
10.18.101.211   ?  Normal  82.31 MB33.33%  0

10.18.101.212   ?  Normal  84.24 MB33.33%
 56713727820156410577229101238628035242
10.18.101.213   Up Normal  54.44 MB33.33%
 113427455640312821154458202477256070485

Is this correct? Why is the status ? ?

Thanks.

On Tue, Apr 12, 2011 at 5:43 PM, Dikang Gu dikan...@gmail.com wrote:

 The 3 nodes were added to the cluster at the same time, so I'm not sure
 whey the data vary.

 I calculate the tokens and get:
 node 0: 0
 node 1: 56713727820156410577229101238628035242
 node 2: 113427455640312821154458202477256070485

 So I should set these tokens to the three nodes?

 And during the time I execute the nodetool move commands, can the cassandra
 servers serve the front end requests at the same time? Is the data safe?

 Thanks.

 On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby 
 jonathan.co...@gmail.comwrote:

 This is normal when you just add single nodes.   When no token is
 assigned, the new node takes a portion of the ring from the most heavily
 loaded node.As a consequence of this, the nodes will be out of balance.

 In other words, when you double the amount nodes you would not have this
 problem.

 The best way to rebalance the cluster is to generate new tokens and use
 the nodetool move new-token command to rebalance the nodes, one at a time.

 After rebalancing you can run cleanup so the nodes get rid of data they
 no longer are responsible for.

 links:

 http://wiki.apache.org/cassandra/Operations#Range_changes

 http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes

 http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity



 On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote:

  I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats:
 
  [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p
 8090 ring
  Address Status State   LoadOwnsToken
 
  109028275973926493413574716008500203721
  192.168.1.25Up Normal  157.25 MB   69.92%
  57856537434773737201679995572503935972
  192.168.1.27Up Normal  201.71 MB   24.28%
  99165710459060760249270263771474737125
  192.168.1.28Up Normal  68.12 MB5.80%
 109028275973926493413574716008500203721
 
  The load and owns vary on each node, is this normal?  And is there a way
 to balance the three nodes?
 
  Thanks.
 
  --
  Dikang Gu
 
  0086 - 18611140205
 




 --
 Dikang Gu

 0086 - 18611140205




-- 
Dikang Gu

0086 - 18611140205


Bootstrap performance.

2015-04-20 Thread Dikang Gu
Hi guys,

We have a 100+ nodes cluster, each node has about 400G data, and is running
on a flash disk. We are running 2.1.2.

When I bring in a new node into the cluster, it introduces significant load
to the cluster. For the new node, the cpu usage is 100%, but disk write io
is only around 50MB/s, while we have 10G network.

Does it sound normal to you?

Here are some iostat and vmstat metrics:
 iostat 
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  88.523.994.110.000.003.38

Device:tpsMB_read/sMB_wrtn/sMB_readMB_wrtn
sda   1.00 0.00 0.04  0  0
sdb 156.50 0.0055.62  01

 vmstat =
138  0  0 86781912 438780 10152336800 0 31893 264496 247316
95  4  1  0  0  2015-04-21 01:04:01 UTC
147  0  0 86562400 438780 10160724800 0 90510 456635 245849
91  5  4  0  0  2015-04-21 01:04:03 UTC
143  0  0 86341168 438780 10169222400 0 32392 284495 273656
92  4  4  0  0  2015-04-21 01:04:05 UTC

Thanks.
-- 
Dikang


Re: Bootstrap performance.

2015-04-20 Thread Dikang Gu
Hi Rob,

Why do you say steaming is single threaded? I see a lot of background
streaming threads running, for example:

STREAM-IN-/10.210.165.49 daemon prio=10 tid=0x7f81fc001000
nid=0x107075 runnable [0x7f836b256000]
STREAM-IN-/10.213.51.57 daemon prio=10 tid=0x7f81f0002000
nid=0x107073 runnable [0x7f836b1d4000]
STREAM-IN-/10.213.51.61 daemon prio=10 tid=0x7f81e8001000
nid=0x107070 runnable [0x7f836b11]
STREAM-IN-/10.213.51.63 daemon prio=10 tid=0x7f81dc001800
nid=0x10706f runnable [0x7f836b0cf000]

Thanks
Dikang.

On Mon, Apr 20, 2015 at 6:48 PM, Robert Coli rc...@eventbrite.com wrote:

 On Mon, Apr 20, 2015 at 6:08 PM, Dikang Gu dikan...@gmail.com wrote:

 When I bring in a new node into the cluster, it introduces significant
 load to the cluster. For the new node, the cpu usage is 100%, but disk
 write io is only around 50MB/s, while we have 10G network.

 Does it sound normal to you?


 Have you unthrottled both compaction and streaming via JMX/nodetool?

 Streaming is single threaded and can (?) be CPU bound, I would not be
 surprised if JIRA contains a ticket on the upper bounds of streaming
 performance in current implementation.

 =Rob







-- 
Dikang


Is 2.1.5 ready for upgrade?

2015-04-21 Thread Dikang Gu
Hi guys,

We have some issues with streaming in 2.1.2. We find that there are a lot
of patches in 2.1.5. Is it ready for upgrade?

Thanks.
-- 
Dikang


Re: Does datastax java driver works with ipv6 address?

2015-11-04 Thread Dikang Gu
Thanks Michael,

Actually I find the problem is with the sever setup, I put "rpc_address:
0.0.0.0" in the config, and I find the sever bind to the address like this:

tcp0  0 :::9160 :::*
 LISTEN  2411582/java
tcp0  0 :::0.0.0.0:9042 :::*
 LISTEN  2411582/java

So using the sever ip "2401:db00:11:60ed:face:0:31:0", I can connect to the
thrift port 9160, but not the native port 9042. Do you know the reason for
this?

Thanks
Dikang.


On Wed, Nov 4, 2015 at 12:29 PM, Michael Shuler <mich...@pbandjelly.org>
wrote:

> On 11/04/2015 11:17 AM, Dikang Gu wrote:
>
>> I have ipv6 only cassandra cluster, and I'm trying to connect to it
>> using java driver, like:
>>
>> Inet6Address inet6 = (Inet6Address)
>> InetAddress.getByName("2401:db00:0011:60ed:face::0031:");
>> cluster = Cluster.builder().addContactPointsWithPorts(Arrays.asList(new
>> InetSocketAddress(inet6,9042))).build();
>> session =cluster.connect(CASSANDRA_KEYSPACE);
>>
>> But it failed to connect to the cassandra, looks like the java driver
>> does not parse the ipv6 address correctly, exceptions are:
>>
>> 
>
> Open a JIRA bug report for the java driver at:
>
>   https://datastax-oss.atlassian.net/browse/JAVA
>
> As for IPv6 testing for Cassandra in general, it has been brought up, but
> little testing is done at this time. If you have some contributions to be
> made in this area, I'm sure they would be greatly appreciated. You are in a
> relatively unique position with an IPv6-only cluster, so your input is
> valuable.
>
>
>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20text%20~%20ipv6%20AND%20status%20!%3D%20Resolved
>
> --
> Kind regards,
> Michael
>
>


-- 
Dikang


Question for datastax java Driver

2015-11-04 Thread Dikang Gu
Hi there,

Right now, it seems if I add a contact point like this:

cluster = Cluster.builder().addContactPoint().build();

When client is connected to the cluster, client will fetch the addresses
for all the nodes in the cluster, and try to connect to them.

I'm wondering can I disable the behavior? I mean I just want each client to
connect to one or several contact point, not connect to all of the nodes,
am I able to do this?

Thanks.
-- 
Dikang


Do I have to use the cql in the datastax java driver?

2015-11-06 Thread Dikang Gu
Hi there,

In the datastax java driver, do I have to use the cql to talk to cassandra
cluster?

Can I still use thrift interface to talk to cassandra? Any reason that we
should not use thrift anymore?

Thanks.
-- 
Dikang


Does datastax java driver works with ipv6 address?

2015-11-04 Thread Dikang Gu
Hi there,

I have ipv6 only cassandra cluster, and I'm trying to connect to it using
java driver, like:

Inet6Address inet6 = (Inet6Address)
InetAddress.getByName("2401:db00:0011:60ed:face::0031:");
cluster = Cluster.builder().addContactPointsWithPorts(Arrays.asList(new
InetSocketAddress(inet6, 9042))).build();
session = cluster.connect(CASSANDRA_KEYSPACE);

But it failed to connect to the cassandra, looks like the java driver does
not parse the ipv6 address correctly, exceptions are:

337 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection  -
Connection[/2401:db00:11:60ed:face:0:31:0:9042-1, inFlight=0, closed=true]
closing connection
339 [main] DEBUG com.datastax.driver.core.ControlConnection  - [Control
connection] error on /2401:db00:11:60ed:face:0:31:0:9042 connection, no
more host to try
com.datastax.driver.core.TransportException:
[/2401:db00:11:60ed:face:0:31:0:9042] Cannot connect
at
com.datastax.driver.core.Connection$1.operationComplete(Connection.java:156)
at
com.datastax.driver.core.Connection$1.operationComplete(Connection.java:139)
at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268)
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Connection refused:
/2401:db00:11:60ed:face:0:31:0:9042
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
at
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
at
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281)
... 6 more
342 [main] DEBUG com.datastax.driver.core.AbstractReconnectionHandler  -
First reconnection scheduled in 1000ms
342 [main] DEBUG com.datastax.driver.core.AbstractReconnectionHandler  -
Becoming the active handler
342 [main] DEBUG com.datastax.driver.core.Cluster  - Shutting down
Exception in thread "main"
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
tried for query failed (tried: /2401:db00:11:60ed:face:0:31:0:9042
(com.datastax.driver.core.TransportException:
[/2401:db00:11:60ed:face:0:31:0:9042] Cannot connect))
at
com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:223)
at
com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:78)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1272)
at com.datastax.driver.core.Cluster.init(Cluster.java:158)
at com.datastax.driver.core.Cluster.connect(Cluster.java:248)
at com.datastax.driver.core.Cluster.connect(Cluster.java:281)

-- 
Dikang


Re: Unable to remove dead node from cluster.

2015-09-22 Thread Dikang Gu
ping.

On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:

> I have tried all of them, neither of them worked.
> 1. decommission: the host had hardware issue, and I can not connect to it.
> 2. remove, there is not HostID, so the removenode did not work.
> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can we
> fix it?
>
> Thanks
> Dikang.
>
> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> Order is decommission, remove, assassinate.
>>
>> Which have you tried?
>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>
>>> Hi there,
>>>
>>> I have a dead node in our cluster, which is a wired state right now, and
>>> can not be removed from cluster.
>>>
>>> The nodestatus shows:
>>> Datacenter: DC1
>>> ===
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address  Load   Tokens  OwnsHost ID
>>>   Rack
>>> DN  10.210.165.55?  256 ?   null
>>>  r1
>>>
>>> I tried the unsafeAssassinateEndpoint, but got exception like:
>>> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
>>> now DOWN
>>> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
>>> Thread[GossipStage:1,5,main]
>>> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
>>> 2015-09-18_23:21:40.80669   at
>>> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80669   at
>>> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80670   at
>>> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80671   at
>>> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80671   at
>>> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80672   at
>>> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80673   at
>>> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80673   at
>>> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80673   at
>>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80674   at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> ~[na:1.7.0_45]
>>> 2015-09-18_23:21:40.80674   at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> ~[na:1.7.0_45]
>>> 2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
>>> ~[na:1.7.0_45]
>>> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
>>> local pause of 10852378435 > 50
>>>
>>> Any suggestions about how to remove it?
>>> Thanks.
>>>
>>> --
>>> Dikang
>>>
>>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Re: Unable to remove dead node from cluster.

2015-09-25 Thread Dikang Gu
The NPE throws when node tried to handleStateLeft, because it can not find
the tokens associated with the node, can we just ignore the NPE and
continue to remove the endpoint from the ring?

On Fri, Sep 25, 2015 at 10:52 AM, Dikang Gu <dikan...@gmail.com> wrote:

> @Jeff, yeah, I run the nodetool grep, and in my case, some nodes return
> "301", and some nodes return "300". And 300 is the correct number of nodes
> in my cluster.
>
> So it does look like an inconsistent issue, can you open a jira for this?
> Also, I'm looking for a quick fix/patch for this.
>
> On Fri, Sep 25, 2015 at 7:43 AM, Nate McCall <n...@thelastpickle.com>
> wrote:
>
>> A few other folks have reported issues with lingering dead nodes on large
>> clusters - Jason Brown *just* gave an excellent gossip presentation at the
>> summit regarding gossip optimizations for large clusters.
>>
>> Gossip is in the process of being refactored (here's at least one of the
>> issues: https://issues.apache.org/jira/browse/CASSANDRA-9667), but it
>> would be worth opening an issue with as much information as you can provide
>> to, at the very least, have information avaiable for others.
>>
>> On Fri, Sep 25, 2015 at 7:08 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>> wrote:
>>
>>> The stack trace is one similar to one I recall seeing recently, but
>>> don’t have in front of me. This is an outside chance that is not at all
>>> certain to be the case.
>>>
>>> For EACH of the hundreds of nodes in your cluster, I suggest you run
>>>
>>> nodetool status | egrep “(^UN|^DN)" | wc -l
>>>
>>> and count to see if every node really has every other node in its ring
>>> properly.
>>>
>>> I suspect, but am not at all sure, that you have inconsistencies you’re
>>> not yet aware of (for example, if you expect that you have 100 nodes in the
>>> cluster, I’m betting that the query above returns 99 on at least one of the
>>> nodes).  If this is the case, please reply so that you and I can submit a
>>> Jira and compare our stack traces and we can find the underlying root cause
>>> of this together.
>>>
>>> - Jeff
>>>
>>> From: Dikang Gu
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Thursday, September 24, 2015 at 9:10 PM
>>> To: cassandra
>>>
>>> Subject: Re: Unable to remove dead node from cluster.
>>>
>>> @Jeff, I just use jmx connect to one node, run the
>>> unsafeAssainateEndpoint, and pass in the "10.210.165.55" ip address.
>>>
>>> Yes, we have hundreds of other nodes in the nodetool status output as
>>> well.
>>>
>>> On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com
>>> > wrote:
>>>
>>>> When you run unsafeAssassinateEndpoint, to which host are you
>>>> connected, and what argument are you passing?
>>>>
>>>> Are there other nodes in the ring that you’re not including in the
>>>> ‘nodetool status’ output?
>>>>
>>>>
>>>> From: Dikang Gu
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Tuesday, September 22, 2015 at 10:09 PM
>>>> To: cassandra
>>>> Cc: "d...@cassandra.apache.org"
>>>> Subject: Re: Unable to remove dead node from cluster.
>>>>
>>>> ping.
>>>>
>>>> On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>>>
>>>>> I have tried all of them, neither of them worked.
>>>>> 1. decommission: the host had hardware issue, and I can not connect to
>>>>> it.
>>>>> 2. remove, there is not HostID, so the removenode did not work.
>>>>> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before,
>>>>> can we fix it?
>>>>>
>>>>> Thanks
>>>>> Dikang.
>>>>>
>>>>> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
>>>>> sebastian.este...@datastax.com> wrote:
>>>>>
>>>>>> Order is decommission, remove, assassinate.
>>>>>>
>>>>>> Which have you tried?
>>>>>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> I have a dead node in 

Re: Unable to remove dead node from cluster.

2015-09-25 Thread Dikang Gu
@Jeff, yeah, I run the nodetool grep, and in my case, some nodes return
"301", and some nodes return "300". And 300 is the correct number of nodes
in my cluster.

So it does look like an inconsistent issue, can you open a jira for this?
Also, I'm looking for a quick fix/patch for this.

On Fri, Sep 25, 2015 at 7:43 AM, Nate McCall <n...@thelastpickle.com> wrote:

> A few other folks have reported issues with lingering dead nodes on large
> clusters - Jason Brown *just* gave an excellent gossip presentation at the
> summit regarding gossip optimizations for large clusters.
>
> Gossip is in the process of being refactored (here's at least one of the
> issues: https://issues.apache.org/jira/browse/CASSANDRA-9667), but it
> would be worth opening an issue with as much information as you can provide
> to, at the very least, have information avaiable for others.
>
> On Fri, Sep 25, 2015 at 7:08 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
> wrote:
>
>> The stack trace is one similar to one I recall seeing recently, but don’t
>> have in front of me. This is an outside chance that is not at all certain
>> to be the case.
>>
>> For EACH of the hundreds of nodes in your cluster, I suggest you run
>>
>> nodetool status | egrep “(^UN|^DN)" | wc -l
>>
>> and count to see if every node really has every other node in its ring
>> properly.
>>
>> I suspect, but am not at all sure, that you have inconsistencies you’re
>> not yet aware of (for example, if you expect that you have 100 nodes in the
>> cluster, I’m betting that the query above returns 99 on at least one of the
>> nodes).  If this is the case, please reply so that you and I can submit a
>> Jira and compare our stack traces and we can find the underlying root cause
>> of this together.
>>
>> - Jeff
>>
>> From: Dikang Gu
>> Reply-To: "user@cassandra.apache.org"
>> Date: Thursday, September 24, 2015 at 9:10 PM
>> To: cassandra
>>
>> Subject: Re: Unable to remove dead node from cluster.
>>
>> @Jeff, I just use jmx connect to one node, run the
>> unsafeAssainateEndpoint, and pass in the "10.210.165.55" ip address.
>>
>> Yes, we have hundreds of other nodes in the nodetool status output as
>> well.
>>
>> On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>> wrote:
>>
>>> When you run unsafeAssassinateEndpoint, to which host are you connected,
>>> and what argument are you passing?
>>>
>>> Are there other nodes in the ring that you’re not including in the
>>> ‘nodetool status’ output?
>>>
>>>
>>> From: Dikang Gu
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Tuesday, September 22, 2015 at 10:09 PM
>>> To: cassandra
>>> Cc: "d...@cassandra.apache.org"
>>> Subject: Re: Unable to remove dead node from cluster.
>>>
>>> ping.
>>>
>>> On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>>
>>>> I have tried all of them, neither of them worked.
>>>> 1. decommission: the host had hardware issue, and I can not connect to
>>>> it.
>>>> 2. remove, there is not HostID, so the removenode did not work.
>>>> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can
>>>> we fix it?
>>>>
>>>> Thanks
>>>> Dikang.
>>>>
>>>> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
>>>> sebastian.este...@datastax.com> wrote:
>>>>
>>>>> Order is decommission, remove, assassinate.
>>>>>
>>>>> Which have you tried?
>>>>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> I have a dead node in our cluster, which is a wired state right now,
>>>>>> and can not be removed from cluster.
>>>>>>
>>>>>> The nodestatus shows:
>>>>>> Datacenter: DC1
>>>>>> ===
>>>>>> Status=Up/Down
>>>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>>> --  Address  Load   Tokens  OwnsHost
>>>>>> ID   Rack
>>>>>> DN  10.210.165.55?  256 ?   null
>>>>>>  r1
>>>>>>
>

Unable to remove dead node from cluster.

2015-09-21 Thread Dikang Gu
Hi there,

I have a dead node in our cluster, which is a wired state right now, and
can not be removed from cluster.

The nodestatus shows:
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  OwnsHost ID
  Rack
DN  10.210.165.55?  256 ?   null
   r1

I tried the unsafeAssassinateEndpoint, but got exception like:
2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is now
DOWN
2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
Thread[GossipStage:1,5,main]
2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
2015-09-18_23:21:40.80669   at
org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80669   at
org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80670   at
org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80671   at
org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80671   at
org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80672   at
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80674   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_45]
2015-09-18_23:21:40.80674   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_45]
2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
~[na:1.7.0_45]
2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
local pause of 10852378435 > 50

Any suggestions about how to remove it?
Thanks.

-- 
Dikang


Re: Unable to remove dead node from cluster.

2015-09-21 Thread Dikang Gu
I have tried all of them, neither of them worked.
1. decommission: the host had hardware issue, and I can not connect to it.
2. remove, there is not HostID, so the removenode did not work.
3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can we
fix it?

Thanks
Dikang.

On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> Order is decommission, remove, assassinate.
>
> Which have you tried?
> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>
>> Hi there,
>>
>> I have a dead node in our cluster, which is a wired state right now, and
>> can not be removed from cluster.
>>
>> The nodestatus shows:
>> Datacenter: DC1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address  Load   Tokens  OwnsHost ID
>> Rack
>> DN  10.210.165.55?  256 ?   null
>>  r1
>>
>> I tried the unsafeAssassinateEndpoint, but got exception like:
>> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
>> now DOWN
>> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
>> Thread[GossipStage:1,5,main]
>> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
>> 2015-09-18_23:21:40.80669   at
>> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80669   at
>> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80670   at
>> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80671   at
>> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80671   at
>> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80672   at
>> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80673   at
>> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80673   at
>> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80673   at
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80674   at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> ~[na:1.7.0_45]
>> 2015-09-18_23:21:40.80674   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> ~[na:1.7.0_45]
>> 2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
>> ~[na:1.7.0_45]
>> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
>> local pause of 10852378435 > 50
>>
>> Any suggestions about how to remove it?
>> Thanks.
>>
>> --
>> Dikang
>>
>>


-- 
Dikang


Re: Unable to remove dead node from cluster.

2015-09-24 Thread Dikang Gu
@Jeff, I just use jmx connect to one node, run the unsafeAssainateEndpoint,
and pass in the "10.210.165.55" ip address.

Yes, we have hundreds of other nodes in the nodetool status output as well.

On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

> When you run unsafeAssassinateEndpoint, to which host are you connected,
> and what argument are you passing?
>
> Are there other nodes in the ring that you’re not including in the
> ‘nodetool status’ output?
>
>
> From: Dikang Gu
> Reply-To: "user@cassandra.apache.org"
> Date: Tuesday, September 22, 2015 at 10:09 PM
> To: cassandra
> Cc: "d...@cassandra.apache.org"
> Subject: Re: Unable to remove dead node from cluster.
>
> ping.
>
> On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
>
>> I have tried all of them, neither of them worked.
>> 1. decommission: the host had hardware issue, and I can not connect to it.
>> 2. remove, there is not HostID, so the removenode did not work.
>> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can
>> we fix it?
>>
>> Thanks
>> Dikang.
>>
>> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
>> sebastian.este...@datastax.com> wrote:
>>
>>> Order is decommission, remove, assassinate.
>>>
>>> Which have you tried?
>>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>>
>>>> Hi there,
>>>>
>>>> I have a dead node in our cluster, which is a wired state right now,
>>>> and can not be removed from cluster.
>>>>
>>>> The nodestatus shows:
>>>> Datacenter: DC1
>>>> ===
>>>> Status=Up/Down
>>>> |/ State=Normal/Leaving/Joining/Moving
>>>> --  Address  Load   Tokens  OwnsHost ID
>>>>   Rack
>>>> DN  10.210.165.55?  256 ?   null
>>>>r1
>>>>
>>>> I tried the unsafeAssassinateEndpoint, but got exception like:
>>>> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
>>>> now DOWN
>>>> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
>>>> Thread[GossipStage:1,5,main]
>>>> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
>>>> 2015-09-18_23:21:40.80669   at
>>>> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80669   at
>>>> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80670   at
>>>> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80671   at
>>>> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80671   at
>>>> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80672   at
>>>> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80673   at
>>>> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80673   at
>>>> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80673   at
>>>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80674   at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> ~[na:1.7.0_45]
>>>> 2015-09-18_23:21:40.80674   at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> ~[na:1.7.0_45]
>>>> 2015-09-18_23:21:40.80674   at
>>>> java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_45]
>>>> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
>>>> local pause of 10852378435 > 50
>>>>
>>>> Any suggestions about how to remove it?
>>>> Thanks.
>>>>
>>>> --
>>>> Dikang
>>>>
>>>>
>>
>>
>> --
>> Dikang
>>
>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Questions about Counter updates.

2016-02-05 Thread Dikang Gu
Hi there,

I have a cluster which has a lot of counter updates. My question is that
when I run the `nodetool tpstats`, I see a lot of MutationStage actions but
no CounterMutationStage stats. I'm wondering is it normal or is it
something I should worry about?

I'm using Cassandra 2.1.8 and the C driver.

Pool NameActive   Pending  Completed   Blocked  All
>> time blocked
>
> CounterMutationStage  0 0  0 0
>> 0
>
> ReadStage 0 0 25 0
>> 0
>
> RequestResponseStage  0 0 21 0
>> 0
>
> MutationStage 0 0   19284070 0
>> 0
>
>
Thanks

-- 
Dikang


How to measure the write amplification of C*?

2016-03-09 Thread Dikang Gu
Hello there,

I'm wondering is there a good way to measure the write amplification of
Cassandra?

I'm thinking it could be calculated by (size of mutations written to the
node)/(number of bytes written to the disk).

Do we already have the metrics of "size of mutations written to the node"?
I did not find it in jmx metrics.

Thanks

-- 
Dikang


Re: How to measure the write amplification of C*?

2016-03-10 Thread Dikang Gu
> >
>>> > I am not sure about what you call "amplification", but as sizes highly
>>> > depends on the structure I think I would probably give it a try using
>>> CCM (
>>> > https://github.com/pcmanus/ccm) or some test cluster with 'production
>>> > like'
>>> > setting and schema. You can write a row, flush it and see how big is
>>> the
>>> > data cluster-wide / per node.
>>> >
>>> > Hope this will be of some help.
>>> >
>>> > C*heers,
>>> > ---
>>> > Alain Rodriguez - al...@thelastpickle.com
>>> > France
>>> >
>>> > The Last Pickle - Apache Cassandra Consulting
>>> > http://www.thelastpickle.com
>>> >
>>> > 2016-03-10 7:18 GMT+01:00 Dikang Gu <dikan...@gmail.com>:
>>> >
>>> > > Hello there,
>>> > >
>>> > > I'm wondering is there a good way to measure the write amplification
>>> of
>>> > > Cassandra?
>>> > >
>>> > > I'm thinking it could be calculated by (size of mutations written to
>>> the
>>> > > node)/(number of bytes written to the disk).
>>> > >
>>> > > Do we already have the metrics of "size of mutations written to the
>>> > node"?
>>> > > I did not find it in jmx metrics.
>>> > >
>>> > > Thanks
>>> > >
>>> > > --
>>> > > Dikang
>>> > >
>>> > >
>>> >
>>>
>>
>>
>


-- 
Dikang


Compaction Filter in Cassandra

2016-03-11 Thread Dikang Gu
Hello there,

RocksDB has the feature called "Compaction Filter" to allow application to
modify/delete a key-value during the background compaction.
https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226

I'm wondering is there a plan/value to add this into C* as well? Or is
there already a similar thing in C*?

Thanks

-- 
Dikang


Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Hi Eric,

Thanks for sharing the information!

We also mainly want to use it for trimming data, either by the time or the
number of columns in a row. We haven't started the work yet, do you mind to
share some patches? We'd love to try it and test it in our environment.

Thanks.

On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevens <migh...@gmail.com> wrote:

> We have been working on filtering compaction for a month or so (though we
> call it deleting compaction, its implementation is as a filtering
> compaction strategy).  The feature is nearing completion, and we have used
> it successfully in a limited production capacity against DSE 4.8 series.
>
> Our use case is that our records are written anywhere between a month, up
> to several years before they are scheduled for deletion.  Tombstones are
> too expensive, as we have tables with hundreds of billions of rows.  In
> addition, traditional TTLs don't work for us because our customers are
> permitted to change their retention policy such that already-written
> records should not be deleted if they increase their retention after the
> record was written (or vice versa).
>
> We can clean up data more cheaply and more quickly with filtered
> compaction than with tombstones and traditional compaction.  Our
> implementation is a wrapper compaction strategy for another underlying
> strategy, so that you can have the characteristics of whichever strategy
> makes sense in terms of managing your SSTables, while interceding and
> removing records during compaction (including cleaning up secondary
> indexes) that otherwise would have survived into the new SSTable.
>
> We are hoping to contribute it back to the community, so if you'd be
> interested in helping test it out, I'd love to hear from you.
>
> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson <krum...@gmail.com> wrote:
>
>> We don't have anything like that, do you have a specific use case in mind?
>>
>> Could you create a JIRA ticket and we can discuss there?
>>
>> /Marcus
>>
>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>
>>> Hello there,
>>>
>>> RocksDB has the feature called "Compaction Filter" to allow application
>>> to modify/delete a key-value during the background compaction.
>>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>>>
>>> I'm wondering is there a plan/value to add this into C* as well? Or is
>>> there already a similar thing in C*?
>>>
>>> Thanks
>>>
>>> --
>>> Dikang
>>>
>>>
>>


-- 
Dikang


Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Fyi, this is the jira, https://issues.apache.org/jira/browse/CASSANDRA-11348
.

We can move the discussion to the jira if want.

On Thu, Mar 17, 2016 at 11:46 AM, Dikang Gu <dikan...@gmail.com> wrote:

> Hi Eric,
>
> Thanks for sharing the information!
>
> We also mainly want to use it for trimming data, either by the time or the
> number of columns in a row. We haven't started the work yet, do you mind to
> share some patches? We'd love to try it and test it in our environment.
>
> Thanks.
>
> On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevens <migh...@gmail.com> wrote:
>
>> We have been working on filtering compaction for a month or so (though we
>> call it deleting compaction, its implementation is as a filtering
>> compaction strategy).  The feature is nearing completion, and we have used
>> it successfully in a limited production capacity against DSE 4.8 series.
>>
>> Our use case is that our records are written anywhere between a month, up
>> to several years before they are scheduled for deletion.  Tombstones are
>> too expensive, as we have tables with hundreds of billions of rows.  In
>> addition, traditional TTLs don't work for us because our customers are
>> permitted to change their retention policy such that already-written
>> records should not be deleted if they increase their retention after the
>> record was written (or vice versa).
>>
>> We can clean up data more cheaply and more quickly with filtered
>> compaction than with tombstones and traditional compaction.  Our
>> implementation is a wrapper compaction strategy for another underlying
>> strategy, so that you can have the characteristics of whichever strategy
>> makes sense in terms of managing your SSTables, while interceding and
>> removing records during compaction (including cleaning up secondary
>> indexes) that otherwise would have survived into the new SSTable.
>>
>> We are hoping to contribute it back to the community, so if you'd be
>> interested in helping test it out, I'd love to hear from you.
>>
>> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson <krum...@gmail.com>
>> wrote:
>>
>>> We don't have anything like that, do you have a specific use case in
>>> mind?
>>>
>>> Could you create a JIRA ticket and we can discuss there?
>>>
>>> /Marcus
>>>
>>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>>
>>>> Hello there,
>>>>
>>>> RocksDB has the feature called "Compaction Filter" to allow application
>>>> to modify/delete a key-value during the background compaction.
>>>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>>>>
>>>> I'm wondering is there a plan/value to add this into C* as well? Or is
>>>> there already a similar thing in C*?
>>>>
>>>> Thanks
>>>>
>>>> --
>>>> Dikang
>>>>
>>>>
>>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Re: How to measure the write amplification of C*?

2016-03-23 Thread Dikang Gu
As a follow-up, I'm going to write a simple patch to expose the number of
flushed bytes from memtable to JMX, so that we can easily monitor it.

Here is the jira: https://issues.apache.org/jira/browse/CASSANDRA-11420

On Thu, Mar 10, 2016 at 12:55 PM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> The doc does say this:
>
> "A log-structured engine that avoids overwrites and uses sequential IO to
> update data is essential for writing to solid-state disks (SSD) and hard
> disks (HDD) On HDD, writing randomly involves a higher number of seek
> operations than sequential writing. The seek penalty incurred can be
> substantial. Using sequential IO (thereby avoiding write amplification
> <http://en.wikipedia.org/wiki/Write_amplification> and disk failure),
> Cassandra accommodates inexpensive, consumer SSDs extremely well."
>
> I presume that write amplification argues for placing the commit log on a
> separate SSD device. That should probably be mentioned.
>
> -- Jack Krupansky
>
> On Thu, Mar 10, 2016 at 12:52 PM, Matt Kennedy <matt.kenn...@datastax.com>
> wrote:
>
>> It isn't really the data written by the host that you're concerned with,
>> it's the data written by your application. I'd start by instrumenting your
>> application tier to tally up the size of the values that it writes to C*.
>>
>> However, it may not be extremely useful to have this value. You can't do
>> much with the information it provides. It is probably a better idea to
>> track the bytes written to flash for each drive so that you know the
>> physical endurance of that type of drive given your workload. Unfortunately
>> the TBW endurance rated for the drive may not be extremely useful given the
>> difference between the synthetic workload used to create those ratings and
>> the workload that Cassandra is producing for your particular case. You can
>> find out more about those here:
>> https://www.jedec.org/standards-documents/docs/jesd219a
>>
>>
>> Matt Kennedy
>>
>> Sr. Product Manager, DSE Core
>>
>> matt.kenn...@datastax.com | Public Calendar <http://goo.gl/4Ui04Z>
>>
>> *DataStax Enterprise - the database for cloud applications.*
>>
>> On Thu, Mar 10, 2016 at 11:44 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>
>>> Hi Matt,
>>>
>>> Thanks for the detailed explanation! Yes, this is exactly what I'm
>>> looking for, "write amplification = data written to flash/data written
>>> by the host".
>>>
>>> We are heavily using the LCS in production, so I'd like to figure out
>>> the amplification caused by that and see what we can do to optimize it. I
>>> have the metrics of "data written to flash", and I'm wondering is there
>>> an easy way to get the "data written by the host" on each C* node?
>>>
>>> Thanks
>>>
>>> On Thu, Mar 10, 2016 at 8:48 AM, Matt Kennedy <mkenn...@datastax.com>
>>> wrote:
>>>
>>>> TL;DR - Cassandra actually causes a ton of write amplification but it
>>>> doesn't freaking matter any more. Read on for details...
>>>>
>>>> That slide deck does have a lot of very good information on it, but
>>>> unfortunately I think it has led to a fundamental misunderstanding about
>>>> Cassandra and write amplification. In particular, slide 51 vastly
>>>> oversimplifies the situation.
>>>>
>>>> The wikipedia definition of write amplification looks at this from the
>>>> perspective of the SSD controller:
>>>> https://en.wikipedia.org/wiki/Write_amplification#Calculating_the_value
>>>>
>>>> In short, write amplification = data written to flash/data written by
>>>> the host
>>>>
>>>> So, if I write 1MB in my application, but the SSD has to write my 1MB,
>>>> plus rearrange another 1MB of data in order to make room for it, then I've
>>>> written a total of 2MB and my write amplification is 2x.
>>>>
>>>> In other words, it is measuring how much extra the SSD controller has
>>>> to write in order to do its own housekeeping.
>>>>
>>>> However, the wikipedia definition is a bit more constrained than how
>>>> the term is used in the storage industry. The whole point of looking at
>>>> write amplification is to understand the impact that a particular workload
>>>> is going to have on the underlying NAND by virtue of the data written. So a
>>>> definition of write amplification that is a little more r

Re: Counter values become under-counted when running repair.

2016-03-24 Thread Dikang Gu
@Aleksey, sure, here is the jira:
https://issues.apache.org/jira/browse/CASSANDRA-11432

Thanks!

On Thu, Mar 24, 2016 at 5:32 PM, Aleksey Yeschenko <alek...@apache.org>
wrote:

> Best open a JIRA ticket and I’ll have a look at what could be the reason.
>
> --
> AY
>
> On 24 March 2016 at 23:20:55, Dikang Gu (dikan...@gmail.com) wrote:
>
> @Aleksey, we are writing to cluster with CL = 2, and reading with CL = 1.
> And overall we have 6 copies across 3 different regions. Do you have
> comments about our setup?
>
> During the repair, the counter value become inaccurate, we are still
> playing with the repair, will keep you update with more experiments. But
> do
> you have any theory around that?
>
> Thanks a lot!
> Dikang.
>
> On Thu, Mar 24, 2016 at 11:02 AM, Aleksey Yeschenko <alek...@apache.org>
> wrote:
>
> > After repair is over, does the value settle? What CLs do you write to
> your
> > counters with? What CLs are you reading with?
> >
> > --
> > AY
> >
> > On 24 March 2016 at 06:17:27, Dikang Gu (dikan...@gmail.com) wrote:
> >
> > Hello there,
> >
> > We are experimenting Counters in Cassandra 2.2.5. Our setup is that we
> > have
> > 6 nodes, across three different regions, and in each region, the
> > replication factor is 2. Basically, each nodes holds a full copy of the
> > data.
> >
> > When are doing 30k/s counter increment/decrement per node, and at the
> > meanwhile, we are double writing to our mysql tier, so that we can
> measure
> > the accuracy of C* counter, compared to mysql.
> >
> > The experiment result was great at the beginning, the counter value in
> C*
> > and mysql are very close. The difference is less than 0.1%.
> >
> > But when we start to run the repair on one node, the counter value in C*
> > become much less than the value in mysql, the difference becomes larger
> > than 1%.
> >
> > My question is that is it a known problem that the counter value will
> > become under-counted if repair is running? Should we avoid running
> repair
> > for counter tables?
> >
> > Thanks.
> >
> > --
> > Dikang
> >
> >
>
>
> --
> Dikang
>
>


-- 
Dikang


Re: Counter values become under-counted when running repair.

2016-03-24 Thread Dikang Gu
@Jack, we write to 2 and read from 1.

I do not understand why RF=2 matters here, will it have impact on the
repair? Can you please explain more?

I select RF=2 in each region, because:
1. all 2 writes will be sent to local region, so we do not need to wait for
the response across region.
2. if one node has problem in local region, the read can still hit the
other one in local region as well.

However, I can change the RF if it's really the cause of the under-counting.

Thanks
Dikang.


On Thu, Mar 24, 2016 at 7:17 AM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> What CL do you read and write with?
>
> Normally, RF=2 is not recommended since it doesn't give you HA within a
> data center - there is no way to achieve quorum in the data center if a
> node goes down.
>
> I suppose you can achieve a quorum if your request is spread across all
> three data centers, but normally apps try to issue requests to a local data
> center for performance. Having to ping all data centers on all requests to
> achieve a quorum seems a bit excessive.
>
> Can you advise us on your thinking when you selected RF=2?
>
>
> -- Jack Krupansky
>
> On Thu, Mar 24, 2016 at 2:17 AM, Dikang Gu <dikan...@gmail.com> wrote:
>
>> Hello there,
>>
>> We are experimenting Counters in Cassandra 2.2.5. Our setup is that we
>> have 6 nodes, across three different regions, and in each region, the
>> replication factor is 2. Basically, each nodes holds a full copy of the
>> data.
>>
>> When are doing 30k/s counter increment/decrement per node, and at the
>> meanwhile, we are double writing to our mysql tier, so that we can measure
>> the accuracy of C* counter, compared to mysql.
>>
>> The experiment result was great at the beginning, the counter value in C*
>> and mysql are very close. The difference is less than 0.1%.
>>
>> But when we start to run the repair on one node, the counter value in C*
>> become much less than the value in mysql,  the difference becomes larger
>> than 1%.
>>
>> My question is that is it a known problem that the counter value will
>> become under-counted if repair is running? Should we avoid running repair
>> for counter tables?
>>
>> Thanks.
>>
>> --
>> Dikang
>>
>>
>


-- 
Dikang


Re: Counter values become under-counted when running repair.

2016-03-24 Thread Dikang Gu
@Aleksey, we are writing to cluster with CL = 2, and reading with CL = 1.
And overall we have 6 copies across 3 different regions. Do you have
comments about our setup?

During the repair, the counter value become inaccurate, we are still
playing with the repair, will keep you update with more experiments. But do
you have any theory around that?

Thanks a lot!
Dikang.

On Thu, Mar 24, 2016 at 11:02 AM, Aleksey Yeschenko <alek...@apache.org>
wrote:

> After repair is over, does the value settle? What CLs do you write to your
> counters with? What CLs are you reading with?
>
> --
> AY
>
> On 24 March 2016 at 06:17:27, Dikang Gu (dikan...@gmail.com) wrote:
>
> Hello there,
>
> We are experimenting Counters in Cassandra 2.2.5. Our setup is that we
> have
> 6 nodes, across three different regions, and in each region, the
> replication factor is 2. Basically, each nodes holds a full copy of the
> data.
>
> When are doing 30k/s counter increment/decrement per node, and at the
> meanwhile, we are double writing to our mysql tier, so that we can measure
> the accuracy of C* counter, compared to mysql.
>
> The experiment result was great at the beginning, the counter value in C*
> and mysql are very close. The difference is less than 0.1%.
>
> But when we start to run the repair on one node, the counter value in C*
> become much less than the value in mysql, the difference becomes larger
> than 1%.
>
> My question is that is it a known problem that the counter value will
> become under-counted if repair is running? Should we avoid running repair
> for counter tables?
>
> Thanks.
>
> --
> Dikang
>
>


-- 
Dikang


Re: Counter values become under-counted when running repair.

2016-03-28 Thread Dikang Gu
Hi Aleksey, do you get a chance to take a look?

Thanks
Dikang.

On Thu, Mar 24, 2016 at 10:30 PM, Dikang Gu <dikan...@gmail.com> wrote:

> @Aleksey, sure, here is the jira:
> https://issues.apache.org/jira/browse/CASSANDRA-11432
>
> Thanks!
>
> On Thu, Mar 24, 2016 at 5:32 PM, Aleksey Yeschenko <alek...@apache.org>
> wrote:
>
>> Best open a JIRA ticket and I’ll have a look at what could be the reason.
>>
>> --
>> AY
>>
>> On 24 March 2016 at 23:20:55, Dikang Gu (dikan...@gmail.com) wrote:
>>
>> @Aleksey, we are writing to cluster with CL = 2, and reading with CL = 1.
>> And overall we have 6 copies across 3 different regions. Do you have
>> comments about our setup?
>>
>> During the repair, the counter value become inaccurate, we are still
>> playing with the repair, will keep you update with more experiments. But
>> do
>> you have any theory around that?
>>
>> Thanks a lot!
>> Dikang.
>>
>> On Thu, Mar 24, 2016 at 11:02 AM, Aleksey Yeschenko <alek...@apache.org>
>> wrote:
>>
>> > After repair is over, does the value settle? What CLs do you write to
>> your
>> > counters with? What CLs are you reading with?
>> >
>> > --
>> > AY
>> >
>> > On 24 March 2016 at 06:17:27, Dikang Gu (dikan...@gmail.com) wrote:
>> >
>> > Hello there,
>> >
>> > We are experimenting Counters in Cassandra 2.2.5. Our setup is that we
>> > have
>> > 6 nodes, across three different regions, and in each region, the
>> > replication factor is 2. Basically, each nodes holds a full copy of the
>> > data.
>> >
>> > When are doing 30k/s counter increment/decrement per node, and at the
>> > meanwhile, we are double writing to our mysql tier, so that we can
>> measure
>> > the accuracy of C* counter, compared to mysql.
>> >
>> > The experiment result was great at the beginning, the counter value in
>> C*
>> > and mysql are very close. The difference is less than 0.1%.
>> >
>> > But when we start to run the repair on one node, the counter value in
>> C*
>> > become much less than the value in mysql, the difference becomes larger
>> > than 1%.
>> >
>> > My question is that is it a known problem that the counter value will
>> > become under-counted if repair is running? Should we avoid running
>> repair
>> > for counter tables?
>> >
>> > Thanks.
>> >
>> > --
>> > Dikang
>> >
>> >
>>
>>
>> --
>> Dikang
>>
>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Re: Dropped messages on random nodes.

2017-01-22 Thread Dikang Gu
Btw, the C* version is 2.2.5, with several backported patches.

On Sun, Jan 22, 2017 at 10:36 PM, Dikang Gu <dikan...@gmail.com> wrote:

> Hello there,
>
> We have a 100 nodes ish cluster, I find that there are dropped messages on
> random nodes in the cluster, which caused error spikes and P99 latency
> spikes as well.
>
> I tried to figure out the cause. I do not see any obvious bottleneck in
> the cluster, the C* nodes still have plenty of cpu idle/disk io. But I do
> see some suspicious gossip events around that time, not sure if it's
> related.
>
> 2017-01-21_16:43:56.71033 WARN  16:43:56 [GossipTasks:1]: Not marking
> nodes down due to local pause of 13079498815 > 50
> 2017-01-21_16:43:56.85532 INFO  16:43:56 [ScheduledTasks:1]: MUTATION
> messages were dropped in last 5000 ms: 65 for internal timeout and 10895
> for cross node timeout
> 2017-01-21_16:43:56.85533 INFO  16:43:56 [ScheduledTasks:1]: READ messages
> were dropped in last 5000 ms: 33 for internal timeout and 7867 for cross
> node timeout
> 2017-01-21_16:43:56.85534 INFO  16:43:56 [ScheduledTasks:1]: Pool Name
>Active   Pending  Completed   Blocked  All Time Blocked
> 2017-01-21_16:43:56.85534 INFO  16:43:56 [ScheduledTasks:1]: MutationStage
>   128 47794 1015525068 0 0
> 2017-01-21_16:43:56.85535
> 2017-01-21_16:43:56.85535 INFO  16:43:56 [ScheduledTasks:1]: ReadStage
>64 20202  450508940 0 0
>
> Any suggestions?
>
> Thanks!
>
> --
> Dikang
>
>


-- 
Dikang


Dropped messages on random nodes.

2017-01-22 Thread Dikang Gu
Hello there,

We have a 100 nodes ish cluster, I find that there are dropped messages on
random nodes in the cluster, which caused error spikes and P99 latency
spikes as well.

I tried to figure out the cause. I do not see any obvious bottleneck in the
cluster, the C* nodes still have plenty of cpu idle/disk io. But I do see
some suspicious gossip events around that time, not sure if it's related.

2017-01-21_16:43:56.71033 WARN  16:43:56 [GossipTasks:1]: Not marking nodes
down due to local pause of 13079498815 > 50
2017-01-21_16:43:56.85532 INFO  16:43:56 [ScheduledTasks:1]: MUTATION
messages were dropped in last 5000 ms: 65 for internal timeout and 10895
for cross node timeout
2017-01-21_16:43:56.85533 INFO  16:43:56 [ScheduledTasks:1]: READ messages
were dropped in last 5000 ms: 33 for internal timeout and 7867 for cross
node timeout
2017-01-21_16:43:56.85534 INFO  16:43:56 [ScheduledTasks:1]: Pool Name
   Active   Pending  Completed   Blocked  All Time Blocked
2017-01-21_16:43:56.85534 INFO  16:43:56 [ScheduledTasks:1]: MutationStage
  128 47794 1015525068 0 0
2017-01-21_16:43:56.85535
2017-01-21_16:43:56.85535 INFO  16:43:56 [ScheduledTasks:1]: ReadStage
   64 20202  450508940 0 0

Any suggestions?

Thanks!

-- 
Dikang


Re: [Multi DC] Old Data Not syncing from Existing cluster to new Cluster

2017-01-27 Thread Dikang Gu
Have you run 'nodetool rebuild dc_india' on the new nodes?

On Tue, Jan 24, 2017 at 7:51 AM, Benjamin Roth 
wrote:

> Have you also altered RF of system_distributed as stated in the tutorial?
>
> 2017-01-24 16:45 GMT+01:00 Abhishek Kumar Maheshwari  timesinternet.in>:
>
>> My Mistake,
>>
>>
>>
>> Both clusters are up and running.
>>
>>
>>
>> Datacenter: DRPOCcluster
>>
>> 
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  AddressLoad   Tokens   OwnsHost
>> ID   Rack
>>
>> UN  172.29.XX.XX  1.65 GB   256  ?
>> badf985b-37da-4735-b468-8d3a058d4b60  01
>>
>> UN  172.29.XX.XX  1.64 GB   256  ?
>> 317061b2-c19f-44ba-a776-bcd91c70bbdd  03
>>
>> UN  172.29.XX.XX  1.64 GB   256  ?
>> 9bf0d1dc-6826-4f3b-9c56-cec0c9ce3b6c  02
>>
>> Datacenter: dc_india
>>
>> 
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  AddressLoad   Tokens   OwnsHost
>> ID   Rack
>>
>> UN  172.26.XX.XX   79.90 GB   256  ?
>> 3e8133ed-98b5-418d-96b5-690a1450cd30  RACK1
>>
>> UN  172.26.XX.XX   80.21 GB   256  ?
>> 7d3f5b25-88f9-4be7-b0f5-746619153543  RACK2
>>
>>
>>
>> *Thanks & Regards,*
>> *Abhishek Kumar Maheshwari*
>> *+91- 805591 <+91%208%2005591> (Mobile)*
>>
>> Times Internet Ltd. | A Times of India Group Company
>>
>> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>>
>> *P** Please do not print this email unless it is absolutely necessary.
>> Spread environmental awareness.*
>>
>>
>>
>> *From:* Benjamin Roth [mailto:benjamin.r...@jaumo.com]
>> *Sent:* Tuesday, January 24, 2017 9:11 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: [Multi DC] Old Data Not syncing from Existing cluster to
>> new Cluster
>>
>>
>>
>> I am not an expert in bootstrapping new DCs but shouldn't the OLD nodes
>> appear as UP to be used as a streaming source in rebuild?
>>
>>
>>
>> 2017-01-24 16:32 GMT+01:00 Abhishek Kumar Maheshwari <
>> abhishek.maheshw...@timesinternet.in>:
>>
>> Yes, I take all steps. While I am inserting new data is replicating on
>> both DC. But only old data is not replication in new cluster.
>>
>>
>>
>> *Thanks & Regards,*
>> *Abhishek Kumar Maheshwari*
>> *+91- 805591 <+91%208%2005591> (Mobile)*
>>
>> Times Internet Ltd. | A Times of India Group Company
>>
>> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>>
>> *P** Please do not print this email unless it is absolutely necessary.
>> Spread environmental awareness.*
>>
>>
>>
>> *From:* Benjamin Roth [mailto:benjamin.r...@jaumo.com]
>> *Sent:* Tuesday, January 24, 2017 8:55 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: [Multi DC] Old Data Not syncing from Existing cluster to
>> new Cluster
>>
>>
>>
>> There is much more to it than just changing the RF in the keyspace!
>>
>>
>>
>> See here: https://docs.datastax.com/en/cassandra/3.0/cassandra/
>> operations/opsAddDCToCluster.html
>>
>>
>>
>> 2017-01-24 16:18 GMT+01:00 Abhishek Kumar Maheshwari <
>> abhishek.maheshw...@timesinternet.in>:
>>
>> Hi All,
>>
>>
>>
>> I have Cassandra stack with 2 Dc
>>
>>
>>
>> Datacenter: DRPOCcluster
>>
>> 
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  AddressLoad   Tokens   OwnsHost
>> ID   Rack
>>
>> UN  172.29.xx.xxx  256  MB   256  ?
>> b6b8cbb9-1fed-471f-aea9-6a657e7ac80a  01
>>
>> UN  172.29.xx.xxx  240 MB   256  ?
>> 604abbf5-8639-4104-8f60-fd6573fb2e17  03
>>
>> UN  172.29. xx.xxx  240 MB   256  ?
>> 32fa79ee-93c6-4e5b-a910-f27a1e9d66c1  02
>>
>> Datacenter: dc_india
>>
>> 
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  AddressLoad   Tokens   OwnsHost
>> ID   Rack
>>
>> DN  172.26. .xx.xxx  78.97 GB   256  ?
>> 3e8133ed-98b5-418d-96b5-690a1450cd30  RACK1
>>
>> DN  172.26. .xx.xxx  79.18 GB   256  ?
>> 7d3f5b25-88f9-4be7-b0f5-746619153543  RACK2
>>
>>
>>
>> dc_india is old Dc which contains all data.
>>
>> I update keyspace as per below:
>>
>>
>>
>> alter KEYSPACE wls WITH replication = {'class':
>> 'NetworkTopologyStrategy', 'DRPOCcluster': '2','dc_india':'2'}  AND
>> durable_writes = true;
>>
>>
>>
>> but old data is not updating in DRPOCcluster(which is new). Also, while
>> running nodetool rebuild getting below exception:
>>
>> Cammand: ./nodetool rebuild -dc dc_india
>>
>>
>>
>> Exception : nodetool: Unable to find sufficient sources for streaming
>> range (-875697427424852,-8755484427030035332] in keyspace
>> system_distributed
>>
>>
>>
>> Cassandra version : 3.0.9
>>
>>
>>
>>
>>
>> *Thanks & Regards,*
>> *Abhishek Kumar Maheshwari*
>> *+91- 805591 <+91%208%2005591> (Mobile)*
>>
>> Times Internet Ltd. | A Times of 

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2016-10-28 Thread Dikang Gu
We are seeing huge cpu regression when upgrading one of our 2.0.16 cluster
to 2.1.14 as well. The 2.1.14 node is not able to handle the same amount of
read traffic as the 2.0.16 node, actually, it's less than 50%.

And in the perf results, the first line could go as high as 50%, as we turn
up the read traffic, which never appeared in 2.0.16.

Any thoughts?
Thanks


Samples: 952K of event 'cycles', Event count (approx.): 229681774560
Overhead  Shared Object  Symbol
   6.52%  perf-196410.map[.]
Lorg/apache/cassandra/db/marshal/IntegerType;.compare in
Lorg/apache/cassandra/db/composites/AbstractSimpleCellNameType;.compare
   4.84%  libzip.so  [.] adler32
   2.88%  perf-196410.map[.]
Ljava/nio/HeapByteBuffer;.get in
Lorg/apache/cassandra/db/marshal/IntegerType;.compare
   2.39%  perf-196410.map[.]
Ljava/nio/Buffer;.checkIndex in
Lorg/apache/cassandra/db/marshal/IntegerType;.findMostSignificantByte
   2.03%  perf-196410.map[.]
Ljava/math/BigInteger;.compareTo in
Lorg/apache/cassandra/db/DecoratedKey;.compareTo
   1.65%  perf-196410.map[.] vtable chunks
   1.44%  perf-196410.map[.]
Lorg/apache/cassandra/db/DecoratedKey;.compareTo in
Ljava/util/concurrent/ConcurrentSkipListMap;.findNode
   1.02%  perf-196410.map[.]
Lorg/apache/cassandra/db/composites/AbstractSimpleCellNameType;.compare
   1.00%  snappy-1.0.5.2-libsnappyjava.so[.] 0x3804
   0.87%  perf-196410.map[.]
Ljava/io/DataInputStream;.readFully in
Lorg/apache/cassandra/db/AbstractCell$1;.computeNext
   0.82%  snappy-1.0.5.2-libsnappyjava.so[.] 0x36dc
   0.79%  [kernel]   [k]
copy_user_generic_string
   0.73%  perf-196410.map[.] vtable chunks
   0.71%  perf-196410.map[.]
Lorg/apache/cassandra/db/OnDiskAtom$Serializer;.deserializeFromSSTable in
Lorg/apache/cassandra/db/AbstractCell$1;.computeNext
   0.70%  [kernel]   [k] find_busiest_group
   0.69%  perf-196410.map[.] <80>H3^?
   0.68%  perf-196410.map[.]
Lorg/apache/cassandra/db/DecoratedKey;.compareTo
   0.65%  perf-196410.map[.]
jbyte_disjoint_arraycopy
   0.64%  [kernel]   [k] _raw_spin_lock
   0.63%  [kernel]   [k] __schedule
   0.45%  snappy-1.0.5.2-libsnappyjava.so[.] 0x36df

On Fri, Jan 29, 2016 at 2:11 PM, Corry Opdenakker  wrote:

> @JC, Get the pid of your target java process (something like "ps -ef |
> grep -i cassandra") .
> Then do a kill -3  (at unix/linux)
> Check the stdout logfile of the process.
>  it should contain the threaddump.
> If you found it, then great!
> Let that kill -3 loop for about 2 or 3 minutes.
> Herafter copy paste and load the stdout file into one if the mentioned
> tools.
> If you are not familiar with the java internals, then those threaddumps
> will learn you a lot:)
>
>
>
>
> Op vrijdag 29 januari 2016 heeft Jean Carlo 
> het volgende geschreven:
>
>> I am having the same issue after upgrade cassandra 2.1.12 from 2.0.10. I
>> am not good on jvm so I would like to know how to do what @
>> CorryOpdenakker  propose with cassandra.
>>
>> :)
>>
>> I check concurrent_compactors
>>
>>
>> Saludos
>>
>> Jean Carlo
>>
>> "The best way to predict the future is to invent it" Alan Kay
>>
>> On Fri, Jan 29, 2016 at 9:24 PM, Corry Opdenakker 
>> wrote:
>>
>>> Hi guys,
>>> Cassandra is still new for me, but I have a lot of java tuning
>>> experience.
>>>
>>> For root cause detection of performance degradations its always good to
>>> start with collecting a series of java thread dumps. Take at problem
>>> occurrence using a loopscript for example 60 thread dumps with an interval
>>> of 1 or 2 seconds.
>>> Then load those dumps into IBM thread dump analyzer or in "eclipse mat"
>>> or any similar tool and see which methods appear to be most active or
>>> blocking others.
>>>
>>> Its really very useful
>>>
>>> Same can be be done in a normal situation to compare the difference.
>>>
>>> That should give more insights.
>>>
>>> Cheers, Corry
>>>
>>>
>>> Op vrijdag 29 januari 2016 heeft Peddi, Praveen  het
>>> volgende geschreven:
>>>
 Hello,
 We have another update on performance on 2.1.11. compression_chunk_size
  didn’t really help much but We changed concurrent_compactors from default
 to 64 in 2.1.11 and read latencies improved significantly. However, 2.1.11
 read latencies are still 1.5 slower than 2.0.9. One thing we noticed in JMX
 metric that could affect read latencies is that 2.1.11 is running
 ReadRepairedBackground and ReadRepairedBlocking too 

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

2016-11-08 Thread Dikang Gu
Michael, thanks for the info. It sounds to me a very serious performance
regression. :(

On Tue, Nov 8, 2016 at 11:39 AM, Michael Kjellman <
mkjell...@internalcircle.com> wrote:

> Yes, We hit this as well. We have a internal patch that I wrote to mostly
> revert the behavior back to ByteBuffers with as small amount of code change
> as possible. Performance of our build is now even with 2.0.x and we've also
> forward ported it to 3.x (although the 3.x patch was even more complicated
> due to Bounds, RangeTombstoneBound, ClusteringPrefix which actually
> increases the number of allocations to somewhere between 11 and 13
> depending on how I count it per indexed block -- making it even worse than
> what you're observing in 2.1.
>
> We haven't upstreamed it as 2.1 is obviously not taking any changes at
> this point and the longer term solution is https://issues.apache.org/
> jira/browse/CASSANDRA-9754 (which also includes the changes to go back to
> ByteBuffers and remove as much of the Composites from the storage engine as
> possible.) Also, the solution is a bit of a hack -- although it was a
> blocker from us deploying 2.1 -- so i'm not sure how "hacky" it is if it
> works..
>
> best,
> kjellman
>
>
> On Nov 8, 2016, at 11:31 AM, Dikang Gu <dikan...@gmail.com<mailto:dik
> an...@gmail.com>> wrote:
>
> This is very expensive:
>
> "MessagingService-Incoming-/2401:db00:21:1029:face:0:9:0" prio=10
> tid=0x7f2fd57e1800 nid=0x1cc510 runnable [0x7f2b971b]
>java.lang.Thread.State: RUNNABLE
> at org.apache.cassandra.db.marshal.IntegerType.compare(
> IntegerType.java:29)
> at org.apache.cassandra.db.composites.AbstractSimpleCellNameType.
> compare(AbstractSimpleCellNameType.java:98)
> at org.apache.cassandra.db.composites.AbstractSimpleCellNameType.
> compare(AbstractSimpleCellNameType.java:31)
> at java.util.TreeMap.put(TreeMap.java:545)
> at java.util.TreeSet.add(TreeSet.java:255)
> at org.apache.cassandra.db.filter.NamesQueryFilter$
> Serializer.deserialize(NamesQueryFilter.java:254)
> at org.apache.cassandra.db.filter.NamesQueryFilter$
> Serializer.deserialize(NamesQueryFilter.java:228)
> at org.apache.cassandra.db.SliceByNamesReadCommandSeriali
> zer.deserialize(SliceByNamesReadCommand.java:104)
> at org.apache.cassandra.db.ReadCommandSerializer.
> deserialize(ReadCommand.java:156)
> at org.apache.cassandra.db.ReadCommandSerializer.
> deserialize(ReadCommand.java:132)
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:99)
> at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(
> IncomingTcpConnection.java:195)
> at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(
> IncomingTcpConnection.java:172)
> at org.apache.cassandra.net.IncomingTcpConnection.run(
> IncomingTcpConnection.java:88)
>
>
> Checked the git history, it comes from this jira:
> https://issues.apache.org/jira/browse/CASSANDRA-5417
>
> Any thoughts?
> ​
>
> On Fri, Oct 28, 2016 at 10:32 AM, Paulo Motta <pauloricard...@gmail.com<
> mailto:pauloricard...@gmail.com>> wrote:
> Haven't seen this before, but perhaps it's related to CASSANDRA-10433?
> This is just a wild guess as it's in a related codepath, but maybe worth
> trying out the patch available to see if it helps anything...
>
> 2016-10-28 15:03 GMT-02:00 Dikang Gu <dikan...@gmail.com<mailto:dik
> an...@gmail.com>>:
> We are seeing huge cpu regression when upgrading one of our 2.0.16 cluster
> to 2.1.14 as well. The 2.1.14 node is not able to handle the same amount of
> read traffic as the 2.0.16 node, actually, it's less than 50%.
>
> And in the perf results, the first line could go as high as 50%, as we
> turn up the read traffic, which never appeared in 2.0.16.
>
> Any thoughts?
> Thanks
>
>
> Samples: 952K of event 'cycles', Event count (approx.): 229681774560
> Overhead  Shared Object  Symbol
>6.52%  perf-196410.map[.]
> Lorg/apache/cassandra/db/marshal/IntegerType;.compare in
> Lorg/apache/cassandra/db/composites/AbstractSimpleCellNameType;.compare
>4.84%  libzip.so  [.] adler32
>2.88%  perf-196410.map[.]
> Ljava/nio/HeapByteBuffer;.get in Lorg/apache/cassandra/db/
> marshal/IntegerType;.compare
>2.39%  perf-196410.map[.]
> Ljava/nio/Buffer;.checkIndex in Lorg/apache/cassandra/db/
> marshal/IntegerType;.findMostSignificantByte
>2.03%  perf-196410.map[.]
> Ljava/math/BigInteger;.compareTo in Lorg/apache/cassandra/db/
> D

Re: A difficult data model with C*

2016-11-07 Thread Dikang Gu
Agree, change the last_time to be descending order will help, you can also
TTL the data, so that the old records will be purged by Cassandra.

--Dikang.

On Mon, Nov 7, 2016 at 10:39 PM, Alain Rastoul 
wrote:

> On 11/08/2016 03:54 AM, ben ben wrote:
>
>> Hi guys,
>>CREATE TABLE recent (
>>  user_name text,
>>  vedio_id text,
>>  position int,
>>  last_time timestamp,
>>  PRIMARY KEY (user_name, vedio_id)
>> )
>>
>>
> Hi Ben,
>
> May be a clustering columns order would help
> CREATE TABLE recent (
> ...
> ) WITH CLUSTERING ORDER BY (last_time DESC);
> So you can query only the last 10 records
> SELECT * FROM recent WHERE vedio_id = xxx  LIMIT 10
>
> See here http://www.datastax.com/dev/blog/we-shall-have-order
> --
> best,
> Alain
>



-- 
Dikang


Re: operation and maintenance tools

2016-11-07 Thread Dikang Gu
Hi Simon,

For a 10 nodes cluster, Cassandra nodetool should be enough for most C*
operations and maintenance, unless you have some special requirements.

For the memory, you can check what's your JVM settings, and the gc log for
JVM usage.

--Dikang.

On Mon, Nov 7, 2016 at 7:25 PM, wxn...@zjqunshuo.com 
wrote:

> Hi All,
>
> I need to do maintenance work for a C* cluster with about 10 nodes. Please
> recommend a C* operation and maintenance tools you are using.
> I also noticed my C* deamon using large memory while doing nothing. Is
> there any convenent tool to deeply analysize the C* node memory?
>
> Cheers,
> Simon
>



-- 
Dikang


Re: Node replacement failed in 2.2

2016-11-18 Thread Dikang Gu
Paulo, the tokens field for 2401:db00:2130:4091:face:0:13:0 shows "TOKENS:
not present", on all live nodes. It means tokens are missing, right? What
would cause this?

Thanks.
Dikang.

On Fri, Nov 18, 2016 at 11:15 AM, Paulo Motta <pauloricard...@gmail.com>
wrote:

> What does nodetool gossipinfo shows for endpoint /2401:db00:2130:4091:
> face:0:13:0 ? Does it contain the TOKENS attribute? If it's missing, is
> it only missing on this node or other nodes as well?
>
> 2016-11-18 17:02 GMT-02:00 Dikang Gu <dikan...@gmail.com>:
>
>> Hi, I encountered couple times that I could not replace a down node due
>> to error:
>>
>> 2016-11-17_19:33:58.70075 Exception (java.lang.RuntimeException)
>> encountered during startup: Could not find tokens for
>> /2401:db00:2130:4091:face:0:13:0 to replace
>> 2016-11-17_19:33:58.70489 ERROR 19:33:58 [main]: Exception encountered
>> during startup
>> 2016-11-17_19:33:58.70491 java.lang.RuntimeException: Could not find
>> tokens for /2401:db00:2130:4091:face:0:13:0 to replace
>> 2016-11-17_19:33:58.70491   at org.apache.cassandra.service.S
>> torageService.prepareReplacementInfo(StorageService.java:525)
>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>> 160315.c29948b]
>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.S
>> torageService.prepareToJoin(StorageService.java:760)
>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>> 160315.c29948b]
>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.S
>> torageService.initServer(StorageService.java:693)
>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>> 160315.c29948b]
>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.S
>> torageService.initServer(StorageService.java:585)
>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>> 160315.c29948b]
>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.C
>> assandraDaemon.setup(CassandraDaemon.java:300)
>> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git201
>> 60315.c29948b]
>> 2016-11-17_19:33:58.70493   at org.apache.cassandra.service.C
>> assandraDaemon.activate(CassandraDaemon.java:516)
>> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git201
>> 60315.c29948b]
>> 2016-11-17_19:33:58.70493   at org.apache.cassandra.service.C
>> assandraDaemon.main(CassandraDaemon.java:625)
>> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git201
>> 60315.c29948b]
>> 2016-11-17_19:33:58.70649 INFO  19:33:58 [StorageServiceShutdownHook]:
>> Announcing shutdown
>> 2016-11-17_19:34:00.70967 INFO  19:34:00 [StorageServiceShutdownHook]:
>> Waiting for messaging service to quiesce
>> 2016-11-17_19:34:00.71066 INFO  19:34:00 
>> [ACCEPT-/2401:db00:2130:4091:face:0:13:0]:
>> MessagingService has terminated the accept() thread
>>
>> Did not find a relevant ticket for this, is anyone aware of this?
>>
>> Thanks!
>>
>> --
>> Dikang
>>
>>
>


-- 
Dikang


Node replacement failed in 2.2

2016-11-18 Thread Dikang Gu
Hi, I encountered couple times that I could not replace a down node due to
error:

2016-11-17_19:33:58.70075 Exception (java.lang.RuntimeException)
encountered during startup: Could not find tokens for
/2401:db00:2130:4091:face:0:13:0 to replace
2016-11-17_19:33:58.70489 ERROR 19:33:58 [main]: Exception encountered
during startup
2016-11-17_19:33:58.70491 java.lang.RuntimeException: Could not find tokens
for /2401:db00:2130:4091:face:0:13:0 to replace
2016-11-17_19:33:58.70491   at
org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:525)
~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
2016-11-17_19:33:58.70492   at
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:760)
~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
2016-11-17_19:33:58.70492   at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:693)
~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
2016-11-17_19:33:58.70492   at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
2016-11-17_19:33:58.70492   at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300)
[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
2016-11-17_19:33:58.70493   at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516)
[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
2016-11-17_19:33:58.70493   at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625)
[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
2016-11-17_19:33:58.70649 INFO  19:33:58 [StorageServiceShutdownHook]:
Announcing shutdown
2016-11-17_19:34:00.70967 INFO  19:34:00 [StorageServiceShutdownHook]:
Waiting for messaging service to quiesce
2016-11-17_19:34:00.71066 INFO  19:34:00
[ACCEPT-/2401:db00:2130:4091:face:0:13:0]: MessagingService has terminated
the accept() thread

Did not find a relevant ticket for this, is anyone aware of this?

Thanks!

-- 
Dikang


Re: Node replacement failed in 2.2

2016-11-21 Thread Dikang Gu
Hmm, I don't think we use join_ring=false or write_survey=true for that
node. I already remove_node to take the bad node out of ring, will try to
have more debug logs next time.

Thanks.

On Sun, Nov 20, 2016 at 2:31 PM, Paulo Motta <pauloricard...@gmail.com>
wrote:

> Is there any chance the replaced node recently resumed bootstrap, joined
> with join_ring=false or write_survey=true? If so, perhaps this could be
> related to CASSANDRA-12935.
>
> Otherwise gossip tokens being empty is definitely unexpected behavior and
> you should probably file another ticket with more details/context (such as
> gossip debug logs of replacement and other nodes, and if the replacement
> node had the same or different ip as the original node since they are
> slightly different code paths after #8523).
>
> 2016-11-18 19:07 GMT-02:00 Dikang Gu <dikan...@gmail.com>:
>
>> Paulo, the tokens field for 2401:db00:2130:4091:face:0:13:0 shows
>> "TOKENS: not present", on all live nodes. It means tokens are missing,
>> right? What would cause this?
>>
>> Thanks.
>> Dikang.
>>
>> On Fri, Nov 18, 2016 at 11:15 AM, Paulo Motta <pauloricard...@gmail.com>
>> wrote:
>>
>>> What does nodetool gossipinfo shows for endpoint /2401:db00:2130:4091:
>>> face:0:13:0 ? Does it contain the TOKENS attribute? If it's missing, is
>>> it only missing on this node or other nodes as well?
>>>
>>> 2016-11-18 17:02 GMT-02:00 Dikang Gu <dikan...@gmail.com>:
>>>
>>>> Hi, I encountered couple times that I could not replace a down node due
>>>> to error:
>>>>
>>>> 2016-11-17_19:33:58.70075 Exception (java.lang.RuntimeException)
>>>> encountered during startup: Could not find tokens for
>>>> /2401:db00:2130:4091:face:0:13:0 to replace
>>>> 2016-11-17_19:33:58.70489 ERROR 19:33:58 [main]: Exception encountered
>>>> during startup
>>>> 2016-11-17_19:33:58.70491 java.lang.RuntimeException: Could not find
>>>> tokens for /2401:db00:2130:4091:face:0:13:0 to replace
>>>> 2016-11-17_19:33:58.70491   at org.apache.cassandra.service.S
>>>> torageService.prepareReplacementInfo(StorageService.java:525)
>>>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>>>> 160315.c29948b]
>>>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.S
>>>> torageService.prepareToJoin(StorageService.java:760)
>>>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>>>> 160315.c29948b]
>>>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.S
>>>> torageService.initServer(StorageService.java:693)
>>>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>>>> 160315.c29948b]
>>>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.S
>>>> torageService.initServer(StorageService.java:585)
>>>> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20
>>>> 160315.c29948b]
>>>> 2016-11-17_19:33:58.70492   at org.apache.cassandra.service.C
>>>> assandraDaemon.setup(CassandraDaemon.java:300)
>>>> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git201
>>>> 60315.c29948b]
>>>> 2016-11-17_19:33:58.70493   at org.apache.cassandra.service.C
>>>> assandraDaemon.activate(CassandraDaemon.java:516)
>>>> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git201
>>>> 60315.c29948b]
>>>> 2016-11-17_19:33:58.70493   at org.apache.cassandra.service.C
>>>> assandraDaemon.main(CassandraDaemon.java:625)
>>>> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git201
>>>> 60315.c29948b]
>>>> 2016-11-17_19:33:58.70649 INFO  19:33:58 [StorageServiceShutdownHook]:
>>>> Announcing shutdown
>>>> 2016-11-17_19:34:00.70967 INFO  19:34:00 [StorageServiceShutdownHook]:
>>>> Waiting for messaging service to quiesce
>>>> 2016-11-17_19:34:00.71066 INFO  19:34:00 
>>>> [ACCEPT-/2401:db00:2130:4091:face:0:13:0]:
>>>> MessagingService has terminated the accept() thread
>>>>
>>>> Did not find a relevant ticket for this, is anyone aware of this?
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> Dikang
>>>>
>>>>
>>>
>>
>>
>> --
>> Dikang
>>
>>
>


-- 
Dikang


Re: Exceptions when upgrade from 2.1.14 to 2.2.5

2017-04-19 Thread Dikang Gu
Thanks Jeff & Hannu,

Yeah, that's my guess too, I walk around this by set
-Dcassandra.commitlog.ignorereplayerrors=true.

Before upgrade, we did run `nodetool drain`, but seems the 2.1 commit log
were not cleared, and still got replayed by 2.2.

Thanks
Dikang.



On Tue, Apr 18, 2017 at 11:09 PM, Jeff Jirsa <jji...@apache.org> wrote:

>
>
> On 2017-04-18 18:57 (-0700), Dikang Gu <dikan...@gmail.com> wrote:
> > Hello there,
> >
> > We are upgrading one of our cluster from 2.1.14 to 2.2.5, but cassandra
> had
> > problems replaying the commit logs...
> >
> > Here is the exception, does anyone experience similar before?
> >
> > 2017-04-19_00:22:22.26911 ERROR 00:22:22 [main]: Exiting due to error
> while
> > processing commit log during initialization.
> > 2017-04-19_00:22:22.26912
> > org.apache.cassandra.db.commitlog.CommitLogReplayer$
> CommitLogReplayException:
> > Unexpected end of segment
>
>
> A lot of the commitlog replay code changed from 2.1 -> 2.2 to be more
> strict with "unexpected" errors on replay. Just glancing at the message but
> not tracing line numbers, I'm guessing it's either
> https://issues.apache.org/jira/browse/CASSANDRA-13282 (went into 2.2.10)
> or https://issues.apache.org/jira/browse/CASSANDRA-11995 (PA but not yet
> reviewed, would love a reviewer).
>



-- 
Dikang


Exceptions when upgrade from 2.1.14 to 2.2.5

2017-04-18 Thread Dikang Gu
Hello there,

We are upgrading one of our cluster from 2.1.14 to 2.2.5, but cassandra had
problems replaying the commit logs...

Here is the exception, does anyone experience similar before?

2017-04-19_00:22:21.69943 DEBUG 00:22:21 [main]: Finished reading
/data/cassandra/commitlog/CommitLog-4-1487900877734.log
2017-04-19_00:22:21.69960 DEBUG 00:22:21 [main]: Replaying
/data/cassandra/commitlog/CommitLog-4-1487900877735.log (CL version 4,
messaging version 8, compression null)
2017-04-19_00:22:22.26911 ERROR 00:22:22 [main]: Exiting due to error while
processing commit log during initialization.
2017-04-19_00:22:22.26912
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
Unexpected end of segment
2017-04-19_00:22:22.26912 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:623)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170
2017-04-19_00:22:22.26913 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:484)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170
2017-04-19_00:22:22.26913 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:389)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170404.5f2187
2017-04-19_00:22:22.26913 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:147)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170404.5f2187
2017-04-19_00:22:22.26913 at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170404.5f2187b]
2017-04-19_00:22:22.26913 at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170404.5f2187b]
2017-04-19_00:22:22.26914 at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:302)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170404.5f2187b]
2017-04-19_00:22:22.26914 at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:544)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170404.5f2187b]
2017-04-19_00:22:22.26914 at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:607)
[apache-cassandra-2.2.5+git20170404.5f2187b.jar:2.2.5+git20170404.5f2187b]

Thanks

-- 
Dikang


Re: Cassandra 2.1.13: Using JOIN_RING=False

2017-05-11 Thread Dikang Gu
1. The coordinator still store system data and hints, but they should not
store any user data since they are not part of ring.
2. We are using coordinator for thrift client. For cql based drivers, they
needs to talk to nodes in the ring, so I think coordinator mode won't work
for them.

-Dikang

On Tue, May 9, 2017 at 2:01 PM, Anubhav Kale <
anubhav.k...@microsoft.com.invalid> wrote:

> Hello,
>
>
>
> With some inspiration from the Cassandra Summit talk from last year, we
> are trying to setup a cluster with coordinator-only nodes. We setup
> join_ring=false in env.sh, disabled auth in YAML and the nodes are able to
> start just fine. However, we’re running into a few problems
>
>
>
> 1] The nodes marked with join_ring=false continue to store data. Why ?
>
> 2] We tried Python driver’s whitelistedpolicy. But we notice message like
> below, so we are not able to send queries to all nodes marked as
> coordinators. We also changed the Scala driver to support whitelisting, but
> see the same thing. What are we missing ?
>
> 3] Is there any way to concretely tell that only coordinator nodes are
> getting requests from clients ? We don’t have OpsCenter.
>
>
>
> Thanks !
>
>
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: [control connection]
> Removing host not found in peers metadata: 
>
> 2017-05-09 20:45:25,060 [INFO] cassandra.cluster: Cassandra host
> 10.80.10.128 removed
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: Removing host
> 10.80.10.128
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: [control connection]
> Removing host not found in peers metadata: 
>
> 2017-05-09 20:45:25,060 [INFO] cassandra.cluster: Cassandra host
> 10.80.10.127 removed
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: Removing host
> 10.80.10.127
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: [control connection]
> Removing host not found in peers metadata: 
>
> 2017-05-09 20:45:25,060 [INFO] cassandra.cluster: Cassandra host
> 10.80.10.129 removed
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: Removing host
> 10.80.10.129
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: [control connection]
> Finished fetching ring info
>
> 2017-05-09 20:45:25,060 [DEBUG] cassandra.cluster: [control connection]
> Rebuilding token map due to topology changes
>
> 2017-05-09 20:45:25,081 [DEBUG] cassandra.metadata: user functions table
> not found
>
> 2017-05-09 20:45:25,081 [DEBUG] cassandra.metadata: user aggregates table
> not found
>
> 2017-05-09 20:45:25,098 [DEBUG] cassandra.cluster: Control connection
> created
>
> 2017-05-09 20:45:25,099 [DEBUG] cassandra.pool: Initializing connection
> for host 10.80.10.125
>
> 2017-05-09 20:45:25,099 [DEBUG] cassandra.pool: Initializing connection
> for host 10.80.10.126
>
>
>



-- 
Dikang


Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
Hello there,

We have some use cases are doing consistent read/write requests, and we
have 4 replicas in that cluster, according to our setup.

What's interesting to me is that, for both read and write quorum requests,
they are blocked for 4/2+1 = 3 replicas, so we are accessing 3 (for write)
+ 3 (for reads) = 6 replicas in quorum requests, which is 2 replicas more
than 4.

I think it's not necessary to have 2 overlap nodes in even replication
factor case.

I suggest to change the `quorumFor(keyspace)` code, separate the case for
read and write requests, so that we can reduce one replica request in read
path.

Any concerns?

Thanks!


-- 
Dikang


Re: Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
To me, CL.TWO and CL.THREE are more like work around of the problem, for
example, they do not work if the number of replicas go to 8, which does
possible in our environment (2 replicas in each of 4 DCs).

What people want from quorum is strong consistency guarantee, as long as
R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
is the most expensive option.

I can not think of a reason, that people want the quorum read, not for
strong consistency reason, but just to read from (n/2+1) nodes. If they
want strong consistency, then the read just needs (n/2) nodes, we are
purely waste the one extra request, and hurts read latency as well.

Thanks
Dikang.

On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall  wrote:

>
> We have CL.TWO.
>>
>>
>>
> This was actually the original motivation for CL.TWO and CL.THREE if
> memory serves:
> https://issues.apache.org/jira/browse/CASSANDRA-2013
>



-- 
Dikang


Re: Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
Justin, what I suggest is that for QUORUM consistent level, the block for
write should be (num_replica/2)+1, this is same as today, but for read
request, we just need to access (num_replica/2) nodes, which should provide
enough strong consistency.

Dikang.

On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <jus...@instaclustr.com>
wrote:

> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
> consistency, as there is no overlap.
>
> In your particular case you could potentially use QUORUM for write and TWO
> for read (or vice-versa) and still achieve strong consistency. If you add
> additional nodes in the future this would obviously no longer work. Also
> the benefit of this is dubious, since 3/4 nodes still need to be accessible
> to perform writes. I'd also guess that it's unlikely to provide any
> significant performance increase.
>
> Justin
>
> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <dikan...@gmail.com> wrote:
>
>> Hello there,
>>
>> We have some use cases are doing consistent read/write requests, and we
>> have 4 replicas in that cluster, according to our setup.
>>
>> What's interesting to me is that, for both read and write quorum
>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>> replicas more than 4.
>>
>> I think it's not necessary to have 2 overlap nodes in even replication
>> factor case.
>>
>> I suggest to change the `quorumFor(keyspace)` code, separate the case for
>> read and write requests, so that we can reduce one replica request in read
>> path.
>>
>> Any concerns?
>>
>> Thanks!
>>
>>
>> --
>> Dikang
>>
>> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>



-- 
Dikang


Re: Definition of QUORUM consistency level

2017-06-08 Thread Dikang Gu
So, for the quorum, what we really want is that there is one overlap among
the nodes in write path and read path. It actually was my assumption for a
long time that we need (N/2 + 1) for write and just need (N/2) for read,
because it's enough to provide the strong consistency.

On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> It would be a little weird to change the definition of QUORUM, which means
> majority, to mean something other than majority for a single use case.
> Sounds like you want to introduce a new CL, HALF.
> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu <dikan...@gmail.com> wrote:
>
>> Justin, what I suggest is that for QUORUM consistent level, the block for
>> write should be (num_replica/2)+1, this is same as today, but for read
>> request, we just need to access (num_replica/2) nodes, which should provide
>> enough strong consistency.
>>
>> Dikang.
>>
>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <jus...@instaclustr.com>
>> wrote:
>>
>>> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
>>> consistency, as there is no overlap.
>>>
>>> In your particular case you could potentially use QUORUM for write and
>>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>>> add additional nodes in the future this would obviously no longer work.
>>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>>> accessible to perform writes. I'd also guess that it's unlikely to provide
>>> any significant performance increase.
>>>
>>> Justin
>>>
>>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <dikan...@gmail.com> wrote:
>>>
>>>> Hello there,
>>>>
>>>> We have some use cases are doing consistent read/write requests, and we
>>>> have 4 replicas in that cluster, according to our setup.
>>>>
>>>> What's interesting to me is that, for both read and write quorum
>>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>>> replicas more than 4.
>>>>
>>>> I think it's not necessary to have 2 overlap nodes in even replication
>>>> factor case.
>>>>
>>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>>> for read and write requests, so that we can reduce one replica request in
>>>> read path.
>>>>
>>>> Any concerns?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> --
>>>> Dikang
>>>>
>>>> --
>>>
>>>
>>> *Justin Cameron*Senior Software Engineer
>>>
>>>
>>> <https://www.instaclustr.com/>
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>
>>
>>
>> --
>> Dikang
>>
>>


-- 
Dikang


Commitlog without header

2017-09-19 Thread Dikang Gu
Hello,

In our production cluster, we had multiple times that after a *unclean*
shutdown, cassandra sever can not start due to commit log exceptions:

2017-09-17_06:06:32.49830 ERROR 06:06:32 [main]: Exiting due to error while
processing commit log during initialization.
2017-09-17_06:06:32.49831
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
Could not read commit log descriptor in file
/data/cassandra/commitlog/CommitLog-5-1503088780367.log
2017-09-17_06:06:32.49831 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:634)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49831 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:303)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49831 at
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:147)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:302)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:544)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
2017-09-17_06:06:32.49832 at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:607)
[apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]

I add some logging to the CommitLogDescriptor.readHeader(), and find the
header is empty in the failure case. By empty, I mean all the fields in the
header are 0:

2017-09-19_22:43:02.22112 INFO  22:43:02 [main]: Dikang: crc: 0, checkcrc:
2077607535
2017-09-19_22:43:02.22130 INFO  22:43:02 [main]: Dikang: version: 0, id: 0,
parametersLength: 0

As a result, it did not pass the crc check, and failed the commit log
replay.

My question is: is it a known issue that some race condition can cause
empty header in commit log? If so, it should be safe just skip last commit
log with empty header, right?

As you can see, we are using Cassandra 2.2.5.

Thanks
Dikang.


Re: Row Cache hit issue

2017-09-19 Thread Dikang Gu
Hi Peng,

C* periodically saves cache to disk, to solve cold start problem. If
row_cache_save_period=0, it means C* does not save cache to disk. But the
cache is still working, if it's enabled in table schema, just the cache
will be empty after restart.

--Dikang.

On Tue, Sep 19, 2017 at 8:27 PM, Peng Xiao <2535...@qq.com> wrote:

> And we are using C* 2.1.18.
>
>
> -- Original --
> *From: * "我自己的邮箱";<2535...@qq.com>;
> *Date: * Wed, Sep 20, 2017 11:27 AM
> *To: * "user";
> *Subject: * Row Cache hit issue
>
> Dear All,
>
> The default row_cache_save_period=0,looks Row Cache does not work in this
> situation?
> but we can still see the row cache hit.
>
> Row Cache  : entries 202787, size 100 MB, capacity 100 MB,
> 3095293 hits, 6796801 requests, 0.455 recent hit rate, 0 save period in
> seconds
>
> Could anyone please explain this?
>
> Thanks,
> Peng Xiao
>



-- 
Dikang


Re: Commitlog without header

2017-09-22 Thread Dikang Gu
I will try the fixes, thanks Benjamin & Jeff.

On Thu, Sep 21, 2017 at 8:55 PM, Jeff Jirsa <jji...@gmail.com> wrote:

> https://issues.apache.org/jira/plugins/servlet/mobile#
> issue/CASSANDRA-11995
>
>
>
> --
> Jeff Jirsa
>
>
> On Sep 19, 2017, at 4:36 PM, Dikang Gu <dikan...@gmail.com> wrote:
>
> Hello,
>
> In our production cluster, we had multiple times that after a *unclean*
> shutdown, cassandra sever can not start due to commit log exceptions:
>
> 2017-09-17_06:06:32.49830 ERROR 06:06:32 [main]: Exiting due to error while
> processing commit log during initialization.
> 2017-09-17_06:06:32.49831
> org.apache.cassandra.db.commitlog.CommitLogReplayer$
> CommitLogReplayException:
> Could not read commit log descriptor in file
> /data/cassandra/commitlog/CommitLog-5-1503088780367.log
> 2017-09-17_06:06:32.49831 at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(
> CommitLogReplayer.java:634)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
> 2017-09-17_06:06:32.49831 at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.
> recover(CommitLogReplayer.java:303)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
> 2017-09-17_06:06:32.49831 at
> org.apache.cassandra.db.commitlog.CommitLogReplayer.
> recover(CommitLogReplayer.java:147)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
> 2017-09-17_06:06:32.49832 at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:189)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
> 2017-09-17_06:06:32.49832 at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:169)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
> 2017-09-17_06:06:32.49832 at
> org.apache.cassandra.service.CassandraDaemon.setup(
> CassandraDaemon.java:302)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
> 2017-09-17_06:06:32.49832 at
> org.apache.cassandra.service.CassandraDaemon.activate(
> CassandraDaemon.java:544)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
> 2017-09-17_06:06:32.49832 at
> org.apache.cassandra.service.CassandraDaemon.main(
> CassandraDaemon.java:607)
> [apache-cassandra-2.2.5+git20170612.e1857fa.jar:2.2.5+git20170612.e1857fa]
>
> I add some logging to the CommitLogDescriptor.readHeader(), and find the
> header is empty in the failure case. By empty, I mean all the fields in the
> header are 0:
>
> 2017-09-19_22:43:02.22112 INFO  22:43:02 [main]: Dikang: crc: 0, checkcrc:
> 2077607535 <(207)%20760-7535>
> 2017-09-19_22:43:02.22130 INFO  22:43:02 [main]: Dikang: version: 0, id: 0,
> parametersLength: 0
>
> As a result, it did not pass the crc check, and failed the commit log
> replay.
>
> My question is: is it a known issue that some race condition can cause
> empty header in commit log? If so, it should be safe just skip last commit
> log with empty header, right?
>
> As you can see, we are using Cassandra 2.2.5.
>
> Thanks
> Dikang.
>
>


-- 
Dikang


Re: Cassandra 3.11.0 compaction attempting impossible to complete compactions

2017-10-13 Thread Dikang Gu
What's the compaction strategy are you using? level compaction or size
tiered compaction?

On Fri, Oct 13, 2017 at 4:31 PM, Bruce Tietjen <
bruce.tiet...@imatsolutions.com> wrote:

> I hadn't noticed that is is now attempting two impossible compactions:
>
>
> id   compaction type keyspace  table
> completed totalunit  progress
> a7d1b130-b04c-11e7-bfc8-79870a3c4039 Compaction  perfectsearch cxml
> 1.73 TiB  5.04 TiB bytes 34.36%
> b7b98890-b063-11e7-bfc8-79870a3c4039 Compaction  perfectsearch cxml
> 867.4 GiB 6.83 TiB bytes 12.40%
> Active compaction remaining time :n/a
>
>
> On Fri, Oct 13, 2017 at 5:27 PM, Jon Haddad  wrote:
>
>> Can you paste the output of cassandra compactionstats?
>>
>> What you’re describing should not happen.  There’s a check that drops
>> sstables out of a compaction task if there isn’t enough available disk
>> space, see https://issues.apache.org/jira/browse/CASSANDRA-12979 for
>> some details.
>>
>>
>> On Oct 13, 2017, at 4:24 PM, Bruce Tietjen > om> wrote:
>>
>>
>> We are new to Cassandra and have built a test cluster and loaded some
>> data into the cluster.
>>
>> We are seeing compaction behavior that seems to violate what we read
>> about it's behavior.
>>
>> Our cluster is configured with JBOD with 3 3.6T disks. Those disks
>> currently respectively have the following used/available space:
>> Disk  Used Available
>> sdb1  1.8T 1.7T
>> sdc1  1.8T1.6T
>> sdd1   1.5T2.0T
>>
>> nodetool compactionstats -H reports that the compaction system is
>> attempting to do a compaction that has a total of 6.83T
>>
>> The system hasn't had that much free space since sometime after we
>> started loading data and there has never been that much free space on a
>> single disk, so why would it ever attempt such a compaction?
>>
>> What have we done wrong, or am I reading this wrong?
>>
>> We have seen the same behavior on most of our 8 nodes.
>>
>> Can anyone tell us what is happening or what we have done wrong?
>>
>> Thanks
>>
>>
>>
>


-- 
Dikang


Re: run Cassandra on physical machine

2017-12-07 Thread Dikang Gu
@Peng, how many network interfaces do you have on your machine? If you just
have one NIC, you probably need to wait this storage port patch.
https://issues.apache.org/jira/browse/CASSANDRA-7544 .

On Thu, Dec 7, 2017 at 7:01 AM, Oliver Ruebenacker  wrote:

>
>  Hello,
>
>   Yes, you can.
>
>  Best, Oliver
>
> On Thu, Dec 7, 2017 at 7:12 AM, Peng Xiao <2535...@qq.com> wrote:
>
>> Dear All,
>>
>> Can we run Cassandra on physical machine directly?
>> we all know that vm can reduce the performance.For instance,we have a
>> machine with 56 core,8 ssd disks.
>> Can we run 8 cassandra instance in the same machine within one rack with
>> different port?
>>
>> Could anyone please advise?
>>
>> Thanks,
>> Peng Xiao
>>
>
>
>
> --
> Oliver Ruebenacker
> Senior Software Engineer, Diabetes Portal
> , Broad Institute
> 
>
>


-- 
Dikang


Re: NVMe SSD benchmarking with Cassandra

2018-01-05 Thread Dikang Gu
Do you have some detailed benchmark metrics? Like the QPS, Avg read/write
latency, P95/P99 read/write latency?

On Fri, Jan 5, 2018 at 5:57 PM, Justin Sanciangco 
wrote:

> I am benchmarking with the YCSB tool doing 1k writes.
>
>
>
> Below are my server specs
>
> 2 sockets
>
> 12 core hyperthreaded processor
>
> 64GB memory
>
>
>
> Cassandra settings
>
> 32GB heap
>
> Concurrent_reads: 128
>
> Concurrent_writes:256
>
>
>
> From what we are seeing it looks like the kernel writing to the disk
> causes degrading performance.
>
>
>
>
>
> Please let me know
>
>
>
>
>
> *From:* Jeff Jirsa [mailto:jji...@gmail.com]
> *Sent:* Friday, January 5, 2018 5:50 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: NVMe SSD benchmarking with Cassandra
>
>
>
> Second the note about compression chunk size in particular.
>
> --
>
> Jeff Jirsa
>
>
>
>
> On Jan 5, 2018, at 5:48 PM, Jon Haddad  wrote:
>
> Generally speaking, disable readahead.  After that it's very likely the
> issue isn’t in the settings you’re using the disk settings, but is actually
> in your Cassandra config or the data model.  How are you measuring things?
> Are you saturating your disks?  What resource is your bottleneck?
>
>
>
> *Every* single time I’ve handled a question like this, without exception,
> it ends up being a mix of incorrect compression settings (use 4K at most),
> some crazy readahead setting like 1MB, and terrible JVM settings that are
> the bulk of the problem.
>
>
>
> Without knowing how you are testing things or *any* metrics whatsoever
> whether it be C* or OS it’s going to be hard to help you out.
>
>
>
> Jon
>
>
>
>
>
> On Jan 5, 2018, at 5:41 PM, Justin Sanciangco 
> wrote:
>
>
>
> Hello,
>
>
>
> I am currently benchmarking NVMe SSDs with Cassandra and am getting very
> bad performance when my workload exceeds the memory size. What mount
> settings for NVMe should be used? Right now the SSD is formatted as XFS
> using noop scheduler. Are there any additional mount options that should be
> used? Any specific kernel parameters that should set in order to make best
> use of the PCIe NVMe SSD? Your insight would be well appreciated.
>
>
>
> Thank you,
>
> Justin Sanciangco
>
>
>
>


-- 
Dikang


Re: New token allocation and adding a new DC

2018-01-24 Thread Dikang Gu
I fixed the new allocation algorithm in non bootstrap case,
https://issues.apache.org/jira/browse/CASSANDRA-13080?filter=-2, the fix is
in 3.12+, but not in 3.0.


On Wed, Jan 24, 2018 at 9:32 AM, Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Thu, Jan 18, 2018 at 5:19 AM, kurt greaves 
> wrote:
>
>> Didn't know that about auto_bootstrap and the algorithm. We should
>> probably fix that. Can you create a JIRA for that issue?
>>
>
> Will do.
>
>
>> Workaround for #2 would be to truncate system.available_ranges after
>> "bootstrap".
>>
>
> Thanks, that seems to help.
>
> Initially we cannot login with cqlsh to run the truncate command on such a
> "bootstrapped" node.  But with the help of yet another workaround, namely
> pulling in the roles data by means of repairing system_auth keyspace only,
> it seems to be possible.  At least we see that netstats reports the ongoing
> streaming operations this time.
>
> --
> Alex
>
>


-- 
Dikang


Re: Nodes show different number of tokens than initially

2018-01-30 Thread Dikang Gu
What's the partitioner you use? We have logic to prevent duplicate tokens.

private static Collection adjustForCrossDatacenterClashes(final
TokenMetadata tokenMetadata,

StrategyAdapter strategy, Collection tokens)
{
List filtered = Lists.newArrayListWithCapacity(tokens.size());

for (Token t : tokens)
{
while (tokenMetadata.getEndpoint(t) != null)
{
InetAddress other = tokenMetadata.getEndpoint(t);
if (strategy.inAllocationRing(other))
throw new
ConfigurationException(String.format("Allocated token %s already
assigned to node %s. Is another node also allocating tokens?", t,
other));
t = t.increaseSlightly();
}
filtered.add(t);
}
return filtered;
}



On Tue, Jan 30, 2018 at 8:44 AM, Jeff Jirsa  wrote:

> All DCs in a cluster use the same token space in the DHT, so token
> conflicts across datacenters are invalid config
>
>
> --
> Jeff Jirsa
>
>
> On Jan 29, 2018, at 11:50 PM, Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
> On Tue, Jan 30, 2018 at 5:13 AM, kurt greaves 
> wrote:
>
>> Shouldn't happen. Can you send through nodetool ring output from one of
>> those nodes? Also, did the logs have anything to say about tokens when you
>> started the 3 seed nodes?​
>>
>
> Hi Kurt,
>
> I cannot run nodetool ring anymore, since these test nodes are long gone.
> However I've grepped the logs and this is what I've found:
>
> Jan 25 08:57:18 ip-172-31-128-41 docker/cf3ea463915a[854]: INFO 08:57:18
> Nodes /172.31.128.31 and /172.31.128.41 have the same token
> -9223372036854775808. Ignoring /172.31.128.31
> Jan 25 08:57:18 ip-172-31-128-41 docker/cf3ea463915a[854]: INFO 08:57:18
> Nodes /172.31.144.32 and /172.31.128.41 have the same token
> -8454757700450211158. Ignoring /172.31.144.32
> Jan 25 08:58:30 ip-172-31-144-41 docker/48fba443d99f[852]: INFO 08:58:30
> Nodes /172.31.128.41 and /172.31.128.31 have the same token
> -9223372036854775808. /172.31.128.41 is the new owner
> Jan 25 08:58:30 ip-172-31-144-41 docker/48fba443d99f[852]: INFO 08:58:30
> Nodes /172.31.144.32 and /172.31.128.41 have the same token
> -8454757700450211158. Ignoring /172.31.144.32
> Jan 25 08:59:45 ip-172-31-160-41 docker/cced70e132f2[849]: INFO 08:59:45
> Nodes /172.31.128.41 and /172.31.128.31 have the same token
> -9223372036854775808. /172.31.128.41 is the new owner
> Jan 25 08:59:45 ip-172-31-160-41 docker/cced70e132f2[849]: INFO 08:59:45
> Nodes /172.31.144.32 and /172.31.128.41 have the same token
> -8454757700450211158. Ignoring /172.31.144.32
>
> Since we are allocating the tokens for seed nodes manually, it appears
> that the first seed node in the new ring (172.31.128.41) gets the same
> first token (-9223372036854775808) as the node in the old ring
> (172.31.128.31).  The same goes for the 3rd token of the new seed node
> (-8454757700450211158).
>
> What is beyond me is why would that matter and why would token ownership
> change at all, while these nodes are in the *different virtual DCs*?  To me
> this sounds like a paticularly nasty bug...
>
> --
> Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
> 127-59-707 <+49%20176%2012759707>
>
>


-- 
Dikang


Re: Rocksandra blog post

2018-03-06 Thread Dikang Gu
Thanks everyone!

@Kyrylo, In Rocksandra world, the storage engine is built on top of
RocksDB, which is another LSM tree based storage engine. So the immutable
sstables and compactions are managed by RocksDB instances. RocksDB supports
different compaction strategies, similar to STCS and LCS. The compactions
are managed by C++, so it will not generate any pressure to JVM.

@Romain, Thanks, and Rocksandra is not limited to only support key/value
data model. We support most of the data types and table data model in
Cassandra.

Dikang.

On Tue, Mar 6, 2018 at 1:48 AM, Romain Hardouin <romainh...@yahoo.fr.invalid
> wrote:

> Rocksandra is very interesting for key/value data model. Let's hope it
> will land in C* upstream in the near future thanks to pluggable storage.
> Thanks Dikang!
>
>
>
> Le mardi 6 mars 2018 à 10:06:16 UTC+1, Kyrylo Lebediev <
> kyrylo_lebed...@epam.com> a écrit :
>
>
> Thanks for sharing, Dikang!
>
> Impressive results.
>
>
> As you plugged in different storage engine, it's interesting how you're
> dealing with compactions in Rocksandra?
>
> Is there still the concept of immutable SSTables + compaction strategies
> or it was changed somehow?
>
>
> Best,
>
> Kyrill
>
> --
> *From:* Dikang Gu <dikan...@gmail.com>
> *Sent:* Monday, March 5, 2018 8:26 PM
> *To:* d...@cassandra.apache.org; cassandra
> *Subject:* Rocksandra blog post
>
> As some of you already know, Instagram Cassandra team is working on the
> project to use RocksDB as Cassandra's storage engine.
>
> Today, we just published a blog post about the work we have done, and more
> excitingly, we published the benchmark metrics in AWS environment.
>
> Check it out here:
> https://engineering.instagram.com/open-sourcing-a-10x-
> reduction-in-apache-cassandra-tail-latency-d64f86b43589
>
> Thanks
> Dikang
>
>


-- 
Dikang


Rocksandra blog post

2018-03-05 Thread Dikang Gu
As some of you already know, Instagram Cassandra team is working on the
project to use RocksDB as Cassandra's storage engine.

Today, we just published a blog post about the work we have done, and more
excitingly, we published the benchmark metrics in AWS environment.

Check it out here:
https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589

Thanks
Dikang


Re: Secondary Index Cleanup

2018-03-02 Thread Dikang Gu
What's the C* version do you use? Sounds like the secondary index is very
out of sync with the parent cf.

On Fri, Mar 2, 2018 at 6:23 AM, Malte Krüger 
wrote:

> hi,
>
> we have an CF which is about 2 gb in size, it has a seondary index on one
> field (UUID).
>
> the index has a size on disk of about 10 gb. it only shrinks a little when
> forcing a compaction through jmx.
>
> if i use sstabledump i see a lot of these:
>
> "partition" : {
>   "key" : [ "123c50d1-1ceb-489d-8427-2f34065325f8" ],
>   "position" : 306166973
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 306167031,
> "clustering" : [ "f28f46930805495aa7d6cba291d92e87" ],
> "liveness_info" : { "tstamp" : "2017-10-30T16:49:37.160361Z" },
> "cells" : [ ]
>   },
>
> ...
>
> normally i can find the key as an indexed field, but most of the keys in
> the dump do no longer exist in the parent CF.
>
> these keys are sometimes months old. (we have gc_grace_seconds set to 30
> mins)
>
> if i use nodetool rebuild_index it does not help, but if i drop the index
> und recreate it size goes down  two several hundred mb!
>
>
> what is the reason the cleanup does not work automatically and how can i
> fix this?
>
> -Malte
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


-- 
Dikang


Re: Aborting a decommission

2019-02-20 Thread Dikang Gu
Tim, you can restart a node in decommission mode, after that, I will start
as a normal node. But it will be nice to have a tool to do that.

On Wed, Feb 20, 2019 at 2:32 PM Timothy Palpant  wrote:

> Hi!
>
> I was wondering if it is possible to abort/cancel a decommission that was
> initiated with `nodetool decommission`.
>
> The use case I am interested in is being able to stop a decommission that
> is in progress (returning the host to UN state) so that I can deal with
> another DN node first, without performing two token ring changes at the
> same time.
>
> Relatedly, does the reassignment of token ranges during a decommission
> happen at the start of the decommission, or at the end (after all data has
> been streamed to the new node)?
>
> Thanks!
> Tim
>
-- 
Dikang