Re: Exception in cassandra logs while processing the message

2014-02-17 Thread ankit tyagi
Hello,

anyone has the idea regarding this exception.

Regards,
Ankit Tyagi


On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote:

 Hello,

 I am seeing below exception in my cassandra
 logs(/var/log/cassandra/system.log).

 INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line
 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445
 6448000
  INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445
 6448000
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java (line
 64) flushing high-traffic column family CFS(Keyspace='comsdb', ColumnFa
 mily='product_update') (estimated 213624220 bytes)
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java
 (line 626) Enqueuing flush of Memtable-product_update@1067619242(31239028/
 213625108 serialized/live bytes, 222393 ops)
  INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400)
 Writing Memtable-product_update@1067619242(31239028/213625108 serialized/
 live bytes, 222393 ops)
  INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445
 6448000
 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438)
 Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_*
 *update-ic-416-Data.db (15707248 bytes) for commitlog position
 ReplayPosition(segmentId=1391568233618, position=13712751)*
 *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.*
 *org.apache.thrift.TException: Negative length: -2147418111*
 at
 org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388)
 at
 org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
 at
 org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304)
 at
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:679)
 ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.
 org.apache.thrift.TException: Negative length: -2147418111


 Below is my cassandra version and hector client version, which is being
 used currently.

 Cassandra-version: 1.2.11
 Hector-client: 1.0-2

 Any lead would be appreciated though we are planning to move cassandra 2.0
 version with java-driver but it may take some time meanwhile need to find
 the root cause and resolve this issue.


 Regards,
 Ankit Tyagi



Re: Exception in cassandra logs while processing the message

2014-02-17 Thread Vivek Mishra
looks like thrift inter operability issue. Seems column family or data
created via CQL3 and using Thrift based API to read it.

Else, recreate your schema and try.

-Vivek


On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote:

 Hello,

 anyone has the idea regarding this exception.

 Regards,
 Ankit Tyagi


 On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi 
 ankittyagi.mn...@gmail.comwrote:

 Hello,

 I am seeing below exception in my cassandra
 logs(/var/log/cassandra/system.log).

 INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line
 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445
 6448000
  INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445
 6448000
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java (line
 64) flushing high-traffic column family CFS(Keyspace='comsdb', ColumnFa
 mily='product_update') (estimated 213624220 bytes)
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java
 (line 626) Enqueuing flush of Memtable-product_update@1067619242
 (31239028/
 213625108 serialized/live bytes, 222393 ops)
  INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400)
 Writing Memtable-product_update@1067619242(31239028/213625108 serialized/
 live bytes, 222393 ops)
  INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445
 6448000
 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438)
 Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_*
 *update-ic-416-Data.db (15707248 bytes) for commitlog position
 ReplayPosition(segmentId=1391568233618, position=13712751)*
 *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.*
 *org.apache.thrift.TException: Negative length: -2147418111*
 at
 org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388)
 at
 org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
 at
 org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304)
 at
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21)
 at
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:679)
 ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.
 org.apache.thrift.TException: Negative length: -2147418111


 Below is my cassandra version and hector client version, which is being
 used currently.

 Cassandra-version: 1.2.11
 Hector-client: 1.0-2

 Any lead would be appreciated though we are planning to move cassandra
 2.0 version with java-driver but it may take some time meanwhile need to
 find the root cause and resolve this issue.


 Regards,
 Ankit Tyagi





Re: Exception in cassandra logs while processing the message

2014-02-17 Thread ankit tyagi
Hi,

I am using Hector-client: 1.0-2 to insert the data. Problem is , i am not
seeing any exception in my application logs where i am inserting data
through hector.

looks like internal to cassandra.

Regards,
Ankit TYagi


On Mon, Feb 17, 2014 at 1:56 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 looks like thrift inter operability issue. Seems column family or data
 created via CQL3 and using Thrift based API to read it.

 Else, recreate your schema and try.

 -Vivek


 On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi 
 ankittyagi.mn...@gmail.comwrote:

 Hello,

 anyone has the idea regarding this exception.

 Regards,
 Ankit Tyagi


 On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi 
 ankittyagi.mn...@gmail.comwrote:

 Hello,

 I am seeing below exception in my cassandra
 logs(/var/log/cassandra/system.log).

 INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line
 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445
 6448000
  INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445
 6448000
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java
 (line 64) flushing high-traffic column family CFS(Keyspace='comsdb',
 ColumnFa
 mily='product_update') (estimated 213624220 bytes)
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java
 (line 626) Enqueuing flush of Memtable-product_update@1067619242
 (31239028/
 213625108 serialized/live bytes, 222393 ops)
  INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400)
 Writing Memtable-product_update@1067619242(31239028/213625108
 serialized/
 live bytes, 222393 ops)
  INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445
 6448000
 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438)
 Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_*
 *update-ic-416-Data.db (15707248 bytes) for commitlog position
 ReplayPosition(segmentId=1391568233618, position=13712751)*
 *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.*
 *org.apache.thrift.TException: Negative length: -2147418111*
 at
 org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388)
 at
 org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
 at
 org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304)
 at
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21)
 at
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:679)
 ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.
 org.apache.thrift.TException: Negative length: -2147418111


 Below is my cassandra version and hector client version, which is being
 used currently.

 Cassandra-version: 1.2.11
 Hector-client: 1.0-2

 Any lead would be appreciated though we are planning to move cassandra
 2.0 version with java-driver but it may take some time meanwhile need to
 find the root cause and resolve this issue.


 Regards,
 Ankit Tyagi






Re: Exception in cassandra logs while processing the message

2014-02-17 Thread Sylvain Lebresne
That looks like a thrift error. My best bet would be that you have an
incompatibility of versions between the thrift lib used by your Hector
version and the one of the Cassandra version used. All I can tell you is
that Cassandra 1.2 uses libthrift 0.7.0, not sure what Hector-client 1.0-2
uses however.


On Mon, Feb 17, 2014 at 9:36 AM, ankit tyagi ankittyagi.mn...@gmail.comwrote:

 Hi,

 I am using Hector-client: 1.0-2 to insert the data. Problem is , i am not
 seeing any exception in my application logs where i am inserting data
 through hector.

 looks like internal to cassandra.

 Regards,
 Ankit TYagi


 On Mon, Feb 17, 2014 at 1:56 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 looks like thrift inter operability issue. Seems column family or data
 created via CQL3 and using Thrift based API to read it.

 Else, recreate your schema and try.

 -Vivek


 On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi 
 ankittyagi.mn...@gmail.comwrote:

 Hello,

 anyone has the idea regarding this exception.

 Regards,
 Ankit Tyagi


 On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi ankittyagi.mn...@gmail.com
  wrote:

 Hello,

 I am seeing below exception in my cassandra
 logs(/var/log/cassandra/system.log).

 INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line
 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is
 445
 6448000
  INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is
 445
 6448000
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java
 (line 64) flushing high-traffic column family CFS(Keyspace='comsdb',
 ColumnFa
 mily='product_update') (estimated 213624220 bytes)
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java
 (line 626) Enqueuing flush of Memtable-product_update@1067619242
 (31239028/
 213625108 serialized/live bytes, 222393 ops)
  INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400)
 Writing Memtable-product_update@1067619242(31239028/213625108
 serialized/
 live bytes, 222393 ops)
  INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line
 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445
 6448000
 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line
 438) Completed flushing
 /cassandra1/data/comsdb/product_update/comsdb-product_*
 *update-ic-416-Data.db (15707248 bytes) for commitlog position
 ReplayPosition(segmentId=1391568233618, position=13712751)*
 *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.*
 *org.apache.thrift.TException: Negative length: -2147418111
 2147418111*
 at
 org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388)
 at
 org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
 at
 org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304)
 at
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21)
 at
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:679)
 ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java
 (line 213) Thrift error occurred during processing of message.
 org.apache.thrift.TException: Negative length: -2147418111


 Below is my cassandra version and hector client version, which is being
 used currently.

 Cassandra-version: 1.2.11
 Hector-client: 1.0-2

 Any lead would be appreciated though we are planning to move cassandra
 2.0 version with java-driver but it may take some time meanwhile need to
 find the root cause and resolve this issue.


 Regards,
 Ankit Tyagi







Re: Intermittent long application pauses on nodes

2014-02-17 Thread Ondřej Černoš
Hi all,

we are seeing the same kind of long pauses in Cassandra. We tried to switch
CMS to G1 without positive result. The stress test is read heavy, 2
datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in
latency on 99.99 percentil and higher, caused by threads being stopped in
JVM.

The GC in G1 looks like this:

{Heap before GC invocations=4073 (full 1):
garbage-first heap   total 8388608K, used 3602914K [0x0005f5c0,
0x0007f5c0, 0x0007f5c0)
 region size 4096K, 142 young (581632K), 11 survivors (45056K)
compacting perm gen  total 28672K, used 27428K [0x0007f5c0,
0x0007f780, 0x0008)
  the space 28672K,  95% used [0x0007f5c0, 0x0007f76c9108,
0x0007f76c9200, 0x0007f780)
No shared spaces configured.
2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation Pause)
(young)
Desired survivor size 37748736 bytes, new threshold 15 (max 15)
- age   1:   17213632 bytes,   17213632 total
- age   2:   19391208 bytes,   36604840 total
, 0.1664300 secs]
  [Parallel Time: 163.9 ms, GC Workers: 2]
 [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max:
222346218.3, Diff: 0.0]
 [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7, Sum:
13.7]
 [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum: 42.6]
[Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum: 120]
 [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum: 46.5]
 [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1, Sum:
224.6]
 [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.1]
 [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.1]
 [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff: 0.0,
Sum: 327.6]
 [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max:
222346382.1, Diff: 0.0]
  [Code Root Fixup: 0.0 ms]
  [Clear CT: 0.4 ms]
  [Other: 2.1 ms]
 [Choose CSet: 0.0 ms]
 [Ref Proc: 1.1 ms]
 [Ref Enq: 0.0 ms]
 [Free CSet: 0.4 ms]
  [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap:
3518.5M(8192.0M)-3018.5M(8192.0M)]
Heap after GC invocations=4074 (full 1):
garbage-first heap   total 8388608K, used 3090914K [0x0005f5c0,
0x0007f5c0, 0x0007f5c0)
 region size 4096K, 17 young (69632K), 17 survivors (69632K)
compacting perm gen  total 28672K, used 27428K [0x0007f5c0,
0x0007f780, 0x0008)
  the space 28672K,  95% used [0x0007f5c0, 0x0007f76c9108,
0x0007f76c9200, 0x0007f780)
No shared spaces configured.
}
[Times: user=0.35 sys=0.00, real=27.58 secs]
222346.219: G1IncCollectionPause [ 111  0
   0]  [ 0 0 0 0 27586]  0

And the total thime for which application threads were stopped is 27.58
seconds.

CMS behaves in a similar manner. We thought it would be GC, waiting for
mmaped files being read from disk (the thread cannot reach safepoint during
this operation), but it doesn't explain the huge time.

We'll try jhiccup to see if it provides any additional information. The
test was done on mixed aws/openstack environment, openjdk 1.7.0_45,
cassandra 1.2.11. Upgrading to 2.0.x is no option for us.

regards,

ondrej cernos


On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng fnt...@gmail.com wrote:

 Sorry, I have not had a chance to file a JIRA ticket.  We have not been
 able to resolve the issue.  But since Joel mentioned that upgrading to
 Cassandra 2.0.X solved it for them, we may need to upgrade.  We are
 currently on Java 1.7 and Cassandra 1.2.8



 On Thu, Feb 13, 2014 at 12:40 PM, Keith Wright kwri...@nanigans.comwrote:

 You're running 2.0.* in production?  May I ask what C* version and OS?
  Any hardware details would be appreciated as well.  Thx!

 From: Joel Samuelsson samuelsson.j...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, February 13, 2014 at 11:39 AM

 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Intermittent long application pauses on nodes

 We have had similar issues and upgrading C* to 2.0.x and Java to 1.7
 seems to have helped our issues.


 2014-02-13 Keith Wright kwri...@nanigans.com:

 Frank did you ever file a ticket for this issue or find the root cause?
  I believe we are seeing the same issues when attempting to bootstrap.

 Thanks

 From: Robert Coli rc...@eventbrite.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, February 3, 2014 at 6:10 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Intermittent long application pauses on nodes

 On Mon, Feb 3, 2014 at 8:52 AM, Benedict Elliott Smith 
 belliottsm...@datastax.com wrote:


 It's possible that this is a JVM issue, but if so there may be some
 remedial action we can take anyway. There are some more flags we should
 add, but we can discuss that once you open a 

Re: Intermittent long application pauses on nodes

2014-02-17 Thread Benedict Elliott Smith
Ondrej,

It seems like your issue is much less difficult to diagnose: your
collection times are long. At least, the pause you printed the time for is
all attributable to the G1 pause.

Note that G1 has not generally performed well with Cassandra in our
testing. There are a number of changes going in soon that may change that,
but for the time being it is advisable to stick with CMS. With tuning you
can no doubt bring your pauses down considerably.


On 17 February 2014 10:17, Ondřej Černoš cern...@gmail.com wrote:

 Hi all,

 we are seeing the same kind of long pauses in Cassandra. We tried to
 switch CMS to G1 without positive result. The stress test is read heavy, 2
 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in
 latency on 99.99 percentil and higher, caused by threads being stopped in
 JVM.

 The GC in G1 looks like this:

 {Heap before GC invocations=4073 (full 1):
 garbage-first heap   total 8388608K, used 3602914K [0x0005f5c0,
 0x0007f5c0, 0x0007f5c0)
  region size 4096K, 142 young (581632K), 11 survivors (45056K)
 compacting perm gen  total 28672K, used 27428K [0x0007f5c0,
 0x0007f780, 0x0008)
   the space 28672K,  95% used [0x0007f5c0, 0x0007f76c9108,
 0x0007f76c9200, 0x0007f780)
 No shared spaces configured.
 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation Pause)
 (young)
 Desired survivor size 37748736 bytes, new threshold 15 (max 15)
 - age   1:   17213632 bytes,   17213632 total
 - age   2:   19391208 bytes,   36604840 total
 , 0.1664300 secs]
   [Parallel Time: 163.9 ms, GC Workers: 2]
  [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max:
 222346218.3, Diff: 0.0]
  [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7,
 Sum: 13.7]
  [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum:
 42.6]
 [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum:
 120]
  [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum: 46.5]
  [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1,
 Sum: 224.6]
  [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.1]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
 0.1]
  [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff: 0.0,
 Sum: 327.6]
  [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max:
 222346382.1, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.4 ms]
   [Other: 2.1 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 1.1 ms]
  [Ref Enq: 0.0 ms]
  [Free CSet: 0.4 ms]
   [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap:
 3518.5M(8192.0M)-3018.5M(8192.0M)]
 Heap after GC invocations=4074 (full 1):
 garbage-first heap   total 8388608K, used 3090914K [0x0005f5c0,
 0x0007f5c0, 0x0007f5c0)
  region size 4096K, 17 young (69632K), 17 survivors (69632K)
 compacting perm gen  total 28672K, used 27428K [0x0007f5c0,
 0x0007f780, 0x0008)
   the space 28672K,  95% used [0x0007f5c0, 0x0007f76c9108,
 0x0007f76c9200, 0x0007f780)
 No shared spaces configured.
 }
 [Times: user=0.35 sys=0.00, real=27.58 secs]
 222346.219: G1IncCollectionPause [ 111  0
  0]  [ 0 0 0 0 27586]  0

 And the total thime for which application threads were stopped is 27.58
 seconds.

 CMS behaves in a similar manner. We thought it would be GC, waiting for
 mmaped files being read from disk (the thread cannot reach safepoint during
 this operation), but it doesn't explain the huge time.

 We'll try jhiccup to see if it provides any additional information. The
 test was done on mixed aws/openstack environment, openjdk 1.7.0_45,
 cassandra 1.2.11. Upgrading to 2.0.x is no option for us.

 regards,

 ondrej cernos


 On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng fnt...@gmail.com wrote:

 Sorry, I have not had a chance to file a JIRA ticket.  We have not been
 able to resolve the issue.  But since Joel mentioned that upgrading to
 Cassandra 2.0.X solved it for them, we may need to upgrade.  We are
 currently on Java 1.7 and Cassandra 1.2.8



 On Thu, Feb 13, 2014 at 12:40 PM, Keith Wright kwri...@nanigans.comwrote:

 You’re running 2.0.* in production?  May I ask what C* version and OS?
  Any hardware details would be appreciated as well.  Thx!

 From: Joel Samuelsson samuelsson.j...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, February 13, 2014 at 11:39 AM

 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: Intermittent long application pauses on nodes

 We have had similar issues and upgrading C* to 2.0.x and Java to 1.7
 seems to have helped our issues.


 2014-02-13 Keith Wright kwri...@nanigans.com:

 Frank did you ever file a ticket for this issue or find the root cause?
  I believe we are seeing the same 

Re: Exception in cassandra logs while processing the message

2014-02-17 Thread ankit tyagi
Hi Sylvain,

hector core 1.0-2 uses libthrift 0.6.1,but this exception is not
consistent, getting intermittently.

if there would be any issue related to compatibility of thrift jar, then
this error should be consistent ryt?

Regards,
Ankit Tyagi




On Mon, Feb 17, 2014 at 2:30 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 That looks like a thrift error. My best bet would be that you have an
 incompatibility of versions between the thrift lib used by your Hector
 version and the one of the Cassandra version used. All I can tell you is
 that Cassandra 1.2 uses libthrift 0.7.0, not sure what Hector-client 1.0-2
 uses however.


 On Mon, Feb 17, 2014 at 9:36 AM, ankit tyagi 
 ankittyagi.mn...@gmail.comwrote:

 Hi,

 I am using Hector-client: 1.0-2 to insert the data. Problem is , i am
 not seeing any exception in my application logs where i am inserting data
 through hector.

 looks like internal to cassandra.

 Regards,
 Ankit TYagi


 On Mon, Feb 17, 2014 at 1:56 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 looks like thrift inter operability issue. Seems column family or data
 created via CQL3 and using Thrift based API to read it.

 Else, recreate your schema and try.

 -Vivek


 On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi ankittyagi.mn...@gmail.com
  wrote:

 Hello,

 anyone has the idea regarding this exception.

 Regards,
 Ankit Tyagi


 On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi 
 ankittyagi.mn...@gmail.com wrote:

 Hello,

 I am seeing below exception in my cassandra
 logs(/var/log/cassandra/system.log).

 INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line
 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is
 445
 6448000
  INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java
 (line 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used;
 max is 445
 6448000
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java
 (line 64) flushing high-traffic column family CFS(Keyspace='comsdb',
 ColumnFa
 mily='product_update') (estimated 213624220 bytes)
  INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java
 (line 626) Enqueuing flush of Memtable-product_update@1067619242
 (31239028/
 213625108 serialized/live bytes, 222393 ops)
  INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line
 400) Writing Memtable-product_update@1067619242(31239028/213625108
 serialized/
 live bytes, 222393 ops)
  INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java
 (line 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max 
 is
 445
 6448000
 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line
 438) Completed flushing
 /cassandra1/data/comsdb/product_update/comsdb-product_*
 *update-ic-416-Data.db (15707248 bytes) for commitlog position
 ReplayPosition(segmentId=1391568233618, position=13712751)*
 *ERROR [Thrift:13] 2014-02-13 13:15:45,694
 CustomTThreadPoolServer.java (line 213) Thrift error occurred during
 processing of message.*
 *org.apache.thrift.TException: Negative length: -2147418111
 2147418111*
 at
 org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388)
 at
 org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
 at
 org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304)
 at
 org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21)
 at
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:679)
 ERROR [Thrift:103] 2014-02-13 13:21:25,719
 CustomTThreadPoolServer.java (line 213) Thrift error occurred during
 processing of message.
 org.apache.thrift.TException: Negative length: -2147418111


 Below is my cassandra version and hector client version, which is
 being used currently.

 Cassandra-version: 1.2.11
 Hector-client: 1.0-2

 Any lead would be appreciated though we are planning to move cassandra
 2.0 version with java-driver but it may take some time meanwhile need to
 find the root cause and resolve this issue.


 Regards,
 Ankit Tyagi








Re: Intermittent long application pauses on nodes

2014-02-17 Thread Ondřej Černoš
Hi,

we tried to switch to G1 because we observed this behaviour on CMS too (27
seconds pause in G1 is quite an advise not to use it). Pauses with CMS were
not easily traceable - JVM stopped even without stop-the-world pause
scheduled (defragmentation, remarking). We thought the go-to-safepoint
waiting time might have been involved (we saw waiting for safepoint
resolution) - especially because access to mmpaped files is not preemptive,
afaik, but it doesn't explain tens of seconds waiting times, even slow IO
should read our sstables into memory in much less time. We switched to G1
out of desperation - and to try different code paths - not that we'd
thought it was a great idea. So I think we were hit by the problem
discussed in this thread, just the G1 report wasn't very clear, sorry.

regards,
ondrej



On Mon, Feb 17, 2014 at 11:45 AM, Benedict Elliott Smith 
belliottsm...@datastax.com wrote:

 Ondrej,

 It seems like your issue is much less difficult to diagnose: your
 collection times are long. At least, the pause you printed the time for is
 all attributable to the G1 pause.

 Note that G1 has not generally performed well with Cassandra in our
 testing. There are a number of changes going in soon that may change that,
 but for the time being it is advisable to stick with CMS. With tuning you
 can no doubt bring your pauses down considerably.


 On 17 February 2014 10:17, Ondřej Černoš cern...@gmail.com wrote:

 Hi all,

 we are seeing the same kind of long pauses in Cassandra. We tried to
 switch CMS to G1 without positive result. The stress test is read heavy, 2
 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in
 latency on 99.99 percentil and higher, caused by threads being stopped in
 JVM.

 The GC in G1 looks like this:

 {Heap before GC invocations=4073 (full 1):
 garbage-first heap   total 8388608K, used 3602914K [0x0005f5c0,
 0x0007f5c0, 0x0007f5c0)
  region size 4096K, 142 young (581632K), 11 survivors (45056K)
 compacting perm gen  total 28672K, used 27428K [0x0007f5c0,
 0x0007f780, 0x0008)
   the space 28672K,  95% used [0x0007f5c0, 0x0007f76c9108,
 0x0007f76c9200, 0x0007f780)
 No shared spaces configured.
 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation
 Pause) (young)
 Desired survivor size 37748736 bytes, new threshold 15 (max 15)
 - age   1:   17213632 bytes,   17213632 total
 - age   2:   19391208 bytes,   36604840 total
 , 0.1664300 secs]
   [Parallel Time: 163.9 ms, GC Workers: 2]
  [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max:
 222346218.3, Diff: 0.0]
  [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7,
 Sum: 13.7]
  [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum:
 42.6]
 [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum:
 120]
  [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum: 46.5]
  [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1,
 Sum: 224.6]
  [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.1]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
 0.1]
  [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff:
 0.0, Sum: 327.6]
  [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max:
 222346382.1, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.4 ms]
   [Other: 2.1 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 1.1 ms]
  [Ref Enq: 0.0 ms]
  [Free CSet: 0.4 ms]
   [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap:
 3518.5M(8192.0M)-3018.5M(8192.0M)]
 Heap after GC invocations=4074 (full 1):
 garbage-first heap   total 8388608K, used 3090914K [0x0005f5c0,
 0x0007f5c0, 0x0007f5c0)
  region size 4096K, 17 young (69632K), 17 survivors (69632K)
 compacting perm gen  total 28672K, used 27428K [0x0007f5c0,
 0x0007f780, 0x0008)
   the space 28672K,  95% used [0x0007f5c0, 0x0007f76c9108,
 0x0007f76c9200, 0x0007f780)
 No shared spaces configured.
 }
 [Times: user=0.35 sys=0.00, real=27.58 secs]
 222346.219: G1IncCollectionPause [ 111  0
  0]  [ 0 0 0 0 27586]  0

 And the total thime for which application threads were stopped is 27.58
 seconds.

 CMS behaves in a similar manner. We thought it would be GC, waiting for
 mmaped files being read from disk (the thread cannot reach safepoint during
 this operation), but it doesn't explain the huge time.

 We'll try jhiccup to see if it provides any additional information. The
 test was done on mixed aws/openstack environment, openjdk 1.7.0_45,
 cassandra 1.2.11. Upgrading to 2.0.x is no option for us.

 regards,

 ondrej cernos


 On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng fnt...@gmail.com wrote:

 Sorry, I have not had a chance to file a JIRA ticket.  We have not been
 able to resolve 

Re: Intermittent long application pauses on nodes

2014-02-17 Thread Benedict Elliott Smith
Hi Ondrej,

It's possible you were hit by the problems in this thread before, but it
looks potentially like you may have other issues. Of course it may be that
on G1 you have one issue and CMS another, but 27s is extreme even for G1,
so it seems unlikely. If you're hitting these pause times in CMS and you
get some more output from the safepoint tracing, please do contribute as I
would love to get to the bottom of that, however is it possible you're
experiencing paging activity? Have you made certain the VM memory is locked
(and preferably that paging is entirely disabled, as the bloom filters and
other memory won't be locked, although that shouldn't cause pauses during
GC)

Note that mmapped file accesses and other native work shouldn't in anyway
inhibit GC activity or other safepoint pause times, unless there's a bug in
the VM. These threads will simply enter a safepoint as they return to the
VM execution context, and are considered safe for the duration they are
outside.




On 17 February 2014 12:30, Ondřej Černoš cern...@gmail.com wrote:

 Hi,

 we tried to switch to G1 because we observed this behaviour on CMS too (27
 seconds pause in G1 is quite an advise not to use it). Pauses with CMS were
 not easily traceable - JVM stopped even without stop-the-world pause
 scheduled (defragmentation, remarking). We thought the go-to-safepoint
 waiting time might have been involved (we saw waiting for safepoint
 resolution) - especially because access to mmpaped files is not preemptive,
 afaik, but it doesn't explain tens of seconds waiting times, even slow IO
 should read our sstables into memory in much less time. We switched to G1
 out of desperation - and to try different code paths - not that we'd
 thought it was a great idea. So I think we were hit by the problem
 discussed in this thread, just the G1 report wasn't very clear, sorry.

 regards,
 ondrej



 On Mon, Feb 17, 2014 at 11:45 AM, Benedict Elliott Smith 
 belliottsm...@datastax.com wrote:

 Ondrej,

 It seems like your issue is much less difficult to diagnose: your
 collection times are long. At least, the pause you printed the time for is
 all attributable to the G1 pause.

 Note that G1 has not generally performed well with Cassandra in our
 testing. There are a number of changes going in soon that may change that,
 but for the time being it is advisable to stick with CMS. With tuning you
 can no doubt bring your pauses down considerably.


 On 17 February 2014 10:17, Ondřej Černoš cern...@gmail.com wrote:

 Hi all,

 we are seeing the same kind of long pauses in Cassandra. We tried to
 switch CMS to G1 without positive result. The stress test is read heavy, 2
 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in
 latency on 99.99 percentil and higher, caused by threads being stopped in
 JVM.

 The GC in G1 looks like this:

 {Heap before GC invocations=4073 (full 1):
 garbage-first heap   total 8388608K, used 3602914K [0x0005f5c0,
 0x0007f5c0, 0x0007f5c0)
  region size 4096K, 142 young (581632K), 11 survivors (45056K)
 compacting perm gen  total 28672K, used 27428K [0x0007f5c0,
 0x0007f780, 0x0008)
   the space 28672K,  95% used [0x0007f5c0, 0x0007f76c9108,
 0x0007f76c9200, 0x0007f780)
 No shared spaces configured.
 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation
 Pause) (young)
 Desired survivor size 37748736 bytes, new threshold 15 (max 15)
 - age   1:   17213632 bytes,   17213632 total
 - age   2:   19391208 bytes,   36604840 total
 , 0.1664300 secs]
   [Parallel Time: 163.9 ms, GC Workers: 2]
  [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max:
 222346218.3, Diff: 0.0]
  [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7,
 Sum: 13.7]
  [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum:
 42.6]
 [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum:
 120]
  [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum:
 46.5]
  [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1,
 Sum: 224.6]
  [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum:
 0.1]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
 Sum: 0.1]
  [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff:
 0.0, Sum: 327.6]
  [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max:
 222346382.1, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.4 ms]
   [Other: 2.1 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 1.1 ms]
  [Ref Enq: 0.0 ms]
  [Free CSet: 0.4 ms]
   [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap:
 3518.5M(8192.0M)-3018.5M(8192.0M)]
 Heap after GC invocations=4074 (full 1):
 garbage-first heap   total 8388608K, used 3090914K [0x0005f5c0,
 0x0007f5c0, 0x0007f5c0)
  region size 4096K, 17 young (69632K), 17 survivors (69632K)
 compacting perm gen  total 

Re: Cass 1.2.11 : java.lang.AssertionError: originally calculated column size

2014-02-17 Thread Oleg Dulin

Bumping this up -- anything ? anyone ?

On 2014-02-13 16:01:50 +, Oleg Dulin said:

I am getting these exceptions on one of the nodes, quite often, during 
compactions:


java.lang.AssertionError: originally calculated column size of 
84562492 but now it is 84562600


Usually  this is on the same column family.

I believe this is preventing compactions from completing, and 
subsequently causing other performance issues for me.


Is there a way to fix that ? Would nodetool scrub take care of this ?



--
Regards,
Oleg Dulin
http://www.olegdulin.com




Failed to decode value

2014-02-17 Thread PARASHAR, BHASKARJYA JAY





Hi

When I do a CQL3 select * on TABLE1 (describe below), I see rows displayed but 
get this error at the end

Failed to decode value '9131146' (for column 'accountId') as bigint: unpack 
requires a string argument of length 8

Cassandra version 1.2.5.  accountId is of type text. Why would it be looking 
for a bigint?

Thanks
Jay

CREATE TABLE TABLE1 (
  key text PRIMARY KEY,
  accountId text,
  balanceImpactCode text,
  billSequenceNumber text,
  creditCreationDate text,
  creditLevelCode text,
  creditNetAmount double,
  creditReason text,
  eventId text,
  productId text,
  productType text,
  reversalReason text,
  reversalTransId text,
  totalTaxAmount double,
  transactionId text,
  updateDate text
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

CREATE INDEX indx_accountId ON EnablerAccountCreditChargeInfo (accountId);



Re: Expired column showing up

2014-02-17 Thread mahesh rajamani
Christain,

Yes. Is it a problem?  Can you explain what happens in this scenario?

Thanks
Mahesh


On Fri, Feb 14, 2014 at 3:07 PM, horschi hors...@gmail.com wrote:

 Hi Mahesh,

 is it possible you are creating columns with a long TTL, then update these
 columns with a smaller TTL?

 kind regards,
 Christian


 On Fri, Feb 14, 2014 at 3:45 PM, mahesh rajamani 
 rajamani.mah...@gmail.com wrote:

 Hi,

 I am using Cassandra 2.0.2 version. On a wide row (approx. 1
 columns), I expire few column by setting TTL as 1 second. At times these
 columns show up during slice query.

 When I have this issue, running count and get commands for that row using
 Cassandra cli it gives different column counts.

 But once I run flush and compact, the issue goes off and expired columns
 don't show up.

 Can someone provide some help on this issue.

 --
 Regards,
 Mahesh Rajamani





-- 
Regards,
Mahesh Rajamani


Re: Expired column showing up

2014-02-17 Thread horschi
Hi Mahesh,

the problem is that every column is only tombstoned for as long as the
original column was valid.

So if the last update was only valid for 1 sec, then the tombstone will
also be valid for 1 second! If the previous was valid for a longer time,
then this old value might reappear.

Maybe you can explain why you are doing this?

kind regards,
Christian



On Mon, Feb 17, 2014 at 6:18 PM, mahesh rajamani
rajamani.mah...@gmail.comwrote:

 Christain,

 Yes. Is it a problem?  Can you explain what happens in this scenario?

 Thanks
 Mahesh


 On Fri, Feb 14, 2014 at 3:07 PM, horschi hors...@gmail.com wrote:

 Hi Mahesh,

 is it possible you are creating columns with a long TTL, then update
 these columns with a smaller TTL?

 kind regards,
 Christian


 On Fri, Feb 14, 2014 at 3:45 PM, mahesh rajamani 
 rajamani.mah...@gmail.com wrote:

 Hi,

 I am using Cassandra 2.0.2 version. On a wide row (approx. 1
 columns), I expire few column by setting TTL as 1 second. At times these
 columns show up during slice query.

 When I have this issue, running count and get commands for that row
 using Cassandra cli it gives different column counts.

 But once I run flush and compact, the issue goes off and expired columns
 don't show up.

 Can someone provide some help on this issue.

 --
 Regards,
 Mahesh Rajamani





 --
 Regards,
 Mahesh Rajamani



cold_reads_to_omit and tombstones

2014-02-17 Thread David Chia
Hi,

It's not clear to me in the doc about cold_reads_to_omit. when
cold_reads_to_omit ignores low read rate sstables during compactions, what
would gc tombstones in those cold sstables? Do we rely on tombstone
compactions, which, I might misconfigured, I've never seen tombstone
compactions got kicked off?

Cassandra version: 2.0.5


David


Re: Expired column showing up

2014-02-17 Thread mahesh rajamani
Christian,

There are 2 use cases which are failing, and both looks to be similar
issue, basically happens in column family  set with TTL.

case 1) I manage index for specific data as single row in a column family.
I set TTL to 1 second if the data need to be removed from the index row.
Under some scenario the get and count for the row key gives different
column counts. In the application if I do get I get correct set of
columns(expired columns don't return), but if I do slice query and read 100
columns at a time, the columns set with TTL returns. I am not able to
understand, what is starting this issue.

case 2) I have column family for managing locks, In this case I insert
a column with by  default TTL as 15 seconds. If the transaction completes
before I remove the column by again setting TTL to 1 second.

In this case when running flush the flush hangs with following Assertion
exception.

ERROR [FlushWriter:1] 2014-02-17 11:49:29,349 CassandraDaemon.java (line
187) Exception in thread Thread[FlushWriter:1,5,main]
java.lang.AssertionError
at
org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:198)
at
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:186)
at
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:360)
at
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:315)
at
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)


Thanks
Mahesh



On Mon, Feb 17, 2014 at 12:43 PM, horschi hors...@gmail.com wrote:

 Hi Mahesh,

 the problem is that every column is only tombstoned for as long as the
 original column was valid.

 So if the last update was only valid for 1 sec, then the tombstone will
 also be valid for 1 second! If the previous was valid for a longer time,
 then this old value might reappear.

 Maybe you can explain why you are doing this?

 kind regards,
 Christian



 On Mon, Feb 17, 2014 at 6:18 PM, mahesh rajamani 
 rajamani.mah...@gmail.com wrote:

 Christain,

 Yes. Is it a problem?  Can you explain what happens in this scenario?

 Thanks
 Mahesh


 On Fri, Feb 14, 2014 at 3:07 PM, horschi hors...@gmail.com wrote:

 Hi Mahesh,

 is it possible you are creating columns with a long TTL, then update
 these columns with a smaller TTL?

 kind regards,
 Christian


 On Fri, Feb 14, 2014 at 3:45 PM, mahesh rajamani 
 rajamani.mah...@gmail.com wrote:

 Hi,

 I am using Cassandra 2.0.2 version. On a wide row (approx. 1
 columns), I expire few column by setting TTL as 1 second. At times these
 columns show up during slice query.

 When I have this issue, running count and get commands for that row
 using Cassandra cli it gives different column counts.

 But once I run flush and compact, the issue goes off and expired
 columns don't show up.

 Can someone provide some help on this issue.

 --
 Regards,
 Mahesh Rajamani





 --
 Regards,
 Mahesh Rajamani





-- 
Regards,
Mahesh Rajamani


Cassandra DSC installation fail due to some python dependecies. How to rectify ?

2014-02-17 Thread Ertio Lew
I am trying to install cassandra dsc20 but the installation fails due to
some python dependecies. How could I make this work ?


root@server1:~# sudo apt-get install dsc20
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  cassandra libjna-java libopts25 ntp python python-minimal
python-support python2.7
  python2.7-minimal
Suggested packages:
  libjna-java-doc ntp-doc apparmor python-doc python-tk python2.7-doc
binutils binfmt-support
Recommended packages:
  perl
The following NEW packages will be installed:
  cassandra dsc20 libjna-java libopts25 ntp python python-minimal
python-support python2.7
  python2.7-minimal
0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded.
Need to get 17.1 MB of archives.
After this operation, 23.2 MB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 http://debian.datastax.com/community/ stable/main cassandra all
2.0.5 [14.3 MB]
Get:2 http://us.archive.ubuntu.com/ubuntu/ raring/main libopts25 amd64
1:5.17.1-1ubuntu2 [62.2 kB]
Get:3 http://us.archive.ubuntu.com/ubuntu/ raring/main ntp amd64
1:4.2.6.p5+dfsg-2ubuntu1 [614 kB]
Get:4 http://us.archive.ubuntu.com/ubuntu/ raring/universe libjna-java
amd64 3.2.7-4 [416 kB]
Get:5 http://us.archive.ubuntu.com/ubuntu/ raring-security/main
python2.7-minimal amd64 2.7.4-2ubuntu3.2 [1223 kB]
Get:6 http://debian.datastax.com/community/ stable/main dsc20 all
2.0.5-1 [1302 B]
Get:7 http://us.archive.ubuntu.com/ubuntu/ raring-security/main
python2.7 amd64 2.7.4-2ubuntu3.2 [263 kB]
Get:8 http://us.archive.ubuntu.com/ubuntu/ raring/main python-minimal
amd64 2.7.4-0ubuntu1 [30.8 kB]
Get:9 http://us.archive.ubuntu.com/ubuntu/ raring/main python amd64
2.7.4-0ubuntu1 [169 kB]
Get:10 http://us.archive.ubuntu.com/ubuntu/ raring/universe
python-support all 1.0.15 [26.7 kB]
Fetched 17.1 MB in 3s (4842 kB/s)
Selecting previously unselected package libopts25.
(Reading database ... 27688 files and directories currently installed.)
Unpacking libopts25 (from .../libopts25_1%3a5.17.1-1ubuntu2_amd64.deb) ...
Selecting previously unselected package ntp.
Unpacking ntp (from .../ntp_1%3a4.2.6.p5+dfsg-2ubuntu1_amd64.deb) ...
Selecting previously unselected package libjna-java.
Unpacking libjna-java (from .../libjna-java_3.2.7-4_amd64.deb) ...
Selecting previously unselected package python2.7-minimal.
Unpacking python2.7-minimal (from
.../python2.7-minimal_2.7.4-2ubuntu3.2_amd64.deb) ...
Selecting previously unselected package python2.7.
Unpacking python2.7 (from .../python2.7_2.7.4-2ubuntu3.2_amd64.deb) ...
Selecting previously unselected package python-minimal.
Unpacking python-minimal (from .../python-minimal_2.7.4-0ubuntu1_amd64.deb) ...
Selecting previously unselected package python.
Unpacking python (from .../python_2.7.4-0ubuntu1_amd64.deb) ...
Selecting previously unselected package python-support.
Unpacking python-support (from .../python-support_1.0.15_all.deb) ...
Selecting previously unselected package cassandra.
Unpacking cassandra (from .../cassandra_2.0.5_all.deb) ...
Selecting previously unselected package dsc20.
Unpacking dsc20 (from .../archives/dsc20_2.0.5-1_all.deb) ...
Processing triggers for man-db ...
Processing triggers for desktop-file-utils ...
Setting up libopts25 (1:5.17.1-1ubuntu2) ...
Setting up ntp (1:4.2.6.p5+dfsg-2ubuntu1) ...
 * Starting NTP server ntpd
 [ OK ]
Setting up libjna-java (3.2.7-4) ...
Setting up python2.7-minimal (2.7.4-2ubuntu3.2) ...
# Empty sitecustomize.py to avoid a dangling symlink
Traceback (most recent call last):
  File /usr/lib/python2.7/py_compile.py, line 170, in module
sys.exit(main())
  File /usr/lib/python2.7/py_compile.py, line 162, in main
compile(filename, doraise=True)
  File /usr/lib/python2.7/py_compile.py, line 106, in compile
with open(file, 'U') as f:
IOError: [Errno 2] No such file or directory:
'/usr/lib/python2.7/sitecustomize.py'
dpkg: error processing python2.7-minimal (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of python2.7:
 python2.7 depends on python2.7-minimal (= 2.7.4-2ubuntu3.2); however:
  Package python2.7-minimal is not configured yet.

dpkg: error processing python2.7 (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of python-minimal:
 python-minimal depends on python2.7-minimal (= 2.7.4-1~); however:
  Package python2.7-minimal is not configured yet.

dpkg: error processing python-minimal (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of python:
 python depends on python2.7 (= 2.7.4-1~); however:
  Package python2.7 is not configured yet.
 python depends on python-minimal (= 2.7.4-0ubuntu1); however:
  Package python-minimal is not configured yet.

dpkg: error processing python (--configure):
 dependency problems 

Re: Cassandra DSC installation fail due to some python dependecies. How to rectify ?

2014-02-17 Thread Al Tobey
This is the root cause:

IOError: [Errno 2] No such file or directory:
'/usr/lib/python2.7/sitecustomize.py'

Off the top of my head, you may be able to work around this packaging issue
in Ubuntu with:

sudo touch /usr/lib/python2.7/sitecustomize.py
sudo apt-get -f install

Then resume re-run your original apt command.

-Al Tobey
Open Source Mechanic
Datastax



On Mon, Feb 17, 2014 at 3:21 PM, Ertio Lew ertio...@gmail.com wrote:

 I am trying to install cassandra dsc20 but the installation fails due to
 some python dependecies. How could I make this work ?


 root@server1:~# sudo apt-get install dsc20
 Reading package lists... Done
 Building dependency tree
 Reading state information... Done
 The following extra packages will be installed:
   cassandra libjna-java libopts25 ntp python python-minimal python-support 
 python2.7
   python2.7-minimal
 Suggested packages:
   libjna-java-doc ntp-doc apparmor python-doc python-tk python2.7-doc 
 binutils binfmt-support
 Recommended packages:
   perl
 The following NEW packages will be installed:
   cassandra dsc20 libjna-java libopts25 ntp python python-minimal 
 python-support python2.7
   python2.7-minimal
 0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded.
 Need to get 17.1 MB of archives.
 After this operation, 23.2 MB of additional disk space will be used.
 Do you want to continue [Y/n]? y
 Get:1 http://debian.datastax.com/community/ stable/main cassandra all 2.0.5 
 [14.3 MB]
 Get:2 http://us.archive.ubuntu.com/ubuntu/ raring/main libopts25 amd64 
 1:5.17.1-1ubuntu2 [62.2 kB]
 Get:3 http://us.archive.ubuntu.com/ubuntu/ raring/main ntp amd64 
 1:4.2.6.p5+dfsg-2ubuntu1 [614 kB]
 Get:4 http://us.archive.ubuntu.com/ubuntu/ raring/universe libjna-java amd64 
 3.2.7-4 [416 kB]
 Get:5 http://us.archive.ubuntu.com/ubuntu/ raring-security/main 
 python2.7-minimal amd64 2.7.4-2ubuntu3.2 [1223 kB]
 Get:6 http://debian.datastax.com/community/ stable/main dsc20 all 2.0.5-1 
 [1302 B]
 Get:7 http://us.archive.ubuntu.com/ubuntu/ raring-security/main python2.7 
 amd64 2.7.4-2ubuntu3.2 [263 kB]
 Get:8 http://us.archive.ubuntu.com/ubuntu/ raring/main python-minimal amd64 
 2.7.4-0ubuntu1 [30.8 kB]
 Get:9 http://us.archive.ubuntu.com/ubuntu/ raring/main python amd64 
 2.7.4-0ubuntu1 [169 kB]
 Get:10 http://us.archive.ubuntu.com/ubuntu/ raring/universe python-support 
 all 1.0.15 [26.7 kB]
 Fetched 17.1 MB in 3s (4842 kB/s)
 Selecting previously unselected package libopts25.
 (Reading database ... 27688 files and directories currently installed.)
 Unpacking libopts25 (from .../libopts25_1%3a5.17.1-1ubuntu2_amd64.deb) ...
 Selecting previously unselected package ntp.
 Unpacking ntp (from .../ntp_1%3a4.2.6.p5+dfsg-2ubuntu1_amd64.deb) ...
 Selecting previously unselected package libjna-java.
 Unpacking libjna-java (from .../libjna-java_3.2.7-4_amd64.deb) ...
 Selecting previously unselected package python2.7-minimal.
 Unpacking python2.7-minimal (from 
 .../python2.7-minimal_2.7.4-2ubuntu3.2_amd64.deb) ...
 Selecting previously unselected package python2.7.
 Unpacking python2.7 (from .../python2.7_2.7.4-2ubuntu3.2_amd64.deb) ...
 Selecting previously unselected package python-minimal.
 Unpacking python-minimal (from .../python-minimal_2.7.4-0ubuntu1_amd64.deb) 
 ...
 Selecting previously unselected package python.
 Unpacking python (from .../python_2.7.4-0ubuntu1_amd64.deb) ...
 Selecting previously unselected package python-support.
 Unpacking python-support (from .../python-support_1.0.15_all.deb) ...
 Selecting previously unselected package cassandra.
 Unpacking cassandra (from .../cassandra_2.0.5_all.deb) ...
 Selecting previously unselected package dsc20.
 Unpacking dsc20 (from .../archives/dsc20_2.0.5-1_all.deb) ...
 Processing triggers for man-db ...
 Processing triggers for desktop-file-utils ...
 Setting up libopts25 (1:5.17.1-1ubuntu2) ...
 Setting up ntp (1:4.2.6.p5+dfsg-2ubuntu1) ...
  * Starting NTP server ntpd   
   [ OK ]
 Setting up libjna-java (3.2.7-4) ...
 Setting up python2.7-minimal (2.7.4-2ubuntu3.2) ...
 # Empty sitecustomize.py to avoid a dangling symlink
 Traceback (most recent call last):
   File /usr/lib/python2.7/py_compile.py, line 170, in module
 sys.exit(main())
   File /usr/lib/python2.7/py_compile.py, line 162, in main
 compile(filename, doraise=True)
   File /usr/lib/python2.7/py_compile.py, line 106, in compile
 with open(file, 'U') as f:
 IOError: [Errno 2] No such file or directory: 
 '/usr/lib/python2.7/sitecustomize.py'
 dpkg: error processing python2.7-minimal (--configure):
  subprocess installed post-installation script returned error exit status 1
 dpkg: dependency problems prevent configuration of python2.7:
  python2.7 depends on python2.7-minimal (= 2.7.4-2ubuntu3.2); however:
   Package python2.7-minimal is not configured yet.

 dpkg: error processing python2.7 (--configure):
  dependency problems - leaving unconfigured
 dpkg: dependency problems 

Re: Cass 1.2.11 : java.lang.AssertionError: originally calculated column size

2014-02-17 Thread Peter Sanford
The issue you should look at is CASSANDRA-4206.

This is apparently fixed on 2.0 so upgrading is one option. If you are not
ready to upgrade to 2.0 then you can try increasing in_memory_compaction.

We were hitting this exception on one of our nodes and increasing
in_memory_compaction did fix it.

-psanford


On Mon, Feb 17, 2014 at 4:53 AM, Oleg Dulin oleg.du...@gmail.com wrote:

 Bumping this up -- anything ? anyone ?


 On 2014-02-13 16:01:50 +, Oleg Dulin said:

  I am getting these exceptions on one of the nodes, quite often, during
 compactions:

 java.lang.AssertionError: originally calculated column size of 84562492
 but now it is 84562600

 Usually  this is on the same column family.

 I believe this is preventing compactions from completing, and
 subsequently causing other performance issues for me.

 Is there a way to fix that ? Would nodetool scrub take care of this ?



 --
 Regards,
 Oleg Dulin
 http://www.olegdulin.com





GCInspector GC for ConcurrentMarkSweep running every 15 seconds

2014-02-17 Thread John Pyeatt
I have a 6 node cluster running on AWS. We are using m1.large instances
with heap size set to 3G.

5 of the 6 nodes seem quite healthy. The 6th one however is running
GCInspector GC for ConcurrentMarkSweep every 15 seconds or so. There is
nothing going on on this box. No repairs and almost not user activity. But
the CPU is almost continuously at 50% or more.

The only message in the log at all is the
 INFO 2014-02-17 22:58:53,429 [ScheduledTasks:1] GCInspector GC for
ConcurrentMarkSweep: 213 ms for 1 collections, 1964940024 used; max is
3200253952
 INFO 2014-02-17 22:59:07,431 [ScheduledTasks:1] GCInspector GC for
ConcurrentMarkSweep: 250 ms for 1 collections, 1983269488 used; max is
3200253952
 INFO 2014-02-17 22:59:21,522 [ScheduledTasks:1] GCInspector GC for
ConcurrentMarkSweep: 280 ms for 1 collections, 1998214480 used; max is
3200253952
 INFO 2014-02-17 22:59:36,527 [ScheduledTasks:1] GCInspector GC for
ConcurrentMarkSweep: 305 ms for 1 collections, 2013065592 used; max is
3200253952
 INFO 2014-02-17 22:59:50,529 [ScheduledTasks:1] GCInspector GC for
ConcurrentMarkSweep: 334 ms for 1 collections, 2028069232 used; max is
3200253952

We don't see any of these messages on the other nodes in the cluster.

We are seeing similar behaviour for both our production and QA clusters.
Production is running cassandra 1.2.9 and QA is running 1.2.13.

Here are some of the cassandra settings that I would think might be
relevant.

flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
in_memory_compaction_limit_in_mb: 64

Does anyone have any ideas why we are seeing this so selectively on one box?

Any cures???
-- 
John Pyeatt
Singlewire Software, LLC
www.singlewire.com
--
608.661.1184
john.pye...@singlewire.com


Turn off compression (1.2.11)

2014-02-17 Thread Plotnik, Alexey
Each compressed SSTable uses additional transfer buffer in 
CompressedRandomAccessReader instance.
After analyzing Heap I saw this buffer has a size about 70KB per SSTable. I 
have more than 30K SSTables per node.
I want to turn off a compression for this column family to save some Heap. How 
can I do it safely? I mean after that the SCRUB/UPGRADESSTABLES process should 
be started?
Next question is how do this processes work in terms of using disk space. I.e. 
CLEANUP is safe - it takes one SSTable, analyzes keys, builds new one, replace 
existing and goes further. Do SCRUB/UPGRADESSTABLES work in the same manner? My 
concern is a disk space usage overhead.


Getting different results each time after inserting data into cassandra using lightweight transaction

2014-02-17 Thread Jim Xu

Hi all,

The cluster has 5 nodes, one keyspace(RF=3), a table(named t1) and a 
column family(named cf1,and has two fixed columns each row).
Single thread, two test programs to insert data using lightweight 
transaction.


the pseudocode is as follows:
program 1:
...
for (int i=1;i1;i++){
...
client.execute_sql3_query(insert into t1(id,name) values(+i+,'jane') 
if not exists);

}

program 2:
...
ListColumn expected=new ArrayListColumn();
for (int i=1;i1;i++){
ListColumn updates=new ArrayListColumn();
updates.add(...);
...
client.cas(ByteBufferUtil.byte(i),cf1,expected,updates,ConsistencyLevel.SERIAL, 
ConsistencyLevel.QUORUM);

}

Program 1 and 2 are executed synchrously,if there is no exception, data 
got from cqlsh using the following cqls:

cqlsh select count(*) from t1 limit 1;
cqlsh select count(*) from cf1 limit 1;

However,the results of the two cqls are different each time I 
execute,and I find every node is UP, using nodetool status.


In the beginning of a suite of tests, we didn't find any problem. The 
case should we notice is that, a test when the former table and column 
family were dropped and then the newly created table and column family 
would lead the scenario descripted above.



I¹m using Cassandra 2.0.3.

Any insights would be greatly appreciated.

Thanks.