Re: Exception in cassandra logs while processing the message
Hello, anyone has the idea regarding this exception. Regards, Ankit Tyagi On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hello, I am seeing below exception in my cassandra logs(/var/log/cassandra/system.log). INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445 6448000 INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445 6448000 INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java (line 64) flushing high-traffic column family CFS(Keyspace='comsdb', ColumnFa mily='product_update') (estimated 213624220 bytes) INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java (line 626) Enqueuing flush of Memtable-product_update@1067619242(31239028/ 213625108 serialized/live bytes, 222393 ops) INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400) Writing Memtable-product_update@1067619242(31239028/213625108 serialized/ live bytes, 222393 ops) INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445 6448000 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438) Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_* *update-ic-416-Data.db (15707248 bytes) for commitlog position ReplayPosition(segmentId=1391568233618, position=13712751)* *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message.* *org.apache.thrift.TException: Negative length: -2147418111* at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388) at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) at org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message. org.apache.thrift.TException: Negative length: -2147418111 Below is my cassandra version and hector client version, which is being used currently. Cassandra-version: 1.2.11 Hector-client: 1.0-2 Any lead would be appreciated though we are planning to move cassandra 2.0 version with java-driver but it may take some time meanwhile need to find the root cause and resolve this issue. Regards, Ankit Tyagi
Re: Exception in cassandra logs while processing the message
looks like thrift inter operability issue. Seems column family or data created via CQL3 and using Thrift based API to read it. Else, recreate your schema and try. -Vivek On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hello, anyone has the idea regarding this exception. Regards, Ankit Tyagi On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hello, I am seeing below exception in my cassandra logs(/var/log/cassandra/system.log). INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445 6448000 INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445 6448000 INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java (line 64) flushing high-traffic column family CFS(Keyspace='comsdb', ColumnFa mily='product_update') (estimated 213624220 bytes) INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java (line 626) Enqueuing flush of Memtable-product_update@1067619242 (31239028/ 213625108 serialized/live bytes, 222393 ops) INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400) Writing Memtable-product_update@1067619242(31239028/213625108 serialized/ live bytes, 222393 ops) INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445 6448000 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438) Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_* *update-ic-416-Data.db (15707248 bytes) for commitlog position ReplayPosition(segmentId=1391568233618, position=13712751)* *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message.* *org.apache.thrift.TException: Negative length: -2147418111* at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388) at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) at org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message. org.apache.thrift.TException: Negative length: -2147418111 Below is my cassandra version and hector client version, which is being used currently. Cassandra-version: 1.2.11 Hector-client: 1.0-2 Any lead would be appreciated though we are planning to move cassandra 2.0 version with java-driver but it may take some time meanwhile need to find the root cause and resolve this issue. Regards, Ankit Tyagi
Re: Exception in cassandra logs while processing the message
Hi, I am using Hector-client: 1.0-2 to insert the data. Problem is , i am not seeing any exception in my application logs where i am inserting data through hector. looks like internal to cassandra. Regards, Ankit TYagi On Mon, Feb 17, 2014 at 1:56 PM, Vivek Mishra mishra.v...@gmail.com wrote: looks like thrift inter operability issue. Seems column family or data created via CQL3 and using Thrift based API to read it. Else, recreate your schema and try. -Vivek On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hello, anyone has the idea regarding this exception. Regards, Ankit Tyagi On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hello, I am seeing below exception in my cassandra logs(/var/log/cassandra/system.log). INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445 6448000 INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445 6448000 INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java (line 64) flushing high-traffic column family CFS(Keyspace='comsdb', ColumnFa mily='product_update') (estimated 213624220 bytes) INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java (line 626) Enqueuing flush of Memtable-product_update@1067619242 (31239028/ 213625108 serialized/live bytes, 222393 ops) INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400) Writing Memtable-product_update@1067619242(31239028/213625108 serialized/ live bytes, 222393 ops) INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445 6448000 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438) Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_* *update-ic-416-Data.db (15707248 bytes) for commitlog position ReplayPosition(segmentId=1391568233618, position=13712751)* *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message.* *org.apache.thrift.TException: Negative length: -2147418111* at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388) at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) at org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message. org.apache.thrift.TException: Negative length: -2147418111 Below is my cassandra version and hector client version, which is being used currently. Cassandra-version: 1.2.11 Hector-client: 1.0-2 Any lead would be appreciated though we are planning to move cassandra 2.0 version with java-driver but it may take some time meanwhile need to find the root cause and resolve this issue. Regards, Ankit Tyagi
Re: Exception in cassandra logs while processing the message
That looks like a thrift error. My best bet would be that you have an incompatibility of versions between the thrift lib used by your Hector version and the one of the Cassandra version used. All I can tell you is that Cassandra 1.2 uses libthrift 0.7.0, not sure what Hector-client 1.0-2 uses however. On Mon, Feb 17, 2014 at 9:36 AM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hi, I am using Hector-client: 1.0-2 to insert the data. Problem is , i am not seeing any exception in my application logs where i am inserting data through hector. looks like internal to cassandra. Regards, Ankit TYagi On Mon, Feb 17, 2014 at 1:56 PM, Vivek Mishra mishra.v...@gmail.comwrote: looks like thrift inter operability issue. Seems column family or data created via CQL3 and using Thrift based API to read it. Else, recreate your schema and try. -Vivek On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hello, anyone has the idea regarding this exception. Regards, Ankit Tyagi On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi ankittyagi.mn...@gmail.com wrote: Hello, I am seeing below exception in my cassandra logs(/var/log/cassandra/system.log). INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445 6448000 INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445 6448000 INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java (line 64) flushing high-traffic column family CFS(Keyspace='comsdb', ColumnFa mily='product_update') (estimated 213624220 bytes) INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java (line 626) Enqueuing flush of Memtable-product_update@1067619242 (31239028/ 213625108 serialized/live bytes, 222393 ops) INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400) Writing Memtable-product_update@1067619242(31239028/213625108 serialized/ live bytes, 222393 ops) INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445 6448000 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438) Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_* *update-ic-416-Data.db (15707248 bytes) for commitlog position ReplayPosition(segmentId=1391568233618, position=13712751)* *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message.* *org.apache.thrift.TException: Negative length: -2147418111 2147418111* at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388) at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) at org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message. org.apache.thrift.TException: Negative length: -2147418111 Below is my cassandra version and hector client version, which is being used currently. Cassandra-version: 1.2.11 Hector-client: 1.0-2 Any lead would be appreciated though we are planning to move cassandra 2.0 version with java-driver but it may take some time meanwhile need to find the root cause and resolve this issue. Regards, Ankit Tyagi
Re: Intermittent long application pauses on nodes
Hi all, we are seeing the same kind of long pauses in Cassandra. We tried to switch CMS to G1 without positive result. The stress test is read heavy, 2 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in latency on 99.99 percentil and higher, caused by threads being stopped in JVM. The GC in G1 looks like this: {Heap before GC invocations=4073 (full 1): garbage-first heap total 8388608K, used 3602914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 142 young (581632K), 11 survivors (45056K) compacting perm gen total 28672K, used 27428K [0x0007f5c0, 0x0007f780, 0x0008) the space 28672K, 95% used [0x0007f5c0, 0x0007f76c9108, 0x0007f76c9200, 0x0007f780) No shared spaces configured. 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 37748736 bytes, new threshold 15 (max 15) - age 1: 17213632 bytes, 17213632 total - age 2: 19391208 bytes, 36604840 total , 0.1664300 secs] [Parallel Time: 163.9 ms, GC Workers: 2] [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max: 222346218.3, Diff: 0.0] [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7, Sum: 13.7] [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum: 42.6] [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum: 120] [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum: 46.5] [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1, Sum: 224.6] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff: 0.0, Sum: 327.6] [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max: 222346382.1, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.4 ms] [Other: 2.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 1.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.4 ms] [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap: 3518.5M(8192.0M)-3018.5M(8192.0M)] Heap after GC invocations=4074 (full 1): garbage-first heap total 8388608K, used 3090914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 17 young (69632K), 17 survivors (69632K) compacting perm gen total 28672K, used 27428K [0x0007f5c0, 0x0007f780, 0x0008) the space 28672K, 95% used [0x0007f5c0, 0x0007f76c9108, 0x0007f76c9200, 0x0007f780) No shared spaces configured. } [Times: user=0.35 sys=0.00, real=27.58 secs] 222346.219: G1IncCollectionPause [ 111 0 0] [ 0 0 0 0 27586] 0 And the total thime for which application threads were stopped is 27.58 seconds. CMS behaves in a similar manner. We thought it would be GC, waiting for mmaped files being read from disk (the thread cannot reach safepoint during this operation), but it doesn't explain the huge time. We'll try jhiccup to see if it provides any additional information. The test was done on mixed aws/openstack environment, openjdk 1.7.0_45, cassandra 1.2.11. Upgrading to 2.0.x is no option for us. regards, ondrej cernos On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng fnt...@gmail.com wrote: Sorry, I have not had a chance to file a JIRA ticket. We have not been able to resolve the issue. But since Joel mentioned that upgrading to Cassandra 2.0.X solved it for them, we may need to upgrade. We are currently on Java 1.7 and Cassandra 1.2.8 On Thu, Feb 13, 2014 at 12:40 PM, Keith Wright kwri...@nanigans.comwrote: You're running 2.0.* in production? May I ask what C* version and OS? Any hardware details would be appreciated as well. Thx! From: Joel Samuelsson samuelsson.j...@gmail.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Thursday, February 13, 2014 at 11:39 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Intermittent long application pauses on nodes We have had similar issues and upgrading C* to 2.0.x and Java to 1.7 seems to have helped our issues. 2014-02-13 Keith Wright kwri...@nanigans.com: Frank did you ever file a ticket for this issue or find the root cause? I believe we are seeing the same issues when attempting to bootstrap. Thanks From: Robert Coli rc...@eventbrite.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Monday, February 3, 2014 at 6:10 PM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Intermittent long application pauses on nodes On Mon, Feb 3, 2014 at 8:52 AM, Benedict Elliott Smith belliottsm...@datastax.com wrote: It's possible that this is a JVM issue, but if so there may be some remedial action we can take anyway. There are some more flags we should add, but we can discuss that once you open a
Re: Intermittent long application pauses on nodes
Ondrej, It seems like your issue is much less difficult to diagnose: your collection times are long. At least, the pause you printed the time for is all attributable to the G1 pause. Note that G1 has not generally performed well with Cassandra in our testing. There are a number of changes going in soon that may change that, but for the time being it is advisable to stick with CMS. With tuning you can no doubt bring your pauses down considerably. On 17 February 2014 10:17, Ondřej Černoš cern...@gmail.com wrote: Hi all, we are seeing the same kind of long pauses in Cassandra. We tried to switch CMS to G1 without positive result. The stress test is read heavy, 2 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in latency on 99.99 percentil and higher, caused by threads being stopped in JVM. The GC in G1 looks like this: {Heap before GC invocations=4073 (full 1): garbage-first heap total 8388608K, used 3602914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 142 young (581632K), 11 survivors (45056K) compacting perm gen total 28672K, used 27428K [0x0007f5c0, 0x0007f780, 0x0008) the space 28672K, 95% used [0x0007f5c0, 0x0007f76c9108, 0x0007f76c9200, 0x0007f780) No shared spaces configured. 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 37748736 bytes, new threshold 15 (max 15) - age 1: 17213632 bytes, 17213632 total - age 2: 19391208 bytes, 36604840 total , 0.1664300 secs] [Parallel Time: 163.9 ms, GC Workers: 2] [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max: 222346218.3, Diff: 0.0] [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7, Sum: 13.7] [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum: 42.6] [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum: 120] [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum: 46.5] [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1, Sum: 224.6] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff: 0.0, Sum: 327.6] [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max: 222346382.1, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.4 ms] [Other: 2.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 1.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.4 ms] [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap: 3518.5M(8192.0M)-3018.5M(8192.0M)] Heap after GC invocations=4074 (full 1): garbage-first heap total 8388608K, used 3090914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 17 young (69632K), 17 survivors (69632K) compacting perm gen total 28672K, used 27428K [0x0007f5c0, 0x0007f780, 0x0008) the space 28672K, 95% used [0x0007f5c0, 0x0007f76c9108, 0x0007f76c9200, 0x0007f780) No shared spaces configured. } [Times: user=0.35 sys=0.00, real=27.58 secs] 222346.219: G1IncCollectionPause [ 111 0 0] [ 0 0 0 0 27586] 0 And the total thime for which application threads were stopped is 27.58 seconds. CMS behaves in a similar manner. We thought it would be GC, waiting for mmaped files being read from disk (the thread cannot reach safepoint during this operation), but it doesn't explain the huge time. We'll try jhiccup to see if it provides any additional information. The test was done on mixed aws/openstack environment, openjdk 1.7.0_45, cassandra 1.2.11. Upgrading to 2.0.x is no option for us. regards, ondrej cernos On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng fnt...@gmail.com wrote: Sorry, I have not had a chance to file a JIRA ticket. We have not been able to resolve the issue. But since Joel mentioned that upgrading to Cassandra 2.0.X solved it for them, we may need to upgrade. We are currently on Java 1.7 and Cassandra 1.2.8 On Thu, Feb 13, 2014 at 12:40 PM, Keith Wright kwri...@nanigans.comwrote: You’re running 2.0.* in production? May I ask what C* version and OS? Any hardware details would be appreciated as well. Thx! From: Joel Samuelsson samuelsson.j...@gmail.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Thursday, February 13, 2014 at 11:39 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Intermittent long application pauses on nodes We have had similar issues and upgrading C* to 2.0.x and Java to 1.7 seems to have helped our issues. 2014-02-13 Keith Wright kwri...@nanigans.com: Frank did you ever file a ticket for this issue or find the root cause? I believe we are seeing the same
Re: Exception in cassandra logs while processing the message
Hi Sylvain, hector core 1.0-2 uses libthrift 0.6.1,but this exception is not consistent, getting intermittently. if there would be any issue related to compatibility of thrift jar, then this error should be consistent ryt? Regards, Ankit Tyagi On Mon, Feb 17, 2014 at 2:30 PM, Sylvain Lebresne sylv...@datastax.comwrote: That looks like a thrift error. My best bet would be that you have an incompatibility of versions between the thrift lib used by your Hector version and the one of the Cassandra version used. All I can tell you is that Cassandra 1.2 uses libthrift 0.7.0, not sure what Hector-client 1.0-2 uses however. On Mon, Feb 17, 2014 at 9:36 AM, ankit tyagi ankittyagi.mn...@gmail.comwrote: Hi, I am using Hector-client: 1.0-2 to insert the data. Problem is , i am not seeing any exception in my application logs where i am inserting data through hector. looks like internal to cassandra. Regards, Ankit TYagi On Mon, Feb 17, 2014 at 1:56 PM, Vivek Mishra mishra.v...@gmail.comwrote: looks like thrift inter operability issue. Seems column family or data created via CQL3 and using Thrift based API to read it. Else, recreate your schema and try. -Vivek On Mon, Feb 17, 2014 at 1:50 PM, ankit tyagi ankittyagi.mn...@gmail.com wrote: Hello, anyone has the idea regarding this exception. Regards, Ankit Tyagi On Fri, Feb 14, 2014 at 7:02 PM, ankit tyagi ankittyagi.mn...@gmail.com wrote: Hello, I am seeing below exception in my cassandra logs(/var/log/cassandra/system.log). INFO [ScheduledTasks:1] 2014-02-13 13:13:57,641 GCInspector.java (line 119) GC for ParNew: 273 ms for 1 collections, 2319121816 used; max is 445 6448000 INFO [ScheduledTasks:1] 2014-02-13 13:14:02,695 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2315368976 used; max is 445 6448000 INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 MeteredFlusher.java (line 64) flushing high-traffic column family CFS(Keyspace='comsdb', ColumnFa mily='product_update') (estimated 213624220 bytes) INFO [OptionalTasks:1] 2014-02-13 13:14:08,093 ColumnFamilyStore.java (line 626) Enqueuing flush of Memtable-product_update@1067619242 (31239028/ 213625108 serialized/live bytes, 222393 ops) INFO [FlushWriter:94] 2014-02-13 13:14:08,127 Memtable.java (line 400) Writing Memtable-product_update@1067619242(31239028/213625108 serialized/ live bytes, 222393 ops) INFO [ScheduledTasks:1] 2014-02-13 13:14:08,696 GCInspector.java (line 119) GC for ParNew: 214 ms for 1 collections, 2480175160 used; max is 445 6448000 * INFO [FlushWriter:94] 2014-02-13 13:14:10,836 Memtable.java (line 438) Completed flushing /cassandra1/data/comsdb/product_update/comsdb-product_* *update-ic-416-Data.db (15707248 bytes) for commitlog position ReplayPosition(segmentId=1391568233618, position=13712751)* *ERROR [Thrift:13] 2014-02-13 13:15:45,694 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message.* *org.apache.thrift.TException: Negative length: -2147418111 2147418111* at org.apache.thrift.protocol.TBinaryProtocol.checkReadLength(TBinaryProtocol.java:388) at org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363) at org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:20304) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:21) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) ERROR [Thrift:103] 2014-02-13 13:21:25,719 CustomTThreadPoolServer.java (line 213) Thrift error occurred during processing of message. org.apache.thrift.TException: Negative length: -2147418111 Below is my cassandra version and hector client version, which is being used currently. Cassandra-version: 1.2.11 Hector-client: 1.0-2 Any lead would be appreciated though we are planning to move cassandra 2.0 version with java-driver but it may take some time meanwhile need to find the root cause and resolve this issue. Regards, Ankit Tyagi
Re: Intermittent long application pauses on nodes
Hi, we tried to switch to G1 because we observed this behaviour on CMS too (27 seconds pause in G1 is quite an advise not to use it). Pauses with CMS were not easily traceable - JVM stopped even without stop-the-world pause scheduled (defragmentation, remarking). We thought the go-to-safepoint waiting time might have been involved (we saw waiting for safepoint resolution) - especially because access to mmpaped files is not preemptive, afaik, but it doesn't explain tens of seconds waiting times, even slow IO should read our sstables into memory in much less time. We switched to G1 out of desperation - and to try different code paths - not that we'd thought it was a great idea. So I think we were hit by the problem discussed in this thread, just the G1 report wasn't very clear, sorry. regards, ondrej On Mon, Feb 17, 2014 at 11:45 AM, Benedict Elliott Smith belliottsm...@datastax.com wrote: Ondrej, It seems like your issue is much less difficult to diagnose: your collection times are long. At least, the pause you printed the time for is all attributable to the G1 pause. Note that G1 has not generally performed well with Cassandra in our testing. There are a number of changes going in soon that may change that, but for the time being it is advisable to stick with CMS. With tuning you can no doubt bring your pauses down considerably. On 17 February 2014 10:17, Ondřej Černoš cern...@gmail.com wrote: Hi all, we are seeing the same kind of long pauses in Cassandra. We tried to switch CMS to G1 without positive result. The stress test is read heavy, 2 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in latency on 99.99 percentil and higher, caused by threads being stopped in JVM. The GC in G1 looks like this: {Heap before GC invocations=4073 (full 1): garbage-first heap total 8388608K, used 3602914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 142 young (581632K), 11 survivors (45056K) compacting perm gen total 28672K, used 27428K [0x0007f5c0, 0x0007f780, 0x0008) the space 28672K, 95% used [0x0007f5c0, 0x0007f76c9108, 0x0007f76c9200, 0x0007f780) No shared spaces configured. 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 37748736 bytes, new threshold 15 (max 15) - age 1: 17213632 bytes, 17213632 total - age 2: 19391208 bytes, 36604840 total , 0.1664300 secs] [Parallel Time: 163.9 ms, GC Workers: 2] [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max: 222346218.3, Diff: 0.0] [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7, Sum: 13.7] [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum: 42.6] [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum: 120] [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum: 46.5] [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1, Sum: 224.6] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff: 0.0, Sum: 327.6] [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max: 222346382.1, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.4 ms] [Other: 2.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 1.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.4 ms] [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap: 3518.5M(8192.0M)-3018.5M(8192.0M)] Heap after GC invocations=4074 (full 1): garbage-first heap total 8388608K, used 3090914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 17 young (69632K), 17 survivors (69632K) compacting perm gen total 28672K, used 27428K [0x0007f5c0, 0x0007f780, 0x0008) the space 28672K, 95% used [0x0007f5c0, 0x0007f76c9108, 0x0007f76c9200, 0x0007f780) No shared spaces configured. } [Times: user=0.35 sys=0.00, real=27.58 secs] 222346.219: G1IncCollectionPause [ 111 0 0] [ 0 0 0 0 27586] 0 And the total thime for which application threads were stopped is 27.58 seconds. CMS behaves in a similar manner. We thought it would be GC, waiting for mmaped files being read from disk (the thread cannot reach safepoint during this operation), but it doesn't explain the huge time. We'll try jhiccup to see if it provides any additional information. The test was done on mixed aws/openstack environment, openjdk 1.7.0_45, cassandra 1.2.11. Upgrading to 2.0.x is no option for us. regards, ondrej cernos On Fri, Feb 14, 2014 at 8:53 PM, Frank Ng fnt...@gmail.com wrote: Sorry, I have not had a chance to file a JIRA ticket. We have not been able to resolve
Re: Intermittent long application pauses on nodes
Hi Ondrej, It's possible you were hit by the problems in this thread before, but it looks potentially like you may have other issues. Of course it may be that on G1 you have one issue and CMS another, but 27s is extreme even for G1, so it seems unlikely. If you're hitting these pause times in CMS and you get some more output from the safepoint tracing, please do contribute as I would love to get to the bottom of that, however is it possible you're experiencing paging activity? Have you made certain the VM memory is locked (and preferably that paging is entirely disabled, as the bloom filters and other memory won't be locked, although that shouldn't cause pauses during GC) Note that mmapped file accesses and other native work shouldn't in anyway inhibit GC activity or other safepoint pause times, unless there's a bug in the VM. These threads will simply enter a safepoint as they return to the VM execution context, and are considered safe for the duration they are outside. On 17 February 2014 12:30, Ondřej Černoš cern...@gmail.com wrote: Hi, we tried to switch to G1 because we observed this behaviour on CMS too (27 seconds pause in G1 is quite an advise not to use it). Pauses with CMS were not easily traceable - JVM stopped even without stop-the-world pause scheduled (defragmentation, remarking). We thought the go-to-safepoint waiting time might have been involved (we saw waiting for safepoint resolution) - especially because access to mmpaped files is not preemptive, afaik, but it doesn't explain tens of seconds waiting times, even slow IO should read our sstables into memory in much less time. We switched to G1 out of desperation - and to try different code paths - not that we'd thought it was a great idea. So I think we were hit by the problem discussed in this thread, just the G1 report wasn't very clear, sorry. regards, ondrej On Mon, Feb 17, 2014 at 11:45 AM, Benedict Elliott Smith belliottsm...@datastax.com wrote: Ondrej, It seems like your issue is much less difficult to diagnose: your collection times are long. At least, the pause you printed the time for is all attributable to the G1 pause. Note that G1 has not generally performed well with Cassandra in our testing. There are a number of changes going in soon that may change that, but for the time being it is advisable to stick with CMS. With tuning you can no doubt bring your pauses down considerably. On 17 February 2014 10:17, Ondřej Černoš cern...@gmail.com wrote: Hi all, we are seeing the same kind of long pauses in Cassandra. We tried to switch CMS to G1 without positive result. The stress test is read heavy, 2 datacenters, 6 nodes, 400reqs/sec on one datacenter. We see spikes in latency on 99.99 percentil and higher, caused by threads being stopped in JVM. The GC in G1 looks like this: {Heap before GC invocations=4073 (full 1): garbage-first heap total 8388608K, used 3602914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 142 young (581632K), 11 survivors (45056K) compacting perm gen total 28672K, used 27428K [0x0007f5c0, 0x0007f780, 0x0008) the space 28672K, 95% used [0x0007f5c0, 0x0007f76c9108, 0x0007f76c9200, 0x0007f780) No shared spaces configured. 2014-02-17T04:44:16.385+0100: 222346.218: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 37748736 bytes, new threshold 15 (max 15) - age 1: 17213632 bytes, 17213632 total - age 2: 19391208 bytes, 36604840 total , 0.1664300 secs] [Parallel Time: 163.9 ms, GC Workers: 2] [GC Worker Start (ms): Min: 222346218.3, Avg: 222346218.3, Max: 222346218.3, Diff: 0.0] [Ext Root Scanning (ms): Min: 6.0, Avg: 6.9, Max: 7.7, Diff: 1.7, Sum: 13.7] [Update RS (ms): Min: 20.4, Avg: 21.3, Max: 22.1, Diff: 1.7, Sum: 42.6] [Processed Buffers: Min: 49, Avg: 60.0, Max: 71, Diff: 22, Sum: 120] [Scan RS (ms): Min: 23.2, Avg: 23.2, Max: 23.3, Diff: 0.1, Sum: 46.5] [Object Copy (ms): Min: 112.3, Avg: 112.3, Max: 112.4, Diff: 0.1, Sum: 224.6] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.1] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff: 0.0, Sum: 327.6] [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max: 222346382.1, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Clear CT: 0.4 ms] [Other: 2.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 1.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.4 ms] [Eden: 524.0M(524.0M)-0.0B(476.0M) Survivors: 44.0M-68.0M Heap: 3518.5M(8192.0M)-3018.5M(8192.0M)] Heap after GC invocations=4074 (full 1): garbage-first heap total 8388608K, used 3090914K [0x0005f5c0, 0x0007f5c0, 0x0007f5c0) region size 4096K, 17 young (69632K), 17 survivors (69632K) compacting perm gen total
Re: Cass 1.2.11 : java.lang.AssertionError: originally calculated column size
Bumping this up -- anything ? anyone ? On 2014-02-13 16:01:50 +, Oleg Dulin said: I am getting these exceptions on one of the nodes, quite often, during compactions: java.lang.AssertionError: originally calculated column size of 84562492 but now it is 84562600 Usually this is on the same column family. I believe this is preventing compactions from completing, and subsequently causing other performance issues for me. Is there a way to fix that ? Would nodetool scrub take care of this ? -- Regards, Oleg Dulin http://www.olegdulin.com
Failed to decode value
Hi When I do a CQL3 select * on TABLE1 (describe below), I see rows displayed but get this error at the end Failed to decode value '9131146' (for column 'accountId') as bigint: unpack requires a string argument of length 8 Cassandra version 1.2.5. accountId is of type text. Why would it be looking for a bigint? Thanks Jay CREATE TABLE TABLE1 ( key text PRIMARY KEY, accountId text, balanceImpactCode text, billSequenceNumber text, creditCreationDate text, creditLevelCode text, creditNetAmount double, creditReason text, eventId text, productId text, productType text, reversalReason text, reversalTransId text, totalTaxAmount double, transactionId text, updateDate text ) WITH COMPACT STORAGE AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'SnappyCompressor'}; CREATE INDEX indx_accountId ON EnablerAccountCreditChargeInfo (accountId);
Re: Expired column showing up
Christain, Yes. Is it a problem? Can you explain what happens in this scenario? Thanks Mahesh On Fri, Feb 14, 2014 at 3:07 PM, horschi hors...@gmail.com wrote: Hi Mahesh, is it possible you are creating columns with a long TTL, then update these columns with a smaller TTL? kind regards, Christian On Fri, Feb 14, 2014 at 3:45 PM, mahesh rajamani rajamani.mah...@gmail.com wrote: Hi, I am using Cassandra 2.0.2 version. On a wide row (approx. 1 columns), I expire few column by setting TTL as 1 second. At times these columns show up during slice query. When I have this issue, running count and get commands for that row using Cassandra cli it gives different column counts. But once I run flush and compact, the issue goes off and expired columns don't show up. Can someone provide some help on this issue. -- Regards, Mahesh Rajamani -- Regards, Mahesh Rajamani
Re: Expired column showing up
Hi Mahesh, the problem is that every column is only tombstoned for as long as the original column was valid. So if the last update was only valid for 1 sec, then the tombstone will also be valid for 1 second! If the previous was valid for a longer time, then this old value might reappear. Maybe you can explain why you are doing this? kind regards, Christian On Mon, Feb 17, 2014 at 6:18 PM, mahesh rajamani rajamani.mah...@gmail.comwrote: Christain, Yes. Is it a problem? Can you explain what happens in this scenario? Thanks Mahesh On Fri, Feb 14, 2014 at 3:07 PM, horschi hors...@gmail.com wrote: Hi Mahesh, is it possible you are creating columns with a long TTL, then update these columns with a smaller TTL? kind regards, Christian On Fri, Feb 14, 2014 at 3:45 PM, mahesh rajamani rajamani.mah...@gmail.com wrote: Hi, I am using Cassandra 2.0.2 version. On a wide row (approx. 1 columns), I expire few column by setting TTL as 1 second. At times these columns show up during slice query. When I have this issue, running count and get commands for that row using Cassandra cli it gives different column counts. But once I run flush and compact, the issue goes off and expired columns don't show up. Can someone provide some help on this issue. -- Regards, Mahesh Rajamani -- Regards, Mahesh Rajamani
cold_reads_to_omit and tombstones
Hi, It's not clear to me in the doc about cold_reads_to_omit. when cold_reads_to_omit ignores low read rate sstables during compactions, what would gc tombstones in those cold sstables? Do we rely on tombstone compactions, which, I might misconfigured, I've never seen tombstone compactions got kicked off? Cassandra version: 2.0.5 David
Re: Expired column showing up
Christian, There are 2 use cases which are failing, and both looks to be similar issue, basically happens in column family set with TTL. case 1) I manage index for specific data as single row in a column family. I set TTL to 1 second if the data need to be removed from the index row. Under some scenario the get and count for the row key gives different column counts. In the application if I do get I get correct set of columns(expired columns don't return), but if I do slice query and read 100 columns at a time, the columns set with TTL returns. I am not able to understand, what is starting this issue. case 2) I have column family for managing locks, In this case I insert a column with by default TTL as 15 seconds. If the transaction completes before I remove the column by again setting TTL to 1 second. In this case when running flush the flush hangs with following Assertion exception. ERROR [FlushWriter:1] 2014-02-17 11:49:29,349 CassandraDaemon.java (line 187) Exception in thread Thread[FlushWriter:1,5,main] java.lang.AssertionError at org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:198) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:186) at org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:360) at org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:315) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Thanks Mahesh On Mon, Feb 17, 2014 at 12:43 PM, horschi hors...@gmail.com wrote: Hi Mahesh, the problem is that every column is only tombstoned for as long as the original column was valid. So if the last update was only valid for 1 sec, then the tombstone will also be valid for 1 second! If the previous was valid for a longer time, then this old value might reappear. Maybe you can explain why you are doing this? kind regards, Christian On Mon, Feb 17, 2014 at 6:18 PM, mahesh rajamani rajamani.mah...@gmail.com wrote: Christain, Yes. Is it a problem? Can you explain what happens in this scenario? Thanks Mahesh On Fri, Feb 14, 2014 at 3:07 PM, horschi hors...@gmail.com wrote: Hi Mahesh, is it possible you are creating columns with a long TTL, then update these columns with a smaller TTL? kind regards, Christian On Fri, Feb 14, 2014 at 3:45 PM, mahesh rajamani rajamani.mah...@gmail.com wrote: Hi, I am using Cassandra 2.0.2 version. On a wide row (approx. 1 columns), I expire few column by setting TTL as 1 second. At times these columns show up during slice query. When I have this issue, running count and get commands for that row using Cassandra cli it gives different column counts. But once I run flush and compact, the issue goes off and expired columns don't show up. Can someone provide some help on this issue. -- Regards, Mahesh Rajamani -- Regards, Mahesh Rajamani -- Regards, Mahesh Rajamani
Cassandra DSC installation fail due to some python dependecies. How to rectify ?
I am trying to install cassandra dsc20 but the installation fails due to some python dependecies. How could I make this work ? root@server1:~# sudo apt-get install dsc20 Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: cassandra libjna-java libopts25 ntp python python-minimal python-support python2.7 python2.7-minimal Suggested packages: libjna-java-doc ntp-doc apparmor python-doc python-tk python2.7-doc binutils binfmt-support Recommended packages: perl The following NEW packages will be installed: cassandra dsc20 libjna-java libopts25 ntp python python-minimal python-support python2.7 python2.7-minimal 0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded. Need to get 17.1 MB of archives. After this operation, 23.2 MB of additional disk space will be used. Do you want to continue [Y/n]? y Get:1 http://debian.datastax.com/community/ stable/main cassandra all 2.0.5 [14.3 MB] Get:2 http://us.archive.ubuntu.com/ubuntu/ raring/main libopts25 amd64 1:5.17.1-1ubuntu2 [62.2 kB] Get:3 http://us.archive.ubuntu.com/ubuntu/ raring/main ntp amd64 1:4.2.6.p5+dfsg-2ubuntu1 [614 kB] Get:4 http://us.archive.ubuntu.com/ubuntu/ raring/universe libjna-java amd64 3.2.7-4 [416 kB] Get:5 http://us.archive.ubuntu.com/ubuntu/ raring-security/main python2.7-minimal amd64 2.7.4-2ubuntu3.2 [1223 kB] Get:6 http://debian.datastax.com/community/ stable/main dsc20 all 2.0.5-1 [1302 B] Get:7 http://us.archive.ubuntu.com/ubuntu/ raring-security/main python2.7 amd64 2.7.4-2ubuntu3.2 [263 kB] Get:8 http://us.archive.ubuntu.com/ubuntu/ raring/main python-minimal amd64 2.7.4-0ubuntu1 [30.8 kB] Get:9 http://us.archive.ubuntu.com/ubuntu/ raring/main python amd64 2.7.4-0ubuntu1 [169 kB] Get:10 http://us.archive.ubuntu.com/ubuntu/ raring/universe python-support all 1.0.15 [26.7 kB] Fetched 17.1 MB in 3s (4842 kB/s) Selecting previously unselected package libopts25. (Reading database ... 27688 files and directories currently installed.) Unpacking libopts25 (from .../libopts25_1%3a5.17.1-1ubuntu2_amd64.deb) ... Selecting previously unselected package ntp. Unpacking ntp (from .../ntp_1%3a4.2.6.p5+dfsg-2ubuntu1_amd64.deb) ... Selecting previously unselected package libjna-java. Unpacking libjna-java (from .../libjna-java_3.2.7-4_amd64.deb) ... Selecting previously unselected package python2.7-minimal. Unpacking python2.7-minimal (from .../python2.7-minimal_2.7.4-2ubuntu3.2_amd64.deb) ... Selecting previously unselected package python2.7. Unpacking python2.7 (from .../python2.7_2.7.4-2ubuntu3.2_amd64.deb) ... Selecting previously unselected package python-minimal. Unpacking python-minimal (from .../python-minimal_2.7.4-0ubuntu1_amd64.deb) ... Selecting previously unselected package python. Unpacking python (from .../python_2.7.4-0ubuntu1_amd64.deb) ... Selecting previously unselected package python-support. Unpacking python-support (from .../python-support_1.0.15_all.deb) ... Selecting previously unselected package cassandra. Unpacking cassandra (from .../cassandra_2.0.5_all.deb) ... Selecting previously unselected package dsc20. Unpacking dsc20 (from .../archives/dsc20_2.0.5-1_all.deb) ... Processing triggers for man-db ... Processing triggers for desktop-file-utils ... Setting up libopts25 (1:5.17.1-1ubuntu2) ... Setting up ntp (1:4.2.6.p5+dfsg-2ubuntu1) ... * Starting NTP server ntpd [ OK ] Setting up libjna-java (3.2.7-4) ... Setting up python2.7-minimal (2.7.4-2ubuntu3.2) ... # Empty sitecustomize.py to avoid a dangling symlink Traceback (most recent call last): File /usr/lib/python2.7/py_compile.py, line 170, in module sys.exit(main()) File /usr/lib/python2.7/py_compile.py, line 162, in main compile(filename, doraise=True) File /usr/lib/python2.7/py_compile.py, line 106, in compile with open(file, 'U') as f: IOError: [Errno 2] No such file or directory: '/usr/lib/python2.7/sitecustomize.py' dpkg: error processing python2.7-minimal (--configure): subprocess installed post-installation script returned error exit status 1 dpkg: dependency problems prevent configuration of python2.7: python2.7 depends on python2.7-minimal (= 2.7.4-2ubuntu3.2); however: Package python2.7-minimal is not configured yet. dpkg: error processing python2.7 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of python-minimal: python-minimal depends on python2.7-minimal (= 2.7.4-1~); however: Package python2.7-minimal is not configured yet. dpkg: error processing python-minimal (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of python: python depends on python2.7 (= 2.7.4-1~); however: Package python2.7 is not configured yet. python depends on python-minimal (= 2.7.4-0ubuntu1); however: Package python-minimal is not configured yet. dpkg: error processing python (--configure): dependency problems
Re: Cassandra DSC installation fail due to some python dependecies. How to rectify ?
This is the root cause: IOError: [Errno 2] No such file or directory: '/usr/lib/python2.7/sitecustomize.py' Off the top of my head, you may be able to work around this packaging issue in Ubuntu with: sudo touch /usr/lib/python2.7/sitecustomize.py sudo apt-get -f install Then resume re-run your original apt command. -Al Tobey Open Source Mechanic Datastax On Mon, Feb 17, 2014 at 3:21 PM, Ertio Lew ertio...@gmail.com wrote: I am trying to install cassandra dsc20 but the installation fails due to some python dependecies. How could I make this work ? root@server1:~# sudo apt-get install dsc20 Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: cassandra libjna-java libopts25 ntp python python-minimal python-support python2.7 python2.7-minimal Suggested packages: libjna-java-doc ntp-doc apparmor python-doc python-tk python2.7-doc binutils binfmt-support Recommended packages: perl The following NEW packages will be installed: cassandra dsc20 libjna-java libopts25 ntp python python-minimal python-support python2.7 python2.7-minimal 0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded. Need to get 17.1 MB of archives. After this operation, 23.2 MB of additional disk space will be used. Do you want to continue [Y/n]? y Get:1 http://debian.datastax.com/community/ stable/main cassandra all 2.0.5 [14.3 MB] Get:2 http://us.archive.ubuntu.com/ubuntu/ raring/main libopts25 amd64 1:5.17.1-1ubuntu2 [62.2 kB] Get:3 http://us.archive.ubuntu.com/ubuntu/ raring/main ntp amd64 1:4.2.6.p5+dfsg-2ubuntu1 [614 kB] Get:4 http://us.archive.ubuntu.com/ubuntu/ raring/universe libjna-java amd64 3.2.7-4 [416 kB] Get:5 http://us.archive.ubuntu.com/ubuntu/ raring-security/main python2.7-minimal amd64 2.7.4-2ubuntu3.2 [1223 kB] Get:6 http://debian.datastax.com/community/ stable/main dsc20 all 2.0.5-1 [1302 B] Get:7 http://us.archive.ubuntu.com/ubuntu/ raring-security/main python2.7 amd64 2.7.4-2ubuntu3.2 [263 kB] Get:8 http://us.archive.ubuntu.com/ubuntu/ raring/main python-minimal amd64 2.7.4-0ubuntu1 [30.8 kB] Get:9 http://us.archive.ubuntu.com/ubuntu/ raring/main python amd64 2.7.4-0ubuntu1 [169 kB] Get:10 http://us.archive.ubuntu.com/ubuntu/ raring/universe python-support all 1.0.15 [26.7 kB] Fetched 17.1 MB in 3s (4842 kB/s) Selecting previously unselected package libopts25. (Reading database ... 27688 files and directories currently installed.) Unpacking libopts25 (from .../libopts25_1%3a5.17.1-1ubuntu2_amd64.deb) ... Selecting previously unselected package ntp. Unpacking ntp (from .../ntp_1%3a4.2.6.p5+dfsg-2ubuntu1_amd64.deb) ... Selecting previously unselected package libjna-java. Unpacking libjna-java (from .../libjna-java_3.2.7-4_amd64.deb) ... Selecting previously unselected package python2.7-minimal. Unpacking python2.7-minimal (from .../python2.7-minimal_2.7.4-2ubuntu3.2_amd64.deb) ... Selecting previously unselected package python2.7. Unpacking python2.7 (from .../python2.7_2.7.4-2ubuntu3.2_amd64.deb) ... Selecting previously unselected package python-minimal. Unpacking python-minimal (from .../python-minimal_2.7.4-0ubuntu1_amd64.deb) ... Selecting previously unselected package python. Unpacking python (from .../python_2.7.4-0ubuntu1_amd64.deb) ... Selecting previously unselected package python-support. Unpacking python-support (from .../python-support_1.0.15_all.deb) ... Selecting previously unselected package cassandra. Unpacking cassandra (from .../cassandra_2.0.5_all.deb) ... Selecting previously unselected package dsc20. Unpacking dsc20 (from .../archives/dsc20_2.0.5-1_all.deb) ... Processing triggers for man-db ... Processing triggers for desktop-file-utils ... Setting up libopts25 (1:5.17.1-1ubuntu2) ... Setting up ntp (1:4.2.6.p5+dfsg-2ubuntu1) ... * Starting NTP server ntpd [ OK ] Setting up libjna-java (3.2.7-4) ... Setting up python2.7-minimal (2.7.4-2ubuntu3.2) ... # Empty sitecustomize.py to avoid a dangling symlink Traceback (most recent call last): File /usr/lib/python2.7/py_compile.py, line 170, in module sys.exit(main()) File /usr/lib/python2.7/py_compile.py, line 162, in main compile(filename, doraise=True) File /usr/lib/python2.7/py_compile.py, line 106, in compile with open(file, 'U') as f: IOError: [Errno 2] No such file or directory: '/usr/lib/python2.7/sitecustomize.py' dpkg: error processing python2.7-minimal (--configure): subprocess installed post-installation script returned error exit status 1 dpkg: dependency problems prevent configuration of python2.7: python2.7 depends on python2.7-minimal (= 2.7.4-2ubuntu3.2); however: Package python2.7-minimal is not configured yet. dpkg: error processing python2.7 (--configure): dependency problems - leaving unconfigured dpkg: dependency problems
Re: Cass 1.2.11 : java.lang.AssertionError: originally calculated column size
The issue you should look at is CASSANDRA-4206. This is apparently fixed on 2.0 so upgrading is one option. If you are not ready to upgrade to 2.0 then you can try increasing in_memory_compaction. We were hitting this exception on one of our nodes and increasing in_memory_compaction did fix it. -psanford On Mon, Feb 17, 2014 at 4:53 AM, Oleg Dulin oleg.du...@gmail.com wrote: Bumping this up -- anything ? anyone ? On 2014-02-13 16:01:50 +, Oleg Dulin said: I am getting these exceptions on one of the nodes, quite often, during compactions: java.lang.AssertionError: originally calculated column size of 84562492 but now it is 84562600 Usually this is on the same column family. I believe this is preventing compactions from completing, and subsequently causing other performance issues for me. Is there a way to fix that ? Would nodetool scrub take care of this ? -- Regards, Oleg Dulin http://www.olegdulin.com
GCInspector GC for ConcurrentMarkSweep running every 15 seconds
I have a 6 node cluster running on AWS. We are using m1.large instances with heap size set to 3G. 5 of the 6 nodes seem quite healthy. The 6th one however is running GCInspector GC for ConcurrentMarkSweep every 15 seconds or so. There is nothing going on on this box. No repairs and almost not user activity. But the CPU is almost continuously at 50% or more. The only message in the log at all is the INFO 2014-02-17 22:58:53,429 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 213 ms for 1 collections, 1964940024 used; max is 3200253952 INFO 2014-02-17 22:59:07,431 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 250 ms for 1 collections, 1983269488 used; max is 3200253952 INFO 2014-02-17 22:59:21,522 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 280 ms for 1 collections, 1998214480 used; max is 3200253952 INFO 2014-02-17 22:59:36,527 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 305 ms for 1 collections, 2013065592 used; max is 3200253952 INFO 2014-02-17 22:59:50,529 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 334 ms for 1 collections, 2028069232 used; max is 3200253952 We don't see any of these messages on the other nodes in the cluster. We are seeing similar behaviour for both our production and QA clusters. Production is running cassandra 1.2.9 and QA is running 1.2.13. Here are some of the cassandra settings that I would think might be relevant. flush_largest_memtables_at: 0.75 reduce_cache_sizes_at: 0.85 reduce_cache_capacity_to: 0.6 in_memory_compaction_limit_in_mb: 64 Does anyone have any ideas why we are seeing this so selectively on one box? Any cures??? -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com
Turn off compression (1.2.11)
Each compressed SSTable uses additional transfer buffer in CompressedRandomAccessReader instance. After analyzing Heap I saw this buffer has a size about 70KB per SSTable. I have more than 30K SSTables per node. I want to turn off a compression for this column family to save some Heap. How can I do it safely? I mean after that the SCRUB/UPGRADESSTABLES process should be started? Next question is how do this processes work in terms of using disk space. I.e. CLEANUP is safe - it takes one SSTable, analyzes keys, builds new one, replace existing and goes further. Do SCRUB/UPGRADESSTABLES work in the same manner? My concern is a disk space usage overhead.
Getting different results each time after inserting data into cassandra using lightweight transaction
Hi all, The cluster has 5 nodes, one keyspace(RF=3), a table(named t1) and a column family(named cf1,and has two fixed columns each row). Single thread, two test programs to insert data using lightweight transaction. the pseudocode is as follows: program 1: ... for (int i=1;i1;i++){ ... client.execute_sql3_query(insert into t1(id,name) values(+i+,'jane') if not exists); } program 2: ... ListColumn expected=new ArrayListColumn(); for (int i=1;i1;i++){ ListColumn updates=new ArrayListColumn(); updates.add(...); ... client.cas(ByteBufferUtil.byte(i),cf1,expected,updates,ConsistencyLevel.SERIAL, ConsistencyLevel.QUORUM); } Program 1 and 2 are executed synchrously,if there is no exception, data got from cqlsh using the following cqls: cqlsh select count(*) from t1 limit 1; cqlsh select count(*) from cf1 limit 1; However,the results of the two cqls are different each time I execute,and I find every node is UP, using nodetool status. In the beginning of a suite of tests, we didn't find any problem. The case should we notice is that, a test when the former table and column family were dropped and then the newly created table and column family would lead the scenario descripted above. I¹m using Cassandra 2.0.3. Any insights would be greatly appreciated. Thanks.