multiple threads updating result in TransportException
We're running into a problem where things are fine if our client runs single threaded but gets TransportException if we use multiple threads. The datastax driver gets an NIO checkBounds error. Here is a link to a stack overflow question we found that describes the problem we're seeing. This question was asked 7 months ago and got no answers. We're running C* 2.0.9 and see the problem on our single node test cluster. Here is the stack trace we see: at java.nio.Buffer.checkBounds(Buffer.java:559) ~[na:1.7.0_55] at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:143) ~[na:1.7.0_55] at org.jboss.netty.buffer.HeapChannelBuffer.setBytes(HeapChannelBuffer.java:136) ~[netty-3.7.0.Final.jar:na] at org.jboss.netty.buffer.AbstractChannelBuffer.writeBytes(AbstractChannelBuffer.java:472) ~[netty-3.7.0.Final.jar:na] at com.datastax.driver.core.CBUtil.writeValue(CBUtil.java:272) ~[cassandra-driver-core-2.0.0-rc2.jar:na] at com.datastax.driver.core.CBUtil.writeValueList(CBUtil.java:297) ~[cassandra-driver-core-2.0.0-rc2.jar:na] at com.datastax.driver.core.Requests$QueryProtocolOptions.encode(Requests.java:223) ~[cassandra-driver-core-2.0.0-rc2.jar:na] at com.datastax.driver.core.Requests$Execute$1.encode(Requests.java:122) ~[cassandra-driver-core-2.0.0-rc2.jar:na] at com.datastax.driver.core.Requests$Execute$1.encode(Requests.java:119) ~[cassandra-driver-core-2.0.0-rc2.jar:na] at com.datastax.driver.core.Message$ProtocolEncoder.encode(Message.java:184) ~[cassandra-driver-core-2.0.0-rc2.jar:na] at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:66) ~[netty-3.7.0.Final.jar:na] at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59) ~[netty-3.7.0.Final.jar:na] at org.jboss.netty.channel.Channels.write(Channels.java:704) ~[netty-3.7.0.Final.jar:na] at org.jboss.netty.channel.Channels.write(Channels.java:671) ~[netty-3.7.0.Final.jar:na] at org.jboss.netty.channel.Ab -- http://about.me/BrianTarbox
Re: High cpu usage segfaulting
When I see a segfault, my first reaction is to always suspect OpenJDK. Are you using OpenJDK or the Oracle JDK? If you're using the former, I recommend the latter. On Tue, Nov 25, 2014 at 10:40 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Stan, Put some monitoring on this. The first thing I think of when I hear chewing up CPU for Java apps is GC. In SPM http://sematext.com/spm/ you can easily see individual JVM memory pools and see if any of them are at (close to) 100%. You can typically correlate that to increased GC times and counts. I'd look at that before looking at strace and such. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Nov 25, 2014 at 11:07 PM, Stan Lemon sle...@salesforce.com wrote: We are using v2.0.11 and have seen several instances in our 24 node cluster where the node becomes unresponsive, when we look into it we find that there is a cassandra process chewing up a lot of CPU. There are no other indications in logs or anything as to what might be happening, however if we strace the process that is chewing up CPU we see a segmental fault: --- SIGSEGV (Segmentation fault) @ 0 (0) --- rt_sigreturn(0x7fd61110f862)= 30618997712 futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27333, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27335, NULL) = 0 futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1 And this happens over and over again while running strafe. Has anyone seen this? Does anyone have any ideas what might be happening, or how we could debug it further? Thanks for your help, Stan -- Tyler Hobbs DataStax http://datastax.com/
Re: High cpu usage segfaulting
Thanks everyone for the feedback. So some additional details... 1. Definitely using Oracle JDK (1.7.0_71-b14) 2. Yes, the segfaulting does go away after a restart 3. No OOM log messages when this occurs 4. We are seeing many GC pauses that take a long time, as in over 2 seconds - we are aware that our GC performance is bad and we believe this is because of IO, which we are addressing. However, we are see these runaway CPU during low load times and even when we took the cluster completely out of use. Thanks again, Stan On Wed, Nov 26, 2014 at 12:03 PM, Tyler Hobbs ty...@datastax.com wrote: When I see a segfault, my first reaction is to always suspect OpenJDK. Are you using OpenJDK or the Oracle JDK? If you're using the former, I recommend the latter. On Tue, Nov 25, 2014 at 10:40 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Stan, Put some monitoring on this. The first thing I think of when I hear chewing up CPU for Java apps is GC. In SPM http://sematext.com/spm/ you can easily see individual JVM memory pools and see if any of them are at (close to) 100%. You can typically correlate that to increased GC times and counts. I'd look at that before looking at strace and such. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Nov 25, 2014 at 11:07 PM, Stan Lemon sle...@salesforce.com wrote: We are using v2.0.11 and have seen several instances in our 24 node cluster where the node becomes unresponsive, when we look into it we find that there is a cassandra process chewing up a lot of CPU. There are no other indications in logs or anything as to what might be happening, however if we strace the process that is chewing up CPU we see a segmental fault: --- SIGSEGV (Segmentation fault) @ 0 (0) --- rt_sigreturn(0x7fd61110f862)= 30618997712 futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27333, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x7fd614844054, FUTEX_WAIT_PRIVATE, 27335, NULL) = 0 futex(0x7fd614844028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7fd6148e2e54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fd6148e2e50, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 futex(0x7fd6148e2e28, FUTEX_WAKE_PRIVATE, 1) = 1 And this happens over and over again while running strafe. Has anyone seen this? Does anyone have any ideas what might be happening, or how we could debug it further? Thanks for your help, Stan -- Tyler Hobbs DataStax http://datastax.com/
Re: Repair completes successfully but data is still inconsistent
On 24 Nov 2014, at 18:54, Robert Coli rc...@eventbrite.com wrote: But for any given value on any given node, you can verify the value it has in 100% of SStables... that's what both the normal read path and repair should do when reconciling row fragments into the materialized row? Hard to understand a case where repair fails, and I might provide that set of SStables attached to an Apache JIRA. These were the sstables present on node 3 on the 13th: -rw-r--r-- 1 andre staff 167794724 Nov 13 17:57 Disco-NamespaceFile2-ic-232-Data.db -rw-r--r-- 1 andre staff 167809485 Nov 13 17:58 Disco-NamespaceFile2-ic-3608-Data.db -rw-r--r-- 1 andre staff 167800404 Nov 13 17:58 Disco-NamespaceFile2-ic-3609-Data.db -rw-r--r-- 1 andre staff 59773136 Nov 13 17:58 Disco-NamespaceFile2-ic-3610-Data.db -rw-r--r-- 1 andre staff 167804631 Nov 13 17:59 Disco-NamespaceFile2-ic-4022-Data.db -rw-r--r-- 1 andre staff 167795369 Nov 13 17:59 Disco-NamespaceFile2-ic-4023-Data.db -rw-r--r-- 1 andre staff 30930871 Nov 13 18:00 Disco-NamespaceFile2-ic-4024-Data.db -rw-r--r-- 1 andre staff 167806410 Nov 13 18:00 Disco-NamespaceFile2-ic-5334-Data.db -rw-r--r-- 1 andre staff 167789187 Nov 13 18:01 Disco-NamespaceFile2-ic-5336-Data.db -rw-r--r-- 1 andre staff 167786564 Nov 13 18:02 Disco-NamespaceFile2-ic-5337-Data.db -rw-r--r-- 1 andre staff 167794911 Nov 13 18:02 Disco-NamespaceFile2-ic-5338-Data.db -rw-r--r-- 1 andre staff 167813170 Nov 13 18:03 Disco-NamespaceFile2-ic-5339-Data.db -rw-r--r-- 1 andre staff 64246179 Nov 13 18:03 Disco-NamespaceFile2-ic-5340-Data.db -rw-r--r-- 1 andre staff 167794716 Nov 13 18:04 Disco-NamespaceFile2-ic-5719-Data.db -rw-r--r-- 1 andre staff 167782343 Nov 13 18:04 Disco-NamespaceFile2-ic-5721-Data.db -rw-r--r-- 1 andre staff 167789502 Nov 13 18:05 Disco-NamespaceFile2-ic-5722-Data.db -rw-r--r-- 1 andre staff 167800931 Nov 13 18:05 Disco-NamespaceFile2-ic-5723-Data.db -rw-r--r-- 1 andre staff 167810143 Nov 13 18:06 Disco-NamespaceFile2-ic-5724-Data.db -rw-r--r-- 1 andre staff 167800713 Nov 13 18:07 Disco-NamespaceFile2-ic-5725-Data.db -rw-r--r-- 1 andre staff 167812647 Nov 13 18:07 Disco-NamespaceFile2-ic-5726-Data.db -rw-r--r-- 1 andre staff 167782180 Nov 13 18:08 Disco-NamespaceFile2-ic-5727-Data.db -rw-r--r-- 1 andre staff 167797488 Nov 13 18:09 Disco-NamespaceFile2-ic-5728-Data.db -rw-r--r-- 1 andre staff 12950043 Nov 13 18:09 Disco-NamespaceFile2-ic-5729-Data.db -rw-r--r-- 1 andre staff 167798810 Nov 13 18:09 Disco-NamespaceFile2-ic-5730-Data.db -rw-r--r-- 1 andre staff 167805918 Nov 13 18:10 Disco-NamespaceFile2-ic-5731-Data.db -rw-r--r-- 1 andre staff 167805189 Nov 13 18:10 Disco-NamespaceFile2-ic-5732-Data.db -rw-r--r-- 1 andre staff 12563937 Nov 13 18:10 Disco-NamespaceFile2-ic-5733-Data.db -rw-r--r-- 1 andre staff 16110649 Nov 13 18:11 Disco-NamespaceFile2-ic-5748-Data.db Of these, the row in question was present on: Disco-NamespaceFile2-ic-5337-Data.db - tombstone column Disco-NamespaceFile2-ic-5719-Data.db - no trace of that column Disco-NamespaceFile2-ic-5748-Data.db - live column with original timestamp Is this the information you requested? Best regards, André
Use of line number and file name in default cassandra logging configuration
In the logging configuration that ships with the cassandra distribution (log4j-server.properties in 2.0, and logback.xml in 2.1), the rolling file appender is configured to print the file name and the line number of each logging event: log4j.appender.R.layout.ConversionPattern=%5p [%t] %d{ISO8601} %F (line %L) %m%n Both the log4j http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html and logback documentation http://logback.qos.ch/manual/layouts.html warn that generating the filename/line information is not a cheap operation: Generating the file information is not particularly fast. Thus, its use should be avoided unless execution speed is not an issue. The implementation for both involves creating a new Throwable and then printing the stack trace for the throwable to find the file name or line number. Is there a particular reason why the official distribution configures the logging like this, instead of using the logger name (%c)? Cassandra would seem like the perfect description of a place where execution speed is an issue.
Re: Use of line number and file name in default cassandra logging configuration
On Wed, Nov 26, 2014 at 10:39 AM, Matt Brown m...@mattnworb.com wrote: Both the log4j http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html and logback documentation http://logback.qos.ch/manual/layouts.html warn that generating the filename/line information is not a cheap operation: Generating the file information is not particularly fast. Thus, its use should be avoided unless execution speed is not an issue. Your observation seems reasonable on its face. I suggest filing it as a JIRA at http://issues.apache.org =Rob
Re: Repair completes successfully but data is still inconsistent
On Wed, Nov 26, 2014 at 10:17 AM, André Cruz andre.c...@co.sapo.pt wrote: Of these, the row in question was present on: Disco-NamespaceFile2-ic-5337-Data.db - tombstone column Disco-NamespaceFile2-ic-5719-Data.db - no trace of that column Disco-NamespaceFile2-ic-5748-Data.db - live column with original timestamp Is this the information you requested? Yes. Do you know if 5748 was created as a result of compaction or via a flush from a memtable? =Rob
Re: Use of line number and file name in default cassandra logging configuration
On Wed, Nov 26, 2014 at 11:57 AM, Matt Brown m...@mattnworb.com wrote: I created https://issues.apache.org/jira/browse/CASSANDRA-8379 and attached patches against trunk and the cassandra-2.0 branch. Sweet. Thanks for closing the loop and letting the list know the JIRA info. =Rob
Never running repair: No need vs consequences in our usage pattern
I have a 30+ node cluster that is under heavy read and write load. Based on the fact that we never delete data, and all data is inserted with TTLs and is somewhat temporal if not upserted, and we are fine with the consistency of one and read repair chance, we elected to never repair. The reasoning behind this is that the data is so temporal and would simply vanish through normal compaction. We also adhere to the policy of trying to do full row writes so we do not have to do reassembly during reads. Are there any consequences we should be aware of with this strategy? We don’t even run repair when adding nodes to the cluster—we just wait for the data to invalidate itself via TTL and be compacted away. Based on everything I’ve read, running repair only really helps us on consistency (which we don’t care about because data is updated so often that being one update behind is fine) and deleted data re-appearing (and we never delete, we just always use TTLs). Perhaps there is some other reason to run repair that we are not aware of? Wayne
Re: Never running repair: No need vs consequences in our usage pattern
On Wed, Nov 26, 2014 at 12:16 PM, Wayne Schroeder wschroe...@pinsightmedia.com wrote: I have a 30+ node cluster that is under heavy read and write load. Based on the fact that we never delete data, and all data is inserted with TTLs and is somewhat temporal if not upserted, and we are fine with the consistency of one and read repair chance, we elected to never repair. The reasoning behind this is that the data is so temporal and would simply vanish through normal compaction. We also adhere to the policy of trying to do full row writes so we do not have to do reassembly during reads. Are there any consequences we should be aware of with this strategy? We don’t even run repair when adding nodes to the cluster—we just wait for the data to invalidate itself via TTL and be compacted away. You ultimately don't care about consistency OR durability, which is the other thing repair helps with. You are correct that you shouldn't bother running repair in this case. =Rob
integrating cassandra and haoop
Hey all, I'd like to connect my cassandra 2.1.2 cluster to hadoop to have it process the data. Are there any good tutorials you can recommend on how to accomplish this? I'm running Centos 6.5 on my cassandra server and the hadoop name node is CentOS 7. Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B