The problem was cross_node_timeout value,I had it set to true and my ntp clocks were not synchronized as a result, some of the requests were dropped.
Thanks, Sandeep On Sat, Nov 9, 2013 at 6:02 PM, srmore <comom...@gmail.com> wrote: > I recently upgraded to 1.2.9 and I am seeing a lot of REQUEST_RESPONSE and > MUTATION messages are being dropped. > > This happens when I have multiple nodes in the cluster (about 3 nodes) and > I send traffic to only one node. I don't think the traffic is that high, it > is around 400 msg/sec with 100 threads. When I take down other two nodes I > don't see any errors (at least on the client side) I am using Pelops. > > On the client I get UnavailableException, but the nodes are up. Initially > I thought I am hitting CASSANDRA-6297 (gossip thread blocking) so I > changed memtable_flush_writers to 3. Still no luck. > > UnavailableException: > org.scale7.cassandra.pelops.exceptions.UnavailableException: null at > org.scale7.cassandra.pelops.exceptions.IExceptionTranslator$ExceptionTranslator.translate(IExceptionTranslator.java:61) > ~[na:na] at > > In the debug log on the cassandra node this is the exception I see > > DEBUG [Thrift:78] 2013-11-09 16:47:28,212 CustomTThreadPoolServer.java > Thrift transport error occurred during processing of message. > org.apache.thrift.transport.TTransportException > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > at > org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) > at > org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) > at > org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > > Could this be because of high load ? with Cassandra 1.0.011 I did not see > this issue. > > Thanks, > Sandeep > > >