I'm not sure if this is a problem with Hector or with Cassandra. We seem to be seeing broken pipe issues with our connections on the client side (Exception below). A bit of googling finds possibly a problem with the amount of data we are trying to store, although I'm certain our datasets are not all that large.
A nodetool ring command doesn't seem to present any downed nodes: Address DC Rack Status State Load Owns Token 153951716904446304929228999025275230571 10.130.202.34 datacenter1 rack1 Up Normal 470.74 KB 79.19% 118538200848404459763384037192174096102 10.130.202.35 datacenter1 rack1 Up Normal 483.63 KB 20.81% 153951716904446304929228999025275230571 There are no errors in the cassandra server logs. Are there any particular timeouts on connections that we need to be aware of? Or perhaps configure on the Cassandra nodes? Is this purely and issue with the Hector API configuration? Anthony 2011-08-02 08:43:06,541 ERROR [me.prettyprint.cassandra.connection.HThriftClient] - Could not flush transport (to be expected if the pool is shutting down) in close for client: CassandraClient<cassandradevrk1:9393-33> org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147) at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156) at me.prettyprint.cassandra.connection.HThriftClient.close(HThriftClient.java:85) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:289) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49) at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85) at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48) at com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.read.StandardFindRegistryPersistenceStrategy.findRegistryByProfileId(StandardFindRegistryPersistenceStrategy.java:237) at com.wsgc.services.registry.persistenceservice.impl.cassandra.strategy.read.StandardFindRegistryPersistenceStrategy.execute(StandardFindRegistryPersistenceStrategy.java:277) at com.wsgc.services.registry.registryservice.impl.service.StandardRegistryService.getRegistriesByProfileId(StandardRegistryService.java:327) at com.wsgc.services.registry.webapp.impl.RegistryServicesController.getRegistriesByProfileId(RegistryServicesController.java:247) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:175) at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:421) at org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:409) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:774) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:719) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:644) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:549) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:774) at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703) at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:896) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145) ... 47 more 2011-08-02 08:43:06,543 ERROR [me.prettyprint.cassandra.connection.HConnectionManager] - MARK HOST AS DOWN TRIGGERED for host cassandradevrk1(10.130.202.34):9393 2011-08-02 08:43:06,543 ERROR [me.prettyprint.cassandra.connection.HConnectionManager] - Pool state on shutdown: <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393}; IsActive?: true; Active: 1; Blocked: 0; Idle: 15; NumBeforeExhausted: 49 2011-08-02 08:43:06,543 ERROR [me.prettyprint.cassandra.connection.ConcurrentHClientPool] - Shutdown triggered on <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393} 2011-08-02 08:43:06,544 ERROR [me.prettyprint.cassandra.connection.ConcurrentHClientPool] - Shutdown complete on <ConcurrentCassandraClientPoolByHost>:{cassandradevrk1(10.130.202.34):9393} 2011-08-02 08:43:06,544 INFO [me.prettyprint.cassandra.connection.CassandraHostRetryService] - Host detected as down was added to retry queue: cassandradevrk1(10.130.202.34):9393 2011-08-02 08:43:06,544 WARN [me.prettyprint.cassandra.connection.HConnectionManager] - Could not fullfill request on this host CassandraClient<cassandradevrk1:9393-33>