[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176915#comment-14176915 ]
Christian Spriegel commented on CASSANDRA-7886: ----------------------------------------------- [~slebresne]: Sorry, I meant CQLSH and not CQL. With "the standard CQL client" I meant the CQLSH client that was installed with the debian packages. Regarding the ErrorMessage class: A new error code "READ_FAILURE" was introduced with my patch. But no new fields were added the ErrorMessage. I assume you worry about clients not being able to handle the new code. In my opinion any client-code that does not have a default-case should be punished. So I would not hestitate to add it ;-) I assume with CQL 4 (CASSANDRA-8043) a clean code handling and additional fields for be implemented for read_failures? {code} public void encode(ErrorMessage msg, ByteBuf dest, int version) { dest.writeInt(msg.error.code().value); // TODO: make sure READ_FAILURE is only sent for CQL >=4 CBUtil.writeString(msg.error.getMessage(), dest); switch (msg.error.code()) { //case READ_FAILURE: // read failure case not implemented so far! // if(version > x) // with the next version this could be implemented // { // RequestFailureException rfe = (RequestFailureException) msg.error; // dest.writeInt(rfe.received); // dest.writeInt(rfe.blockFor); // dest.writeInt(rfe.failures); // } // break; {code} > TombstoneOverwhelmingException should not wait for timeout > ---------------------------------------------------------- > > Key: CASSANDRA-7886 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Tested with Cassandra 2.0.8 > Reporter: Christian Spriegel > Assignee: Christian Spriegel > Priority: Minor > Fix For: 3.0 > > Attachments: 7886_v1.txt > > > *Issue* > When you have TombstoneOverwhelmingExceptions occuring in queries, this will > cause the query to be simply dropped on every data-node, but no response is > sent back to the coordinator. Instead the coordinator waits for the specified > read_request_timeout_in_ms. > On the application side this can cause memory issues, since the application > is waiting for the timeout interval for every request.Therefore, if our > application runs into TombstoneOverwhelmingExceptions, then (sooner or later) > our entire application cluster goes down :-( > *Proposed solution* > I think the data nodes should send a error message to the coordinator when > they run into a TombstoneOverwhelmingException. Then the coordinator does not > have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)