Invalid queries in Cassandra.Client causes subsequent, valid, queries to fail
-----------------------------------------------------------------------------
Key: CASSANDRA-3339
URL: https://issues.apache.org/jira/browse/CASSANDRA-3339
Project: Cassandra
Issue Type: Bug
Components: API
Affects Versions: 0.8.6
Environment: Windows
Reporter: Ivo Ladage-van Doorn
First of all; I'm quite new to Cassandra, so I hope that my analysis is
correct.
I am using the Hector client to perform queries on Cassandra. The problem is
that once I invoked an invalid slice query with a null rowKey, subsequent
queries also fail with roughly the same error. So the first time I invoke the
invalid query, I get this exception:
org.apache.thrift.protocol.TProtocolException: Required field 'key' was not
present!
Struct: get_slice_args(key:null,
column_parent:ColumnParent(column_family:AmdatuToken),
predicate:SlicePredicate(slice_range:SliceRange(start:, finish:,
reversed:false, count:1000000)), consistency_level:ONE)
at
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:56)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:285)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:268)
...
which is expected behavior. However, after invoking this invalid query,
subsequent valid calls also fail with roughly the same error:
me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra
encountered an internal error processing this request: TApplicationError type:
7 message:Required field 'key' was not present! Struct:
get_slice_args(key:null, column_parent:null, predicate:null,
consistency_level:ONE)
org.apache.felix.log.LogException:
me.prettyprint.hector.api.exceptions.HCassandraInternalException:
Cassandra encountered an internal error processing this request:
TApplicationError type: 7 message:Required field 'key' was not present! Struct:
get_slice_args(key:null, column_parent:null, predicate:null,
consistency_level:ONE)
at
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:29)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$2.execute(KeyspaceServiceImpl.java:121)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$2.execute(KeyspaceServiceImpl.java:114)
...
In the case of Hector it goes downhill from there, ending in socket write
errors and marking the Cassandra host as being down.
Now this is what I think happens:
The Hector client uses the org.apache.cassandra.thrift.Cassandra.Client class
to execute the queries on Cassandra.
When I perform a Thrift slice query from Hector, it invokes the get_slice
method in the Cassandra.Client, which in its turn invokes send_get_slice. In my
case a bug in my own software caused an invocation of this method with a rowKey
that equals null. Now although the rowKey is invalid, the method call continues
all the way to send_get_slice. This send_get_slice method looks like this:
(from org.apache.cassandra.thrift.Cassandra)
public void send_get_slice(ByteBuffer key, ColumnParent column_parent,
SlicePredicate predicate, ConsistencyLevel consistency_level) throws
org.apache.thrift.TException
{
oprot_.writeMessageBegin(new org.apache.thrift.protocol.TMessage("get_slice",
org.apache.thrift.protocol.TMessageType.CALL, ++seqid_));
get_slice_args args = new get_slice_args();
args.setKey(key);
args.setColumn_parent(column_parent);
args.setPredicate(predicate);
args.setConsistency_level(consistency_level);
args.write(oprot_);
oprot_.writeMessageEnd();
oprot_.getTransport().flush();
}
The problem is that the TMessage is written to the output protocol at the first
line. When subsequently the arguments are written in args.write(oprot_), it
first calls validate(). The validate() method detects the null rowKey and
throws an exception:
public void validate() throws org.apache.thrift.TException {
// check for required fields
if (key == null) {
throw new org.apache.thrift.protocol.TProtocolException("Required field
'key' was not present! Struct: " + toString());
}
...
}
Now Hector finally catches the exception and returns the Cassandra client to
its connection pool. When that Cassandra client is drawn from the pool and
reused for another query, it still has the TMessage written to its oprot,
resulting in another failed call. Call reusing this client from the connection
pool keep getting such errors, and in the end result in socket write errors.
The problem is not only restricted to this particular method by the way, the
same construct is used in many more methods in this class. Also passing a null
value for columnsPath or ConsistencyLevel seems to cause the same issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira