[
https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296200#comment-14296200
]
Philip Thompson commented on CASSANDRA-8358:
--------------------------------------------
Here is my current branch: https://github.com/ptnapoleon/cassandra/compare/8358
Sorry about the WIP pushed changes to BulkLoader, ignore those for now. I have
recently received a JAR of the a tentative 2.1.5 of the driver containing
JAVA-312, so I can finish work on this now.
I was having an issue where the Thread calling the java driver's connect() was
being interrupted, which was causing the connect() to fail. Currently I check
for Thread.interrupted() and retry if that is the reason for the failure. I am
not sure how to prevent the interruption in the first place.
Currently when running pig-test, only one test that uses CqlNativeStorage is
failing, and that is testCqlNativeStorageCollectionColumnTable.
This is due to the following problem:
{code}
java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:267)
at
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:552)
at
org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:561)
at
org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:118)
at
org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:100)
at org.apache.cassandra.cql3.Maps$Value.fromSerialized(Maps.java:164)
at org.apache.cassandra.cql3.Maps$Marker.bind(Maps.java:273)
at org.apache.cassandra.cql3.Maps$Marker.bind(Maps.java:262)
at org.apache.cassandra.cql3.Maps$Putter.doPut(Maps.java:355)
at org.apache.cassandra.cql3.Maps$Setter.execute(Maps.java:292)
at
org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:98)
at
org.apache.cassandra.cql3.statements.ModificationStatement.getMutations(ModificationStatement.java:655)
at
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:487)
at
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:473)
at
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:233)
at
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:443)
at
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:134)
at
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
at
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
at
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
at
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
at java.lang.Thread.run(Thread.java:745)
{code}
This is erroring because in CollectionSerializer.readValue
{code}
public static ByteBuffer readValue(ByteBuffer input, int version)
{
if (version >= Server.VERSION_3)
{
int size = input.getInt();
if (size < 0)
return null;
return ByteBufferUtil.readBytes(input, size);
}
else
{
return ByteBufferUtil.readBytesWithShortLength(input);
}
}
{code}
The value of size from input.getInt() is an integer in the millions for one of
the map values. I am still figuring out what is differing from cassandra-2.1
where the test is passing without my changes, but the ByteBuffer itself doesn't
appear to be different.
In CqlConfigHelper, should I be creating an OUTPUT_* property for each INPUT_*
property?
PigTestBase should be switched over to using the java driver, but I would
rather handle that in a separate ticket.
AbstractCassandraStorage may be deprecated for 3.0, but it is not working at
all with the current schema parsing queries. That also belongs in a separate
ticket.
> Bundled tools shouldn't be using Thrift API
> -------------------------------------------
>
> Key: CASSANDRA-8358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8358
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Aleksey Yeschenko
> Assignee: Philip Thompson
> Fix For: 3.0
>
>
> In 2.1, we switched cqlsh to the python-driver.
> In 3.0, we got rid of cassandra-cli.
> Yet there is still code that's using legacy Thrift API. We want to convert it
> all to use the java-driver instead.
> 1. BulkLoader uses Thrift to query the schema tables. It should be using
> java-driver metadata APIs directly instead.
> 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift
> 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift
> 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift
> 5. o.a.c.hadoop.pig.CqlStorage is using Thrift
> Some of the things listed above use Thrift to get the list of partition key
> columns or clustering columns. Those should be converted to use the Metadata
> API of the java-driver.
> Somewhat related to that, we also have badly ported code from Thrift in
> o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches
> columns from schema tables instead of properly using the driver's Metadata
> API.
> We need all of it fixed. One exception, for now, is
> o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its
> describe_splits_ex() call that cannot be currently replaced by any
> java-driver call (?).
> Once this is done, we can stop starting Thrift RPC port by default in
> cassandra.yaml.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)