[ https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277874#comment-14277874 ]
Philip Thompson commented on CASSANDRA-8358: -------------------------------------------- Progress Update: 1. Completion of work on BulkLoader is blocked by https://datastax-oss.atlassian.net/browse/JAVA-312 2. I have an initial draft for both o.a.c.h.cql3.CqlRecordWriter and o.a.c.h.cql3.CqlRecordReader. pig-test is completely broken on trunk right now, so I haven't had a good opportunity to test them. 3. I am not touching o.a.c.h.ColumnFamily* on [~jjordan]'s recommendation. 4. o.a.c.h.pig.CqlNativeStorage extends CqlStorage which extends AbstractCassandraStorage. CassandraStorage also extends AbstractCassandraStorage. I will remove thrift from CqlNativeStorage. Should I also remove thrift from CqlStorage as well, or just deprecate it? It seems to me that I will need to remove the connection between CqlNativeStorage and CqlStorage, or CqlStorage and AbstractCassandraStorage in order to remove thrift without affecting CassandraStorage. What would be best here? > Bundled tools shouldn't be using Thrift API > ------------------------------------------- > > Key: CASSANDRA-8358 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8358 > Project: Cassandra > Issue Type: Improvement > Reporter: Aleksey Yeschenko > Assignee: Philip Thompson > Fix For: 3.0 > > > In 2.1, we switched cqlsh to the python-driver. > In 3.0, we got rid of cassandra-cli. > Yet there is still code that's using legacy Thrift API. We want to convert it > all to use the java-driver instead. > 1. BulkLoader uses Thrift to query the schema tables. It should be using > java-driver metadata APIs directly instead. > 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift > 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift > 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift > 5. o.a.c.hadoop.pig.CqlStorage is using Thrift > Some of the things listed above use Thrift to get the list of partition key > columns or clustering columns. Those should be converted to use the Metadata > API of the java-driver. > Somewhat related to that, we also have badly ported code from Thrift in > o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches > columns from schema tables instead of properly using the driver's Metadata > API. > We need all of it fixed. One exception, for now, is > o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its > describe_splits_ex() call that cannot be currently replaced by any > java-driver call (?). > Once this is done, we can stop starting Thrift RPC port by default in > cassandra.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)