[
https://issues.apache.org/jira/browse/CASSANDRA-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384715#comment-14384715
]
Aleksey Yeschenko commented on CASSANDRA-8358:
----------------------------------------------
Pushed a squashed version based on latest trunk to
https://github.com/iamaleksey/cassandra/commits/8358 with no changes.
So far things mostly look good. I'd like to do a few more cosmetic things,
however.
1. {{AbstractColumnFamilyRecordWriter}} is a small class, and both the
(deprecated) {{ColumnFamilyRecordWriter}} and {{CqlRecordWriter}} extend it. It
also has Thrift-specific logic. So I would prefer the abstract class to go away
entirely, with its functionality duplicated, if needed, (the shared bits) in
{{ColumnFamilyRecordWriter}} and {{CqlRecordWriter}}
2. Same for {{AbstractColumnFamilyOutputFormat}}
3. Same for {{AbstractColumnFamilyInputFormat}}. At the very least it shouldn't
include Thrift-only functionality ({{createAuthenticatedClient}}), at most I'd
like to get rid of the abstract class and have {{ColumnFamilyInputFormat}} and
{{CqlInputFormat}} duplicate the shared bits.
4. Same for {{AbstractBulkRecordWriter}} - more than half the class is
Thrift-code. Plus, shouldn't old {{BulkRecordWriter}} be {{@Deprecated}} too?
5. Same for {{AbstractBulkOutputFormat}} and deprecation of
{{BulkOutputFormat}} itself (right now both its methods are deprecated
individually)
With all the {{*ColumnFamily*}} versions getting deprecated in this version,
removing them in 3.later would be as simple as rm-ing the non-CQL classes.
Would also be nice to get rid of "column family" naming everywhere in Cql*
classed, in favor of Table* - in method names and class names.
> Bundled tools shouldn't be using Thrift API
> -------------------------------------------
>
> Key: CASSANDRA-8358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8358
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Aleksey Yeschenko
> Assignee: Philip Thompson
> Fix For: 3.0
>
>
> In 2.1, we switched cqlsh to the python-driver.
> In 3.0, we got rid of cassandra-cli.
> Yet there is still code that's using legacy Thrift API. We want to convert it
> all to use the java-driver instead.
> 1. BulkLoader uses Thrift to query the schema tables. It should be using
> java-driver metadata APIs directly instead.
> 2. o.a.c.hadoop.cql3.CqlRecordWriter is using Thrift
> 3. o.a.c.hadoop.ColumnFamilyRecordReader is using Thrift
> 4. o.a.c.hadoop.AbstractCassandraStorage is using Thrift
> 5. o.a.c.hadoop.pig.CqlStorage is using Thrift
> Some of the things listed above use Thrift to get the list of partition key
> columns or clustering columns. Those should be converted to use the Metadata
> API of the java-driver.
> Somewhat related to that, we also have badly ported code from Thrift in
> o.a.c.hadoop.cql3.CqlRecordReader (see fetchKeys()) that manually fetches
> columns from schema tables instead of properly using the driver's Metadata
> API.
> We need all of it fixed. One exception, for now, is
> o.a.c.hadoop.AbstractColumnFamilyInputFormat - it's using Thrift for its
> describe_splits_ex() call that cannot be currently replaced by any
> java-driver call (?).
> Once this is done, we can stop starting Thrift RPC port by default in
> cassandra.yaml.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)