[ 
https://issues.apache.org/jira/browse/CASSANDRA-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941478#comment-14941478
 ] 

Yong Zhang commented on CASSANDRA-9618:
---------------------------------------

As end user, I am so sad that there is no maintain version of sstable2json in 
the C* 2.x. As a nosql database, it is important that C* still gives end user 
an option to parse the data out from the SSTable files, and ingest into other 
storage system, like HDFS.

There are also other open source projects like netflix astyanax and hadoop 
sstable (https://github.com/fullcontact/hadoop-sstable) in fact are doing this, 
and depending on the sstable2json logic to understand the internal way to parse 
SSTable files. This in fact is lost or way more difficult than before as 
sstable2json is not catching up with the latest changes, especially all these 
new collection/map format.

I understand there is still option to use something like Spark to query data 
from C* directly, but to be honest, due to the C* internal storage layout to 
support its use cases, for significant amount of data, it is hard to support 
arbitrary read path in C*. 

Please consider supporting sstable2json in C* 2.x or 3.x or future releases.

> Consider deprecating sstable2json/json2sstable in 2.2
> -----------------------------------------------------
>
>                 Key: CASSANDRA-9618
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9618
>             Project: Cassandra
>          Issue Type: Task
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 2.2.0 rc2
>
>         Attachments: 0001-Deprecate-sstable2json-and-json2sstable.patch
>
>
> The rational is explained in CASSANDRA-7464 but to rephrase a bit:
> * json2sstable is pretty much useless, {{CQLSSTableWriter}} is way more 
> flexible if you need to write sstable directly.
> * sstable2json is really only potentially useful for debugging, but it's 
> pretty bad at that (it's output is not really all that helpul in modern 
> Cassandra in particular).
> Now, it happens that updating those tool for CASSANDRA-8099, while possible, 
> is a bit involved. So I don't think it make sense to invest effort in 
> maintain these tools. So I propose to deprecate these in 2.2 with removal in 
> 3.0.
> I'll note that having a tool to help debugging sstable can be useful, but I 
> propose to add a tool for that purpose with CASSANDRA-7464.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to