The current code I have is using SSTableReader (like sstable2json) and
then trying to interpret the bytes with the help of CFMetaData.

Any pointers for where in the Cassandra codebase I could see how this is done?

Thanks for the help.


2015-05-26 16:07 GMT+02:00 Tyler Hobbs <ty...@datastax.com>:
> Trying to parse and export an sstable at a higher, CQL level with the
> current codebase is going to be pretty tough.  Handling static columns,
> collections (multi-cell columns), and the four minor variants of sstable
> formats (sparse vs dense, composite vs simple) is not easy.  If you want to
> handle things at a CQL level, you should probably go through the normal
> read path.
>
> With that said, CASSANDRA-8099 will substantially change the format of
> sstables to more closely match CQL, making this more feasible.
>
> On Tue, May 26, 2015 at 8:49 AM, Malcolm Matalka <malc...@spotify.com>
> wrote:
>
>> Thanks Tyler,
>>
>> The problem with sstable2json is that it does not support the CQL
>> types as far as I can see and there isn't any indication as to modify
>> it to do that.  It seems like the CQL things are a layer above the
>> SSTable.
>>
>> 2015-05-26 15:44 GMT+02:00 Tyler Hobbs <ty...@datastax.com>:
>> > I would start by looking at sstable2json.  It may be simplest for you to
>> > run sstable2json and then process the resulting json.  If that's not
>> > adequate, modifying the sstable2json code is probably your best bet.
>> >
>> > On Mon, May 25, 2015 at 11:12 AM, Malcolm Matalka <malc...@spotify.com>
>> > wrote:
>> >
>> >> Hello,
>> >>
>> >> For efficiency reasons I am trying to parse the raw SSTable files in
>> >> order to transform them into another format.  I understand this is
>> >> like poking a sleeping beast and there aren't many guarantees around
>> >> this but I'm asking if anyone has any pointers to make this possible?
>> >> In a search I have stumbled upon FullContact's SSTable parser, but it
>> >> does not parse the complicated data structures that CQL supports.  In
>> >> attempting to reverse engineer how Cassandra handles the actual data
>> >> there are a few cases that are unclear and I'm concerned that my
>> >> attempts to interpret them will result in a fragile result.
>> >>
>> >> Are there any suggestions?  Existing libraries?  Tips on how Cassandra
>> >> parses the data itself?  Pointers into the code to read?  SSTable
>> >> design doc?
>> >>
>> >> Thanks,
>> >> /Malcolm
>> >>
>> >
>> >
>> >
>> > --
>> > Tyler Hobbs
>> > DataStax <http://datastax.com/>
>>
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>

Reply via email to