The current code I have is using SSTableReader (like sstable2json) and then trying to interpret the bytes with the help of CFMetaData.
Any pointers for where in the Cassandra codebase I could see how this is done? Thanks for the help. 2015-05-26 16:07 GMT+02:00 Tyler Hobbs <ty...@datastax.com>: > Trying to parse and export an sstable at a higher, CQL level with the > current codebase is going to be pretty tough. Handling static columns, > collections (multi-cell columns), and the four minor variants of sstable > formats (sparse vs dense, composite vs simple) is not easy. If you want to > handle things at a CQL level, you should probably go through the normal > read path. > > With that said, CASSANDRA-8099 will substantially change the format of > sstables to more closely match CQL, making this more feasible. > > On Tue, May 26, 2015 at 8:49 AM, Malcolm Matalka <malc...@spotify.com> > wrote: > >> Thanks Tyler, >> >> The problem with sstable2json is that it does not support the CQL >> types as far as I can see and there isn't any indication as to modify >> it to do that. It seems like the CQL things are a layer above the >> SSTable. >> >> 2015-05-26 15:44 GMT+02:00 Tyler Hobbs <ty...@datastax.com>: >> > I would start by looking at sstable2json. It may be simplest for you to >> > run sstable2json and then process the resulting json. If that's not >> > adequate, modifying the sstable2json code is probably your best bet. >> > >> > On Mon, May 25, 2015 at 11:12 AM, Malcolm Matalka <malc...@spotify.com> >> > wrote: >> > >> >> Hello, >> >> >> >> For efficiency reasons I am trying to parse the raw SSTable files in >> >> order to transform them into another format. I understand this is >> >> like poking a sleeping beast and there aren't many guarantees around >> >> this but I'm asking if anyone has any pointers to make this possible? >> >> In a search I have stumbled upon FullContact's SSTable parser, but it >> >> does not parse the complicated data structures that CQL supports. In >> >> attempting to reverse engineer how Cassandra handles the actual data >> >> there are a few cases that are unclear and I'm concerned that my >> >> attempts to interpret them will result in a fragile result. >> >> >> >> Are there any suggestions? Existing libraries? Tips on how Cassandra >> >> parses the data itself? Pointers into the code to read? SSTable >> >> design doc? >> >> >> >> Thanks, >> >> /Malcolm >> >> >> > >> > >> > >> > -- >> > Tyler Hobbs >> > DataStax <http://datastax.com/> >> > > > > -- > Tyler Hobbs > DataStax <http://datastax.com/>