Hello, For efficiency reasons I am trying to parse the raw SSTable files in order to transform them into another format. I understand this is like poking a sleeping beast and there aren't many guarantees around this but I'm asking if anyone has any pointers to make this possible? In a search I have stumbled upon FullContact's SSTable parser, but it does not parse the complicated data structures that CQL supports. In attempting to reverse engineer how Cassandra handles the actual data there are a few cases that are unclear and I'm concerned that my attempts to interpret them will result in a fragile result.
Are there any suggestions? Existing libraries? Tips on how Cassandra parses the data itself? Pointers into the code to read? SSTable design doc? Thanks, /Malcolm