Hi,

I'm interested in the SSTable index file format and particularly in Cassandra 2.2 which uses the SSTable version "ma". Apart from keys and their corresponding offsets in the data file what else is included in each index entry?

I'm trying to trace code when an SSTable is flushed (especially in class BigTableWriter.java). I see that each RowIndexEntry may contain a ColumnIndex which in turn it has a list with IndexHelper.IndexInfo entries.
So i would expect the index format to be something like this:
<key><fields_list><offset_in_datafile>

On the other hand it seems that the ColumnIndex does not contain all the columns of the data row.

Let me give you an example.
Assume the following schema of a column family
mytable ( y_id varchar primary key, field0 varchar, field1 varchar, field2 varchar);

In this case if i execute the queries below:
INSERT INTO ycsb.usertable (y_id, field0, field1, field2) VALUES ('k1', 'f1a', 'f1b', 'f1c');
INSERT INTO ycsb.usertable (y_id, field0) VALUES ('k2', 'f2a');

and then flush the table, I would expect the index to have the following info:
k1, [field0, field1, field2], <offset>
k2, [field0], <offset>

Is this correct?
Is there a documentation page with the file format of the index file?

Reply via email to