> It means that uncompressed sstable data is compressed to approximately > chunk_length_kb and every read needs to read approximately chunk_length_kb and > decompress it to read any value from compressed range ? > > Or it means approximately chunk_length_kb of sstable data is compressed and > stored on disk, so similar values must be in chunk_length_kb range to make > compression efficient ?
Pretty much the second one. We compress the sstable data by blocks of chunk_length_kb (so chunk_length_kb of uncompressed data), yelding a number of compressed blocks that are hopefully smaller than that. It does mean however that every read needs to read and deserialize a full (compressed) block and decompress it to fetch any value within this block. Sylvain
