"does it typically have to read in the entire SStable into memory (assuming
the bloom filter said yes)?" --> No, it would be perf killer.

 On the read path, after Bloom filter, Cassandra is using the "Partition
Key Cache" to see if the partition it is looking for is present there.

 If yes, it gets the offset (from the beginning of the SSTable) to skip a
lot of data and move the disk head directly there
 If not, it then relies on the "Partition sample" to move the disk head to
the nearest location of the sought partition

 If compaction is on (by default), there will be another step before
hitting disk: compression offset. It's a translation table to match
uncompressed file offset / compressed file offset


On Wed, Sep 24, 2014 at 10:07 PM, Donald Smith <
[email protected]> wrote:

>  We’re using cassandra as a key-value store; our values are small.  So
> we’re thinking we don’t need much disk readahead (e.g., “blockdev –getra
> /dev/sda”).   We’re using SSDs.
>
>
>
> When cassandra does disk seeks to satisfy read requests does it typically
> have to read in the entire SStable into memory (assuming the bloom filter
> said yes)?  If cassandra needs to read in lots of blocks anyway or if it
> needs to read the entire file during compaction then I'd expect we might as
> well have a big readahead.   Perhaps there’s a tradeoff between read
> latency and compaction time.
>
>
>
> Any feedback welcome.
>
>
> Thanks
>
>
>
> *Donald A. Smith* | Senior Software Engineer
> P: 425.201.3900 x 3866
> C: (206) 819-5965
> F: (646) 443-2333
> [email protected]
>
>
> [image: AudienceScience]
>
>
>

Reply via email to