On Wed, Oct 22, 2014 at 4:34 AM, Thomas Whiteway <
thomas.white...@metaswitch.com> wrote:

>  I’m working on an application using a Cassandra (2.1.0) cluster where
>
>  -          our entire dataset is around 22GB
>
> -          each node has 48GB of memory but only a single (mechanical)
> hard disk
>
> -          in normal operation we have a low level of writes and no reads
>
> -          very occasionally we need to read rows very fast (>1.5K
> rows/second), and only read each row once.
>
>
>
> When we try and read the rows it takes up to five minutes before Cassandra
> is able to keep up.  The problem seems to be that it takes a while to get
> the data into the page cache and until then Cassandra can’t retrieve the
> data from disk fast enough (e.g. if I drop the page cache mid-test then
> Cassandra slows down for the next 5 minutes).
>

Use :

populate_io_cache_on_flush

It's designed for this case. "flush" in this case also includes the "flush"
that comes at the end of compaction.

Kevin Burton's (hi! :D) https://code.google.com/p/linux-ftools/ will help
you keep the SSTables in the page cache when f/e rebooting nodes.

=Rob

Reply via email to