Re: row cache re-fill very slow

Wei Zhu Mon, 19 Nov 2012 13:23:48 -0800

Last time I checked, it took about 120 seconds to load up 21125 keys with total 
about 500M in memory ( We have a pretty wide row:). So it's about 4 MB/sec.

Just curious Andras, how can you manage such a big row cache (10-15GB 
currently)? By recommendation, you will have 10% of your heap as row cache, so 
your heap is over 100G?? The largest datastax recommends is 8GB and it seems to 
be a hardcoded limit in cassandra-env.sh ( # calculate 1/4 ram and cap to 
8192MB). Does you GC hold up with such a big heap? In my experience, full GC 
could take over 20 seconds for such a big heap.

Thanks.
-Wei

________________________________
 From: aaron morton <[email protected]>
To: [email protected] 
Sent: Monday, November 19, 2012 1:00 PM
Subject: Re: row cache re-fill very slow

i was just wondering if anyone else is experiencing very slow ( ~ 3.5 MB/sec ) 
re-fill of the row cache at start up.
It was mentioned the other day.  

What version are you on ? 
Do you know how many rows were loaded ? When complete it will log a message 
with the pattern 

"completed loading (%d ms; %d keys) row cache for %s.%s"

How is the "saved row cache file" processed?
In Version 1.1, after the SSTables have been opened the keys in the saved row 
cache are read one at a time and the whole row read into memory. This is a 
single threaded operation. 

In 1.2 reading the saved cache is still single threaded, but reading the rows 
goes through the read thread pool so is in parallel.

In both cases I do not believe the cache is stored in token (or key) order. 

( Admittedly whatever is going on is still much more preferable to starting 
with a cold row cache )
row_cache_keys_to_save in yaml may help you find a happy half way point. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 20/11/2012, at 3:17 AM, Andras Szerdahelyi 
<[email protected]> wrote:

Hey list, 
>
>
>i was just wondering if anyone else is experiencing very slow ( ~ 3.5 MB/sec ) 
>re-fill of the row cache at start up. We operate with a large row cache ( 
>10-15GB currently ) and we already measure startup times in hours :-)
>
>
>How is the "saved row cache file" processed? Are the cached row keys simply 
>iterated over and their respective rows read from SSTables - possibly creating 
>random reads with small enough sstable files, if the keys were not stored in a 
>manner optimised for a quick re-fill ? -  or is there a smarter algorithm ( 
>i.e. scan through one sstable at a time, filter rows that should be in row 
>cache )  at work and this operation is purely disk i/o bound ?
>
>
>( Admittedly whatever is going on is still much more preferable to starting 
>with a cold row cache )
>
>
>thanks!
>Andras
>
>
>
>
>
>Andras Szerdahelyi
>Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
>M: +32 493 05 50 88 | Skype: sandrew84
>
>
>
><C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
>
>  
>

Re: row cache re-fill very slow

Reply via email to