Last time I checked, it took about 120 seconds to load up 21125 keys with total about 500M in memory ( We have a pretty wide row:). So it's about 4 MB/sec.
Just curious Andras, how can you manage such a big row cache (10-15GB currently)? By recommendation, you will have 10% of your heap as row cache, so your heap is over 100G?? The largest datastax recommends is 8GB and it seems to be a hardcoded limit in cassandra-env.sh ( # calculate 1/4 ram and cap to 8192MB). Does you GC hold up with such a big heap? In my experience, full GC could take over 20 seconds for such a big heap. Thanks. -Wei ________________________________ From: aaron morton <[email protected]> To: [email protected] Sent: Monday, November 19, 2012 1:00 PM Subject: Re: row cache re-fill very slow i was just wondering if anyone else is experiencing very slow ( ~ 3.5 MB/sec ) re-fill of the row cache at start up. It was mentioned the other day. What version are you on ? Do you know how many rows were loaded ? When complete it will log a message with the pattern "completed loading (%d ms; %d keys) row cache for %s.%s" How is the "saved row cache file" processed? In Version 1.1, after the SSTables have been opened the keys in the saved row cache are read one at a time and the whole row read into memory. This is a single threaded operation. In 1.2 reading the saved cache is still single threaded, but reading the rows goes through the read thread pool so is in parallel. In both cases I do not believe the cache is stored in token (or key) order. ( Admittedly whatever is going on is still much more preferable to starting with a cold row cache ) row_cache_keys_to_save in yaml may help you find a happy half way point. Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 20/11/2012, at 3:17 AM, Andras Szerdahelyi <[email protected]> wrote: Hey list, > > >i was just wondering if anyone else is experiencing very slow ( ~ 3.5 MB/sec ) >re-fill of the row cache at start up. We operate with a large row cache ( >10-15GB currently ) and we already measure startup times in hours :-) > > >How is the "saved row cache file" processed? Are the cached row keys simply >iterated over and their respective rows read from SSTables - possibly creating >random reads with small enough sstable files, if the keys were not stored in a >manner optimised for a quick re-fill ? - or is there a smarter algorithm ( >i.e. scan through one sstable at a time, filter rows that should be in row >cache ) at work and this operation is purely disk i/o bound ? > > >( Admittedly whatever is going on is still much more preferable to starting >with a cold row cache ) > > >thanks! >Andras > > > > > >Andras Szerdahelyi >Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A >M: +32 493 05 50 88 | Skype: sandrew84 > > > ><C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png> > > >
