> > Ravi, > > Sorry not getting back to you sooner.
No problems at all. There were many times when I > used it when it would completely fill the configured block cache. This > would in turn cause things that were actually being used to be pushed out > of the BC and performance would suffer Yes quite true. Some sort of read storm in the network could cause problems. I saw a ThrottledIndexInput in old-code. Guess it was meant for this... Now the auto save and load warmup feature would be a completely different > piece of that I think would be of great benefit for restarts / failures. Yup. This must be a good feature for single machine failure/restarts.. But I just realized that the worst-case for this auto-save/load will be cluster-wide restarts.. You think ThrottledIndexInput can serve us well here also or we need some higher-order construct? -- Ravi On Thu, Apr 23, 2015 at 5:20 PM, Aaron McCurry <[email protected]> wrote: > Ravi, > > Sorry not getting back to you sooner. > > I agree that auto saving and loading of the blocks that are actually used > would be a benefit. However the previous warmup process in practice pulled > in too much data that was never accessed. There were many times when I > used it when it would completely fill the configured block cache. This > would in turn cause things that were actually being used to be pushed out > of the BC and performance would suffer. So we ran Blur for some time with > the warmup process disabled and for the most part the entire system worked > better. > > Now the auto save and load warmup feature would be a completely different > piece of that I think would be of great benefit for restarts / failures. > > Aaron > > On Thu, Apr 23, 2015 at 7:29 AM, Ravikumar Govindarajan < > [email protected]> wrote: > > > Oh I am sorry… > > > > I find that Blur Warmup code has been completely removed in repository > now. > > Are there reasons for doing the same? > > > > I thought auto-save/loading of block-cache can benefit from this nicely > > > > -- > > Ravi > > > > On Tue, Apr 21, 2015 at 6:06 PM, Ravikumar Govindarajan < > > [email protected]> wrote: > > > > > Was just looking at Blur warmup logic. I could classify in 2 stages... > > > > > > StageI > > > Looks like openShard [DistributedIndexServer] submits the warmup > request > > > on a separate warmupExecutor. This is what is exactly needed for > loading > > > auto-saved block-cache from HDFS... > > > > > > StageII > > > But when I prodded a little bit deeper, it got complex. > > > TraceableDirectory, IndexTracer with thread-local stuff etc… I could > not > > > follow the code... > > > > > > I decided on an impl as follows... > > > > > > public class BlockCacheWarmup extends BlurIndexWarmup { > > > > > > @override > > > > > > public void warmBlurIndex(final TableDescriptor table, final String > > > shard, IndexReader reader, > > > > > > AtomicBoolean isClosed, ReleaseReader releaseReader, AtomicLong > > > pauseWarmup) throws IOException { > > > for (each segment) { > > > for (each file) { > > > //Read cache-meta data from HDFS... > > > //Directly open CacheIndexInput and populate block-cache > > > } > > > } > > > } > > > > > > My qstn is… > > > > > > If I explicitly bypass StageII {TraceableDirectory and friends} and > just > > > populate block-cache alone, will this work-fine. Am I missing something > > > obvious? > > > > > > Any help is much appreciated… > > > > > > -- > > > Ravi > > > > > > On Fri, Feb 6, 2015 at 2:58 PM, Aaron McCurry <[email protected]> > > wrote: > > > > > >> Yes exactly. That way we could provide a set of blocks to be cache > with > > >> priority, so the most important bits get cached first. > > >> > > >> Aaron > > >> > > >> On Fri, Feb 6, 2015 at 12:43 AM, Ravikumar Govindarajan < > > >> [email protected]> wrote: > > >> > > >> > That's a great idea... > > >> > > > >> > You meant like instead of saving blocks themselves, we can store > > >> metadata > > >> > {block-ids} for each file/shard in HDFS that is written to > > >> block-cache... > > >> > > > >> > Opening a shard can then use this metadata to re-populate the hot > > parts > > >> of > > >> > the files... > > >> > > > >> > We also need to handle evictions & file-deletes... > > >> > > > >> > Is this what you are hinting at? > > >> > > > >> > -- > > >> > Ravi > > >> > > > >> > On Thu, Feb 5, 2015 at 7:03 PM, Aaron McCurry <[email protected]> > > >> wrote: > > >> > > > >> > > On Thu, Feb 5, 2015 at 6:30 AM, Ravikumar Govindarajan < > > >> > > [email protected]> wrote: > > >> > > > > >> > > > I noticed in BigTable impl of Cassandra where they store the > > >> "Memtable" > > >> > > > info periodically onto disk to avoid cold start-ups... > > >> > > > > > >> > > > Is it possible to do something like that for Blur's block-cache, > > >> > > preferably > > >> > > > in HDFS itself so that both cold start-ups and shard take-overs > > >> don't > > >> > > > affect end-user latencies... > > >> > > > > > >> > > > In Cassandra's case, the size of Memtable will typically be > > 2GB-4GB. > > >> > But > > >> > > in > > >> > > > case of Blur, it could even be100 GB. So I don't know if > > attempting > > >> > such > > >> > > > stuff is good idea. > > >> > > > > > >> > > > Any help is appreciated much... > > >> > > > > > >> > > > > >> > > Yeah I agree that caches could be very large and storing in HDFS > > >> could be > > >> > > counter productive. Also the block cache represents what is on > the > > >> > single > > >> > > node and it's not really broken up by shard or table. So if a > node > > >> was > > >> > > restarted without a full cluster restart there's no guarantee that > > the > > >> > > shard server will get the same shards back that it was serving > > before. > > >> > > > > >> > > I like the idea though, perhaps we can write out what parts of > what > > >> files > > >> > > the cache was storing with the lru order. Then any server that is > > >> > opening > > >> > > the shard can know what parts of what files were hot the last time > > it > > >> was > > >> > > open. Then they could choose to populate the cache upon shard > > >> opening. > > >> > > > > >> > > Thoughts? > > >> > > > > >> > > Aaron > > >> > > > > >> > > > > >> > > > > > >> > > > -- > > >> > > > Ravi > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > >
