Ideally those shouldn't be configurable, we should just set to a level that makes more sense. If we do make it configurable and let it like that then we'll have questions like "what acceptable/min factor should I use?" and we'll spend hours doing back and forth on the ML to get minimal results.
Currently having the acceptable factor set to where it is just means that we're using less memory than configured eg if you need to cache 2GB per machine, set hfile.block.cache.size to ~2.35GB and you'll have it. The real issue is the minimum factor. The idea is that we don't want to overflow the configured maximum size while we're evicting. The problems: - Evicting 10% of the cache (85-75) is pretty hardcore, it means that if you evict often then you're never close to using 85% of your cache. - Evictions are purely CPU-bound and in my tests are almost never likely to be so slow that you reach 100% utilization (whereas loading the cache usually means you need to read from disk). It was too slow for caches of up to 32MB with data generated in-memory. - Considering the previous two problems it seems we should set the minimum factor close to the acceptable one, but on big caches this would waste a lot of CPU cycles (I haven't quantified that yet, I'm just stating this from experience). So back to HBASE-6312, at the moment I think we should just set the minimum factor 5% closer to the acceptable one. Jie Huang doesn't mention if in their tests their customers compared the caching ratio for caches of the same size but with different acceptable factor or if they tried to compare apples to apples. What I'm trying to say, going back to my earlier example, is that if they compared two caches with hfile.block.cache.size=0.2 but with different acceptable factors then well yes the one with the bigger acceptable factor will win... because it's using a bigger cache. J-D On Tue, Jul 3, 2012 at 9:54 AM, Ted Yu <[email protected]> wrote: > Here're the knobs for block cache introduced in the patch: > > - static final String LRU_DEFAULT_LOAD_FACTOR = > "hbase.lru.blockcache.default.load.factor"; > - static final String LRU_DEFAULT_CONCURRENCY_LEVEL = > "hbase.lru.blcokcache.default.concurrency.level"; > - static final String LRU_DEFAULT_MIN_FACTOR = > "hbase.lru.blockcache.default.min.factor"; > - static final String LRU_DEFAULT_ACCEPTABLE_FACTOR = > "hbase.lru.blockcache.default.acceptable.factor"; > - static final String LRU_DEFAULT_SINGLE_FACTOR = > "hbase.lru.blockcache.default.single.factor"; > - static final String LRU_DEFAULT_MULTI_FACTOR = > "hbase.lru.blockcache.default.multi.factor"; > - static final String LRU_DEFAULT_MEMORY_FACTOR = > "hbase.lru.blockcache.default.memory.factor"; > > Slide 11 of J-D's talk mentioned using acceptable factor of 0.95f and min > factor of 0.90f > > We should expose these two knobs in my opinion. > > On Tue, Jul 3, 2012 at 9:32 AM, Andrew Purtell <[email protected]> wrote: > >> Continuing discussion while JIRA is not available. >> >> OP wants to make the BlockCache eviction thresholds configurable so >> they don't have to recompile to try J-D's advice on tuning them up. Is >> that really useful? By that I mean are there use cases where a lower >> threshold would make sense? Or should we instead change the constants? >> Or both? >> >> -- >> Best regards, >> >> - Andy >> >> Problems worthy of attack prove their worth by hitting back. - Piet >> Hein (via Tom White) >>
