[jira] [Commented] (HBASE-9857) Blockcache prefetch for HFile V3

Andrew Purtell (JIRA) Tue, 05 Nov 2013 14:46:58 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814352#comment-13814352
 ]


Andrew Purtell commented on HBASE-9857:
---------------------------------------

Thanks for looking at the patch [~ndimiduk]. 

bq. don't see why it's limited to HFileV3. Can it be made a general feature

I put the preload logic into the v3 reader because v3 is 'experimental'. Could 
trivially go into the v2 reader instead.

bq. I think it could be smart about loading the blocks, load either 
sequentially or over a random distribution until the cache is full

Files to be preloaded are queued and scheduled to be handled by a small 
threadpool. When a thread picks up work for a file, the blocks are loaded 
sequentially using a non-pread scanner from offset 0 to the end of the index. 

By random did you mean randomly select work from the file queue?

bq. The "until full" part seems tricky as eviction detection isn't very 
straight-forward

Right. If we had it, I could make use of it.

> Blockcache prefetch for HFile V3
> --------------------------------
>
>                 Key: HBASE-9857
>                 URL: https://issues.apache.org/jira/browse/HBASE-9857
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>         Attachments: 9857.patch
>
>
> Attached patch implements a prefetching function for HFile (v3) blocks, if 
> indicated by a column family or regionserver property. The purpose of this 
> change is to as rapidly after region open as reasonable warm the blockcache 
> with all the data and index blocks of (presumably also in-memory) table data, 
> without counting those block loads as cache misses. Great for fast reads and 
> keeping the cache hit ratio high. Can tune the IO impact versus time until 
> all data blocks are in cache. Works a bit like CompactSplitThread. Makes some 
> effort not to stampede.
> I have been using this for setting up various experiments and thought I'd 
> polish it up a bit and throw it out there. If the data to be preloaded will 
> not fit in blockcache, or if as a percentage of blockcache it is large, this 
> is not a good idea, will just blow out the cache and trigger a lot of useless 
> GC activity. Might be useful as an expert tuning option though. Or not.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9857) Blockcache prefetch for HFile V3

Reply via email to