[jira] [Commented] (HBASE-9969) Improve KeyValueHeap using loser tree

Matt Corgan (JIRA) Wed, 20 Nov 2013 12:55:39 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828100#comment-13828100
 ]


Matt Corgan commented on HBASE-9969:
------------------------------------

{quote}@Matt:
Putting patch on review board would be nice.{quote}It's still just for 
benchmarking.  We wouldn't want to commit this as is.  We'd probably only 
commit one implementation, or maybe a hybrid if we can't get a single clear 
winner.

{quote}numScanners is always <= the capacity provided at construction 
time{quote}i *think* this is always true in hbase?  Am missing somewhere that 
we add new scanners that were not present at heap construction?

{quote}the number of scanners is usually < 10, almost always < 100{quote}yes, 
it's almost always true in my cluster, but doesn't hbase try to enforce this in 
general by issuing compactions?  And then it enforces it during the compactions 
with hbase.hstore.compaction.max defaulting to 10.

{quote}For KeyValueHeap.java, I don't see where numNextComparisons is 
updated.{quote}yes, sorry, some of the counts are still missing or are not 
named well

> Improve KeyValueHeap using loser tree
> -------------------------------------
>
>                 Key: HBASE-9969
>                 URL: https://issues.apache.org/jira/browse/HBASE-9969
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance, regionserver
>            Reporter: Chao Shi
>            Assignee: Chao Shi
>             Fix For: 0.98.0, 0.96.1
>
>         Attachments: 9969-0.94.txt, KeyValueHeapBenchmark_v1.ods, 
> KeyValueHeapBenchmark_v2.ods, hbase-9969-pq-v1.patch, hbase-9969-pq-v2.patch, 
> hbase-9969-v2.patch, hbase-9969-v3.patch, hbase-9969.patch, hbase-9969.patch, 
> kvheap-benchmark.png, kvheap-benchmark.txt
>
>
> LoserTree is the better data structure than binary heap. It saves half of the 
> comparisons on each next(), though the time complexity is on O(logN).
> Currently A scan or get will go through two KeyValueHeaps, one is merging KVs 
> read from multiple HFiles in a single store, the other is merging results 
> from multiple stores. This patch should improve the both cases whenever CPU 
> is the bottleneck (e.g. scan with filter over cached blocks, HBASE-9811).
> All of the optimization work is done in KeyValueHeap and does not change its 
> public interfaces. The new code looks more cleaner and simpler to understand.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9969) Improve KeyValueHeap using loser tree

Reply via email to