[ 
https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827066#comment-13827066
 ] 

Matt Corgan commented on HBASE-9969:
------------------------------------

{quote}Some classes, e.g. BenchmarkableKeyValueHeap.java, miss license 
header.{quote}this patch is only for benchmarking

{quote}Currently KeyValueHeap is used by StoreScanner and RegionScannerImpl.
I wonder if a config parameter, e.g. hbase.scanner.heap.impl.class, can be 
introduced so that different implementations of BenchmarkableKeyValueHeap can 
be plugged in.{quote}Multiple heaps could be interesting but may not be 
necessary if LoserTree can be optimized for consecutive KVs from the same 
scanner.  It seems to handle consecutive KVs better than non-consecutive, but 
still not as well as trunk.

> Improve KeyValueHeap using loser tree
> -------------------------------------
>
>                 Key: HBASE-9969
>                 URL: https://issues.apache.org/jira/browse/HBASE-9969
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance, regionserver
>            Reporter: Chao Shi
>            Assignee: Chao Shi
>             Fix For: 0.98.0, 0.96.1
>
>         Attachments: 9969-0.94.txt, KeyValueHeapBenchmark_v1.ods, 
> hbase-9969-pq-v1.patch, hbase-9969-v2.patch, hbase-9969-v3.patch, 
> hbase-9969.patch, hbase-9969.patch, kvheap-benchmark.png, kvheap-benchmark.txt
>
>
> LoserTree is the better data structure than binary heap. It saves half of the 
> comparisons on each next(), though the time complexity is on O(logN).
> Currently A scan or get will go through two KeyValueHeaps, one is merging KVs 
> read from multiple HFiles in a single store, the other is merging results 
> from multiple stores. This patch should improve the both cases whenever CPU 
> is the bottleneck (e.g. scan with filter over cached blocks, HBASE-9811).
> All of the optimization work is done in KeyValueHeap and does not change its 
> public interfaces. The new code looks more cleaner and simpler to understand.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to