[ 
https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828030#comment-13828030
 ] 

Ted Yu commented on HBASE-9969:
-------------------------------

@Matt:
Putting patch on review board would be nice.

Some classes miss license.

For KeyValueScannerHeap, there is some duplicate code with KeyValueHeap. Looks 
like some refactoring would help ease maintenance of these two classes.
{code}
+  public KeyValueScannerPriorityQueue getHeap() {
+    return this.heap;
{code}
Can the return type be widened to PriorityQueue<KeyValueScanner> ?

For KeyValueScannerPriorityQueue:
{code}
+ * * numScanners is always <= the capacity provided at construction time<br/>
+ * * the number of scanners is usually < 10, almost always < 100<br/>
{code}
Did the above assumption come from experience with your clusters ?

For KeyValueHeap.java, I don't see where numNextComparisons is updated.

> Improve KeyValueHeap using loser tree
> -------------------------------------
>
>                 Key: HBASE-9969
>                 URL: https://issues.apache.org/jira/browse/HBASE-9969
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance, regionserver
>            Reporter: Chao Shi
>            Assignee: Chao Shi
>             Fix For: 0.98.0, 0.96.1
>
>         Attachments: 9969-0.94.txt, KeyValueHeapBenchmark_v1.ods, 
> KeyValueHeapBenchmark_v2.ods, hbase-9969-pq-v1.patch, hbase-9969-pq-v2.patch, 
> hbase-9969-v2.patch, hbase-9969-v3.patch, hbase-9969.patch, hbase-9969.patch, 
> kvheap-benchmark.png, kvheap-benchmark.txt
>
>
> LoserTree is the better data structure than binary heap. It saves half of the 
> comparisons on each next(), though the time complexity is on O(logN).
> Currently A scan or get will go through two KeyValueHeaps, one is merging KVs 
> read from multiple HFiles in a single store, the other is merging results 
> from multiple stores. This patch should improve the both cases whenever CPU 
> is the bottleneck (e.g. scan with filter over cached blocks, HBASE-9811).
> All of the optimization work is done in KeyValueHeap and does not change its 
> public interfaces. The new code looks more cleaner and simpler to understand.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to