[
https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822603#comment-13822603
]
Ted Yu commented on HBASE-9969:
-------------------------------
Putting patch on review board would make reviewing easier.
For benchmark B, can you try ColumnPaginationFilter with smaller offset ?
Giving percentage improvement in the tables is desirable.
{code}
+ */
+public class LoserTree<T> {
{code}
Please add annotation for audience.
{code}
+ * {@code tree[i]} where i > 0 stores the index to greater value between
{@code value[tree[2*i]]}
+ * and {@code value[tree[2*i + 1]]}.
{code}
'value[' should be 'values[', right ?
{code}
+ * @return the index to the minimal elements.
{code}
elements -> element
{code}
+ * Pushes next value from the stream that we previously taken the minimal
element from.
{code}
taken -> took
{code}
+ * Passes {@code NULL} to value if the stream has been reached EOF.
{code}
Remove 'been'
{code}
+ if (index != topIndex()) {
+ throw new IllegalArgumentException("Only the top index can be updated");
{code}
Consider including index and topIndex in the exception message.
{code}
+ if (value == null && values.get(index) != null) {
+ numOpenStreams--;
+ if (numOpenStreams < 0) {
{code}
In what condition would numOpenStreams become negative ?
{code}
+ throw new AssertionError("numOpenStreams is negative: " +
numOpenStreams);
{code}
Throw IllegalStateException.
{code}
+ public List<Integer> getOpenStreamsForTesting() {
{code}
The above is used in KeyValueHeap, consider renaming.
{code}
+ * from bottom up to the root. Once it "loses", it stops there and the
winner continues to fight to up.
+ */
+ private void update(int i) {
{code}
'fight to up' -> 'fight to top'
Please add @param for i.
KeyValueHeapBenchmark.java and TestLoserTree.java need license.
> Improve KeyValueHeap using loser tree
> -------------------------------------
>
> Key: HBASE-9969
> URL: https://issues.apache.org/jira/browse/HBASE-9969
> Project: HBase
> Issue Type: Improvement
> Reporter: Chao Shi
> Attachments: hbase-9969.patch, hbase-9969.patch,
> kvheap-benchmark.png, kvheap-benchmark.txt
>
>
> LoserTree is the better data structure than binary heap. It saves half of the
> comparisons on each next(), though the time complexity is on O(logN).
> Currently A scan or get will go through two KeyValueHeaps, one is merging KVs
> read from multiple HFiles in a single store, the other is merging results
> from multiple stores. This patch should improve the both cases whenever CPU
> is the bottleneck (e.g. scan with filter over cached blocks, HBASE-9811).
> All of the optimization work is done in KeyValueHeap and does not change its
> public interfaces. The new code looks more cleaner and simpler to understand.
--
This message was sent by Atlassian JIRA
(v6.1#6144)