[ 
https://issues.apache.org/jira/browse/HBASE-14826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008553#comment-15008553
 ] 

ramkrishna.s.vasudevan commented on HBASE-14826:
------------------------------------------------

Using the benchmark class attached in HBASE-9969 and testing reseek with and 
without the patch


With patch
========
next/reseek,numColsPerRow,prefixLength,impl,numScanners,opsPerSec(K),topScannerNull(K),next
 comparisons(K),heap comparisons(K),GS comparisons(K)
reseek,16,0,KVHeap,1,2987,0,0,-0,0
reseek,16,0,KVHeap,2,2724,0,0,-0,0
reseek,16,0,KVHeap,4,3329,0,0,-0,0
reseek,16,0,KVHeap,8,4333,0,0,-0,0
reseek,16,0,KVHeap,16,3495,0,0,-0,0
reseek,16,0,KVHeap,32,3361,0,0,-0,0

Without patch
==========
next/reseek,numColsPerRow,prefixLength,impl,numScanners,opsPerSec(K),topScannerNull(K),next
 comparisons(K),heap comparisons(K),GS comparisons(K)
reseek,16,0,KVHeap,1,2539,0,0,-0,0
reseek,16,0,KVHeap,2,2118,0,0,-0,0
reseek,16,0,KVHeap,4,2675,0,0,-0,0
reseek,16,0,KVHeap,8,3628,0,0,-0,0
reseek,16,0,KVHeap,16,2583,0,0,-0,0
reseek,16,0,KVHeap,32,2284,0,0,-0,0

Where the 6th col indicates the opsPerSec.  We can see a gain of 7% to 10%

> Small improvement in KVHeap seek() API
> --------------------------------------
>
>                 Key: HBASE-14826
>                 URL: https://issues.apache.org/jira/browse/HBASE-14826
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Minor
>         Attachments: HBASE-14826.patch
>
>
> Currently in seek/reseek() APIs we tend to do lot of priorityqueue related 
> operations. We initially add the current scanner to the heap, then poll and 
> again add the scanner back if the seekKey is greater than the topkey in that 
> scanner. Since the KVs are always going to be in increasing order and in 
> ideal scan flow every seek/reseek is followed by a next() call it should be 
> ok if we start with checking the current scanner and then do a poll to get 
> the next scanner. Just avoid the initial PQ.add(current) call. This could 
> save some comparisons. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to