[ 
https://issues.apache.org/jira/browse/HBASE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532881#comment-13532881
 ] 

Kannan Muthukkaruppan commented on HBASE-4443:
----------------------------------------------

Todd: HBASE-5987 does the peek at/track the start key of the next block 
optimization (so that a reseek in the middle of a scan doesn't have to go back 
to the index in many cases). 

But this JIRA is for a different case. This JIRA should help cases where you 
are looking for the first key in a block. Normally, that's not a common case. 
But if you are storing large objects in HBase, then it becomes a common case 
because each block may contain very small number of keys. 

Say "r1,c3,ts=21" is the first key in a block, and we are looking for "r1,c3". 
The query itself doesn't have a TS. So, we'll search for r1,c3,ts=LONG.MAX_VAL 
and end up going to the previous block unnecessarily. This JIRA proposes to 
change the key that is kept in the HFile index. Instead of the index keeping 
the first key of the blocks it is pointing to, it'll keep a fake key - which is 
the last key of the previous block plus eplison. So, for my earlier example 
(from two posts prior), we propose that the index will keep the key for Block 2 
as r1,c2,19 (because the last key of Block 1 is r1,c2,20- so we simply pick the 
next key whether or not that's the actual start key of this block); instead of 
the current r1,c3,21.
                
> optimize/avoid seeking to "previous" block when key you are interested in is 
> the first one of a block
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4443
>                 URL: https://issues.apache.org/jira/browse/HBASE-4443
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>
> This issue primarily affects cases when you are storing large blobs, i.e. 
> when blocks contain small number of keys, and the chances of the key you are 
> looking for being the first block of a key is higher.
> Say, you are looking for "row/col", and "row/col/ts=5" is the latest version 
> of the key in the HFile and is at the beginning of block X.
> The search for the key is done by looking for "row/col/TS=Long.MAX_VAL", but 
> this will land us in block X-1 (because ts=Long.MAX_VAL sorts ahead of ts=5); 
> only to find that there is no matching "row/col" in block X-1, and then we'll 
> advance to block X to return the value.
> Seems like we should be able to optimize this somehow.
> Some possibilities:
> 1) Suppose we track that the  file contains no deletes, and if the CF setting 
> has MAX_VERSIONS=1, we can know for sure that block X - 1 does not contain 
> any relevant data, and directly position the seek to block X. [This will also 
> require the memstore flusher to remove extra versions if MAX_VERSION=1 and 
> not allow the file to contain duplicate entries for the same ROW/COL.]  
> Tracking deletes will also avoid in many cases, the seek to the top of the 
> row to look for DeleteFamily.
> 2) Have a dense index (1 entry per KV in the index; this might be ok for 
> large object case since index vs. data ratio will still be low).
> 3) Have the index contain the last KV of each block also in addition to the 
> first KV. This doubles the size of the index though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to