[ 
https://issues.apache.org/jira/browse/HBASE-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905993#action_12905993
 ] 

stack commented on HBASE-2450:
------------------------------

So, we just had an interesting case here where an ICV was running real slow -- 
two orders of magnitude slower than old Get 0.20.x codepath, see hbase-2959 -- 
because the ICV was being done on a row that had thousands of columns (The ICV 
to update was somewhere in the midst of these thousands of columns).   At first 
blush, the fix was changing ScanQueryMatcher so that the startrow was changed 
from '    this.startKey = KeyValue.createFirstOnRow(scan.getStartRow());' to 
instead consider column.  But then, reading this issue, I'm reminded of deletes 
and of how a delete row is first thing on the row and of how a delete family is 
first thing in a family.

Having to go to the start of the row and move forward is slowing Gets (and 
ICVs).

Above its mentioned that get on a row needs to look at start of row to see if a 
delete of all the row (and we need to look at start of family to see if family 
is deleted) but, yeah, this seems wrong.

The other ideas sound better -- delete dynamic bloom or extra info in index.

Meantime we've changed our schema here so ICVs done in a row of one column only 
but this issue is going to burn us again.

> For single row reads of specific columns, seek to the first column in HFiles 
> rather than start of row
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2450
>                 URL: https://issues.apache.org/jira/browse/HBASE-2450
>             Project: HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>            Reporter: Jonathan Gray
>            Assignee: Pranav Khaitan
>             Fix For: 0.90.0
>
>
> Currently we will always seek to the start of a row.  If we are getting 
> specific columns, we should seek to the first column in that row.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to