[ 
https://issues.apache.org/jira/browse/HBASE-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857512#action_12857512
 ] 

Jonathan Gray commented on HBASE-2450:
--------------------------------------

Spoke w/ kannan/karthik offline and added clarity to the issue here by talking 
about case of multiple files.  Whether a minor removes deleted stuff or not, 
we'll always have to read any row deletes because they could apply to both 
newer and older storefiles.  This is the major difference between majors and 
minors, that a major always includes all files.

Another issue we brought up, beyond the scope of this jira but still 
interesting, is whether even during a major we should actually remove deletes.  
The problem is just that this creates a scenario where a background process 
impacts user-facing behavior.  This is really just an issue w/ manual stamps, 
but we are moving towards a world where we support all kinds of weird out of 
order stuff.  If I have a delete row @ ts=10, then everything I insert with ts 
< 10 will not get read.  But after major compaction, this will no longer be the 
case and inserts with ts < 10 will now show up.  Again I'm not a big proponent 
of supporting wonky use cases like that but it's weird nonetheless and maybe 
there are other less wonky scenarios this could lead to weird user-facing 
behavior that differs pre/post major compactions.

Back to the jira at hand, it looks like we'll have to take the approach of 
either seeking first to the start of a row, looking for row deletes, then once 
done we re-seek down to the first column... or we get fancy.  One fancy idea is 
we could have a separate "delete" bloom which marks all rows which contain 
deletes (or just delete rows).  Another idea would be to add a small flag in 
the block index marking a bit of extra information for each block (does it have 
deletes in it, for example)... could potentially do some neat stuff there, but 
need to determine if/when it would be worth it.

In any case, the simpler solution should suffice for now, and will be a big 
improvement over today where we never seek to anything but the start of a row 
so will always have to iterate through all the columns until we find what we 
want.  For hot rows, the first block in the row will get in the block cache and 
further requests for columns in that row will be able to get the initial row 
seek cheaply because it's already cached.

> For single row reads of specific columns, seek to the first column in HFiles 
> rather than start of row
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2450
>                 URL: https://issues.apache.org/jira/browse/HBASE-2450
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: io, regionserver
>            Reporter: Jonathan Gray
>             Fix For: 0.20.5, 0.21.0
>
>
> Currently we will always seek to the start of a row.  If we are getting 
> specific columns, we should seek to the first column in that row.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to