[ 
https://issues.apache.org/jira/browse/HBASE-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581465#action_12581465
 ] 

Bryan Duxbury commented on HBASE-29:
------------------------------------

I think we should make it a priority to get this fixed. Even if it performs 
worse, it's really unacceptable to give incorrect answers.

However, I think there's a decent alternative to just getting slower wholesale. 
When we fixed getClosestBefore, we decided that the assumption would always be 
that getClosestRowBefore had to operate on a table where cells were always 
being added in ascending timestamp order, which at least made it perform 
acceptably. Clearly, this issue is about situations where that assumption isn't 
true. So, what I think we should do is make the default get and getRow methods 
return the answer that assumes the mapfiles don't have any inherent ordering, 
and then make new methods getAscending and getRowAscending (names could 
change?) that assume mapfiles are sorted ascending, and are faster as a result. 

With this approach, people can make the default choice of using get and getRow, 
pay the performance penalty, but get the right answer no matter what. Then, if 
people happen to have a use case that matches the always-ascending constraints, 
then they can just switch the method call fractionally and get the improved 
performance.

> [hbase] HStore#get and HStore#getFull may not return expected values by 
> timestamp when there is more than one MapFile
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-29
>                 URL: https://issues.apache.org/jira/browse/HBASE-29
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, regionserver
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.2.0
>
>         Attachments: 29.patch
>
>
> Ok, this one is a little tricky. Let's say that you write a row with some 
> value without a timestamp, thus meaning right now. Then, the memcache gets 
> flushed out to a MapFile. Then, you write another value to the same row, this 
> time with a timestamp that is in the past, ie, before the "now" timestamp of 
> the first put. 
> Some time later, but before there is a compaction, if you do a get for this 
> row, and only ask for a single version, you will logically be expecting the 
> latest version of the cell, which you would assume would be the one written 
> at "now" time. Instead, you will get the value written into the "past" cell, 
> because even though it is tagged as having happened in the past, it actually 
> *was written* after the "now" cell, and thus when #get searches for 
> satisfying values, it runs into the one most recently written first. 
> The result of this problem is inconsistent data results. Note that this 
> problem only ever exists when there's an uncompacted HStore, because during 
> compaction, these cells will all get sorted into the correct order by 
> timestamp and such. In a way, this actually makes the problem worse, because 
> then you could easily get inconsistent results from HBase about the same 
> (unchanged) row depending on whether there's been a flush/compaction.
> The only solution I can think of for this problem at the moment is to scan 
> all the MapFiles and Memcache for possible results, sort them, and then 
> select the desired number of versions off of the top. This is unfortunate 
> because it means you never get the snazzy shortcircuit logic except within a 
> single mapfile or memcache. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to