[jira] Commented: (HBASE-2265) HFile and Memstore should maintain minimum and maximum timestamps

HBase Review Board (JIRA) Wed, 07 Jul 2010 14:44:48 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886096#action_12886096
 ]

HBase Review Board commented on HBASE-2265:
-------------------------------------------

Message from: "Ryan Rawson" <[email protected]>

bq.  On 2010-07-07 13:58:43, Ryan Rawson wrote:
bq.  > 
trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueSkipListSet.java,
 line 55
bq.  > <http://review.hbase.org/r/257/diff/2/?file=2159#file2159line55>
bq.  >
bq.  >     I think this information should be maintained in MemStore not inside 
this data structure. We might get rid of this data structure type and change to 
another one day. This makes it too hard to do that.
bq.  
bq.  Pranav Khaitan wrote:
bq.      When we are flushing the memstore to a storefile, we are passing an 
object of KeyValueSkipListSet. This variable goes through several functions 
before reaching Store. If we don't have TimeRangeTracker inside 
KeyValueSkipListSet, we will have to change all flush related functions to take 
an extra argument as input. This way, in future, if we decide to send another 
piece of information, we will have to add more arguments. Having 
TimeRangeTracker inside KeyValueSkipListSet lets us pass the information 
without changing all flush related functions. Would it still be better to pass 
TimeRangeTracker as an additional argument?

this totally makes sense, the only issue is that historically we have 
KeyValueSkipListSet because we couldnt use SkipListSet with the particular 
implementation of incrementColumnValue we had.  Now that the implementation of 
ICV is changing (in an unrelated JIRA), we no longer need a specialized 
SkipListSet and we could use the standard one instead.  

We have the StoreFlusherImpl inside Store which exists to capture this kind of 
metadata and carry it along, so doing the other thing might not be too painful 
or bogus, what do you think?

- Ryan

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/257/#review314
-----------------------------------------------------------

> HFile and Memstore should maintain minimum and maximum timestamps
> -----------------------------------------------------------------
>
>                 Key: HBASE-2265
>                 URL: https://issues.apache.org/jira/browse/HBASE-2265
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Todd Lipcon
>            Assignee: Pranav Khaitan
>
> In order to fix HBASE-1485 and HBASE-29, it would be very helpful to have 
> HFile and Memstore track their maximum and minimum timestamps. This has the 
> following nice properties:
> - for a straight Get, if an entry has been already been found with timestamp 
> X, and X >= HFile.maxTimestamp, the HFile doesn't need to be checked. Thus, 
> the current fast behavior of get can be maintained for those who use strictly 
> increasing timestamps, but "correct" behavior for those who sometimes write 
> out-of-order.
> - for a scan, the "latest timestamp" of the storage can be used to decide 
> which cell wins, even if the timestamp of the cells is equal. In essence, 
> rather than comparing timestamps, instead you are able to compare tuples of 
> (row timestamp, storage.max_timestamp)
> - in general, min_timestamp(storage A) >= max_timestamp(storage B) if storage 
> A was flushed after storage B.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2265) HFile and Memstore should maintain minimum and maximum timestamps

Reply via email to