[
https://issues.apache.org/jira/browse/HBASE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pranav Khaitan updated HBASE-2265:
----------------------------------
Attachment: TimestampOptimizationV6.patch
Hi Ryan, Jonathan,
There is a major correction to this JIRA. I just did some more testing and
realized that we forgot one thing in the last set of refactoring and
reformatting.
In line 422, we had changed the variable from b to timerangeBytes but did
not change the if statement in next sentence (which is a big deal).
byte[] timerangeBytes = metadataMap.get(TIMERANGE_KEY);
if (b!=null)
Should be changed to:
byte[] timerangeBytes = metadataMap.get(TIMERANGE_KEY);
if (timerangeBytes != null)
I am also attaching the patch with this mail. Please update this asap and
let me know if you have any questions.
Regards,
Pranav
> HFile and Memstore should maintain minimum and maximum timestamps
> -----------------------------------------------------------------
>
> Key: HBASE-2265
> URL: https://issues.apache.org/jira/browse/HBASE-2265
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Todd Lipcon
> Assignee: Pranav Khaitan
> Fix For: 0.90.0
>
> Attachments: TimestampOptimizationV6.patch
>
>
> In order to fix HBASE-1485 and HBASE-29, it would be very helpful to have
> HFile and Memstore track their maximum and minimum timestamps. This has the
> following nice properties:
> - for a straight Get, if an entry has been already been found with timestamp
> X, and X >= HFile.maxTimestamp, the HFile doesn't need to be checked. Thus,
> the current fast behavior of get can be maintained for those who use strictly
> increasing timestamps, but "correct" behavior for those who sometimes write
> out-of-order.
> - for a scan, the "latest timestamp" of the storage can be used to decide
> which cell wins, even if the timestamp of the cells is equal. In essence,
> rather than comparing timestamps, instead you are able to compare tuples of
> (row timestamp, storage.max_timestamp)
> - in general, min_timestamp(storage A) >= max_timestamp(storage B) if storage
> A was flushed after storage B.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.