[
https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701107#comment-13701107
]
Jesse Yates edited comment on HBASE-8809 at 7/5/13 8:39 PM:
------------------------------------------------------------
As slight follow up to this, it feels like raw scans should also ignore the
column version/timestamp filtering. In particular, I'm talking about this
section in ScanQueryMatcher:
{code}
MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength,
timestamp, type, kv.getMemstoreTS() > maxReadPointToTrackVersions);
/*
* According to current implementation, colChecker can only be
* SEEK_NEXT_COL, SEEK_NEXT_ROW, SKIP or INCLUDE. Therefore, always return
* the MatchCode. If it is SEEK_NEXT_ROW, also set stickyNextRow.
*/
...
{code}
Where the ScanWildcardColumnTracker will not ignore the timestamp in the simple
case - four (since default is to keep 3 versions) puts to the same row with
increasing timestamps will ignore the first by default, even though its still
"present" in the store regardless of the rawness of the scan.
Thoughts?
was (Author: jesse_yates):
As slight follow up to this, it feels like raw scans should also ignore the
column version/timestamp filtering. In particular, I'm talking about this
section in ScanQueryMatcher:
{code}
MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength,
timestamp, type, kv.getMemstoreTS() > maxReadPointToTrackVersions);
/*
* According to current implementation, colChecker can only be
* SEEK_NEXT_COL, SEEK_NEXT_ROW, SKIP or INCLUDE. Therefore, always return
* the MatchCode. If it is SEEK_NEXT_ROW, also set stickyNextRow.
*/
...
{code}
Where the ScanWildcardColumnTracker will not ignore the timestamp in the simple
case - four puts to the same row with different timestamps will ignore the
oldest by default, even though its still "present" in the store regardless of
the rawness of the scan.
Thoughts?
> Include deletes in the scan (setRaw) method does not respect the time range
> or the filter
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-8809
> URL: https://issues.apache.org/jira/browse/HBASE-8809
> Project: HBase
> Issue Type: Bug
> Components: Scanners
> Reporter: Vasu Mariyala
> Assignee: Lars Hofhansl
> Fix For: 0.98.0, 0.95.2, 0.94.10
>
> Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc
>
>
> If a row has been deleted at time stamp 'T' and a scan with time range (0,
> T-1) is executed, it still returns the delete marker at time stamp 'T'. It is
> because of the code in ScanQueryMatcher.java
> {code}
> if (retainDeletesInOutput
> || (!isUserScan && (EnvironmentEdgeManager.currentTimeMillis() -
> timestamp) <= timeToPurgeDeletes)
> || kv.getMemstoreTS() > maxReadPointToTrackVersions) {
> // always include or it is not time yet to check whether it is OK
> // to purge deltes or not
> return MatchCode.INCLUDE;
> }
> {code}
> The assumption is scan (even with setRaw is set to true) should respect the
> filters and the time range specified.
> Please let me know if you think this behavior can be changed so that I can
> provide a patch for it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira