[
https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857617#comment-13857617
]
Sergey Shelukhin commented on HBASE-10241:
------------------------------------------
bq. On mvcc giving consistent view on region, that is unnecessary, right – when
would we ever care about a consistent view across a region rather than just
across a row (other than the fact that row boundaries are only known after the
fact, after you have passed them out)
It can actually be pretty important... if recovery takes a while and scanner
bounces the data read can be several minutes apart. For certain use cases it's
much better to have consistent data for close rows (esp. if some sharded data
is stored). Also, if secondary reads are implemented the divergence between
scanners can be even greater, so the negative effects of scanner "jumping" will
be even more visible. Then, as suggested above, by querying mvcc from all
requisite regions before scanner runs we can make it even more reasonable.
Then it becomes as close as you can get to consistent view of the data without
implementing something like Percolator, with external timestamps. Which is
pretty neat :)
> implement mvcc-consistent scanners (across recovery)
> ----------------------------------------------------
>
> Key: HBASE-10241
> URL: https://issues.apache.org/jira/browse/HBASE-10241
> Project: HBase
> Issue Type: New Feature
> Components: HFile, regionserver, Scanners
> Affects Versions: 0.99.0
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: Consistent scanners.pdf
>
>
> Scanners currently use mvcc for consistency. However, mvcc is lost on server
> restart, or even a region move. This JIRA is to enable the scanners to
> transfer mvcc (or seqId, or some other number, see HBASE-8763) between
> servers. First, client scanner needs to get and store the readpoint. Second,
> mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in
> store files per KV and discarded when not needed.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)