[ 
https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857617#comment-13857617
 ] 

Sergey Shelukhin commented on HBASE-10241:
------------------------------------------

bq. On mvcc giving consistent view on region, that is unnecessary, right – when 
would we ever care about a consistent view across a region rather than just 
across a row (other than the fact that row boundaries are only known after the 
fact, after you have passed them out)
It can actually be pretty important... if recovery takes a while and scanner 
bounces the data read can be several  minutes apart. For certain use cases it's 
much better to have consistent data for close rows (esp. if some sharded data 
is stored). Also, if secondary reads are implemented the divergence between 
scanners can be even greater, so the negative effects of scanner "jumping" will 
be even more visible. Then, as suggested above, by querying mvcc from all 
requisite regions before scanner runs we can make it even more reasonable.
Then it becomes as close as you can get to consistent view of the data without 
implementing something like Percolator, with external timestamps. Which is 
pretty neat :)


> implement mvcc-consistent scanners (across recovery)
> ----------------------------------------------------
>
>                 Key: HBASE-10241
>                 URL: https://issues.apache.org/jira/browse/HBASE-10241
>             Project: HBase
>          Issue Type: New Feature
>          Components: HFile, regionserver, Scanners
>    Affects Versions: 0.99.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: Consistent scanners.pdf
>
>
> Scanners currently use mvcc for consistency. However, mvcc is lost on server 
> restart, or even a region move. This JIRA is to enable the scanners to 
> transfer mvcc (or seqId, or some other number, see HBASE-8763) between 
> servers. First, client scanner needs to get and store the readpoint. Second, 
> mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in 
> store files per KV and discarded when not needed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to