[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863531#comment-13863531 ] Enis Soztutar commented on HBASE-10241: --- We need to fix this for a couple of different reasons: - Fixing scanner consistency with multi-row transactions (see HBASE-9797) - Adding cell-based scanners, and streaming scans - Adding single-row scanners. - Consistent scanners with region replicas in case replicas are mostly up to date (HBASE-10070) What is the plan here? I think we should do subtasks 1 and 3 regardless of HBASE-8763. But it seems that if we do HBASE-8763 first, it will be much cleaner and we won't need subtask 2 at all. implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863546#comment-13863546 ] Sergey Shelukhin commented on HBASE-10241: -- Subtask 2 is trivial; the reason it is not done is that it is being done elsewhere (see how it's resolved as dup), so since it's not blocking us here and now it doesn't make sense to do double work. I am not working on this jira right now (will get back to it hopefully and there's a patch out in client subtask), but the plan was that I will do 1 and 3, and then take 2 if the other JIRA that does 2 is not done by then. HBASE-8763 does not need to block this, it's probably bigger than this entire JIRA. If it's done before this due to delays, good, if not, also good :) implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863552#comment-13863552 ] Sergey Shelukhin commented on HBASE-10241: -- Thanks for HBASE-9797 reference, yes, it is good for that! implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863590#comment-13863590 ] Enis Soztutar commented on HBASE-10241: --- bq. but the plan was that I will do 1 and 3, and then take 2 if the other JIRA that does 2 is not done by then. Sounds good. I thought HBASE-8721 is won't fix. bq. HBASE-8763 does not need to block this, it's probably bigger than this entire JIRA Indeed. But it will be a shame if we add mvcc's to WAL only to remove them again after HBASE-8763. BTW, I think we also have to handle mvcc / seqId as a part of the serialization in the KV byte array. Do we have any open issues for that? implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863615#comment-13863615 ] Sergey Shelukhin commented on HBASE-10241: -- There's another issue, HBASE-10227 for the WAL stuff. Mvcc can already be serialized with KV in HFile. Comment in KeyValue.java is a lie :) implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863651#comment-13863651 ] Enis Soztutar commented on HBASE-10241: --- bq. Mvcc can already be serialized with KV in HFile. Comment in KeyValue.java is a lie Sorry, I am not talking about mvcc serialization in hfile. I was talking about making mvcc number a part of the byte[] in KV. implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863654#comment-13863654 ] Sergey Shelukhin commented on HBASE-10241: -- That would be PITA from backward compat perspective - we'd both add a field, requiring HFileFormat v4 (don't really want tag overhead for this), and presumably (tags or not) remove the old magic mechanism implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863825#comment-13863825 ] Sergey Shelukhin commented on HBASE-10241: -- I will make a separate unrelated JIRA for this implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863829#comment-13863829 ] Sergey Shelukhin commented on HBASE-10241: -- HBASE-10288 implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860996#comment-13860996 ] Sergey Shelukhin commented on HBASE-10241: -- final client-side patch (that works across region re-open, split and merge as long as you trick existing server into preserving MVCC) is attached to HBASE-10242 implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13858234#comment-13858234 ] Feng Honghua commented on HBASE-10241: -- I also encountered this issue when implementing JIRA-8721, the fix for deletes can mask puts that happen after the delete, and already persisted mvcc in WAL, and then they can be used to recover region's correct mvcc during re-opening. Anyone who has interest can refer it :-) Seems setting mvcc (per hfile) to zero for (minor, arguably) performance benefit can't offset the correctness penalty it brings. Persisting mvcc and survive them across regionservers is a matter of semantic correctness, seems most related issues can be resolved by making this correct, combining mvcc and seqid is not as critical as this correcting. implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857563#comment-13857563 ] stack commented on HBASE-10241: --- [~sershe] Nice writeup. sequenceid is not scoped to the region. Ergo mvcc should be too? bq. Client will have a configurable setting to enable consistent scanners. IMO, just say no to more configs. When would someone want 'inconsistent scanners'? (In my experience, users do not play with configs in 98.1745% of cases -- smile). bq. unless mvcc and seqId are merged) Would this issue be better if above work is done first? bq. ...Recovery will have to make use of mvcc when replaying the edit to new server. This work is ongoing over in @jeffrey zhong replay wal effort? mvcc is about giving you a consistent view on a row only. This work is to deal with the case where you have a wide row and you have already passed the client the first half of a row, a crash happens, and you need to return to the client the second half of the row? A section in doc. on implications of not having this change fixed and then the 'cost' of this fix going in would help. Thanks for working on this tough one [~sershe] [~lhofhansl] One for you to peruse boss... implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857600#comment-13857600 ] Sergey Shelukhin commented on HBASE-10241: -- bq. sequenceid is not scoped to the region. Ergo mvcc should be too? That is not necessary, strictly speaking. bq. IMO, just say no to more configs. When would someone want 'inconsistent scanners'? (In my experience, users do not play with configs in 98.1745% of cases – smile). That is ok, but in 1.8265% cases when you do need config you normally really need it :) It can be on by default, but at least for the first version the new functionality should be easy to disable if needed. {quote} bq.unless mvcc and seqId are merged) Would this issue be better if above work is done first? {quote} Not much. After examining stuff I'd say it almost doesn't matter. Except maybe some more work adding mvcc to WAL, but that's not much. ...Recovery will have to make use of mvcc when replaying the edit to new server. bq. This work is ongoing over in @jeffrey zhong replay wal effort? It will have to go into both that and old replay I guess, similar to nonces. There's also HBASE-10227, which is related. bq. mvcc is about giving you a consistent view on a row only. This work is to deal with the case where you have a wide row and you have already passed the client the first half of a row, a crash happens, and you need to return to the client the second half of the row? Actually, mvcc currently gives you a consistent view of the region, in theory (unless there's some glitch in code). All updates in the region are mvcc-ordered. bq. A section in doc. on implications of not having this change fixed and then the 'cost' of this fix going in would help. Will add. What do you mean by cost? implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857607#comment-13857607 ] stack commented on HBASE-10241: --- +1 on it being on by default and config is only to turn it off if problematic. On mvcc giving consistent view on region, that is unnecessary, right -- when would we ever care about a consistent view across a region rather than just across a row (other than the fact that row boundaries are only known after the fact, after you have passed them out) -- so if that the case, we should make do w/ the lesser scope if we can. When I say cost, I mean the amount of extra checks and work the server will have to do to enforce this consistency (the fancy dancing necessary enforcing the period during which a scanner may come in post recover of a server -- will we suspend compactions during this time or will compactions have a special flag which says do not drop mvcc so no full compactions during this period -- and so on). implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857617#comment-13857617 ] Sergey Shelukhin commented on HBASE-10241: -- bq. On mvcc giving consistent view on region, that is unnecessary, right – when would we ever care about a consistent view across a region rather than just across a row (other than the fact that row boundaries are only known after the fact, after you have passed them out) It can actually be pretty important... if recovery takes a while and scanner bounces the data read can be several minutes apart. For certain use cases it's much better to have consistent data for close rows (esp. if some sharded data is stored). Also, if secondary reads are implemented the divergence between scanners can be even greater, so the negative effects of scanner jumping will be even more visible. Then, as suggested above, by querying mvcc from all requisite regions before scanner runs we can make it even more reasonable. Then it becomes as close as you can get to consistent view of the data without implementing something like Percolator, with external timestamps. Which is pretty neat :) implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857696#comment-13857696 ] Lars Hofhansl commented on HBASE-10241: --- Nice writeup. It would be better even if it would state the failure scenario we're guarding against. I think we have to careful not to add any half guarantees. Currently HBase does MVCC based SI for rows (and serializable views for increment/append). If I understand this correctly, this is not needed for row consistency, right? While recovery is in progress we cannot scan, so we'd never see partial rows, right? Then what exactly are we guarding against? (As usually, I might just miss an important point :) ) implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857745#comment-13857745 ] Sergey Shelukhin commented on HBASE-10241: -- Region-level inconsistency, essentially. implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857911#comment-13857911 ] Sergey Shelukhin commented on HBASE-10241: -- WIP patch attached to child jira implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10241) implement mvcc-consistent scanners (across recovery)
[ https://issues.apache.org/jira/browse/HBASE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857348#comment-13857348 ] Sergey Shelukhin commented on HBASE-10241: -- one-pager doesn't cover merges and splits. Looks like mvccs on client will have to be tracked per range, rather than region. For even more consistent scanner, mvccs might even optionally be pre-fetched from all regions, but that is for later implement mvcc-consistent scanners (across recovery) Key: HBASE-10241 URL: https://issues.apache.org/jira/browse/HBASE-10241 Project: HBase Issue Type: New Feature Components: HFile, regionserver, Scanners Affects Versions: 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Consistent scanners.pdf Scanners currently use mvcc for consistency. However, mvcc is lost on server restart, or even a region move. This JIRA is to enable the scanners to transfer mvcc (or seqId, or some other number, see HBASE-8763) between servers. First, client scanner needs to get and store the readpoint. Second, mvcc needs to be preserved in WAL. Third, the mvcc needs to be stored in store files per KV and discarded when not needed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)