[
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871212#comment-13871212
]
Sergey Shelukhin commented on HBASE-10227:
------------------------------------------
This JIRA tries to solve it for some other issue (see Feng's comment above).
Scanners before opening the region would be possible (HBASE-10241) but they
don't require storing MVCC for all time, just for more time. You could also do
stuff like arbitrary file combinations for compactions with mvcc/seqId always
stored (so you don't need the files to be in order to compare KVs).
As for storing one way to optimize would be vlong with delta, base value in
header. One would expect most KVs in most files to have mvccs in narrow range.
> When a region is opened, its mvcc isn't correctly recovered when there are
> split hlogs to replay
> ------------------------------------------------------------------------------------------------
>
> Key: HBASE-10227
> URL: https://issues.apache.org/jira/browse/HBASE-10227
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: Feng Honghua
> Assignee: Feng Honghua
> Attachments: HBASE-10227-trunk_v0.patch
>
>
> When opening a region, all stores are examined to get the max MemstoreTS and
> it's used as the initial mvcc for the region, and then split hlogs are
> replayed. In fact the edits in split hlogs have kvs with greater mvcc than
> all MemstoreTS in all store files, but replaying them don't increment the
> mvcc according at all. From an overall perspective this mvcc recovering is
> 'logically' incorrect/incomplete.
> Why currently it doesn't incur problem is because no active scanners exists
> and no new scanners can be created before the region opening completes, so
> the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely
> set to zero. They are just treated as kvs put 'earlier' than the ones in
> HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less
> than the ones with non-zero mvcc, but in fact they are put 'later'), and
> without any incorrect impact just because during region opening there are no
> active scanners existing / created.
> This bug is just in 'logic' sense for the time being, but if later on we need
> to survive mvcc in the region's whole logic lifecycle(across regionservers)
> and never set them to zero, this bug needs to be fixed first.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)