[
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508495#comment-14508495
]
Jeffrey Zhong commented on HBASE-13389:
---------------------------------------
[[email protected]] Well said and good examples! As of today. there are two
cases that we could have out of order puts: DLR or replication, where the order
of wal files to be replayed isn't guaranteed.
For non-adjacent hfile compactions, it seems that we have to keep mvcc in KVs
level, For example, hfile1(max mvcc=1) hfile2(max mvcc=2) and hfile3(max
mvcc=3). If we just compact hfile1 and hfile3, we can't set the newly compacted
hfile's max mvcc=3 because hfile2 may have same rows in either hfile1 or hfile2.
Keeping mvcc will make the "haunting" out-of-order issue go away and one less
concern. Let me know which option we should go and I can also help on the fix.
> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -------------------------------------------------------------
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
> Issue Type: Sub-task
> Components: Performance
> Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the
> file level. This is sort of how it should be many of us would argue but as a
> side-effect of this change, read-time optimizations that helped speed scans
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself,
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against
> the actual smallestReadpoint, and hence we're always performing all the
> checks, tests, and comparisons that these jiras removed in addition to
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change:
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)