[
https://issues.apache.org/jira/browse/HUDI-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678388#comment-17678388
]
Alexey Kudinkin commented on HUDI-3828:
---------------------------------------
This is addressed in the new scanV2 implementation
> We need to revisit MOR block merging sequence
> ---------------------------------------------
>
> Key: HUDI-3828
> URL: https://issues.apache.org/jira/browse/HUDI-3828
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Alexey Kudinkin
> Assignee: Alexey Kudinkin
> Priority: Critical
> Fix For: 0.14.0
>
>
> Currently, block-merging is configurable to be either lazy or non-lazy.
> However non-lazy sequence is incorrect – it will be merging blocks before
> actually rolling back reverted ones. To make sure users do not accidentally
> hit this issue, we need to revisit MOR block merging sequence and make sure
> that following invariants are upheld
> # Blocks are merged in 2 passes:
> ## First we load all blocks, while dropping rolled back ones, then
> ## We merge them in another forward-pass
> # We should try to avoid having 2 merging sequences and instead consolidate
> on just one: right now we have "block + block", and "base + block", but we
> should be able to just get away with just the latter (this will simplify
> merging sequence quite substantially, for ex in respect to handling of
> deletions)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)