[
https://issues.apache.org/jira/browse/HUDI-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davis Zhang updated HUDI-8917:
------------------------------
Summary: Reset checkpoint handling upgrade path (was: Reset checkpoint
handling)
> Reset checkpoint handling upgrade path
> --------------------------------------
>
> Key: HUDI-8917
> URL: https://issues.apache.org/jira/browse/HUDI-8917
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: Davis Zhang
> Assignee: Davis Zhang
> Priority: Blocker
> Fix For: 1.0.1
>
> Original Estimate: 16h
> Time Spent: 5h
> Remaining Estimate: 15h
>
> h2. Motivation
> When target table upgrades to v2, it uses completion time to track the ckp.
> In v1, it uses request time.
>
> From the user's perspective: * they only have 1 option for resetting the
> checkpoint. It does not imply it is completion time based or request time
> based.
> * the interpretation of those single ckp options are dependent on which
> version the target table is, which is beyond user's knowledge.
>
> From the system's standpoint: * It has no idea the user configured config is
> a request time or completion time. If we misread the semantics the behavior
> is totally different.
>
> Target: * When user set the checkpoint, it should be easy for them to figure
> out if they are setting completion time or request time.
> * When system playing with configs, it should treat all configs with the
> same semantics, we are only in danger when we compare request time with
> completion time.
>
> Acceptable case for simplifying the design: * Given the chance is low that 1
> instant's start time is another instants' completion time, operations like
> {{aCompletionTime == or != aRequestTimeConfig}} is allowed. But we should
> never allow {{<}} {{>}} {{<=}} {{>=}} between {{aRequestTimeConfig}} and
> {{aCompletionTime}}
> * User streamer config:
> ** cfg.ckp needs {{<}} {{>}} {{<=}} {{>=}}
> ** cfg.ignore_ckp only needs {{== / !=}}
> * Commit metadata
> ** ckp needs {{<}} {{>}} {{<=}} {{>=}}
> ** ignore_ckp and reset_ckp only needs {{== / !=}}
> h3. Broken case 1 - User wants to override checkpoint while upgrading
> {{cfg.checkpoint}}
> As a result, when user would like to override the checkpoint while target
> table upgrade is happening, they don't know if this is reset to some commit
> completion time or request time.
>
> *Auto Inference:* If it maps to request time of some instant, then we assume
> it is request time. If it maps to some completion time, it is used as
> completion time.
> Cons: User might not always set a time whose value maps to any instant
> request/completion time. Also {{cfg.checkpoint}} might involves in {{<}}
> {{>}} {{<=}} {{>=}} operations, so inferencing is not acceptable.
>
> *Introduce checkpoint v2 config:* Users make it explicit that I'm using v2
> config and we should use completion time. If the table has not upgraded to
> v2, we error out. If table has upgraded, we only allow v2.
> Cons: 1 more user config. Require user action
>
> *(Preferred) Do not allow user override ckp while upgrading:* If user set the
> ckp and we are trying to upgrade, fail the ingestion.
> {code:java}
> needsUpgradeOrDowngrade(metaClient) && cfg.checkpoint/ignoreCheckpoint
> => throw exception {code}
> Unless they turn off autoUpgrade. Also after upgrade complete, for checkpoint
> we only allow format:
> {code:java}
> before <checkpoint>
> after
> resumeFromInstantRequestTime: <checkpoint>
> or
> resumeFromInstantCompletionTime: <checkpoint>{code}
> h3. Case 2 - Ckp config from the previous commit if we go with (Preferred)
> above
> For reset ckp already taken effect before the upgrade, the reset ckp is
> interpreted as a request time. Later when the upgrade completes, the reset
> ckp is interpreted as completion time. Will this cause any issue?
>
> timeline
> t0.instant1.requested t0.instant2.completed t0_t1.instant1.completed
> t1.instant3.requested t1_t4.instant3.completed t4.instant4.requested
>
> prev commit metadata before upgrade, they are in the same semantic
> {code:java}
> {
> ckp.v1: t60 <-- request time
> reset_ckp.v1: t50 <-- request time
> } {code}
>
> after upgrade, in the first ingestion we do the following: # first resolve
> between reset_ckp.v1 and ckp.v1. They do not equal, so we continue with t1.
> *Because we do comparison before translation, we can never compare reset_key
> as a completion time to ckp as a request time.*
> # then the translation happens for t1, for non USE_TRANSITION_TIME case we
> convert to completion time t4. Then we start from there. Let's say it stop at
> some instant t5. Then we will end up with something as follows
>
> *Just no-op*
> {code:java}
> {
> ckp.v2: t60 <-- completion time
> reset_ckp.v2: t50 <-- request time
> } {code}
> Please note here we failed to keep the 2 options under the same semantics, so
> in the next ingestion, we might compare t0 and t5 and they are the same which
> leads to issues.
>
> *Just don't write* {{*reset_ckp.v1*}} *when upgrading to* {{*reset_ckp.v2*}}
> **
> { ckp.v2: t5 <-- completion time reset_ckp.v2: null }
>
>
> ----
> candidate ckp config to consider:
>
> - commitMetadata.ckp (1234) -> set checkpoint.ckp as commitMetadata.ckp
> (1234)
> - cfg.ckp (v1/v2:1234) -> set checkpoint.ckp as cfg.ckp (1234)
> checkpoint.reset_ckp (raw value - v1/v2:1234)
> - cfg.ignore_ckp (1234) -> set checkpoint.ckp as null
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)