[ 
https://issues.apache.org/jira/browse/HUDI-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-8917:
---------------------------------
    Labels: pull-request-available  (was: )

> Reset checkpoint handling upgrade path
> --------------------------------------
>
>                 Key: HUDI-8917
>                 URL: https://issues.apache.org/jira/browse/HUDI-8917
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Davis Zhang
>            Assignee: Davis Zhang
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.0.1
>
>   Original Estimate: 16h
>          Time Spent: 24h
>  Remaining Estimate: 0h
>
> h2. PR for review [https://github.com/apache/hudi/pull/12718]
>  
> h2. Motivation
> When target table upgrades to v2, it uses completion time to track the ckp. 
> In v1, it uses request time.
>  
> From the user's perspective: * they only have 1 option for resetting the 
> checkpoint. It does not imply it is completion time based or request time 
> based.
>  * the interpretation of those single ckp options are dependent on which 
> version the target table is, which is beyond user's knowledge.
>  
> From the system's standpoint: * It has no idea the user configured config is 
> a request time or completion time. If we misread the semantics the behavior 
> is totally different.
>  
> Target: * When user set the checkpoint, it should be easy for them to figure 
> out if they are setting completion time or request time.
>  * When system playing with configs, it should treat all configs with the 
> same semantics, we are only in danger when we compare request time with 
> completion time.
>  
> Acceptable case for simplifying the design: * Given the chance is low that 1 
> instant's start time is another instants' completion time, operations like 
> {{aCompletionTime == or != aRequestTimeConfig}} is allowed. But we should 
> never allow {{<}} {{>}} {{<=}} {{>=}} between {{aRequestTimeConfig}} and 
> {{aCompletionTime}}
>  * User streamer config:
>  ** cfg.ckp needs {{<}} {{>}} {{<=}} {{>=}}
>  ** cfg.ignore_ckp only needs {{== / !=}}
>  * Commit metadata
>  ** ckp needs {{<}} {{>}} {{<=}} {{>=}}
>  ** ignore_ckp and reset_ckp only needs {{== / !=}}
> h3. Broken case 1 - User wants to override checkpoint while upgrading
> {{cfg.checkpoint}}
> As a result, when user would like to override the checkpoint while target 
> table upgrade is happening, they don't know if this is reset to some commit 
> completion time or request time.
>  
> *Auto Inference:* If it maps to request time of some instant, then we assume 
> it is request time. If it maps to some completion time, it is used as 
> completion time.
> Cons: User might not always set a time whose value maps to any instant 
> request/completion time. Also {{cfg.checkpoint}} might involves in {{<}} 
> {{>}} {{<=}} {{>=}} operations, so inferencing is not acceptable.
>  
> *Introduce checkpoint v2 config:* Users make it explicit that I'm using v2 
> config and we should use completion time. If the table has not upgraded to 
> v2, we error out. If table has upgraded, we only allow v2.
> Cons: 1 more user config. Require user action
>  
> *(Preferred) Do not allow user override ckp while upgrading:* If user set the 
> ckp and we are trying to upgrade, fail the ingestion.
> {code:java}
> needsUpgradeOrDowngrade(metaClient) && cfg.checkpoint/ignoreCheckpoint 
> => throw exception {code}
> Unless they turn off autoUpgrade. Also after upgrade complete, for checkpoint 
> we only allow format:
> {code:java}
> before <checkpoint>
> after
> resumeFromInstantRequestTime: <checkpoint>
> or
> resumeFromInstantCompletionTime: <checkpoint>{code}
> h3. Also because we enforce ckp override to keep clean during auto upgrade 
> iteration, the commit metadata generated will not have reset key and ignore 
> key, so we don't need any explicit clean up at all.
>  
> h3. Case 2 - Ckp config from the previous commit if we go with (Preferred) 
> above
> For reset ckp already taken effect before the upgrade, the reset ckp is 
> interpreted as a request time. Later when the upgrade completes, the reset 
> ckp is interpreted as completion time. Will this cause any issue?
>  
> timeline
> t0.instant1.requested t0.instant2.completed t0_t1.instant1.completed 
> t1.instant3.requested t1_t4.instant3.completed t4.instant4.requested
>  
> prev commit metadata before upgrade, they are in the same semantic
> {code:java}
> { 
>   ckp.v1: t60 <-- request time 
>   reset_ckp.v1: t50 <-- request time
> } {code}
>  
> after upgrade, in the first ingestion we do the following: # first resolve 
> between reset_ckp.v1 and ckp.v1. They do not equal, so we continue with t1. 
> *Because we do comparison before translation, we can never compare reset_key 
> as a completion time to ckp as a request time.*
>  # then the translation happens for t1, for non USE_TRANSITION_TIME case we 
> convert to completion time t4. Then we start from there. Let's say it stop at 
> some instant t5. Then we will end up with something as follows
>  
> *Just no-op*
> {code:java}
> { 
> ckp.v2: t60 <-- completion time
> reset_ckp.v2: t50 <-- request time 
> } {code}
> Please note here we failed to keep the 2 options under the same semantics, so 
> in the next ingestion, we might compare t0 and t5 and they are the same which 
> leads to issues.
>  
> *Just don't write* {{*reset_ckp.v1*}} *when upgrading to* {{*reset_ckp.v2*}} 
> **
> { ckp.v2: t5 <-- completion time reset_ckp.v2: null }
>  
>  
> ----
> candidate ckp config to consider:
>  
>  - commitMetadata.ckp (1234) -> set checkpoint.ckp as commitMetadata.ckp 
> (1234)
>  - cfg.ckp (v1/v2:1234) -> set checkpoint.ckp as cfg.ckp (1234)
> checkpoint.reset_ckp (raw value - v1/v2:1234)
>  - cfg.ignore_ckp (1234) -> set checkpoint.ckp as null
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to