[
https://issues.apache.org/jira/browse/HUDI-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davis Zhang updated HUDI-8917:
------------------------------
Description:
h2. PR for review [https://github.com/apache/hudi/pull/12718]
h2. Motivation
When target table upgrades to v2, it uses completion time to track the ckp. In
v1, it uses request time.
>From the user's perspective: * they only have 1 option for resetting the
>checkpoint. It does not imply it is completion time based or request time
>based.
* the interpretation of those single ckp options are dependent on which
version the target table is, which is beyond user's knowledge.
>From the system's standpoint: * It has no idea the user configured config is a
>request time or completion time. If we misread the semantics the behavior is
>totally different.
Target: * When user set the checkpoint, it should be easy for them to figure
out if they are setting completion time or request time.
* When system playing with configs, it should treat all configs with the same
semantics, we are only in danger when we compare request time with completion
time.
Acceptable case for simplifying the design: * Given the chance is low that 1
instant's start time is another instants' completion time, operations like
{{aCompletionTime == or != aRequestTimeConfig}} is allowed. But we should never
allow {{<}} {{>}} {{<=}} {{>=}} between {{aRequestTimeConfig}} and
{{aCompletionTime}}
* User streamer config:
** cfg.ckp needs {{<}} {{>}} {{<=}} {{>=}}
** cfg.ignore_ckp only needs {{== / !=}}
* Commit metadata
** ckp needs {{<}} {{>}} {{<=}} {{>=}}
** ignore_ckp and reset_ckp only needs {{== / !=}}
h3. Broken case 1 - User wants to override checkpoint while upgrading
{{cfg.checkpoint}}
As a result, when user would like to override the checkpoint while target table
upgrade is happening, they don't know if this is reset to some commit
completion time or request time.
*Auto Inference:* If it maps to request time of some instant, then we assume it
is request time. If it maps to some completion time, it is used as completion
time.
Cons: User might not always set a time whose value maps to any instant
request/completion time. Also {{cfg.checkpoint}} might involves in {{<}} {{>}}
{{<=}} {{>=}} operations, so inferencing is not acceptable.
*Introduce checkpoint v2 config:* Users make it explicit that I'm using v2
config and we should use completion time. If the table has not upgraded to v2,
we error out. If table has upgraded, we only allow v2.
Cons: 1 more user config. Require user action
*(Preferred) Do not allow user override ckp while upgrading:* If user set the
ckp and we are trying to upgrade, fail the ingestion.
{code:java}
needsUpgradeOrDowngrade(metaClient) && cfg.checkpoint/ignoreCheckpoint
=> throw exception {code}
Unless they turn off autoUpgrade. Also after upgrade complete, for checkpoint
we only allow format:
{code:java}
before <checkpoint>
after
resumeFromInstantRequestTime: <checkpoint>
or
resumeFromInstantCompletionTime: <checkpoint>{code}
h3. Also because we enforce ckp override to keep clean during auto upgrade
iteration, the commit metadata generated will not have reset key and ignore
key, so we don't need any explicit clean up at all.
h3. Case 2 - Ckp config from the previous commit if we go with (Preferred) above
For reset ckp already taken effect before the upgrade, the reset ckp is
interpreted as a request time. Later when the upgrade completes, the reset ckp
is interpreted as completion time. Will this cause any issue?
timeline
t0.instant1.requested t0.instant2.completed t0_t1.instant1.completed
t1.instant3.requested t1_t4.instant3.completed t4.instant4.requested
prev commit metadata before upgrade, they are in the same semantic
{code:java}
{
ckp.v1: t60 <-- request time
reset_ckp.v1: t50 <-- request time
} {code}
after upgrade, in the first ingestion we do the following: # first resolve
between reset_ckp.v1 and ckp.v1. They do not equal, so we continue with t1.
*Because we do comparison before translation, we can never compare reset_key as
a completion time to ckp as a request time.*
# then the translation happens for t1, for non USE_TRANSITION_TIME case we
convert to completion time t4. Then we start from there. Let's say it stop at
some instant t5. Then we will end up with something as follows
*Just no-op*
{code:java}
{
ckp.v2: t60 <-- completion time
reset_ckp.v2: t50 <-- request time
} {code}
Please note here we failed to keep the 2 options under the same semantics, so
in the next ingestion, we might compare t0 and t5 and they are the same which
leads to issues.
*Just don't write* {{*reset_ckp.v1*}} *when upgrading to* {{*reset_ckp.v2*}} **
{ ckp.v2: t5 <-- completion time reset_ckp.v2: null }
----
candidate ckp config to consider:
- commitMetadata.ckp (1234) -> set checkpoint.ckp as commitMetadata.ckp (1234)
- cfg.ckp (v1/v2:1234) -> set checkpoint.ckp as cfg.ckp (1234)
checkpoint.reset_ckp (raw value - v1/v2:1234)
- cfg.ignore_ckp (1234) -> set checkpoint.ckp as null
was:
h2. PR WIP https://github.com/Davis-Zhang-Onehouse/hudi-oss/pull/1
h2. Motivation
When target table upgrades to v2, it uses completion time to track the ckp. In
v1, it uses request time.
>From the user's perspective: * they only have 1 option for resetting the
>checkpoint. It does not imply it is completion time based or request time
>based.
* the interpretation of those single ckp options are dependent on which
version the target table is, which is beyond user's knowledge.
>From the system's standpoint: * It has no idea the user configured config is a
>request time or completion time. If we misread the semantics the behavior is
>totally different.
Target: * When user set the checkpoint, it should be easy for them to figure
out if they are setting completion time or request time.
* When system playing with configs, it should treat all configs with the same
semantics, we are only in danger when we compare request time with completion
time.
Acceptable case for simplifying the design: * Given the chance is low that 1
instant's start time is another instants' completion time, operations like
{{aCompletionTime == or != aRequestTimeConfig}} is allowed. But we should never
allow {{<}} {{>}} {{<=}} {{>=}} between {{aRequestTimeConfig}} and
{{aCompletionTime}}
* User streamer config:
** cfg.ckp needs {{<}} {{>}} {{<=}} {{>=}}
** cfg.ignore_ckp only needs {{== / !=}}
* Commit metadata
** ckp needs {{<}} {{>}} {{<=}} {{>=}}
** ignore_ckp and reset_ckp only needs {{== / !=}}
h3. Broken case 1 - User wants to override checkpoint while upgrading
{{cfg.checkpoint}}
As a result, when user would like to override the checkpoint while target table
upgrade is happening, they don't know if this is reset to some commit
completion time or request time.
*Auto Inference:* If it maps to request time of some instant, then we assume it
is request time. If it maps to some completion time, it is used as completion
time.
Cons: User might not always set a time whose value maps to any instant
request/completion time. Also {{cfg.checkpoint}} might involves in {{<}} {{>}}
{{<=}} {{>=}} operations, so inferencing is not acceptable.
*Introduce checkpoint v2 config:* Users make it explicit that I'm using v2
config and we should use completion time. If the table has not upgraded to v2,
we error out. If table has upgraded, we only allow v2.
Cons: 1 more user config. Require user action
*(Preferred) Do not allow user override ckp while upgrading:* If user set the
ckp and we are trying to upgrade, fail the ingestion.
{code:java}
needsUpgradeOrDowngrade(metaClient) && cfg.checkpoint/ignoreCheckpoint
=> throw exception {code}
Unless they turn off autoUpgrade. Also after upgrade complete, for checkpoint
we only allow format:
{code:java}
before <checkpoint>
after
resumeFromInstantRequestTime: <checkpoint>
or
resumeFromInstantCompletionTime: <checkpoint>{code}
h3. Also because we enforce ckp override to keep clean during auto upgrade
iteration, the commit metadata generated will not have reset key and ignore
key, so we don't need any explicit clean up at all.
h3. Case 2 - Ckp config from the previous commit if we go with (Preferred) above
For reset ckp already taken effect before the upgrade, the reset ckp is
interpreted as a request time. Later when the upgrade completes, the reset ckp
is interpreted as completion time. Will this cause any issue?
timeline
t0.instant1.requested t0.instant2.completed t0_t1.instant1.completed
t1.instant3.requested t1_t4.instant3.completed t4.instant4.requested
prev commit metadata before upgrade, they are in the same semantic
{code:java}
{
ckp.v1: t60 <-- request time
reset_ckp.v1: t50 <-- request time
} {code}
after upgrade, in the first ingestion we do the following: # first resolve
between reset_ckp.v1 and ckp.v1. They do not equal, so we continue with t1.
*Because we do comparison before translation, we can never compare reset_key as
a completion time to ckp as a request time.*
# then the translation happens for t1, for non USE_TRANSITION_TIME case we
convert to completion time t4. Then we start from there. Let's say it stop at
some instant t5. Then we will end up with something as follows
*Just no-op*
{code:java}
{
ckp.v2: t60 <-- completion time
reset_ckp.v2: t50 <-- request time
} {code}
Please note here we failed to keep the 2 options under the same semantics, so
in the next ingestion, we might compare t0 and t5 and they are the same which
leads to issues.
*Just don't write* {{*reset_ckp.v1*}} *when upgrading to* {{*reset_ckp.v2*}} **
{ ckp.v2: t5 <-- completion time reset_ckp.v2: null }
----
candidate ckp config to consider:
- commitMetadata.ckp (1234) -> set checkpoint.ckp as commitMetadata.ckp (1234)
- cfg.ckp (v1/v2:1234) -> set checkpoint.ckp as cfg.ckp (1234)
checkpoint.reset_ckp (raw value - v1/v2:1234)
- cfg.ignore_ckp (1234) -> set checkpoint.ckp as null
> Reset checkpoint handling upgrade path
> --------------------------------------
>
> Key: HUDI-8917
> URL: https://issues.apache.org/jira/browse/HUDI-8917
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: Davis Zhang
> Assignee: Davis Zhang
> Priority: Blocker
> Fix For: 1.0.1
>
> Original Estimate: 16h
> Time Spent: 19.5h
> Remaining Estimate: 0.5h
>
> h2. PR for review [https://github.com/apache/hudi/pull/12718]
>
> h2. Motivation
> When target table upgrades to v2, it uses completion time to track the ckp.
> In v1, it uses request time.
>
> From the user's perspective: * they only have 1 option for resetting the
> checkpoint. It does not imply it is completion time based or request time
> based.
> * the interpretation of those single ckp options are dependent on which
> version the target table is, which is beyond user's knowledge.
>
> From the system's standpoint: * It has no idea the user configured config is
> a request time or completion time. If we misread the semantics the behavior
> is totally different.
>
> Target: * When user set the checkpoint, it should be easy for them to figure
> out if they are setting completion time or request time.
> * When system playing with configs, it should treat all configs with the
> same semantics, we are only in danger when we compare request time with
> completion time.
>
> Acceptable case for simplifying the design: * Given the chance is low that 1
> instant's start time is another instants' completion time, operations like
> {{aCompletionTime == or != aRequestTimeConfig}} is allowed. But we should
> never allow {{<}} {{>}} {{<=}} {{>=}} between {{aRequestTimeConfig}} and
> {{aCompletionTime}}
> * User streamer config:
> ** cfg.ckp needs {{<}} {{>}} {{<=}} {{>=}}
> ** cfg.ignore_ckp only needs {{== / !=}}
> * Commit metadata
> ** ckp needs {{<}} {{>}} {{<=}} {{>=}}
> ** ignore_ckp and reset_ckp only needs {{== / !=}}
> h3. Broken case 1 - User wants to override checkpoint while upgrading
> {{cfg.checkpoint}}
> As a result, when user would like to override the checkpoint while target
> table upgrade is happening, they don't know if this is reset to some commit
> completion time or request time.
>
> *Auto Inference:* If it maps to request time of some instant, then we assume
> it is request time. If it maps to some completion time, it is used as
> completion time.
> Cons: User might not always set a time whose value maps to any instant
> request/completion time. Also {{cfg.checkpoint}} might involves in {{<}}
> {{>}} {{<=}} {{>=}} operations, so inferencing is not acceptable.
>
> *Introduce checkpoint v2 config:* Users make it explicit that I'm using v2
> config and we should use completion time. If the table has not upgraded to
> v2, we error out. If table has upgraded, we only allow v2.
> Cons: 1 more user config. Require user action
>
> *(Preferred) Do not allow user override ckp while upgrading:* If user set the
> ckp and we are trying to upgrade, fail the ingestion.
> {code:java}
> needsUpgradeOrDowngrade(metaClient) && cfg.checkpoint/ignoreCheckpoint
> => throw exception {code}
> Unless they turn off autoUpgrade. Also after upgrade complete, for checkpoint
> we only allow format:
> {code:java}
> before <checkpoint>
> after
> resumeFromInstantRequestTime: <checkpoint>
> or
> resumeFromInstantCompletionTime: <checkpoint>{code}
> h3. Also because we enforce ckp override to keep clean during auto upgrade
> iteration, the commit metadata generated will not have reset key and ignore
> key, so we don't need any explicit clean up at all.
>
> h3. Case 2 - Ckp config from the previous commit if we go with (Preferred)
> above
> For reset ckp already taken effect before the upgrade, the reset ckp is
> interpreted as a request time. Later when the upgrade completes, the reset
> ckp is interpreted as completion time. Will this cause any issue?
>
> timeline
> t0.instant1.requested t0.instant2.completed t0_t1.instant1.completed
> t1.instant3.requested t1_t4.instant3.completed t4.instant4.requested
>
> prev commit metadata before upgrade, they are in the same semantic
> {code:java}
> {
> ckp.v1: t60 <-- request time
> reset_ckp.v1: t50 <-- request time
> } {code}
>
> after upgrade, in the first ingestion we do the following: # first resolve
> between reset_ckp.v1 and ckp.v1. They do not equal, so we continue with t1.
> *Because we do comparison before translation, we can never compare reset_key
> as a completion time to ckp as a request time.*
> # then the translation happens for t1, for non USE_TRANSITION_TIME case we
> convert to completion time t4. Then we start from there. Let's say it stop at
> some instant t5. Then we will end up with something as follows
>
> *Just no-op*
> {code:java}
> {
> ckp.v2: t60 <-- completion time
> reset_ckp.v2: t50 <-- request time
> } {code}
> Please note here we failed to keep the 2 options under the same semantics, so
> in the next ingestion, we might compare t0 and t5 and they are the same which
> leads to issues.
>
> *Just don't write* {{*reset_ckp.v1*}} *when upgrading to* {{*reset_ckp.v2*}}
> **
> { ckp.v2: t5 <-- completion time reset_ckp.v2: null }
>
>
> ----
> candidate ckp config to consider:
>
> - commitMetadata.ckp (1234) -> set checkpoint.ckp as commitMetadata.ckp
> (1234)
> - cfg.ckp (v1/v2:1234) -> set checkpoint.ckp as cfg.ckp (1234)
> checkpoint.reset_ckp (raw value - v1/v2:1234)
> - cfg.ignore_ckp (1234) -> set checkpoint.ckp as null
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)