[
https://issues.apache.org/jira/browse/HUDI-9073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davis Zhang updated HUDI-9073:
------------------------------
Fix Version/s: 1.1.0
> Deprecate request time ordering with commit time ordering in 1.x
> ----------------------------------------------------------------
>
> Key: HUDI-9073
> URL: https://issues.apache.org/jira/browse/HUDI-9073
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Davis Zhang
> Priority: Major
> Fix For: 1.1.0
>
>
> h2. Motivation
> Back in hudi 0.x, all instants are ordered by "request time", which notes the
> "initiation of a given write operation". On the other hand, in hudi 1.x
> "completion time" is introduced, which noted the "changes applied by the
> write operation is committed and visible to readers/writers".
>
> From a standard DBMS point of view, there should be only "completion time
> based" event ordering to any concurrent reader/writers of a DBMS. Indeed the
> notion of "request time" may serve some purpose for hudi to coordinate some
> internal states across table services and other writers. Yet leaking "request
> time" to hudi consumers is a miss of design as what they actually care are
> completion time ordering.
>
> As of today, in order to fill-in the gap various bandits has applied - we
> introduced hollow commits handling and also introduced "state transition time
> ordering". This leads to unnecessary complexity and impair maintainability
> and dev velocity because of these ugly pieces.
>
> h3. Ideal end state
> For 99.99% we should use completion time based ordering. This means all V2
> instant generator use CompletionTimeComparator.
>
> We should revisit all request time based ordering related logic (request time
> comparator usage) and replace with completion time based properly. If it is
> not possible, document in the code on why we need to know about the request
> time and why it is not avoidable.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)