sivabalan narayanan created HUDI-8451:
-----------------------------------------

             Summary: Followup to fix all callers to HoodieLogRecordReader to 
set the right value for max instant time
                 Key: HUDI-8451
                 URL: https://issues.apache.org/jira/browse/HUDI-8451
             Project: Apache Hudi
          Issue Type: Improvement
          Components: reader-core
            Reporter: sivabalan narayanan


As part of [https://github.com/apache/hudi/pull/12033,] we fixed an issue where 
log record reader was missing to read a data block in some edge cases. 

The fix ensured log record reader will account for all rollback blocks 
dis-regarding the max instant time configured while reading log record reader.

 

But lets also follow through to see if we can fix all callers to set the right 
value for the max instant time. 

 
 
Say, we have t1.dc, t2.dc and t2.dc crashed mid way.
Current layout is,
base file(t1), lf1(partially committed data w/ t2 as instant time)
 
Then we start t5.dc say. just when we start t5.dc, hudi detects pending commit 
and triggers a rollback. And this rollback will get an *instant time of t6 
(t6.rb). Note that rollback's commit time is greater than t5 or current ongoing 
delta commit.* 
So, once rollback completes, this is the layout.
 
base file, lf1(from t2.dc partially failed), lf3 (rollback command block with 
t6).
 
And once t5.dc completes, this is how the layout looks like
 
base file, lf1(from t2.dc partially failed), *lf3 (rollback command block with 
t6). lf4 (from t5)*
 

Callers involved: 
 * This affects global indexes (simple, bloom) by not applying deletes. 
Non-global we read base files.. and with only updates in the log, it does not 
affect the tagging for non-global (bloom/simple).
 * Once there is a new commit, snapshot queries will start returning lf4. 
(almost eventually consistent behavior)
 ** - spark does not factor RBs in latestInstantTime..
 ** hive/trino/presto if they all use inputFormat 
{{BaseHoodieFileIndex#getLatestCompletedInstant}} handles this.
 ** Flink (FormatUtils is not handling this).
 * CDC: Also has issues. Irrespective of whether end instant time is set by the 
user or not.
 * Incremental queries : Just fixing lastInstant time alone may not suffice. 
since the instant time might be set by the user. So, we might have to remove 
"break" from within logRecordReader.
 * what about indexing? all new indexes added in 1.x 
 * if clustering is scheduled, right after this. (or) executed inline right 
after this ➝ this is not an issue since clustering passes in its own instant 
time as latestInstantTime, passing the check and exposing lf4.
 * if compaction is scheduled, right after this (or) executed inline right 
after this ➝ this accounts by taking into account the rollback when passing 
lastInstantTime that includes rollback ts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to