satishkotha opened a new pull request #2388:
URL: https://github.com/apache/hudi/pull/2388
## What is the purpose of the pull request
* Add incremental timeline support to update pending clustering operations
* Fix timeline to include information in inflight clustering operations
## Brief change log
* Change timeline in filesystem views to include pending replacecommits
(Previously it only included completed commits and pending compaction instants).
* Because filesystem view includes pending clustering operations, change
HoodieFileGroup#lastInstant to track only completed instants. Note that this
required changing some assumption in TestUpgradeDowngrade tests, please take a
close look.
* Add incremental timeline support to refresh view based on pending
clustering operations
* Change replacecommit.inflight file also to include clustering plan
(Previously only requested file has clustering plan). This is needed to block
updates on file groups in pending clustering correctly. One disadvantage is
replacecommit.inflight has sometimes avro and sometimes json (WorkloadProfile
used by insert_overwrite) structure. So there is a hack needed to figure out if
a inflight file is created by insert_overwrite or clustering.
Let me know if you have any suggestions .
## Verify this pull request
This change added tests. See TestIncrementalFSViewSync.
## Committer checklist
- [ ] Has a corresponding JIRA in PR title & commit
- [ ] Commit message is descriptive of the change
- [ ] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under
an umbrella JIRA.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]