voonhous commented on PR #7669:
URL: https://github.com/apache/hudi/pull/7669#issuecomment-1408352092
Had a chat @SteNicholas and he brought up a very good point.
If a DropPartition DDL is issued whilst a Cleaning operation is in the midst
of executing, it will cause the timeline to look something like this:
```txt
Clean0.requested -> Clean{P0->FG1}
DropPartition0.requested -> Drop{P0->FG1}
DropPartition0.inflight -> Drop{P0->FG1}
DropPartition0.requested -> Drop{P0->FG1}
Clean0.inflight -> Clean{P0->FG1}
Clean0.completed -> Clean{P0->FG1}
```
When the next CleanPlanner is invoked, it may include a FileGroup that may
have been previously cleaned.
```txt
Clean1.requested -> Clean{P0->FG1}
...
```
As such, when creating `Clean1`, we should exclude files that may have
already been cleaned in `CleanT-1`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]