[
https://issues.apache.org/jira/browse/HUDI-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-1800:
---------------------------------
Labels: pull-request-available sev:critical (was: sev:critical)
> Incorrect HoodieTableFileSystem API usage for pending slices causing issues
> ---------------------------------------------------------------------------
>
> Key: HUDI-1800
> URL: https://issues.apache.org/jira/browse/HUDI-1800
> Project: Apache Hudi
> Issue Type: Bug
> Components: Writer Core
> Reporter: Nishith Agarwal
> Assignee: Ryan Pifer
> Priority: Major
> Labels: pull-request-available, sev:critical
>
> From [~vbalaji]
>
> We are using wrong API of FileSystemView here
> [https://github.com/apache/hudi/blob/release-0.6.0/hudi-client/src/main/java/org/apache/hudi/table/action/deltacommit/UpsertDeltaCommitPartitioner.java#L85]
> We don't include file groups that are in pending compaction but with Hbase
> Index we are including them. With the current state of code, Including files
> in pending compaction is an issue.
> This API "getLatestFileSlicesBeforeOrOn" is originally intended to be used by
> CompactionAdminClient to figure out log files that were added after pending
> compaction and rename them such that we can undo the effects of compaction
> scheduling. There is a different API "getLatestMergedFileSlicesBeforeOrOn"
> which gives a consolidated view of the latest file slice and includes all
> data both before and after compaction. This is what should be used in
> [https://github.com/apache/hudi/blob/release-0.6.0/hudi-client/src/main/java/org/apache/hudi/table/action/deltacommit/UpsertDeltaCommitPartitioner.java#L85]
> The other workaround would be excluding file slices in pending compaction
> when we select small files to avoid the interaction between compactor and
> ingestion in this case. But, I think we can go with the first option
>
> More details can be found here -> https://github.com/apache/hudi/issues/2633
--
This message was sent by Atlassian Jira
(v8.3.4#803005)