yihua opened a new pull request, #8001:
URL: https://github.com/apache/hudi/pull/8001

   ### Change Logs
   
   Even though the metadata table writer used by the async indexer is 
configured to use `LAZY` failed write cleaning policy, the 
`SparkHoodieBackedTableMetadataWriter` is hard-coded to roll back failed writes 
regardless of the configuration, which should not be triggered for the async 
indexer.  In the current logic, the async indexer can trigger the rollback of 
inflight delta commit from another regular writer in the metadata table, 
causing issues.  This also makes the following test flaky.
   
   This PR fixes `SparkHoodieBackedTableMetadataWriter` so that the rollback of 
failed writes is not triggered by the async indexer.
   
   ```
   2023-02-16T13:46:06.1573775Z [ERROR] Tests run: 113, Failures: 0, Errors: 1, 
Skipped: 2, Time elapsed: 3,518.191 s <<< FAILURE! - in 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
   2023-02-16T13:46:06.1576031Z [ERROR] testHoodieIndexer{HoodieRecordType}[2]  
Time elapsed: 79.838 s  <<< ERROR!
   ...
   2023-02-16T13:46:06.1705711Z Caused by: java.lang.IllegalArgumentException
   2023-02-16T13:46:06.1706251Z         at 
org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
   2023-02-16T13:46:06.1706995Z         at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:633)
   2023-02-16T13:46:06.1707847Z         at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionRequestedToInflight(HoodieActiveTimeline.java:698)
   2023-02-16T13:46:06.1708751Z         at 
org.apache.hudi.table.action.commit.BaseCommitActionExecutor.saveWorkloadProfileMetadataToInflight(BaseCommitActionExecutor.java:147)
   2023-02-16T13:46:06.1709792Z         at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:172)
   2023-02-16T13:46:06.1710733Z         at 
org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44)
   2023-02-16T13:46:06.1712815Z         at 
org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:111)
   2023-02-16T13:46:06.1713593Z         at 
org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:80)
   2023-02-16T13:46:06.1714353Z         at 
org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:154)
   2023-02-16T13:46:06.1715155Z         at 
org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:186)
   ...
   ```
   
   ### Impact
   
   Fixes the rollback behavior of async indexer.  Also fixes the flaky test.  
Adds a new test to guard around the behavior (before this PR, the test fails).
   
   ### Risk level
   
   low
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to