nbalajee commented on code in PR #9035:
URL: https://github.com/apache/hudi/pull/9035#discussion_r1258658841
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -138,9 +139,35 @@ protected Path makeNewFilePath(String partitionPath,
String fileName) {
*
* @param partitionPath Partition path
*/
- protected void createMarkerFile(String partitionPath, String dataFileName) {
- WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime)
- .create(partitionPath, dataFileName, getIOType(), config, fileId,
hoodieTable.getMetaClient().getActiveTimeline());
+ protected void createInProgressMarkerFile(String partitionPath, String
dataFileName, String markerInstantTime) {
+ WriteMarkers writeMarkers =
WriteMarkersFactory.get(config.getMarkersType(), hoodieTable, instantTime);
+ if (!writeMarkers.doesMarkerDirExist()) {
+ throw new HoodieIOException(String.format("Marker root directory absent
: %s/%s (%s)",
+ partitionPath, dataFileName, markerInstantTime));
+ }
+ if (config.enforceFinalizeWriteCheck()
+ && writeMarkers.markerExists(writeMarkers.getCompletionMarkerPath("",
"FINALIZE_WRITE", markerInstantTime, IOType.CREATE))) {
Review Comment:
If the job has passed through the
(a) write stage to create the data files
(b) started a commit and have finalized writes (keeping the files that are
part of the write statuses and removing the duplicate files)
(c) when updating the MDT for RLI (or before updating MDT), if writestatus
information (RDD blocks also persisted in the containers local storage) are
found to be lost due to lost/failed containers
Having this flag turned on would force the job to fail, instead of retrying
the tasks/stages to recreate the data files (associated with missing write
statuses).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]