eric9204 commented on issue #6308: URL: https://github.com/apache/hudi/issues/6308#issuecomment-1232381134
@nsivabalan I know what you mean, there is another problem. If one of writers fail, this hoodie commit transaction was failed at hudi side, but this batch was successful at spark side. So the next micro batch will consume record from last successful commit of offset. the data of last successful micro batch which is failed at hoodie side actually may be lost 。 ``` 22/08/05 15:01:18 INFO HoodieStreamingSink: Ignore the exception and move on streaming as per hoodie.datasource.write.streaming.ignore.failed.batch configuration 22/08/05 15:01:18 INFO HoodieStreamingSink: Micro batch id=32 succeeded 22/08/05 15:01:18 INFO BlockManager: Removing RDD 159 22/08/05 15:01:18 INFO CheckpointFileManager: Writing atomically to hdfs://host-10-4-6-18:8020/tmp/hudi/ckp1/commits/32 using temp file hdfs://host-10-4-6-18:8020/tmp/hudi/ckp1/commits/.32.a829b364-c1b9-4b4b-8fae-f7d866ed76e5.tmp 22/08/05 15:01:18 INFO CheckpointFileManager: Renamed temp file hdfs://host-10-4-6-18:8020/tmp/hudi/ckp1/commits/.32.a829b364-c1b9-4b4b-8fae-f7d866ed76e5.tmp to hdfs://host-10-4-6-18:8020/tmp/hudi/ckp1/commits/32 ``` Could the community add a retry strategy to make it successful instead of just discarding it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
