umehrot2 commented on issue #1764:
URL: https://github.com/apache/hudi/issues/1764#issuecomment-650478776


   Actually this is not just a problem with `Throttling`. AWS S3 can throw 
intermittent `Throttling` and well as `Internal Errors` which can potentially 
succeed upon retrying.
   
   I wish we were able to use 
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-s3-optimized-committer.html
 which solved a lot of the S3 related commit problems. The EMR file system will 
not commit the file until the spark task commit succeeds, essentially making 
this file commit atomic. Unfortunately, Hudi does not depend on sparks commit 
mechanisms to be able to leverage this.
   
   Yes, I think waiting for 1 attempt and if it does not appear, just skipping 
(**and not failing the job**) it should work better as an interim solution 
given the current design of using marker files. @bvaradar also has some 
thoughts of improving this in the long run.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to