Github user steveloughran commented on the issue:
    @megaserg : if you are writing to GCS, Azure, algorithm 2 is fine. If S3 is 
the target, then it's only safe to use with a consistent store (Hadoop 3.0 
+S3Guard, Amazon Consistent EMR); you still take a major perf hit from that 
copy. The S3A committers in Hadoop 3.1 deliver that high performance commit 
semantics, and Netflix committers don't (directly) need a consistent store 
—though to chain together work you will.
    BTW, how to verify that the v2 algorithm version is being opted for? : set 
the version = 3 and expect a stack trace from the version switch code. It's 
what I do to make sure that the FileOutputCommitter isn't actually being picked 


To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to