Hello! I noticed that in the documentation starting with 2.2.0 it states that the parameter spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version is 1 by default: https://issues.apache.org/jira/browse/SPARK-20107
I don't actually see this being set anywhere explicitly in the Spark code and so the documentation isn't entirely accurate in case you run on an environment that has MAPREDUCE-6406 implemented (starting with Hadoop 3.0). The default version was explicitly set to 2 in the FileOutputCommitter class, so any output committer that inherits from this class (ParquetOutputCommitter for example) would use v2 in a Hadoop 3.0 environment and v1 in the older Hadoop environments. Would it make sense for us to consider setting v1 as the default in code in case the configuration was not set by a user? Regards, Waleed