[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29895: [SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default

GitBox Tue, 29 Sep 2020 09:18:26 -0700


dongjoon-hyun commented on a change in pull request #29895:
URL: https://github.com/apache/spark/pull/29895#discussion_r496866857




##########
File path: docs/configuration.md
##########
@@ -1761,16 +1761,10 @@ Apart from these, the following properties are also 
available, and may be useful
 </tr>
 <tr>
   
<td><code>spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version</code></td>
-  <td>Dependent on environment</td>
+  <td>1</td>
   <td>
     The file output committer algorithm version, valid algorithm version 
number: 1 or 2.
-    Version 2 may have better performance, but version 1 may handle failures 
better in certain situations,
-    as per <a 
href="https://issues.apache.org/jira/browse/MAPREDUCE-4815";>MAPREDUCE-4815</a>.
-    The default value depends on the Hadoop version used in an environment:
-    1 for Hadoop versions lower than 3.0
-    2 for Hadoop versions 3.0 and higher
-    It's important to note that this can change back to 1 again in the future 
once <a 
href="https://issues.apache.org/jira/browse/MAPREDUCE-7282";>MAPREDUCE-7282</a>
-    is fixed and merged.

Review comment:
       This PR aims to provide a consistent view for Apache Spark users. For 
example, ` The default value depends on the Hadoop version used in an 
environment` is not valid any more. After this PR, Apache Spark users will use 
`v1` consistently by default.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29895: [SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 by default

Reply via email to