[GitHub] [spark] attilapiros commented on a diff in pull request #37468: [SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions

GitBox Tue, 06 Sep 2022 14:21:31 -0700


attilapiros commented on code in PR #37468:
URL: https://github.com/apache/spark/pull/37468#discussion_r964180597



##########
hadoop-cloud/src/hadoop-3/main/scala/org/apache/spark/internal/io/cloud/PathOutputCommitProtocol.scala:
##########
@@ -161,7 +204,16 @@ object PathOutputCommitProtocol {
   val REJECT_FILE_OUTPUT_DEFVAL = false
 
   /** Error string for tests. */
-  private[cloud] val UNSUPPORTED: String = "PathOutputCommitProtocol does not 
support" +
+  private[cloud] val UNSUPPORTED: String = "PathOutputCommitter does not 
support" +
     " dynamicPartitionOverwrite"
 
+  /**
+     * Stream Capabilities probe for spark dynamic partitioning compatibility.
+     */
+  private[cloud] val CAPABILITY_DYNAMIC_PARTITIONING = 
"mapreduce.job.committer.dynamic.partitioning"
+
+  /**
+   * Scheme prefix for per-filesystem scheme committers.
+   */
+  private[cloud] val OUTPUTCOMMITTER_FACTORY_SCHEME = 
"mapreduce.outputcommitter.factory.scheme"

Review Comment:
   Our documentation lack mentioning the 
`mapreduce.outputcommitter.factory.scheme` config and as I see this config will 
be more important than it was before (when it was just simply set to a single 
specific value in `SparkContext#fillMissingMagicCommitterConfsIfNeeded`). 
   
   What about extending the `cloud-integration.md` with some info about its 
role? Or just add a link to the Hadoop's documentation of this config? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] attilapiros commented on a diff in pull request #37468: [SPARK-40034][SQL] PathOutputCommitters to support dynamic partitions

Reply via email to