[GitHub] spark pull request #18714: [SPARK-20236][SQL] dynamic partition overwrite

gatorsmile Tue, 02 Jan 2018 18:08:32 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18714#discussion_r159352867
  
    --- Diff: 
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
 ---
    @@ -39,8 +39,19 @@ import org.apache.spark.mapred.SparkHadoopMapRedUtil
      *
      * @param jobId the job's or stage's id
      * @param path the job's output path, or null if committer acts as a noop
    + * @param dynamicPartitionOverwrite If true, Spark will overwrite 
partition directories at runtime
    + *                                  dynamically, i.e., we first write 
files under a staging
    + *                                  directory with partition path, e.g.
    + *                                  /path/to/staging/a=1/b=1/xxx.parquet. 
When committing the job,
    + *                                  we first clean up the corresponding 
partition directories at
    + *                                  destination path, e.g. 
/path/to/destination/a=1/b=1, and move
    + *                                  files from staging directory to the 
corresponding partition
    + *                                  directories under destination path.
      */
    -class HadoopMapReduceCommitProtocol(jobId: String, path: String)
    +class HadoopMapReduceCommitProtocol(
    +     jobId: String,
    +     path: String,
    +     dynamicPartitionOverwrite: Boolean = false)
    --- End diff --
    
    Indents.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #18714: [SPARK-20236][SQL] dynamic partition overwrite

Reply via email to