[GitHub] spark pull request #19294: [SPARK-21549][CORE] Respect OutputFormats with no...

szhem Wed, 20 Sep 2017 12:59:42 -0700

Github user szhem commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19294#discussion_r140076614
  
    --- Diff: 
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
    @@ -568,6 +568,51 @@ class PairRDDFunctionsSuite extends SparkFunSuite with 
SharedSparkContext {
         assert(FakeWriterWithCallback.exception.getMessage contains "failed to 
write")
       }
     
    +  test("saveAsNewAPIHadoopDataset should use current working directory " +
    +    "for files to be committed to an absolute output location when empty 
output path specified") {
    +    val pairs = sc.parallelize(Array((new Integer(1), new Integer(2))), 1)
    +
    +    val job = NewJob.getInstance(new Configuration(sc.hadoopConfiguration))
    +    job.setOutputKeyClass(classOf[Integer])
    +    job.setOutputValueClass(classOf[Integer])
    +    job.setOutputFormatClass(classOf[NewFakeFormat])
    +    val jobConfiguration = job.getConfiguration
    +
    +    val fs = FileSystem.get(jobConfiguration)
    +    fs.setWorkingDirectory(new 
Path(getClass.getResource(".").toExternalForm))
    +    try {
    +      // just test that the job does not fail with
    +      // java.lang.IllegalArgumentException: Can not create a Path from a 
null string
    +      pairs.saveAsNewAPIHadoopDataset(jobConfiguration)
    +    } finally {
    +      // close to prevent filesystem caching across different tests
    +      fs.close()
    +    }
    +  }
    +
    +  test("saveAsHadoopDataset should use current working directory " +
    +    "for files to be committed to an absolute output location when empty 
output path specified") {
    +    val pairs = sc.parallelize(Array((new Integer(1), new Integer(2))), 1)
    +
    +    val conf = new JobConf()
    +    conf.setOutputKeyClass(classOf[Integer])
    +    conf.setOutputValueClass(classOf[Integer])
    +    conf.setOutputFormat(classOf[FakeOutputFormat])
    +    conf.setOutputCommitter(classOf[FakeOutputCommitter])
    +
    +    val fs = FileSystem.get(conf)
    +    fs.setWorkingDirectory(new 
Path(getClass.getResource(".").toExternalForm))
    +    try {
    +      FakeOutputCommitter.ran = false
    +      pairs.saveAsHadoopDataset(conf)
    +    } finally {
    +      // close to prevent filesystem caching across different tests
    +      fs.close()
    --- End diff --
    
     I've updated PR not to use filesystem at all.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19294: [SPARK-21549][CORE] Respect OutputFormats with no...

Reply via email to