[GitHub] spark issue #19487: [SPARK-21549][CORE] Respect OutputFormats with no/invali...

steveloughran Fri, 13 Oct 2017 12:06:05 -0700

Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/19487
  
    The more I see of the committer internals, the less confident I am about 
understanding any of it.
    If your committer isn't writing stuff out, it doesn't need to have any 
value of mapred.output.dir at all, does it? If it does use it, it'll handle an 
invalid entry in setupJob/setupTask by throwing an exception there. So the goal 
of the stuff above it should be to make sure it gets to deal with validating 
its inputs.
    
    Hadoop trunk adds a new 
[PathOutputCommitter](https://github.com/steveloughran/hadoop/blob/s3guard/HADOOP-13786-committer/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/PathOutputCommitter.java)
 class for committers: it's the useful getters of `FileOutputCommitter` pulled 
up so allowing other committers to provide things like spark the info they need 
without looking into properties like mapred.output.dir. Have a look at that 
class and if there is something extra you want pulled up, let me know before 
Hadoop 3.0 ships & I'll see what I can do




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19487: [SPARK-21549][CORE] Respect OutputFormats with no/invali...

Reply via email to