Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/21257#discussion_r187953870
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
---
@@ -163,6 +169,12 @@ class HadoopMapReduceCommitProtocol(
}
override def commitJob(jobContext: JobContext, taskCommits:
Seq[TaskCommitMessage]): Unit = {
+ // first delete the should delete special file
+ val committerFs =
jobContext.getWorkingDirectory.getFileSystem(jobContext.getConfiguration)
--- End diff --
I'm not sure you can guarantee that the working dir is always the dest FS.
At least with @rdblue's committers, task attempts work dirs are in file:// &
task commit (somehow) gets them to the destFS in a form where job commit will
make them visible.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]