Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22112
> 2. ask the output committer to be able to overwrite a committed task.
Note that, the output committer here is the FileCommitProtocol interface in
Spark, not the hadoop output committer. We don't have to make all the hadoop
output committers work.
I disagree with this. Spark works with any hadoop output committer via
RDD api. Spark writing to HBASE is a perfect example of this. You can't do
moves in hbase. PairRDDfunctions.saveAsHadoopDataset can be used with hbase,
this uses the SparkHadoopWriter.write function that uses the FileCommitProtocol
in Spark. If that is assuming moves are possible for all output committers
then in my opinon its a bug.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]