Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/19487
> If it does use it, it'll handle an invalid entry in setupJob/setupTask by
throwing an exception there.
This should currently happen and `hasValidPath` does not prevent it.
That is, if committer is unable to handle specified output directory, it
can throw exception in `committer.setupJob`; based on what is specified in the
config passed in `TaskAttemptContext`.
Note that `hasValidPath` and `path` handle the explicit case of absolute
path based committer's where `HadoopMapReduceCommitProtocol` moves the result
to the final destination (and removes them in case of failure) : see use of
`commitJob#taskCommits`.
`commitJob` does invoke `committer.commitJob` - so committer specific
commit will happen.
This is not relevant for non-path based committer's.
What I would like clarification on is, what is to be done when `path` is
invalid.
My understanding was, this is up to the committer implementation to handle
- since it could be a valid use : and if invalid, it would throw an exception
in `setupJob` or `commitJob`.
If this is incorrect assumption, then I will change it back to explicitly
support `null` or `""` for `path` - instead of also unparseable path's
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]