Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4155#issuecomment-72157326
Since we're nearing a working solution, I'd like to aim to include this
patch in 1.3. Since this patch involves major changes to the output commit
code, I'd like to propose that we "feature-flag" the output committer behind a
new configuration option. I think (but please correct me if I'm wrong) that we
can safely bypass the new driver coordination when speculation is disabled
(this should also alleviate some of the performance impact concerns that have
been raised here). When speculation is enabled, we should perform the new
checks by default, but should have an emergency "escape-hatch" option to bypass
the new checks in case they present problems or contain bugs. Therefore, I
think the new setting's default value could be set to
`spark.speculation.enabled`'s value. I'm open to naming settings for the new
configuration, but I was thinking that it could be something like
`spark.hadoop.outputCommitCoordination.enabled` (maybe that's too verbose).
It's a toss-up on whether we
'd want to include this in `configuration.md`, since I can't imagine that
users would want to disable it unless we found bugs in the new code.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]