Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/3832#issuecomment-68466612
I'd be glad to add a test here, although this might be a little tricky
since the old behavior resulted in silent failures; I should be able to come up
with a test though.
Regarding the streaming-specific
`spark.streaming.hadoop.validateOutputSpecs` setting, which of the following
behaviors is more intuitive?
1. Streaming jobs always respect the Streaming version of the setting and
non-streaming jobs respect the regular version. If the streaming checks are
enabled but the core checks are disabled, then we do output spec validation for
streaming.
2. The Streaming version is just a gate which controls whether the core
setting also applies to streaming jobs. If the streaming setting is true but
the core setting is false, then the checks are not applied.
Which of these makes more sense? I think that option 2 is a better
backwards-compatibility escape hatch / flag.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]