Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/10807#discussion_r50073547
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala ---
@@ -714,12 +714,20 @@ class StreamingContext private[streaming] (
// interrupted. See SPARK-12001 for more details. Because the
body of this case can be
// executed twice in the case of a partial stop, all methods
called here need to be
// idempotent.
- scheduler.stop(stopGracefully)
--- End diff --
I think one of the primary goal of this JIRA is to allow partial clean-up
and retry on `stop()` calls.
In this specific code path, it is already written in a way to allow for
retry by setting the state to `STOPPED` only almost at the end on [line
728](https://github.com/apache/spark/pull/10807/files#diff-8a7f0e3f26c15ba484e6312c3caf033dL728)
in the original code.
`tryLogNonFatalError` swallows and logs "non-fatal" exception, and with
that added, despite any non-critical error thrown it could reach the line
`state = STOPPED`. For instance, if `env.metricsSystem.removeSource()` throws
then it will continue on and setting `state` to `STOPPED`, at which point the
caller cannot get back to the same code to retry cleanup because of the `state`
match case
[above](https://github.com/apache/spark/pull/10807/files#diff-8a7f0e3f26c15ba484e6312c3caf033dR707).
Is that what we want?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]