GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/8781
[SPARK-10649][STREAMING] Prevent inheriting job group and irrelevant job
description in streaming jobs
The job group, and job descriptions information is passed through thread
local properties, and get inherited by child threads. In case of spark
streaming, the streaming jobs inherit these properties from the thread that
called streamingContext.start(). This may not make sense.
1. Job group: This is mainly used for cancelling a group of jobs together.
It does not make sense to cancel streaming jobs like this, as the effect will
be unpredictable. And its not a valid usecase any way, to cancel a streaming
context, call streamingContext.stop()
2. Job description: This is used to pass on nice text descriptions for jobs
to show up in the UI. The job description of the thread that calls
streamingContext.start() is not useful for all the streaming jobs, as it does
not make sense for all of the streaming jobs to have the same description, and
the description may or may not be related to streaming.
The solution in this PR is to explicitly set job group and job description
in the thread that starts the streaming scheduler so that all the subsequent
child threads inherits relevant properties. Also, the starting is done in a new
child thread, so that setting the job group and description for streaming, does
not change those properties in the thread that called streamingContext.start().
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tdas/spark SPARK-10649
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8781.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8781
----
commit 986cdd65b001460a6066f826b0169cfdc3eb0690
Author: Tathagata Das <[email protected]>
Date: 2015-09-08T19:59:09Z
Added information on backpressure
commit 2525bc58e791d05e139067b9ce4985301914911e
Author: Tathagata Das <[email protected]>
Date: 2015-09-16T20:51:14Z
Fixed job and job description for streaming jobs
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]