Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/5124#issuecomment-85132588
It looks like this NPE bug has been around for a while, but it seems pretty
hard to hit (which is probably why it hasn't been reported before). I think
that we should be able to trigger / reproduce this by creating a new
SparkContext, ensuring that the thread-local properties are null, launching a
long-running job, then attempting to cancel all jobs in some non-existent job
group. Can we add a regression test for this? Shouldn't be too hard if my
hunch is right.
It looks like we don't directly expose the Properties object to users, so
if we wanted to we could go even further and convert all of the upstream
nullable `Properties` into `Options[Properties]` as well. If you look at the
call chain leading to this use of `properties`, it looks like it can only be
null if no local properties have ever been set the job submitting thread, its
parent thread, or any of its other ancestor threads. Therefore, maybe we can
just eliminate the whole null / Option stuff entirely by ensuring that the
thread-local has an `initialValue` instead of having it be `null` in some
circumstances and not others.
Therefore, here's my suggestion:
- Add a regression test and confirm that it reproduces the original bug.
- Override `SparkContext.initialValue` to return a new empty properties
object (since this is [how we lazily
initialize](https://github.com/hunglin/spark/blob/687434c9ab65601dde095d3cf6bb2f0de2ea90e1/core/src/main/scala/org/apache/spark/SparkContext.scala#L478)
the properties in the existing code. Update the other parts of SparkContext
that set this to account for this change.
- Add a few `assert(properties != null)` so that we catch errors up-front.
I'd add these checks at the entry points of the DAGScheduler, e.g. the
`private[spark]` `submitJob` methods that are called from SparkContext.
Your patch looks good overall, but if we can I think we should just fix the
underlying messiness.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]