Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/5124#issuecomment-85132588
  
    It looks like this NPE bug has been around for a while, but it seems pretty 
hard to hit (which is probably why it hasn't been reported before).  I think 
that we should be able to trigger / reproduce this by creating a new 
SparkContext, ensuring that the thread-local properties are null, launching a 
long-running job, then attempting to cancel all jobs in some non-existent job 
group.  Can we add a regression test for this?  Shouldn't be too hard if my 
hunch is right.
    
    It looks like we don't directly expose the Properties object to users, so 
if we wanted to we could go even further and convert all of the upstream 
nullable `Properties` into `Options[Properties]` as well.  If you look at the 
call chain leading to this use of `properties`, it looks like it can only be 
null if no local properties have ever been set the job submitting thread, its 
parent thread, or any of its other ancestor threads.  Therefore, maybe we can 
just eliminate the whole null / Option stuff entirely by ensuring that the 
thread-local has an `initialValue` instead of having it be `null` in some 
circumstances and not others.
    
    Therefore, here's my suggestion:
    
    - Add a regression test and confirm that it reproduces the original bug.
    - Override `SparkContext.initialValue` to return a new empty properties 
object (since this is [how we lazily 
initialize](https://github.com/hunglin/spark/blob/687434c9ab65601dde095d3cf6bb2f0de2ea90e1/core/src/main/scala/org/apache/spark/SparkContext.scala#L478)
 the properties in the existing code.  Update the other parts of SparkContext 
that set this to account for this change.
    - Add a few `assert(properties != null)` so that we catch errors up-front.  
I'd add these checks at the entry points of the DAGScheduler, e.g. the 
`private[spark]` `submitJob` methods that are called from SparkContext.
    
    Your patch looks good overall, but if we can I think we should just fix the 
underlying messiness.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to