Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5124#discussion_r27307726
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -474,13 +477,13 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
        * Spark fair scheduler pool.
        */
       def setLocalProperty(key: String, value: String) {
    -    if (localProperties.get() == null) {
    -      localProperties.set(new Properties())
    +    if (localProperties.get().isEmpty) {
    +      localProperties.set(Some(new Properties()))
    --- End diff --
    
    To address the actual issue at hand, though, I'm suggesting that it's 
clearer to change where we perform our initialization so that 
`localProperties.get()` always returns a valid Properties object, even if it's 
an empty properties object.  This will prevent `localProperties.get()` from 
returning null, so we can then begin to look at all of the code which checks 
whether `(localProperties.get() == null)` and refactor that to assume that 
`localProperties` is non-null, and so on, until we've removed all of the 
nullability and options from this code path.
    
    If you look at DAGScheduler, I think that properties flow into it via 
`handleJobSubmitted`.  The properties that flow to this location come from 
either SparkContext.runJob or SparkContext.runApproximateJob, both of which 
pass `localProperties.get`: 
https://github.com/hunglin/spark/blob/baea4fd9f6df0af466e51a8f19380f194ec502ae/core/src/main/scala/org/apache/spark/SparkContext.scala#L1493
    
    If you continue to apply this sort of reasoning to all of the places where 
these properties flow, I think that we'll find that the properties won't be 
null unless they're null in the SparkContext run*Job methods, which will be 
prevented by overriding `initialValue` to return an empty properties object.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to