Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/5124#discussion_r27307726
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -474,13 +477,13 @@ class SparkContext(config: SparkConf) extends Logging
with ExecutorAllocationCli
* Spark fair scheduler pool.
*/
def setLocalProperty(key: String, value: String) {
- if (localProperties.get() == null) {
- localProperties.set(new Properties())
+ if (localProperties.get().isEmpty) {
+ localProperties.set(Some(new Properties()))
--- End diff --
To address the actual issue at hand, though, I'm suggesting that it's
clearer to change where we perform our initialization so that
`localProperties.get()` always returns a valid Properties object, even if it's
an empty properties object. This will prevent `localProperties.get()` from
returning null, so we can then begin to look at all of the code which checks
whether `(localProperties.get() == null)` and refactor that to assume that
`localProperties` is non-null, and so on, until we've removed all of the
nullability and options from this code path.
If you look at DAGScheduler, I think that properties flow into it via
`handleJobSubmitted`. The properties that flow to this location come from
either SparkContext.runJob or SparkContext.runApproximateJob, both of which
pass `localProperties.get`:
https://github.com/hunglin/spark/blob/baea4fd9f6df0af466e51a8f19380f194ec502ae/core/src/main/scala/org/apache/spark/SparkContext.scala#L1493
If you continue to apply this sort of reasoning to all of the places where
these properties flow, I think that we'll find that the properties won't be
null unless they're null in the SparkContext run*Job methods, which will be
prevented by overriding `initialValue` to return an empty properties object.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]