Github user colorant commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1273#discussion_r14544250
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
    @@ -141,7 +141,7 @@ class HadoopRDD[K, V](
           // local process. The local cache is accessed through 
HadoopRDD.putCachedMetadata().
           // The caching helps minimize GC, since a JobConf can contain ~10KB 
of temporary objects.
           // synchronize to prevent ConcurrentModificationException 
(Spark-1097, Hadoop-10456)
    -      broadcastedConf.synchronized {
    +      broadcastedConf.value.value.synchronized {
    --- End diff --
    
    @aarondav , Yes, I also thought about this before. The reason I keep use 
broadcastedConf.value.value here is because: though broadcast variable is 
suggested to be read only and not changed, But I wonder maybe in case someone 
miss use it and change the value, read the latest value might be helpful. And 
it read the latest code in the next line in the original code, so I keep this 
style. But think again, if the value did changed in some place without hold any 
synchronize lock, this might still not be able to solve the  problem. I will 
update the pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to