Github user colorant commented on the pull request:

    https://github.com/apache/spark/pull/1000#issuecomment-47628775
  
    @rxin correct me if I am wrong. 
    
    The problem here is that the broadcastedConf is in per task HadoopRDD, 
synchronized on the method or on the broadcastedConf itself is good within this 
task. while when you call braodcastedConf.value.value, you actually return the 
value saved in the memory store,( when memory is enough and with deserialize 
approaching) this conf object should be the same one per node? say when getconf 
across task, you don't prevent to get the same conf object. and pass this conf 
object to JobConf(conf) lead to this problem.
    
    If I am right, then, broadcastedConf.value.value.synchronized might solve 
this problem? 
    
    I am not 100% sure those reference across task staffs did work as I 
described above. What do you think about it? I will try to modify the code and 
see if it works, If this is true, I can do a quick pull request then


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to