Tathagata Das created SPARK-12087:
-------------------------------------

             Summary: DStream.saveAsHadoopFiles can throw 
ConcurrentModificationException
                 Key: SPARK-12087
                 URL: https://issues.apache.org/jira/browse/SPARK-12087
             Project: Spark
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 1.5.2, 1.4.1, 1.3.1
            Reporter: Tathagata Das
            Assignee: Tathagata Das


The JobConf object created in DStream.saveAsHadoopFiles is used concurrently in 
multiple places:
- The JobConf is updated by RDD.saveAsHadoopFile() before the job is launched
- The JobConf is serialized as part of the DStream checkpoints. 

These concurrent accesses (updating in one thread, while the another thread is 
serializing it) can lead to concurrentModidicationException in the underlying 
Java hashmap using in the internal Hadoop Configuration object. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to