Tathagata Das created SPARK-12087:
-------------------------------------
Summary: DStream.saveAsHadoopFiles can throw
ConcurrentModificationException
Key: SPARK-12087
URL: https://issues.apache.org/jira/browse/SPARK-12087
Project: Spark
Issue Type: Bug
Components: Streaming
Affects Versions: 1.5.2, 1.4.1, 1.3.1
Reporter: Tathagata Das
Assignee: Tathagata Das
The JobConf object created in DStream.saveAsHadoopFiles is used concurrently in
multiple places:
- The JobConf is updated by RDD.saveAsHadoopFile() before the job is launched
- The JobConf is serialized as part of the DStream checkpoints.
These concurrent accesses (updating in one thread, while the another thread is
serializing it) can lead to concurrentModidicationException in the underlying
Java hashmap using in the internal Hadoop Configuration object.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]