[ 
https://issues.apache.org/jira/browse/HADOOP-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285546#comment-14285546
 ] 

Varun Saxena commented on HADOOP-11209:
---------------------------------------

Thanks a lot [~ozawa] for the review. 

Actually there was another issue raised in Spark i.e. SPARK-1097(linked to 
SPARK-2546) where ConcurrentModificationException was also an issue. I wasnt 
able to devise a deterministic way of simulating infinite loop hence didn't 
include it in test. The approach you gave above sounds good. Will include this 
as well in my original test case.

Regarding your other comments,
bq. 2. The definition of updatingResource and backup should be Map<String, 
String[]> instead of ConcurrentHashMap.
bq. 3. The indents of following lines are strange because of tab. Please 
replace them with 2 spaces.
bq. 4. Please remove trailing spaces.
Will fix.

bq. 1. Why not use Collections.synchronizedSet(new HashSet<String>()) 
straightforwardly?
{{Collections.synchronizedSet}} has comparatively poorer performance as 
compared to a set view of ConcurrentHashMap.
This because Collections.synchronizedSet is basically a wrapper around the 
underlying set and uses a monolithic mutex around every operation.
That is why I did not use it because Configuration object may be accessed by 
several threads and frequently.




> Configuration is not thread-safe
> --------------------------------
>
>                 Key: HADOOP-11209
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11209
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: conf
>            Reporter: Josh Rosen
>            Assignee: Varun Saxena
>         Attachments: HADOOP-11209.001.patch, HADOOP-11209.002.patch, 
> HADOOP-11209.003.patch
>
>
> {{Configuration}} objects are not fully thread-safe, which causes problems in 
> multi-threaded frameworks like Spark that use these configurations to 
> interact with existing Hadoop APIs (such as InputFormats).
> SPARK-2546 is an example of a problem caused by this lack of thread-safety.  
> In that bug, multiple concurrent modifications of the same Configuration (in 
> third-party code) caused an infinite loop because Configuration's internal 
> {{java.util.HashMap}} is not thread-safe.
> One workaround is for our code to clone Configuration objects; unfortunately, 
> this also suffers from thread-safety issues on older Hadoop versions because 
> Configuration's constructor wasn't thread-safe (HADOOP-10456).
> [Looking at a recent version of 
> Configuration.java|https://github.com/apache/hadoop/blob/d989ac04449dc33da5e2c32a7f24d59cc92de536/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L666],
>  it seems that the private {{updatingResource}} HashMap and 
> {{finalParameters}} HashSet fields the only non-thread-safe collections in 
> Configuration (Java's {{Properties}} class is thread-safe), so I don't think 
> that it would be hard to make Configuration fully thread-safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to