[ 
https://issues.apache.org/jira/browse/HADOOP-12532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14982957#comment-14982957
 ] 

Wei-Chiu Chuang commented on HADOOP-12532:
------------------------------------------

I think a simple fix is to remove the connection thread from the has map after 
connections are terminated. But creating a test case to verify the fix would be 
harder.

> Data race in IPC client Client.stop()
> -------------------------------------
>
>                 Key: HADOOP-12532
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12532
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>
> I found a data race in ipc.Client.stop()
> ipc.Client maintains a hash map of connection threads. When stop() is called, 
> it interrupts all connection threads; the threads are supposed to remove 
> itself from the hash map as part of the clean up work; and stop() 
> periodically checks to see if the hash map is empty and then returns.
> The bug is, this checking operation is not synchronized, and the connection 
> thread actually removes itself from the hash map before terminating 
> connections. 
> This bug causes regression for HDFS-4925. In fact, the fix in HDFS-4925 may 
> not be correct, because it assumes when it returns from 
> QuorumJournalManager.close(), IPC client connection threads are terminated. 
> But the reality is the IPC code assumes connections are closed, not the IPC 
> connection threads (which in any case is buggy as well).
> This is also likely related to the bug reported in HDFS-4925 
> (TestQuorumJournalManager.testPurgeLogs intermittently Fails 
> assertNoThreadsMatching)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to