[ 
https://issues.apache.org/jira/browse/HDFS-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229420#comment-15229420
 ] 

Colin Patrick McCabe commented on HDFS-10267:
---------------------------------------------

Thanks for the reviews, [~eddyxu] and [~xiaochen].

bq. Could you help me to understand where does stopWriterThread interrupt 
slowWriterThread?

{{stopWriterThread}} will call an operation that eventually leads to:
{code}
 e.getReplica().stopWriter(datanode.getDnConf().getXceiverStopTimeout());
{code}
{{ReplicaInPipeline#stopWriter}} interrupts the existing writer thread and then 
tries to join it.

bq. Should we break the while loop if the Thread holding this semaphore being 
interrupted?

Hmm.  No one thread "holds" a semaphore.  Are you proposing breaking on INE in 
{{uninterruptiblyAcquire}}?  We cannot do that because it would lead to the 
slowWriter thread terminating immediately when {{ReplicaInPipeline#stopWriter}} 
was called.  This defeats the point of the test, which is to ensure that nobody 
is holding the {{FsDatasetImpl}} lock when calling 
{{ReplicaInPipeline#stopWriter}}.

bq. javadoc of uninterruptiblyAcquire has redundant parameter:

Fixed

bq. We could use GenericTestUtils#assertExceptionContains

Good question.  I did not want to use this method, because on failure throws an 
exception which is different than the exception which it is checking.  I want 
to see the original exception if there is a failure.

bq. javadoc on testStopWorker should be updated (replace initReplicaRecovery)

fixed

removed extra spaces.

> Extra "synchronized" on FsDatasetImpl#recoverAppend and 
> FsDatasetImpl#recoverClose
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-10267
>                 URL: https://issues.apache.org/jira/browse/HDFS-10267
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.8.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-10267.001.patch, HDFS-10267.002.patch, 
> HDFS-10267.003.patch
>
>
> There is an extra "synchronized" on FsDatasetImpl#recoverAppend and 
> FsDatasetImpl#recoverClose that prevents the HDFS-8496 fix from working as 
> intended.  This should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to