[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388080#comment-14388080
 ] 

zhouyingchao commented on HDFS-7999:
------------------------------------

Thank you for looking into the patch.  Here is some explain of the logic of 
createTemporary() after applying the patch:
1.  If there is no ReplicaInfo in volumeMap for the passed in ExtendedBlock b, 
then we will create one, insert into volumeMap and then return from line 1443.
2.  If there is a ReplicaInfo in volumeMap and its GS is newer than the passed 
in ExtendedBlock b, then throw the ReplicaAlreadyExistsException from line 1447.
3.  If there is a ReplicaInfo in volumeMap whereas its GS is older than the 
passed in ExtendedBlock b, then it means this is a new write and the earlier 
writer should be stopped.  We will release the FsDatasetImpl lock and try to 
stop the earlier writer w/o the lock.  
4.  After the earlier writer is stopped, we need to evict earlier writer's 
ReplicaInfo from volumeMap, to that end we will re-acquire the FsDatasetImpl 
lock.  However,  since this thread has released the FsDatasetImpl lock when it 
tried to stop earlier writer, another thread might have come in and changed the 
ReplicaInfo of this block in VolumeMap.  This situation is not very likely to 
happen whereas we have to handle it in case.   The loop in the patch is just 
tried to handle this situation -- after re-acuire the FsDatasetImpl lock, it 
will check if the current ReplicaInfo in volumeMap is still the one before we 
stop the writer, if so we can simply evict it and create/insert a new one then 
return from line 1443. Otherwise, it implies another thread has slipped in and 
changed the ReplicaInfo when we were stopping earlier writer.  In this 
condition, we check if that thread has inserted a block with even newer GS than 
us, if so we throws ReplicaAlreadyExistsException from line 1447. Otherwise we 
need to stop that thread's write just like we stop the earlier writer in step 3.


> FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
> very long time
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-7999
>                 URL: https://issues.apache.org/jira/browse/HDFS-7999
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: zhouyingchao
>            Assignee: zhouyingchao
>         Attachments: HDFS-7999-001.patch
>
>
> I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
> very long time, say more than 100 seconds. I get the jstack twice and looks 
> like they are all blocked (at getStorageReport) by dataset lock, and which is 
> held by a thread that is calling createTemporary, which again is blocked to 
> wait earlier incarnation writer to exit.
> The heartbeat thread stack:
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
>         - waiting to lock <0x00000007b01428c0> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>         at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
>         - locked <0x00000007b0140ed0> (a java.lang.Object)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
>         at java.lang.Thread.run(Thread.java:662)
> The DataXceiver thread holds the dataset lock:
> "DataXceiver for client at XXXXX" daemon prio=10 tid=0x00007f14041e6480 
> nid=0x52bc in Object.wait() [0x00007f11d78f7000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Thread.join(Thread.java:1194)
> locked <0x00000007a33b85d8> (a org.apache.hadoop.util.Daemon)
> at 
> org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
> locked <0x00000007b01428c0> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:179)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
> at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to