[jira] [Resolved] (HADOOP-19052) Hadoop use Shell command to get the count of the hard link which takes a lot of time

Shilun Fan (Jira) Sat, 23 Mar 2024 19:16:10 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-19052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shilun Fan resolved HADOOP-19052.
---------------------------------
       Fix Version/s: 3.5.0
                      3.4.1
        Hadoop Flags: Reviewed
    Target Version/s: 3.5.0, 3.4.1
          Resolution: Fixed

> Hadoop use Shell command to get the count of the hard link which takes a lot 
> of time
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-19052
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19052
>             Project: Hadoop Common
>          Issue Type: Improvement
>         Environment: Hadopp 3.3.4
>            Reporter: liang yu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.5.0, 3.4.1
>
>         Attachments: debuglog.png
>
>
> Using Hadoop 3.3.4
>  
> When the QPS of `append` executions is very high, at a rate of above 10000/s. 
>  
> We found that the write speed in hadoop is very slow. We traced some 
> datanodes' log and find that there is a warning :
> {code:java}
> 2024-01-26 11:09:44,292 WARN impl.FsDatasetImpl 
> (InstrumentedLock.java:logwaitWarning(165)) Waited above threshold(300 ms) to 
> acquire lock: lock identifier: FsDatasetRwlock waitTimeMs=336 ms.Suppressed 0 
> lock wait warnings.Longest supressed waitTimeMs=0.The stack trace is
> java.lang.Thread,getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1060)
> org.apache.hadoop.util.Instrumentedlock.logWaitWarning(InstrumentedLock.java:171)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:222)
> org.apache.hadoop.util.InstrumentedLock.lock(InstrumentedLock, iaya:105)
> org.apache.hadoop.util.AutocloseableLock.acquire(AutocloseableLock.java:67)
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:1239)
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:230)
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.getBlockReceiver 
> (DataXceiver.java:1313)
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock 
> (DataXceiver.java:764)
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:176)
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:110)
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:293)
> java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Then we traced the method 
> _org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.
>  java:1239),_ and print how long each command take to finish the execution, 
> and find that it takes us 700ms to get the linkCount of the file which is 
> really slow.
> !debuglog.png!
>  
> We traced the code and  find that java1.8 use a Shell Command to get the 
> linkCount, in which execution it will start a new Process and wait for the 
> Process to fork, when the QPS is very high, it will sometimes take a long 
> time to fork the process.
> Here is the shell command.
> {code:java}
> stat -c%h /path/to/file
> {code}
>  
> Solution:
> For the FileStore that supports the file attributes "unix", we can use the 
> method _Files.getAttribute(f.toPath(), "unix:nlink")_ to get the linkCount, 
> this method doesn't need to start a new process, and will return the result 
> in a very short time.
>  
> When we use this method to get the file linkCount, we rarely get the WARN log 
> above when the QPS of append execution is high.
> .
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (HADOOP-19052) Hadoop use Shell command to get the count of the hard link which takes a lot of time

Reply via email to