[
https://issues.apache.org/jira/browse/MAPREDUCE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873109#action_12873109
]
Scott Chen commented on MAPREDUCE-1823:
---------------------------------------
Here's the corresponding jstack:
{code}
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
- locked <0x00002aaab7e19810> (a sun.nio.ch.Util$1)
- locked <0x00002aaab7e197f8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00002aaab7e19468> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332)
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
- locked <0x00002aaae427a320> (a java.io.BufferedInputStream)
at java.io.DataInputStream.readShort(DataInputStream.java:295)
at
org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1436)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1698)
- locked <0x00002aaae4264f38> (a
org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1815)
- locked <0x00002aaae4264f38> (a
org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
at java.io.DataInputStream.read(DataInputStream.java:83)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
at
org.apache.hadoop.fs.HarFileSystem.fileStatusInIndex(HarFileSystem.java:441)
at
org.apache.hadoop.fs.HarFileSystem.getFileStatus(HarFileSystem.java:616)
at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:541)
at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:561)
at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:639)
at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655)
at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655)
at org.apache.hadoop.raid.RaidNode.selectFiles(RaidNode.java:594)
at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:63)
at
org.apache.hadoop.raid.RaidNode$TriggerMonitor.doProcess(RaidNode.java:374)
at org.apache.hadoop.raid.RaidNode$TriggerMonitor.run(RaidNode.java:313)
at java.lang.Thread.run(Thread.java:619)
{code}
> Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-1823
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1823
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0
> Reporter: Scott Chen
> Assignee: Scott Chen
> Fix For: 0.22.0
>
>
> RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method
> fetches information from DataNode so it is slow. It becomes the bottleneck of
> the RaidNode. It will be nice if we can make this more efficient.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.