[ https://issues.apache.org/jira/browse/MAPREDUCE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881989#action_12881989 ]
Scott Chen commented on MAPREDUCE-1823: --------------------------------------- In the patch, when performing getFileStatus() in recursing the policy, we do listStatus() instead. And we put the result in a map. This will reduce the number of RPCs to NN. There is no unit test. This is an optimization and the code path is covered by the original tests: TestRaidNode, TestRaidPurge and TestRaidHar. > Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode > --------------------------------------------------------------------- > > Key: MAPREDUCE-1823 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1823 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 0.22.0 > Reporter: Scott Chen > Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1823.txt > > > RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method > fetches information from DataNode so it is slow. It becomes the bottleneck of > the RaidNode. It will be nice if we can make this more efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.