[ https://issues.apache.org/jira/browse/MAPREDUCE-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xing Shi updated MAPREDUCE-1424: -------------------------------- Description: The Merger will open too many files on disk, when there are too many empty segments in shuffle mem. We process larger data , eg. > 100T,in one Job. And we use our partitioner to partition the map output,and one map output will wholely shuffle to one reduce。So the other reduce will get lots of empty segments. " whole "map_n |  ̄ ̄ ̄ ̄---> reduce1 " | empty " |  ̄ ̄ ̄ ̄---> reduce2 " | empty " |  ̄ ̄ ̄ ̄---> reduce3 " | empty " |  ̄ ̄ ̄ ̄---> reduce4 Because, our input data is bigger, so there are lots of map(10^5). And mostly there are several thousands maps to one reduce, and several thousands empty segments. For example: 1000 mapOutput(on disk) + 3000 empty segments(in mem) Then, as the io.sort.factor=100 in first merge cycle, the merger will merge 10+3000 segments [ by getPassFactor (1000 - 1)%100 + 1 + 30000 ],because there is no real data in mem, then we should use the left 990 mapOutput to replace the empty 3000 mem segments, then we open 1000 fd. Once there are several reduce on one taskTracker, we will open several thousand fds. I think we can use first collection to remove the empty segments, moreover in shuffle phase, we also can not add the segment into mem. was: The Merger will open too many files on disk, when there are too many empty segments in shuffle mem. We process larger data , eg. > 100T,in one Job. And we use our partitioner to partition the map output,and one map output will wholely shuffle to one reduce。So the other reduce will get lots of empty segments. whole "map_n |  ̄ ̄ ̄ ̄---> reduce1 " | empty " |  ̄ ̄ ̄ ̄---> reduce2 " | empty " |  ̄ ̄ ̄ ̄---> reduce3 " | empty " |  ̄ ̄ ̄ ̄---> reduce4 Because, our input data is bigger, so there are lots of map(10^5). And mostly there are several thousands maps to one reduce, and several thousands empty segments. For example: 1000 mapOutput(on disk) + 3000 empty segments(in mem) Then, as the io.sort.factor=100 in first merge cycle, the merger will merge 10+3000 segments [ by getPassFactor (1000 - 1)%100 + 1 + 30000 ],because there is no real data in mem, then we should use the left 990 mapOutput to replace the empty 3000 mem segments, then we open 1000 fd. Once there are several reduce on one taskTracker, we will open several thousand fds. I think we can use first collection to remove the empty segments, moreover in shuffle phase, we also can not add the segment into mem. > prevent Merger fd leak when there are lots empty segments in mem > ----------------------------------------------------------------- > > Key: MAPREDUCE-1424 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1424 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task > Reporter: Xing Shi > > The Merger will open too many files on disk, when there are too many empty > segments in shuffle mem. > We process larger data , eg. > 100T,in one Job. And we use our partitioner to > partition the map output,and one map output will wholely shuffle to one > reduce。So the other reduce will get lots of empty segments. > " whole > "map_n |  ̄ ̄ ̄ ̄---> reduce1 > " | empty > " |  ̄ ̄ ̄ ̄---> reduce2 > " | empty > " |  ̄ ̄ ̄ ̄---> reduce3 > " | empty > " |  ̄ ̄ ̄ ̄---> reduce4 > Because, our input data is bigger, so there are lots of map(10^5). And mostly > there are several thousands maps to one reduce, and several thousands empty > segments. > For example: > 1000 mapOutput(on disk) + 3000 empty segments(in mem) > Then, as the io.sort.factor=100 > in first merge cycle, the merger will merge 10+3000 segments [ by > getPassFactor (1000 - 1)%100 + 1 + 30000 ],because there is no real data in > mem, then we should use the left 990 mapOutput to replace the empty 3000 mem > segments, then we open 1000 fd. > Once there are several reduce on one taskTracker, we will open several > thousand fds. > I think we can use first collection to remove the empty segments, > moreover in shuffle phase, we also can not add the segment into mem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.