[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Shi updated MAPREDUCE-1424:
--------------------------------

    Description: 
The Merger will open too many files on disk, when there are too many empty 
segments in shuffle mem.


We process larger data , eg. > 100T,in one Job. And we use our partitioner to 
partition the map output,and one map output will wholely shuffle to one 
reduce。So the other reduce will get lots of empty segments.

'                             whole
'mapOutput_n  |  ̄ ̄ ̄ ̄--->      reduce1
'                        |    empty    
'                        |  ̄ ̄ ̄ ̄--->      reduce2
'                        |    empty
'                        |  ̄ ̄ ̄ ̄--->      reduce3
'                        |   empty
'                        |  ̄ ̄ ̄ ̄--->      reduce4

Because, our input data is bigger, so there are lots of map(10^5). And mostly 
there are several thousands maps to one reduce, and several thousands empty 
segments. 

For example:
     1000 mapOutput(on disk) + 3000 empty segments(in mem)

Then, as the io.sort.factor=100

    in first merge cycle, the merger will merge 10+3000 segments [ by 
getPassFactor (1000 - 1)%100 + 1 + 30000 ],because there is no real data in 
mem, then we should use the left 990 mapOutput to replace the empty 3000 mem 
segments, then we open 1000 fd.

    Once there are several reduce on one taskTracker, we will open several 
thousand fds.


    I think we can use first collection to remove the empty segments, moreover 
in shuffle phase, we also can not add the segment into mem.

  was:
The Merger will open too many files on disk, when there are too many empty 
segments in shuffle mem.


We process larger data , eg. > 100T,in one Job. And we use our partitioner to 
partition the map output,and one map output will wholely shuffle to one 
reduce。So the other reduce will get lots of empty segments.

"                  whole
"map_n  |  ̄ ̄ ̄ ̄--->      reduce1
"             |    empty    
"             |  ̄ ̄ ̄ ̄--->      reduce2
"             |    empty
"             |  ̄ ̄ ̄ ̄--->      reduce3
"             |   empty
"             |  ̄ ̄ ̄ ̄--->      reduce4

Because, our input data is bigger, so there are lots of map(10^5). And mostly 
there are several thousands maps to one reduce, and several thousands empty 
segments. 

For example:
     1000 mapOutput(on disk) + 3000 empty segments(in mem)

Then, as the io.sort.factor=100

    in first merge cycle, the merger will merge 10+3000 segments [ by 
getPassFactor (1000 - 1)%100 + 1 + 30000 ],because there is no real data in 
mem, then we should use the left 990 mapOutput to replace the empty 3000 mem 
segments, then we open 1000 fd.

    Once there are several reduce on one taskTracker, we will open several 
thousand fds.


    I think we can use first collection to remove the empty segments, moreover 
in shuffle phase, we also can not add the segment into mem.


> prevent Merger fd  leak when there are lots empty segments in mem
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1424
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1424
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>            Reporter: Xing Shi
>
> The Merger will open too many files on disk, when there are too many empty 
> segments in shuffle mem.
> We process larger data , eg. > 100T,in one Job. And we use our partitioner to 
> partition the map output,and one map output will wholely shuffle to one 
> reduce。So the other reduce will get lots of empty segments.
> '                             whole
> 'mapOutput_n  |  ̄ ̄ ̄ ̄--->      reduce1
> '                        |    empty    
> '                        |  ̄ ̄ ̄ ̄--->      reduce2
> '                        |    empty
> '                        |  ̄ ̄ ̄ ̄--->      reduce3
> '                        |   empty
> '                        |  ̄ ̄ ̄ ̄--->      reduce4
> Because, our input data is bigger, so there are lots of map(10^5). And mostly 
> there are several thousands maps to one reduce, and several thousands empty 
> segments. 
> For example:
>      1000 mapOutput(on disk) + 3000 empty segments(in mem)
> Then, as the io.sort.factor=100
>     in first merge cycle, the merger will merge 10+3000 segments [ by 
> getPassFactor (1000 - 1)%100 + 1 + 30000 ],because there is no real data in 
> mem, then we should use the left 990 mapOutput to replace the empty 3000 mem 
> segments, then we open 1000 fd.
>     Once there are several reduce on one taskTracker, we will open several 
> thousand fds.
>     I think we can use first collection to remove the empty segments, 
> moreover in shuffle phase, we also can not add the segment into mem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to