[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-5124:
--------------------------------------

    Attachment: MAPREDUCE-5124-proto.2.txt

I attached a rough prototype to restrict onetime-style RPC with keeping the 
backward compatibility. This prototype includes changes as follows:

1. adding RPC header to callType to distinguish ONETIME with HEATBEAT.
2. adding a new error code(ToBusyRetryLaterException).
3. adding a counter to restrict numbers of processing RPC within high-water 
mark to Server#Handler.

In a mean while, this prototype does NOT include:
1. test codes.
2. creating response to decide the heatbeat period dynamically to client.

If this design is acceptable, I make the next patch which include both of them. 
If you have any question about the design, let me know.
                
> AM lacks flow control for task events
> -------------------------------------
>
>                 Key: MAPREDUCE-5124
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.3-alpha, 0.23.5
>            Reporter: Jason Lowe
>         Attachments: MAPREDUCE-5124-proto.2.txt, MAPREDUCE-5124-prototype.txt
>
>
> The AM does not have any flow control to limit the incoming rate of events 
> from tasks.  If the AM is unable to keep pace with the rate of incoming 
> events for a sufficient period of time then it will eventually exhaust the 
> heap and crash.  MAPREDUCE-5043 addressed a major bottleneck for event 
> processing, but the AM could still get behind if it's starved for CPU and/or 
> handling a very large job with tens of thousands of active tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to