[ 
https://issues.apache.org/jira/browse/TEZ-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136656#comment-14136656
 ] 

Siddharth Seth commented on TEZ-1589:
-------------------------------------

[~rajesh.balamohan] - this starts at 10, then degrades to 20, 30, 40 etc.
Instead of starting at 100ms, does it make sense to maybe degrade this faster 
after the first few attempts so that smaller jobs / clusters don't end up with 
a high ping rate.
When there are pending tasks for a job, I believe we typically end up getting a 
task assigned within the first 30ms of a task completing. The first heartbeat, 
I don't think gets a task - since the AM takes some time to process the last 
completion message.

> ContainerReporter requests AM for getTask() too frequently (causing pressure 
> on AM side)
> ----------------------------------------------------------------------------------------
>
>                 Key: TEZ-1589
>                 URL: https://issues.apache.org/jira/browse/TEZ-1589
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>              Labels: performance
>         Attachments: TEZ-1589.1.patch
>
>
> Min time to wait before a task requests AM for another task is set to 200 ms. 
>  As per ContainerReporter->call() logic, it would be invoked every 10ms until 
> we reach 200ms.
> This wouldn't be much of a problem for small jobs.  But for large jobs with 
> many tasks, this will cause pressure on AM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to