[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

2015-12-21 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066946#comment-15066946
 ] 

Giovanni Matteo Fumarola commented on YARN-110:
---

[~kasha] My idea is to keep a list of requested containers in 
AppSchedulingInfo. 
When the RM sends containers to the AM and in the same heartbeat the AM asks 
containers, the adding check forwards to the capacity scheduler the correct 
number of containers.

After the vacation I will rebase my patch and I will push it.

> AM releases too many containers due to the protocol
> ---
>
> Key: YARN-110
> URL: https://issues.apache.org/jira/browse/YARN-110
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7 
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against 
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
> 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

2015-12-18 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065236#comment-15065236
 ] 

Karthik Kambatla commented on YARN-110:
---

Got it. See the value in fixing it. Proposals on how to? 

> AM releases too many containers due to the protocol
> ---
>
> Key: YARN-110
> URL: https://issues.apache.org/jira/browse/YARN-110
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7 
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against 
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
> 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

2015-12-15 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057747#comment-15057747
 ] 

Karthik Kambatla commented on YARN-110:
---

bq. but it looks like this increases latencies for competing AMs
Not sure I fully understand this. Care to elaborate with a concrete example? 

> AM releases too many containers due to the protocol
> ---
>
> Key: YARN-110
> URL: https://issues.apache.org/jira/browse/YARN-110
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7 
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against 
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
> 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

2015-12-15 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057843#comment-15057843
 ] 

Arun Suresh commented on YARN-110:
--

Correct me if I am wrong [~giovanni.fumarola]
Assume the situation mentioned in the Description.
Currently, as per MAPREDUCE-4671, once an AM has received all its containers, 
the MR AM will just send 0 container count ask to cancel any outstanding 
requests. That implies that the RM will have to wait for the next allocate call 
from the AM to free up any erroneously granted resources (7 containers in the 
example) which means that any other AM running on a fully utilized cluster will 
have to wait for an allocate heartbeat, or two, to satisfy its resource ask.

> AM releases too many containers due to the protocol
> ---
>
> Key: YARN-110
> URL: https://issues.apache.org/jira/browse/YARN-110
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7 
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against 
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
> 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

2015-12-09 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049301#comment-15049301
 ] 

Arun Suresh commented on YARN-110:
--

[~ka...@cloudera.com], [~vinodkv], I understand from MAPREDUCE-4671 that  the 
accounting burden for this has been pushed to the AM and it will not pose a 
latency issue for the AM requesting the resources, but it looks like this 
increases latencies for competing AMs (they might have to wait for subsequent 
allocate call for the resources). Also Custom AMs would need to be cognizant of 
this.

It also looks like [~giovanni.fumarola] is hitting this on some of the clusters 
he is working on. If [~acmurthy] is not actively looking into this, he would 
like to volunteer a patch.

Thoughts ?

> AM releases too many containers due to the protocol
> ---
>
> Key: YARN-110
> URL: https://issues.apache.org/jira/browse/YARN-110
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7 
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against 
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
> 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

2015-06-22 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596614#comment-14596614
 ] 

Giovanni Matteo Fumarola commented on YARN-110:
---

[~acmurthy], [~vinodkv] any updates on this? 
If you don't mind, can I work on this?

 AM releases too many containers due to the protocol
 ---

 Key: YARN-110
 URL: https://issues.apache.org/jira/browse/YARN-110
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: YARN-110.patch


 - AM sends request asking 4 containers on host H1.
 - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
 this point, sets the value against H1 to
 zero in its aggregate request-table for all apps.
 - In the mean-while AM gets to need 3 more containers, so a total of 7 
 including the 4 from previous request.
 - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
 request table.
 - RM seems to be overriding its earlier value of zero against H1 to 7 against 
 H1. And thus allocating 7 more
 containers.
 - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)