[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-09-22 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774293#comment-13774293
 ] 

Maysam Yabandeh commented on YARN-779:
--

Sure [~acmurthy]. Note taken for other jiras. But the current jira is still in 
the brainstorming phase and the attached patch is just a unit test showing the 
problem.

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Maysam Yabandeh
Priority: Critical
 Attachments: YARN-779.patch, YARN-779.patch


 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 containers in node3, the resulting outstanding ResourceRequests on the RM 
 will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-08-24 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749341#comment-13749341
 ] 

Maysam Yabandeh commented on YARN-779:
--

I am thinking perhaps we can solve the problem without needing a complete 
change in the API. Since we are using Protocol Buffers, we can freely add new 
fields to the message.

What we need is a way to express in a set of ResourceRequests the disjunction 
between the requested containers in ContainerRequest. For that we can use a 
locally unique resourceRequestId generated by the AMRMClientImpl.java. For 
example if application requires one container in (node1 || node2), 
#addContainerRequest decomposes it into two ResourceRequests but tagged with 
the same resourceRequestId. 
* ResourceRequest(node1, id1234);
* ResourceRequest(node2, id1234);

Later, when the ResourceManager services a ResourceRequest with ID id1234, it 
can update all other corresponding ResourceRequests from the same application 
with the same ID of id1234. Thanks to Protocol Buffers, there will be no 
inconsistency between old/new clients with new/old servers.

Feedbacks are appreciated.

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Priority: Critical
 Attachments: YARN-779.patch, YARN-779.patch


 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 containers in node3, the resulting outstanding ResourceRequests on the RM 
 will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684839#comment-13684839
 ] 

Sandy Ryza commented on YARN-779:
-

Ah, I misunderstood how removeContainerRequest works.  That gets rid of my 
concerns.

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Maysam Yabandeh
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 containers in node3, the resulting outstanding ResourceRequests on the RM 
 will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-15 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684104#comment-13684104
 ] 

Maysam Yabandeh commented on YARN-779:
--

Thanks [~sandyr]. Let me run by you my understanding of the problem, to ensure 
that we are on the same page. The reported erroneous scenario could be 
addressed by reseting the outstanding requests at RM, whenever ANY gets 0. The 
actual problem, however, still remains since the AMRMClient receives a 
ContainerRequest and decomposes it into independent ResourceRequests. The 
information about the disjunction between the requested resources is, thus, not 
available at RM to properly maintain the list of outstanding requests. Building 
on top of the original example, here is the erroneous scenario:

{code}
@AMRMClient
ContainerRequest(..., {node1, node2}, ..., 10)
ContainerRequest(..., {node3}, ..., 5)
{code}

The internal state at RM will be:

{code}
@AppSchedulingInfo
Resource  #
-
node110
node210
node35
ANY  15
{code}

In other words, the original request of (10*(node1 or node2)) and 5*node3  
could be interpreted in different way such as 10*node1 and (5*(node2 or 
node3)). If my understanding is correct, then solution lies in changing the 
API between AM and RM, to also send the original disjunction between the 
requested resources. We then need to change the AppSchedulingInfo to properly 
maintain the added information. Does this makes sense?



 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Maysam Yabandeh
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 containers in node3, the resulting outstanding ResourceRequests on the RM 
 will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684249#comment-13684249
 ] 

Sandy Ryza commented on YARN-779:
-

[~maysamyabandeh], I follow you until the end.  What API changes do you have in 
mind? i.e. what would be required to send the disjunction between requested 
resources that is not available now?

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Maysam Yabandeh
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 containers in node3, the resulting outstanding ResourceRequests on the RM 
 will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684413#comment-13684413
 ] 

Sandy Ryza commented on YARN-779:
-

The MR AM has solved this problem purely on the AM side (can't remember the 
JIRA number, but I'll post it when I find it), so I think it should be possible 
to do this without changing the RM or the AMRM protocol.  The basic issue is 
that, when a container is given to the app, we need to associate it with a 
ContainerRequest so that we can cancel the right resource requests.  In 
general, the AMRMClient cannot automatically perform this association.  
Consider a situation where an app needs two tasks, one on node1 or node2, and 
one on node2 or node3.  When the app receives a container on node2, it will 
assign it to one of these tasks, but only the app knows which task it is 
assigning it to.  So we need some sort of API for the app to communicate this 
knowledge to the AMRMClient.

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Maysam Yabandeh
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 containers in node3, the resulting outstanding ResourceRequests on the RM 
 will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-15 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684548#comment-13684548
 ] 

Bikas Saha commented on YARN-779:
-

That API is AMRMClient.removeContainerRequest().

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Maysam Yabandeh
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 containers in node3, the resulting outstanding ResourceRequests on the RM 
 will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-14 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13683407#comment-13683407
 ] 

Maysam Yabandeh commented on YARN-779:
--

Can you please explain the erroneous scenario a bit more? I understand how the 
state of table 1 is created, but I am wondering which method call exactly 
changes the state to what depicted in table 2. The only public methods that 
manipulated the state are addContainerRequest and removeContainerRequest, which 
none seems to be able to perform such state transformation.

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 container in node3, the resulting outstanding ResourceRequests will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-14 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13683510#comment-13683510
 ] 

Sandy Ryza commented on YARN-779:
-

Table 2 reflects the state on the RM, which is updated whenever a container is 
allocated (see AppSchedulingInfo.java).

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding on the RM.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 container in node3, the resulting outstanding ResourceRequests will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-779) AMRMClient should clean up dangling unsatisfied request

2013-06-06 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677644#comment-13677644
 ] 

Alejandro Abdelnur commented on YARN-779:
-

It may be that if a ContainerRequest has been partially allocated and it is 
remove the count goes also out of synch.

 AMRMClient should clean up dangling unsatisfied request
 ---

 Key: YARN-779
 URL: https://issues.apache.org/jira/browse/YARN-779
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Priority: Critical

 If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
 node2 is placed (assuming a single rack) the resulting ResourceRequests will 
 be
 {code}
 location - containers
 -
 node1- 10
 node2- 10
 rack - 10
 ANY  - 10
 {code}
 Assuming 5 containers are allocated in node1 and 5 containers are allocated 
 in node2, the following ResourceRequests will be outstanding.
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 {code}
 If the AMMRClient does a new ContainerRequest allocation, this time for 5 
 container in node3, the resulting outstanding ResourceRequests will be:
 {code}
 location - containers
 -
 node1- 5
 node2- 5
 node3- 5
 rack - 5
 ANY  - 5
 {code}
 At this point, the scheduler may assign 5 containers to node1 and it will 
 never assign the 5 containers node3 asked for.
 AMRMClient should keep track of the outstanding allocations counts per 
 ContainerRequest and when gets to zero it should update the the RACK/ANY 
 decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira