[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-04 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082312#comment-15082312
 ] 

Subru Krishnan commented on YARN-3870:
--

Regarding the ID, I am in principle fine with asking the AM to set it. We do 
have the option of reusing the _responseID_ of *AllocateRequest* which both the 
RM and AM maintain today. It would be good to also link the _responseID_ to the 
actual allocated container in *AllocateResponse* as this is a useful hint for 
the AMs. In fact has been requested by [~markus.weimer] to simplify certain 
bookkeeping for the [REEF | http://reef.apache.org/ ] AM.

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-04 Thread Lei Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081242#comment-15081242
 ] 

Lei Guo commented on YARN-3870:
---

I am not against to combine this JIRA and YARN-371, There is common ground 
between these two JIRAs. And more likely the final technical solution will be 
single solution to cover both, though it's not necessary. Maybe we can view 
YARN-371 as a technical speculation and YARN-3870 as one related use case (if 
YARN-371 is resolved, YARN-3870 should be covered). 

>From another angle, YARN-3870 could be resolved via approaches without ID. The 
>scheduling is more care about the current snapshot of resource requests from 
>applications. It's not mandatory to have the ID, as long as the snapshot can 
>provide detailed resource request information, scheduler can do fine 
>scheduling. The ID will mainly help to prevent/handle issues from asynchronous 
>protocol.

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-03 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080363#comment-15080363
 ] 

Karthik Kambatla commented on YARN-3870:


I am not saying we shouldn't do this. I am only saying we should likely discuss 
this on YARN-371 so folks there are aware of this conversation. 

bq. even without an ID, what happens today if an application makes each request 
with a different Priority ?
In that case, we would still run into too many ResourceRequests, however this 
is unlikely to happen on a cluster that isn't compromised.

Outside of this, today, we hold a lot more information about all running 
containers. Storing resource-requests for outstanding requests might not be 
prohibitively expensive. 



> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-02 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076642#comment-15076642
 ] 

Arun Suresh commented on YARN-3870:
---

+1 to having the AM create the ID (which I guess is required if the AM is to 
co-relate a resourcereq with a container response). Can we set the ID that is 
something of a combination of {{app_attempt_id + seq_id}}, where seq_id is 
incremented monotonically per app attempt ? to distinguish between different 
app attempts.

Also, I guess the outstanding requests for an app needs to be stored in the 
state store so subsequent app attempts can be intimated of unfullfulled reqs


> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076629#comment-15076629
 ] 

Karthik Kambatla commented on YARN-3870:


bq. Karthik Kambatla, I am thinking whether we also need some update for the 
response part to correlate it with the ResourceRequest ID. As the scheduling is 
asynchronous, AM will also need to know the relation between response and 
request.

bq. If the AM doesn't add an ID, the RM could add one. Or, we could have the RM 
add the IDs and return them to the AM for help with book keeping.

Thought more about this. Since one AllocateRequest could have multiple 
ResourceRequests, the protocol becomes quite complicated if the RM creates an 
ID instead of the AM. How about we expect the AM to set this ID? If the AM 
doesn't set, we treat the requests the same way we do today (ID = -1). In the 
AllocateResponse, the RM could send the last received ResourceRequest. The AM 
could look at this ACK to see if it has to resend the requests? The AMRMClient 
and MR-AM could be updated to do this. 

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076743#comment-15076743
 ] 

Karthik Kambatla commented on YARN-3870:


Was fleshing this out further. The number of IDs and hence ResourceRequests 
could be O(num. outstanding containers) which could pose problems as outlined 
in YARN-371. In fact, this JIRA is a duplicate of YARN-371: may be, we should 
close this and continue the discussion there. 

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-02 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076746#comment-15076746
 ] 

Arun Suresh commented on YARN-3870:
---

Hmmm... not sure entirely sure the argument holds, or I might be missing 
something. For eg. even without an ID, what happens today if an application 
makes each request with a different Priority ?

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15079334#comment-15079334
 ] 

Karthik Kambatla commented on YARN-3870:


[~leftnoteasy] - agree that YARN-4485 doesn't necessarily need ID-ing all 
requests corresponding to one task. 

The JIRAs are related only due to the data-structures in AppSchedulingInfo:
# If we add IDs as discussed here, we don't need any other data-structure 
changes except adding a timestamp to each ResourceRequest. 
# If we don't add IDs, we might need to store the resource-requests as 
Map Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2016-01-02 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076775#comment-15076775
 ] 

Wangda Tan commented on YARN-3870:
--

[~kasha],

bq. Was fleshing this out further. The number of IDs and hence ResourceRequests 
could be O(num. outstanding containers) which could pose problems as outlined 
in YARN-371. In fact, this JIRA is a duplicate of YARN-371: may be, we should 
close this and continue the discussion there.

As I mentioned above, I think we shouldn't combine YARN-371 and YARN-4485 
together: YARN-4485 is more like an internal change of scheduler to me:

Let's say an AM originally requests 1000 container (T1), then AM requests 1200 
containers (T2), then after scheduler allocated 100 containers, AM requests 
1200 containers again (T3).

For the original request, scheduler records: T1, 1000.
After T2, scheduler records: T1, 1000; T2, 200.
After T3, scheduler records: T1, 900 (scheduler allocates 100 containers); T2, 
200; T3, 100.

Instead recording timestamps for all resource requests, AM only needs to record 
timestamp to #pending-requests. And scheduler will "dequeue" from the timestamp 
to #pending-requests (sorted by time) when container allocated.

Like what you said, it will be hard to ask AM to set the ID, but scheduler 
should easily set it. But this solution needs more work if we want to save 
these timestamps when RM restart.

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>Assignee: Karthik Kambatla
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-31 Thread Lei Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075991#comment-15075991
 ] 

Lei Guo commented on YARN-3870:
---

[~kasha], I am thinking whether we also need some update for the response part 
to correlate it with the ResourceRequest ID. As the scheduling is asynchronous, 
AM will also need to know the relation between response and request.

Another question is whether we should make this detailed information as 
parameter controlled, as it may increase the AM/RM communication overhead.

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-30 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075740#comment-15075740
 ] 

Karthik Kambatla commented on YARN-3870:


bq. I think this JIRA and YARN-4485 are different:

bq. I think we cannot simply update timestamp when new resource request 
arrives. For example, at T1, AM asks 100 * 1G container; after 2 mins (T2), 
assume there's no container allocated, AM asks 100 * 1G container, we cannot 
say the resource request is added at T2. Instead, we should only set new 
timestamp for incremental asks

For YARN-4485, I was planning on taking this exact approach. While the two 
JIRAs and their purposes are different, the ability to identify a set of 
requests that arrived at one point in time requires similar updates to the data 
structures we use in AppSchedulingInfo. 

bq. what do you think of using  for id?
Timestamps for Ids might not be a good idea especially when an AM can restart. 
Also, there might be merit to differentiating two ResourceRequests (say, at 
different priorities) received at the same time. 

Discussed this with [~asuresh] and [~subru] offline. We felt the following 
changes would help us address multiple JIRAs (as [~xinxianyin] listed):
# Add an ID field to ResourceRequest - this can be a sequence number for each 
application. On AM restarts, a subsequent attempt could choose to resume from 
appropriate sequence number. If the AM doesn't add an ID, the RM could add one. 
Or, we could have the RM add the IDs and return them to the AM for help with 
book keeping. 
# YARN-4485 would likely want to add a timestamp in addition to this. Given the 
IDs, we likely don't have to do special delta handling. 
# In case the number of containers in the existing ResourceRequest increases, 
the delta is given a new ID. e.g  - e.g. App increases request from 3 
containers to 7 containers of same capability etc., the first three would have 
ID '1' and the next four would have ID '2'. 
# In case the number of containers corresponding to an existing ResourceRequest 
decreases, the number of containers is reduced from the largest ID to the 
smallest ID until the decrease is accounted for. e.g. If an app asks for 3, 7 
and 2 containers in subsequent allocate calls, once these calls are processed, 
the app has 2 containers with ID '1'. 
# The resource-request data structure in AppSchedulingInfo will be this 
{{Map Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to 

[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-29 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074755#comment-15074755
 ] 

Xianyin Xin commented on YARN-3870:
---

+1 for adding an unique id for a resource request, but i would suggest we 
consider these kind of problems in a more systematic way, considering YARN-314, 
YARN-1042, YARN-371, YARN-4485 and this.

Like my comment in YARN-314, a natural way the scheduler works should like a 
factory, it receives orders, and prepare for that. Once we accept the work 
philosophy, we'll find it's natural and necessary for a resource order has the 
following dimensions
1. order id, which can identify an order, and can get overdue, or has a time 
limit;
2. priority;
3. a collection of request unit, each specifies a kind of resource request,that 
should have a coordinate of ;
4. relaxLocality;
5. canbeDecomposed, or ifGangScheduling;
6. ...
Scheduler do scheduling based on order form, and should not swallow any 
information passed from the app.

Any thoughts?

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-29 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074658#comment-15074658
 ] 

Karthik Kambatla commented on YARN-3870:


+1 to improving the way we are storing ResourceRequests in AppSchedulingInfo. 

In the context of YARN-4485, I would like to put a timestamp on when each 
ResourceRequest was received. We can use this to determine the container 
allocation latency at allocation-time. Today, we store all ResourceRequests for 
the same priority and locality together irrespective of whether the requests 
came all together or separate.

[~asuresh] - what do you think of using  for id? I 
forget the details - how do we handle an AM restart, do we create a new 
AppSchedulingInfo? If so, we can just use the timestamp as is. 

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-29 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074685#comment-15074685
 ] 

Wangda Tan commented on YARN-3870:
--

Hi [~kasha]/[~asuresh],

I think this JIRA and YARN-4485 are different:
- This JIRA is focus on "resource request group", like the example in 
description.
{code}
c1: host1, host2, host4
c2: host2, host3, host5
{code}
Current resource request cannot store "group by" info.
- YARN-4485 is focus on timestamp:
Some random thoughts for YARN-4485,
I think we cannot simply update timestamp when new resource request arrives. 
For example, at T1, AM asks 100 * 1G container; after 2 mins (T2), assume 
there's no container allocated, AM asks 100 * 1G container, we cannot say the 
resource request is added at T2. Instead, we should only set new timestamp for 
incremental asks (e.g. AM asks 120 * 1G container at T2, we should say, 100 
resource requests from T1, and 20 resource requests from T2).

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070186#comment-15070186
 ] 

Wangda Tan commented on YARN-3870:
--

Hi [~grey],

Thanks for raising this, we definitely need such mechanism to better describe 
our resource request.

[~asuresh], I'm not sure how the unique id works? Are you planing to add it as 
a key to AppSchedulingInfo resource requests map? (e.g. {{Map = >>}})

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-23 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070178#comment-15070178
 ] 

Subru Krishnan commented on YARN-3870:
--

+1 on this.

Thanks [~grey] for raising this. I have been having offline discussions with 
[~asuresh] and [~curino] around Distributed Scheduling (YARN-2877) and 
Federation (YARN-2915). In both scenarios, sending the raw container request 
and letting the RM expand will save us a lot of pain as currently we are 
finding it very difficult to route requests correctly in the AMRMProxy 
(YARN-2844) 

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-23 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070279#comment-15070279
 ] 

Arun Suresh commented on YARN-3870:
---

[~leftnoteasy], With respect to the AM, I was thinking.. just having it as a 
field in the ReseourceRequest as well as the Container (returned by the 
allocate call) would suffice.
>From the perspective of the Scheduler, yes, {{Map = >>}} was the direction I was thinking.. Correct me if I 
>am wrong, but, currently, there is an implicit understanding that all 
>resources requests for the same resource requirement should have the same 
>priority. Having an explicit request id would allow us to remove that 
>constraint as well..

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-22 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069208#comment-15069208
 ] 

Arun Suresh commented on YARN-3870:
---

Thank you for starting this discussion [~grey]

Correct me if I am wrong, what you are proposing, I guess is some way for the 
Scheduler to co-relate the expanded Resource Requests. I do feel this would be 
genuinely useful, not only from a Scheduling perspective for eg. making 
affinity / anti-afinity scheduling decisions viz. YARN-1042. This will also 
greatly help improving pre-emption decisions in the FairScheduler viz. 
YARN-2154.. 

This would also be extremely useful for AMs too. Currently the MRAM does the 
book keeping and matches an allocated container to ResourceRequest. AMs can be 
generally relieved of this job if an allocated Container Token can easily be 
matched against a Resource Request.

One possible approach could be to have the AMClient generate a unique id for a 
Resource request and tag each of the expanded requests (Node, Rack and ANY) 
with this id. This Id can then be passed around in the 
Container/ContainerTokenIdentifier.

[~ka...@cloudera.com], [~vinodkv], [~leftnoteasy], Thoughts ?

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)