[jira] [Commented] (YARN-7903) Method getStarvedResourceRequests() only consider the first encountered resource

2018-02-08 Thread Steven Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357923#comment-16357923
 ] 

Steven Rand commented on YARN-7903:
---

Agreed that having a concept of delay scheduling for preemption is a good idea 
and would help with both JIRAs. We might be able to use 
{{FSAppAttempt.getAllowedLocalityLevel}} or 
{{FSAppAttempt.getAllowedLocalityLevelByTime}}, since those already have logic 
for checking whether the app has waited longer than the threshold for requests 
with some {{SchedulerKey}} (which seems to really just mean priority?). I'll 
defer to others though on whether it makes sense for delay logic in preemption 
to match delay logic in allocation -- possibly there are differences between 
the two that call for separate logic.

I'm also quite confused as to how we should be thinking about different RRs 
from the same app at the same priority. I spent some time digging through the 
code today, but don't really understand it yet. There are a couple pieces of 
code I found that deal with deduping/deconflicting RRs, but I wasn't sure how 
to interpret them:

* {{VisitedResourceRequestTracker}} seems to consider RRs with the same 
priority and capability to be logically the same
* {{AppSchedulingInfo#internalAddResourceRequests}} seems to consider RRs with 
the same {{SchedulerRequestKey}} and resourceName to be logically the same

> Method getStarvedResourceRequests() only consider the first encountered 
> resource
> 
>
> Key: YARN-7903
> URL: https://issues.apache.org/jira/browse/YARN-7903
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> We need to specify rack and ANY while submitting a node local resource 
> request, as YARN-7561 discussed. For example:
> {code}
> ResourceRequest nodeRequest =
> createResourceRequest(GB, node1.getHostName(), 1, 1, false);
> ResourceRequest rackRequest =
> createResourceRequest(GB, node1.getRackName(), 1, 1, false);
> ResourceRequest anyRequest =
> createResourceRequest(GB, ResourceRequest.ANY, 1, 1, false);
> List resourceRequests =
> Arrays.asList(nodeRequest, rackRequest, anyRequest);
> {code}
> However, method getStarvedResourceRequests() only consider the first 
> encountered resource, which most likely is ResourceRequest.ANY. That's a 
> mismatch for locality request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7903) Method getStarvedResourceRequests() only consider the first encountered resource

2018-02-08 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357515#comment-16357515
 ] 

Yufei Gu commented on YARN-7903:


{quote}
I think that we should try to make progress on that JIRA as well as this one.
{quote}
Agreed.
The request request in description is actually one RR instead of 3 RRs. If the 
RR is strict about locality(RelaxLocality is false), I don't think there is 
need to consider other RRs. However, that's another story if RR's RelaxLocality 
is true. To enable some kind of delay scheduling for preemption seems a 
reasonable solution for both YARN-6956 and this Jira. 

I am still confusing about how to parse the multiple RRs of an apps, e.g. the 
example in the description is actually one RRs instead of 3 RRs, what if there 
size and container# are different between nodeRequest and rackRequest? Do we 
consider them one RRs or multiple RRs. Without good understanding of this, I 
don't think we can make any progress on YARN-6956. Please let me know your 
thoughts about this. I'll try to dig more about this as well. Thanks. [~Steven 
Rand].

> Method getStarvedResourceRequests() only consider the first encountered 
> resource
> 
>
> Key: YARN-7903
> URL: https://issues.apache.org/jira/browse/YARN-7903
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> We need to specify rack and ANY while submitting a node local resource 
> request, as YARN-7561 discussed. For example:
> {code}
> ResourceRequest nodeRequest =
> createResourceRequest(GB, node1.getHostName(), 1, 1, false);
> ResourceRequest rackRequest =
> createResourceRequest(GB, node1.getRackName(), 1, 1, false);
> ResourceRequest anyRequest =
> createResourceRequest(GB, ResourceRequest.ANY, 1, 1, false);
> List resourceRequests =
> Arrays.asList(nodeRequest, rackRequest, anyRequest);
> {code}
> However, method getStarvedResourceRequests() only consider the first 
> encountered resource, which most likely is ResourceRequest.ANY. That's a 
> mismatch for locality request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7903) Method getStarvedResourceRequests() only consider the first encountered resource

2018-02-07 Thread Steven Rand (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356327#comment-16356327
 ] 

Steven Rand commented on YARN-7903:
---

Agreed that it seems weird/wrong to ignore locality when considering which of 
an app's RRs to preempt for. I think it's worth noting though that if we change 
the code to choose the most local request, then we increase the frequency of 
the failure mode described in YARN-6956, where we fail to preempt because 
{{getStarvedResourceRequests}} returns only {{NODE_LOCAL}} RRs, and there 
aren't any preemptable containers on those nodes (even though there are 
preemptable containers on other nodes). I think that we should try to make 
progress on that JIRA as well as this one.

> Method getStarvedResourceRequests() only consider the first encountered 
> resource
> 
>
> Key: YARN-7903
> URL: https://issues.apache.org/jira/browse/YARN-7903
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> We need to specify rack and ANY while submitting a node local resource 
> request, as YARN-7561 discussed. For example:
> {code}
> ResourceRequest nodeRequest =
> createResourceRequest(GB, node1.getHostName(), 1, 1, false);
> ResourceRequest rackRequest =
> createResourceRequest(GB, node1.getRackName(), 1, 1, false);
> ResourceRequest anyRequest =
> createResourceRequest(GB, ResourceRequest.ANY, 1, 1, false);
> List resourceRequests =
> Arrays.asList(nodeRequest, rackRequest, anyRequest);
> {code}
> However, method getStarvedResourceRequests() only consider the first 
> encountered resource, which most likely is ResourceRequest.ANY. That's a 
> mismatch for locality request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org