[jira] [Commented] (YARN-7903) Method getStarvedResourceRequests() only consider the first encountered resource
[ https://issues.apache.org/jira/browse/YARN-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357923#comment-16357923 ] Steven Rand commented on YARN-7903: --- Agreed that having a concept of delay scheduling for preemption is a good idea and would help with both JIRAs. We might be able to use {{FSAppAttempt.getAllowedLocalityLevel}} or {{FSAppAttempt.getAllowedLocalityLevelByTime}}, since those already have logic for checking whether the app has waited longer than the threshold for requests with some {{SchedulerKey}} (which seems to really just mean priority?). I'll defer to others though on whether it makes sense for delay logic in preemption to match delay logic in allocation -- possibly there are differences between the two that call for separate logic. I'm also quite confused as to how we should be thinking about different RRs from the same app at the same priority. I spent some time digging through the code today, but don't really understand it yet. There are a couple pieces of code I found that deal with deduping/deconflicting RRs, but I wasn't sure how to interpret them: * {{VisitedResourceRequestTracker}} seems to consider RRs with the same priority and capability to be logically the same * {{AppSchedulingInfo#internalAddResourceRequests}} seems to consider RRs with the same {{SchedulerRequestKey}} and resourceName to be logically the same > Method getStarvedResourceRequests() only consider the first encountered > resource > > > Key: YARN-7903 > URL: https://issues.apache.org/jira/browse/YARN-7903 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Yufei Gu >Priority: Major > > We need to specify rack and ANY while submitting a node local resource > request, as YARN-7561 discussed. For example: > {code} > ResourceRequest nodeRequest = > createResourceRequest(GB, node1.getHostName(), 1, 1, false); > ResourceRequest rackRequest = > createResourceRequest(GB, node1.getRackName(), 1, 1, false); > ResourceRequest anyRequest = > createResourceRequest(GB, ResourceRequest.ANY, 1, 1, false); > List resourceRequests = > Arrays.asList(nodeRequest, rackRequest, anyRequest); > {code} > However, method getStarvedResourceRequests() only consider the first > encountered resource, which most likely is ResourceRequest.ANY. That's a > mismatch for locality request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7903) Method getStarvedResourceRequests() only consider the first encountered resource
[ https://issues.apache.org/jira/browse/YARN-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357515#comment-16357515 ] Yufei Gu commented on YARN-7903: {quote} I think that we should try to make progress on that JIRA as well as this one. {quote} Agreed. The request request in description is actually one RR instead of 3 RRs. If the RR is strict about locality(RelaxLocality is false), I don't think there is need to consider other RRs. However, that's another story if RR's RelaxLocality is true. To enable some kind of delay scheduling for preemption seems a reasonable solution for both YARN-6956 and this Jira. I am still confusing about how to parse the multiple RRs of an apps, e.g. the example in the description is actually one RRs instead of 3 RRs, what if there size and container# are different between nodeRequest and rackRequest? Do we consider them one RRs or multiple RRs. Without good understanding of this, I don't think we can make any progress on YARN-6956. Please let me know your thoughts about this. I'll try to dig more about this as well. Thanks. [~Steven Rand]. > Method getStarvedResourceRequests() only consider the first encountered > resource > > > Key: YARN-7903 > URL: https://issues.apache.org/jira/browse/YARN-7903 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Yufei Gu >Priority: Major > > We need to specify rack and ANY while submitting a node local resource > request, as YARN-7561 discussed. For example: > {code} > ResourceRequest nodeRequest = > createResourceRequest(GB, node1.getHostName(), 1, 1, false); > ResourceRequest rackRequest = > createResourceRequest(GB, node1.getRackName(), 1, 1, false); > ResourceRequest anyRequest = > createResourceRequest(GB, ResourceRequest.ANY, 1, 1, false); > List resourceRequests = > Arrays.asList(nodeRequest, rackRequest, anyRequest); > {code} > However, method getStarvedResourceRequests() only consider the first > encountered resource, which most likely is ResourceRequest.ANY. That's a > mismatch for locality request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7903) Method getStarvedResourceRequests() only consider the first encountered resource
[ https://issues.apache.org/jira/browse/YARN-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356327#comment-16356327 ] Steven Rand commented on YARN-7903: --- Agreed that it seems weird/wrong to ignore locality when considering which of an app's RRs to preempt for. I think it's worth noting though that if we change the code to choose the most local request, then we increase the frequency of the failure mode described in YARN-6956, where we fail to preempt because {{getStarvedResourceRequests}} returns only {{NODE_LOCAL}} RRs, and there aren't any preemptable containers on those nodes (even though there are preemptable containers on other nodes). I think that we should try to make progress on that JIRA as well as this one. > Method getStarvedResourceRequests() only consider the first encountered > resource > > > Key: YARN-7903 > URL: https://issues.apache.org/jira/browse/YARN-7903 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Yufei Gu >Priority: Major > > We need to specify rack and ANY while submitting a node local resource > request, as YARN-7561 discussed. For example: > {code} > ResourceRequest nodeRequest = > createResourceRequest(GB, node1.getHostName(), 1, 1, false); > ResourceRequest rackRequest = > createResourceRequest(GB, node1.getRackName(), 1, 1, false); > ResourceRequest anyRequest = > createResourceRequest(GB, ResourceRequest.ANY, 1, 1, false); > List resourceRequests = > Arrays.asList(nodeRequest, rackRequest, anyRequest); > {code} > However, method getStarvedResourceRequests() only consider the first > encountered resource, which most likely is ResourceRequest.ANY. That's a > mismatch for locality request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org