[
https://issues.apache.org/jira/browse/GOBBLIN-1728?focusedWorklogId=819787&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-819787
]
ASF GitHub Bot logged work on GOBBLIN-1728:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 24/Oct/22 17:17
Start Date: 24/Oct/22 17:17
Worklog Time Spent: 10m
Work Description: homatthew commented on code in PR #3586:
URL: https://github.com/apache/gobblin/pull/3586#discussion_r1003565204
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java:
##########
@@ -889,11 +902,11 @@ public void onContainersAllocated(List<Container>
containers) {
allocatedContainerCountMap.putIfAbsent(containerHelixTag, new
AtomicInteger(0));
allocatedContainerCountMap.get(containerHelixTag).incrementAndGet();
- // Find matching requests and remove the request to reduce the chance
that a subsequent request
- // will request extra containers. YARN does not have a delta request
API and the requests are not
- // cleaned up automatically.
+ // Find matching requests and remove the request (YARN-660). We the
scheduler are responsible
Review Comment:
https://issues.apache.org/jira/browse/YARN-660?focusedCommentId=13655384&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13655384
> This jira takes up the problem of helping schedulers find matching
requests for allocated containers.
>
> ...
>
> Now the API becomes elegant and intuitive. addContainerRequest() to add
requests. getMatchingRequests() to get all matching requests. Pick a container
from matching requests and call removeContainerRequest() with it to remove it.
That is all that there is to it. There are some other minor fixes to
AMRMClientAsync. We are on the same page wrt this being a needed functionality.
>
Issue Time Tracking
-------------------
Worklog Id: (was: 819787)
Time Spent: 4h 50m (was: 4h 40m)
> Yarn Service requests too many containers due to improper calculation
> ---------------------------------------------------------------------
>
> Key: GOBBLIN-1728
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1728
> Project: Apache Gobblin
> Issue Type: New Feature
> Reporter: Matthew Ho
> Priority: Major
> Time Spent: 4h 50m
> Remaining Estimate: 0h
>
> Yarn Service is responsible for calculating the number of instances based on
> the helix tasks. Yarn service tracks the number of instances by asking Yarn
> for the number of resource requests and the number of allocated containers.
>
> It uses this count to determine if it should ask for more containers or
> shrink the number of containers. This calculation is currently done
> improperly and we continue to request containers when we have enough
> requested.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)