[
https://issues.apache.org/jira/browse/GOBBLIN-1728?focusedWorklogId=818904&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-818904
]
ASF GitHub Bot logged work on GOBBLIN-1728:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 20/Oct/22 19:02
Start Date: 20/Oct/22 19:02
Worklog Time Spent: 10m
Work Description: hanghangliu commented on code in PR #3586:
URL: https://github.com/apache/gobblin/pull/3586#discussion_r1000919913
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnAutoScalingManager.java:
##########
@@ -266,7 +266,7 @@ void runInternal() {
}
slidingWindowReservoir.add(yarnContainerRequestBundle);
- log.debug("There are {} containers being requested in total, tag-count
map {}, tag-resource map {}",
+ log.info("There are {} containers being requested in total, tag-count
map {}, tag-resource map {}",
Review Comment:
This log seems very similar to the log in
YarnService.requestTargetNumberOfContainers() "Current tag-container desired
count:". Maybe we can just use that one and add tag-resource map there?
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java:
##########
@@ -474,11 +477,21 @@ public synchronized void
requestTargetNumberOfContainers(YarnContainerRequestBun
for (Map.Entry<String, Integer> entry :
yarnContainerRequestBundle.getHelixTagContainerCountMap().entrySet()) {
String currentHelixTag = entry.getKey();
int desiredContainerCount = entry.getValue();
+ Resource resourceForHelixTag =
yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag);
+
// Calculate requested container count based on adding allocated count
and outstanding ContainerRequests in Yarn
- int requestedContainerCount =
allocatedContainerCountMap.getOrDefault(currentHelixTag, 0)
- +
getMatchingRequestsCount(yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag));
- for(; requestedContainerCount < desiredContainerCount;
requestedContainerCount++) {
- requestContainer(Optional.absent(),
yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag));
+ allocatedContainerCountMap.putIfAbsent(currentHelixTag, new
AtomicInteger(0));
+ int allocatedContainers =
allocatedContainerCountMap.get(currentHelixTag).get();
+ int outstandingContainerRequests =
getMatchingRequestsCount(resourceForHelixTag);
+ int requestedContainerCount = allocatedContainers +
outstandingContainerRequests;
+ int numContainersNeeded = desiredContainerCount -
requestedContainerCount;
+ LOGGER.info("helixTag={}, allocatedContainers={},
outstandingContainerRequests={}, desiredContainerCount={},
numContainersNeeded={}",
+ currentHelixTag, allocatedContainers, outstandingContainerRequests,
desiredContainerCount, numContainersNeeded);
+
+ if (numContainersNeeded > 0) {
+ requestContainers(numContainersNeeded, resourceForHelixTag);
+ } else {
+ LOGGER.info("Not requesting any containers because
numContainersNeeded={} which is not > 0", numContainersNeeded);
Review Comment:
maybe can set this as debug? As the previous log should already indicate the
behavior
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java:
##########
@@ -857,6 +887,8 @@ public void onContainersAllocated(List<Container>
containers) {
ContainerInfo containerInfo = new ContainerInfo(container,
instanceName, containerHelixTag);
containerMap.put(container.getId(), containerInfo);
+ allocatedContainerCountMap.putIfAbsent(containerHelixTag, new
AtomicInteger(0));
Review Comment:
Not sure if this is still needed, as it should already been added in line
483 inside requestTargetNumberOfContainers. But no harm to add though
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java:
##########
@@ -474,11 +477,21 @@ public synchronized void
requestTargetNumberOfContainers(YarnContainerRequestBun
for (Map.Entry<String, Integer> entry :
yarnContainerRequestBundle.getHelixTagContainerCountMap().entrySet()) {
String currentHelixTag = entry.getKey();
int desiredContainerCount = entry.getValue();
+ Resource resourceForHelixTag =
yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag);
+
// Calculate requested container count based on adding allocated count
and outstanding ContainerRequests in Yarn
- int requestedContainerCount =
allocatedContainerCountMap.getOrDefault(currentHelixTag, 0)
- +
getMatchingRequestsCount(yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag));
- for(; requestedContainerCount < desiredContainerCount;
requestedContainerCount++) {
- requestContainer(Optional.absent(),
yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag));
+ allocatedContainerCountMap.putIfAbsent(currentHelixTag, new
AtomicInteger(0));
+ int allocatedContainers =
allocatedContainerCountMap.get(currentHelixTag).get();
+ int outstandingContainerRequests =
getMatchingRequestsCount(resourceForHelixTag);
+ int requestedContainerCount = allocatedContainers +
outstandingContainerRequests;
Review Comment:
allocatedContainers looks confusing as we have another variable
numAllocatedContainers. Maybe just directly calculate requestedContainerCount =
allocatedContainerCountMap.get(currentHelixTag).get() +
getMatchingRequestsCount(resourceForHelixTag); ? Or you have any better idea?
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java:
##########
@@ -845,8 +877,6 @@ public void onContainersAllocated(List<Container>
containers) {
instanceName = null;
}
}
- allocatedContainerCountMap.put(containerHelixTag,
Review Comment:
I'm okay with your change, but the original way should also work
Issue Time Tracking
-------------------
Worklog Id: (was: 818904)
Time Spent: 2h 20m (was: 2h 10m)
> Yarn Service requests too many containers due to improper calculation
> ---------------------------------------------------------------------
>
> Key: GOBBLIN-1728
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1728
> Project: Apache Gobblin
> Issue Type: New Feature
> Reporter: Matthew Ho
> Priority: Major
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> Yarn Service is responsible for calculating the number of instances based on
> the helix tasks. Yarn service tracks the number of instances by asking Yarn
> for the number of resource requests and the number of allocated containers.
>
> It uses this count to determine if it should ask for more containers or
> shrink the number of containers. This calculation is currently done
> improperly and we continue to request containers when we have enough
> requested.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)