homatthew commented on code in PR #3586:
URL: https://github.com/apache/gobblin/pull/3586#discussion_r999752146
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java:
##########
@@ -474,11 +477,21 @@ public synchronized void
requestTargetNumberOfContainers(YarnContainerRequestBun
for (Map.Entry<String, Integer> entry :
yarnContainerRequestBundle.getHelixTagContainerCountMap().entrySet()) {
String currentHelixTag = entry.getKey();
int desiredContainerCount = entry.getValue();
+ Resource resourceForHelixTag =
yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag);
+
// Calculate requested container count based on adding allocated count
and outstanding ContainerRequests in Yarn
- int requestedContainerCount =
allocatedContainerCountMap.getOrDefault(currentHelixTag, 0)
- +
getMatchingRequestsCount(yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag));
- for(; requestedContainerCount < desiredContainerCount;
requestedContainerCount++) {
- requestContainer(Optional.absent(),
yarnContainerRequestBundle.getHelixTagResourceMap().get(currentHelixTag));
+ allocatedContainerCountMap.putIfAbsent(currentHelixTag, new
AtomicInteger(0));
Review Comment:
The main logical change is changing the value type from `Integer` to
`AtomicInteger`. This prevents potential desync between our
allocatedContainerCountMap (HelixTag -> numCountainers) and the containerMap
(containerId -> Container POJO) due to race condition.
I suspect the incorrect value for allocated container map is one of the
reasons we saw incorrect behavior with shrink + allocating. This edge case
would be generally rare and that's why we only start to see issues after a
large number of days and the issue is resolved after a restart.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]