[
https://issues.apache.org/jira/browse/GOBBLIN-1781?focusedWorklogId=845250&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-845250
]
ASF GitHub Bot logged work on GOBBLIN-1781:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 14/Feb/23 00:57
Start Date: 14/Feb/23 00:57
Worklog Time Spent: 10m
Work Description: Will-Lo commented on code in PR #3638:
URL: https://github.com/apache/gobblin/pull/3638#discussion_r1105137753
##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/YarnService.java:
##########
@@ -462,10 +464,16 @@ private EventSubmitter buildEventSubmitter() {
*
* @param yarnContainerRequestBundle the desired containers information,
including numbers, resource and helix tag
* @param inUseInstances a set of in use instances
+ * @return whether the requestTargetNumberOfContainers function has executed
yet
Review Comment:
I would argue that the return value is whether or not the requested number
of containers could be actually obtained after service initialization.
describing the execution itself is ambiguous since it still executes when you
return false
Issue Time Tracking
-------------------
Worklog Id: (was: 845250)
Time Spent: 1.5h (was: 1h 20m)
> Helix offline instance purging is not thread safe in the yarn service
> ---------------------------------------------------------------------
>
> Key: GOBBLIN-1781
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1781
> Project: Apache Gobblin
> Issue Type: Bug
> Reporter: Andy Jiang
> Priority: Major
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> Helix instances are purged during startup of the yarn service. This operation
> must be done without new helix instances being added or removed (i.e. the API
> call is not thread safe).
>
> The current implementation blocks the yarn service from allocating initial
> containers while the helix instance purging is enabled, but it does not
> prevent other external services from requesting containers through its public
> methods.
> These 2 services start up concurrently, and it's possible that the
> AutoScalingYarnManager starts up before the Yarn Service is completely
> finished purging. This means leads to the AutoScalingYarnManager to
> requestContainers while the instances are still purging.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)