alirezazamani commented on a change in pull request #1326:
URL: https://github.com/apache/helix/pull/1326#discussion_r480375367
##########
File path:
helix-core/src/main/java/org/apache/helix/controller/stages/task/TaskSchedulingStage.java
##########
@@ -110,117 +103,9 @@ private BestPossibleStateOutput compute(ClusterEvent
event, Map<String, Resource
restOfResources.remove(jobName);
}
- // Current rest of resources including: only current state left over ones
- // Original resource map contains workflows + jobs + other invalid
resources
- // After removing workflows + jobs, only leftover ones will go over old
rebalance pipeline.
- for (Resource resource : restOfResources.values()) {
Review comment:
I think previously, when currentState exist and configs are missing, we
were relying on this legacy code to remove the currentState (DROP the task).
However, since you removed this part, how can we ensure that functionality?
##########
File path:
helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java
##########
@@ -518,7 +519,7 @@ protected void handleJobTimeout(JobContext jobCtx,
WorkflowContext workflowCtx,
}
_clusterStatusMonitor.updateJobCounters(jobCfg, TaskState.TIMED_OUT);
_rebalanceScheduler.removeScheduledRebalance(jobResource);
- TaskUtil.cleanupJobIdealStateExtView(_manager.getHelixDataAccessor(),
jobResource);
+ RebalanceUtil.scheduleOnDemandPipeline(_manager.getClusterName(),0L,false);
Review comment:
Maybe you can add comment for onDemand rebalances and point out why you
add this line? Something like making sure next pipeline runs and context get
updated maybe?
##########
File path: helix-core/src/main/java/org/apache/helix/task/JobDispatcher.java
##########
@@ -85,15 +86,15 @@ public ResourceAssignment
processJobStatusUpdateAndAssignment(String jobName,
// completed)
TaskState workflowState = workflowCtx.getWorkflowState();
TaskState jobState = workflowCtx.getJobState(jobName);
- // The job is already in a final state (completed/failed).
+ // Do not include workflowState == TIMED_OUT here, as later logic needs to
handle this case
Review comment:
I am assuming this is not related to this PR? Anyways it is fine to have
this comment, I just want to make sure I understand it correctly.
##########
File path: helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
##########
@@ -746,13 +705,11 @@ public void deleteAndWaitForCompletion(String workflow,
long timeout)
BaseDataAccessor baseDataAccessor = _accessor.getBaseDataAccessor();
PropertyKey.Builder keyBuilder = _accessor.keyBuilder();
- String idealStatePath = keyBuilder.idealStates(workflow).getPath();
String workflowConfigPath = keyBuilder.resourceConfig(workflow).getPath();
String workflowContextPath =
keyBuilder.workflowContext(workflow).getPath();
while (System.currentTimeMillis() <= endTime) {
- if (baseDataAccessor.exists(idealStatePath, AccessOption.PERSISTENT)
- || baseDataAccessor.exists(workflowConfigPath,
AccessOption.PERSISTENT)
+ if (baseDataAccessor.exists(workflowConfigPath, AccessOption.PERSISTENT)
Review comment:
Can we keep TaskDriver to clean up IS for a while? What would happen to
the workflows/jobs that are running if we switch to new controller? Are these
IS are getting deleted anyways? What do you think?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]