NealSun96 commented on a change in pull request #1076:
URL: https://github.com/apache/helix/pull/1076#discussion_r437766166



##########
File path: 
helix-core/src/main/java/org/apache/helix/controller/stages/TaskGarbageCollectionStage.java
##########
@@ -23,34 +27,53 @@ public AsyncWorkerType getAsyncWorkerType() {
   }
 
   @Override
-  public void execute(ClusterEvent event) {
+  public void process(ClusterEvent event) throws Exception {
     WorkflowControllerDataProvider dataProvider =
         event.getAttribute(AttributeName.ControllerDataProvider.name());
+    event.addAttribute(AttributeName.WORKFLOW_CONFIG_MAP.name(),
+        dataProvider.getWorkflowConfigMap());

Review comment:
       I understand your proposal better now, and after looking at it 
carefully, we can ditch the Zk reads and preprocess lightweight logic. In the 
preprocess section of the stage, we can loop through all WorkflowContext and 
figure out 1. what workflows need to be garbage collected, and 2. what jobs 
need to be purged. All work can be done based on cache.
   
   For your point 1 you did mention that we can do the calculation in 
"rebalance pipeline" (I assume is the synonym for TaskSchedulingStage". I doing 
the calculation in GarbageCollectionStage for code clarity, and the overhead 
isn't much because it's 1 loop through workflows using in-memory operations. 
What do you think? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to