[ 
https://issues.apache.org/jira/browse/GOBBLIN-1837?focusedWorklogId=864303&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-864303
 ]

ASF GitHub Bot logged work on GOBBLIN-1837:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Jun/23 22:39
            Start Date: 07/Jun/23 22:39
    Worklog Time Spent: 10m 
      Work Description: phet commented on code in PR #3700:
URL: https://github.com/apache/gobblin/pull/3700#discussion_r1222198550


##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/restli/GobblinServiceFlowExecutionResourceHandlerWithWarmStandby.java:
##########
@@ -54,27 +53,26 @@ public void 
resume(ComplexResourceKey<org.apache.gobblin.service.FlowStatusId, E
     String flowName = key.getKey().getFlowName();
     Long flowExecutionId = key.getKey().getFlowExecutionId();
     try {
-      // If an existing resume or kill request is still pending then do not 
accept this request
-      if (this.dagActionStore.exists(flowGroup, flowName, 
flowExecutionId.toString())) {
-        DagActionStore.DagActionValue action = 
this.dagActionStore.getDagAction(flowGroup, flowName, 
flowExecutionId.toString()).getDagActionValue();
-        this.handleException(flowGroup, flowName, flowExecutionId.toString(),
-            new RuntimeException("There is already a pending action " + action 
+ " for this flow. Please wait to resubmit and wait for"
+      // If an existing resume request is still pending then do not accept 
this request
+      if (this.dagActionStore.exists(flowGroup, flowName, 
flowExecutionId.toString(), DagActionStore.FlowActionType.RESUME)) {
+        this.handleException(flowGroup, flowName, flowExecutionId.toString(), 
DagActionStore.FlowActionType.RESUME,
+            new RuntimeException("There is already a pending RESUME action for 
this flow. Please wait to resubmit and wait for"
                 + " action to be completed."));
         return;
       }
-      this.dagActionStore.addDagAction(flowGroup, flowName, 
flowExecutionId.toString(), DagActionStore.DagActionValue.RESUME);
-    } catch (IOException | SQLException | SpecNotFoundException e) {
+      this.dagActionStore.addDagAction(flowGroup, flowName, 
flowExecutionId.toString(), DagActionStore.FlowActionType.RESUME);
+    } catch (IOException | SQLException e) {
       log.warn(
           String.format("Failed to add execution resume action for flow %s %s 
%s to dag action store due to", flowGroup,
               flowName, flowExecutionId), e);
-      this.handleException(flowGroup, flowName, flowExecutionId.toString(), e);
+      this.handleException(flowGroup, flowName, flowExecutionId.toString(), 
DagActionStore.FlowActionType.RESUME, e);
     }
 
   }
 
-  private void handleException (String flowGroup, String flowName, String 
flowExecutionId, Exception e) {
+  private void handleException (String flowGroup, String flowName, String 
flowExecutionId, DagActionStore.FlowActionType flowActionType, Exception e) {
     try {
-      if (this.dagActionStore.exists(flowGroup, flowName, flowExecutionId)) {
+      if (this.dagActionStore.exists(flowGroup, flowName, flowExecutionId, 
flowActionType)) {

Review Comment:
   having a separate, subsequent, and repeated check of `exists` looks like a 
race condition.  instead whoever calls `handleException` should indicate the 
result of the first (and ideally only) call to `exists`





Issue Time Tracking
-------------------

    Worklog Id:     (was: 864303)
    Time Spent: 9h  (was: 8h 50m)

> Implement multi-active, non blocking for leader host
> ----------------------------------------------------
>
>                 Key: GOBBLIN-1837
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1837
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-service
>            Reporter: Urmi Mustafi
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 9h
>  Remaining Estimate: 0h
>
> This task will include the implementation of non-blocking, multi-active 
> scheduler for each host. It will NOT include metric emission or unit tests 
> for validation. That will be done in a separate follow-up ticket. The work in 
> this ticket includes
>  * define a table to do scheduler lease determination for each flow's trigger 
> event and related methods to execute actions on this tableĀ 
>  * update DagActionStore schema and DagActionStoreMonitor to act upon new 
> "LAUNCH" type events in addition to KILL/RESUME
>  * update scheduler/orchestrator logic to apply the non-blocking algorithm 
> when "multi-active scheduler mode" is enabled, otherwise submit events 
> directly to the DagManager after receiving a scheduler trigger
>  * implement the non-blocking algorithm, particularly handling reminder 
> events if another host is in the process of securing the lease for a 
> particular flow trigger



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to