[
https://issues.apache.org/jira/browse/GOBBLIN-1837?focusedWorklogId=864303&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-864303
]
ASF GitHub Bot logged work on GOBBLIN-1837:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 07/Jun/23 22:39
Start Date: 07/Jun/23 22:39
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3700:
URL: https://github.com/apache/gobblin/pull/3700#discussion_r1222198550
##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/restli/GobblinServiceFlowExecutionResourceHandlerWithWarmStandby.java:
##########
@@ -54,27 +53,26 @@ public void
resume(ComplexResourceKey<org.apache.gobblin.service.FlowStatusId, E
String flowName = key.getKey().getFlowName();
Long flowExecutionId = key.getKey().getFlowExecutionId();
try {
- // If an existing resume or kill request is still pending then do not
accept this request
- if (this.dagActionStore.exists(flowGroup, flowName,
flowExecutionId.toString())) {
- DagActionStore.DagActionValue action =
this.dagActionStore.getDagAction(flowGroup, flowName,
flowExecutionId.toString()).getDagActionValue();
- this.handleException(flowGroup, flowName, flowExecutionId.toString(),
- new RuntimeException("There is already a pending action " + action
+ " for this flow. Please wait to resubmit and wait for"
+ // If an existing resume request is still pending then do not accept
this request
+ if (this.dagActionStore.exists(flowGroup, flowName,
flowExecutionId.toString(), DagActionStore.FlowActionType.RESUME)) {
+ this.handleException(flowGroup, flowName, flowExecutionId.toString(),
DagActionStore.FlowActionType.RESUME,
+ new RuntimeException("There is already a pending RESUME action for
this flow. Please wait to resubmit and wait for"
+ " action to be completed."));
return;
}
- this.dagActionStore.addDagAction(flowGroup, flowName,
flowExecutionId.toString(), DagActionStore.DagActionValue.RESUME);
- } catch (IOException | SQLException | SpecNotFoundException e) {
+ this.dagActionStore.addDagAction(flowGroup, flowName,
flowExecutionId.toString(), DagActionStore.FlowActionType.RESUME);
+ } catch (IOException | SQLException e) {
log.warn(
String.format("Failed to add execution resume action for flow %s %s
%s to dag action store due to", flowGroup,
flowName, flowExecutionId), e);
- this.handleException(flowGroup, flowName, flowExecutionId.toString(), e);
+ this.handleException(flowGroup, flowName, flowExecutionId.toString(),
DagActionStore.FlowActionType.RESUME, e);
}
}
- private void handleException (String flowGroup, String flowName, String
flowExecutionId, Exception e) {
+ private void handleException (String flowGroup, String flowName, String
flowExecutionId, DagActionStore.FlowActionType flowActionType, Exception e) {
try {
- if (this.dagActionStore.exists(flowGroup, flowName, flowExecutionId)) {
+ if (this.dagActionStore.exists(flowGroup, flowName, flowExecutionId,
flowActionType)) {
Review Comment:
having a separate, subsequent, and repeated check of `exists` looks like a
race condition. instead whoever calls `handleException` should indicate the
result of the first (and ideally only) call to `exists`
Issue Time Tracking
-------------------
Worklog Id: (was: 864303)
Time Spent: 9h (was: 8h 50m)
> Implement multi-active, non blocking for leader host
> ----------------------------------------------------
>
> Key: GOBBLIN-1837
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1837
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-service
> Reporter: Urmi Mustafi
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 9h
> Remaining Estimate: 0h
>
> This task will include the implementation of non-blocking, multi-active
> scheduler for each host. It will NOT include metric emission or unit tests
> for validation. That will be done in a separate follow-up ticket. The work in
> this ticket includes
> * define a table to do scheduler lease determination for each flow's trigger
> event and related methods to execute actions on this tableĀ
> * update DagActionStore schema and DagActionStoreMonitor to act upon new
> "LAUNCH" type events in addition to KILL/RESUME
> * update scheduler/orchestrator logic to apply the non-blocking algorithm
> when "multi-active scheduler mode" is enabled, otherwise submit events
> directly to the DagManager after receiving a scheduler trigger
> * implement the non-blocking algorithm, particularly handling reminder
> events if another host is in the process of securing the lease for a
> particular flow trigger
--
This message was sent by Atlassian Jira
(v8.20.10#820010)