[ 
https://issues.apache.org/jira/browse/GOBBLIN-1831?focusedWorklogId=861246&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-861246
 ]

ASF GitHub Bot logged work on GOBBLIN-1831:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/May/23 19:13
            Start Date: 09/May/23 19:13
    Worklog Time Spent: 10m 
      Work Description: Will-Lo opened a new pull request, #3694:
URL: https://github.com/apache/gobblin/pull/3694

   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-1831
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if 
applicable):
   This PR handles the scenario where concurrent jobs can run on Gobblin 
cluster from GaaS and handles the cancellation logic properly.
   
   When executing jobs from GaaS to Gobblin cluster, there can be a mismatch of 
flowexecution ids to jobs running on Gobblin cluster. The old behavior was that 
the current job will be deleted regardless of the execution ID, but this could 
mean that a state mismatch could lead to current jobs being deleted when they 
should still continue to run.
   
   To address this, we tried adding the FlowExecutionId to the jobSpec, but 
that meant that jobs could run concurrently on Gobblin cluster when they should 
have been deduped.
   
   So instead, during cancellation, we want to check if the incoming spec has a 
flow execution ID. If so, then it will cancel the existing job only if the flow 
execution IDs match. Otherwise, it will know that the current job does not 
match the incoming request and should not be deleted.
   
   This PR also uses flow execution ID if applicable as the planningjob ID and 
job_actualJob ID, as there should only be one of these per flow.
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   




Issue Time Tracking
-------------------

            Worklog Id:     (was: 861246)
    Remaining Estimate: 0h
            Time Spent: 10m

> Use Flow Execution ID in Gobblin cluster cancellation semantics and jobname 
> IDs if possible
> -------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1831
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1831
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: William Lo
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When executing jobs from GaaS to Gobblin cluster, there can be a mismatch of 
> flowexecution ids to jobs running on Gobblin cluster.
> To address this, we tried adding the FlowExecutionId to the jobSpec, but that 
> meant that jobs could run concurrently on Gobblin cluster when they should 
> have been deduped.
> So instead, during cancellation, we want to check if the incoming spec has a 
> flow execution ID. If so, then it will cancel the existing job only if the 
> flow execution IDs match. Otherwise, it will know that the current job does 
> not match the incoming request and should not be deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to