[ 
https://issues.apache.org/jira/browse/BEAM-8539?focusedWorklogId=341996&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341996
 ]

ASF GitHub Bot logged work on BEAM-8539:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Nov/19 16:03
            Start Date: 12/Nov/19 16:03
    Worklog Time Spent: 10m 
      Work Description: chadrik commented on issue #9969: [BEAM-8539] Provide 
an initial definition of all job states and the state transition diagram
URL: https://github.com/apache/beam/pull/9969#issuecomment-552959589
 
 
   Good to note, thanks.
   
   On Tue, Nov 12, 2019 at 7:41 AM Maximilian Michels <[email protected]>
   wrote:
   
   > *@mxm* commented on this pull request.
   > ------------------------------
   >
   > In model/job-management/src/main/proto/beam_job_api.proto
   > <https://github.com/apache/beam/pull/9969#discussion_r345280401>:
   >
   > > @@ -201,6 +201,16 @@ message JobMessagesResponse {
   >  }
   >
   >  // Enumeration of all JobStates
   > +//
   > +// The state transition diagram is:
   > +//   STOPPED -> STARTING -> RUNNING -> DONE
   > +//                                  \> FAILED
   > +//                                  \> CANCELLING -> CANCELLED
   > +//                                  \> UPDATING -> UPDATED
   > +//                                  \> DRAINING -> DRAINED
   >
   > Should add here that it is possible for jobs to transition from FAILING
   > to RUNNING in Flink (i.e. recovering from a checkpoint), though this does
   > not have to be exposed through the Beam API.
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
<https://github.com/apache/beam/pull/9969?email_source=notifications&email_token=AAAPOEZPF2EEZ7PBG2D6HQLQTLFBPA5CNFSM4JH6UHRKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCLICGSQ#discussion_r345280401>,
   > or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AAAPOEYZAE2JQK6DOLUJCCDQTLFBPANCNFSM4JH6UHRA>
   > .
   >
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 341996)
    Time Spent: 6.5h  (was: 6h 20m)

> Clearly define the valid job state transitions
> ----------------------------------------------
>
>                 Key: BEAM-8539
>                 URL: https://issues.apache.org/jira/browse/BEAM-8539
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model, runner-core, sdk-java-core, sdk-py-core
>            Reporter: Chad Dombrova
>            Priority: Major
>          Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> The Beam job state transitions are ill-defined, which is big problem for 
> anything that relies on the values coming from JobAPI.GetStateStream.
> I was hoping to find something like a state transition diagram in the docs so 
> that I could determine the start state, the terminal states, and the valid 
> transitions, but I could not find this. The code reveals that the SDKs differ 
> on the fundamentals:
> Java InMemoryJobService:
>  * start state: *STOPPED*
>  * run - about to submit to executor:  STARTING
>  * run - actually running on executor:  RUNNING
>  * terminal states: DONE, FAILED, CANCELLED, DRAINED
> Python AbstractJobServiceServicer / LocalJobServicer:
>  * start state: STARTING
>  * terminal states: DONE, FAILED, CANCELLED, *STOPPED*
> I think it would be good to make python work like Java, so that there is a 
> difference in state between a job that has been prepared and one that has 
> additionally been run.
> It's hard to tell how far this problem has spread within the various runners. 
>  I think a simple thing that can be done to help standardize behavior is to 
> implement the terminal states as an enum in the beam_job_api.proto, or create 
> a utility function in each language for checking if a state is terminal, so 
> that it's not left up to each runner to reimplement this logic.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to