[ 
https://issues.apache.org/jira/browse/BEAM-8539?focusedWorklogId=337543&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-337543
 ]

ASF GitHub Bot logged work on BEAM-8539:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Nov/19 20:28
            Start Date: 01/Nov/19 20:28
    Worklog Time Spent: 10m 
      Work Description: lukecwik commented on pull request #9969: [BEAM-8539] 
Provide an initial definition of all job states and the state transition diagram
URL: https://github.com/apache/beam/pull/9969#discussion_r341739058
 
 

 ##########
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##########
 @@ -201,6 +201,16 @@ message JobMessagesResponse {
 }
 
 // Enumeration of all JobStates
+//
+// The state transition diagram is:
+//   STOPPED -> STARTING -> RUNNING -> DONE
+//                                  \> FAILED
+//                                  \> CANCELLING -> CANCELLED
+//                                  \> UPDATING -> UPDATED
+//                                  \> DRAINING -> DRAINED
 
 Review comment:
   Yeah, this is the problem with UPDATED since it is dependent on the 
implementation within the Runner. Is an UPDATED job the same job as the prior 
one or a new job which continues from the prior state? In Dataflow it is the 
latter but the former can make sense for other runners.
   
   Not sure of any RUNNING -> STOPPED scenarios today but adding 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 337543)
    Time Spent: 1h  (was: 50m)

> Clearly define the valid job state transitions
> ----------------------------------------------
>
>                 Key: BEAM-8539
>                 URL: https://issues.apache.org/jira/browse/BEAM-8539
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model, runner-core, sdk-java-core, sdk-py-core
>            Reporter: Chad Dombrova
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The Beam job state transitions are ill-defined, which is big problem for 
> anything that relies on the values coming from JobAPI.GetStateStream.
> I was hoping to find something like a state transition diagram in the docs so 
> that I could determine the start state, the terminal states, and the valid 
> transitions, but I could not find this. The code reveals that the SDKs differ 
> on the fundamentals:
> Java InMemoryJobService:
>  * start state: *STOPPED*
>  * run - about to submit to executor:  STARTING
>  * run - actually running on executor:  RUNNING
>  * terminal states: DONE, FAILED, CANCELLED, DRAINED
> Python AbstractJobServiceServicer / LocalJobServicer:
>  * start state: STARTING
>  * terminal states: DONE, FAILED, CANCELLED, *STOPPED*
> I think it would be good to make python work like Java, so that there is a 
> difference in state between a job that has been prepared and one that has 
> additionally been run.
> It's hard to tell how far this problem has spread within the various runners. 
>  I think a simple thing that can be done to help standardize behavior is to 
> implement the terminal states as an enum in the beam_job_api.proto, or create 
> a utility function in each language for checking if a state is terminal, so 
> that it's not left up to each runner to reimplement this logic.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to