[ 
https://issues.apache.org/jira/browse/BEAM-8539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chad Dombrova updated BEAM-8539:
--------------------------------
    Description: 
The Beam job state transitions are ill-defined, which is big problem for 
anything that relies on the values coming from JobAPI.GetStateStream.

I was hoping to find something like a state transition diagram in the docs so 
that I could determine the start state, the terminal states, and the valid 
transitions, but I could not find this. The code reveals that the SDKs differ 
on the fundamentals:

Java InMemoryJobService:
 * start state: *STOPPED*
 * run - about to submit to executor:  STARTING
 * run - actually running on executor:  RUNNING
 * terminal states: DONE, FAILED, CANCELLED, DRAINED

Python AbstractJobServiceServicer / LocalJobServicer:
 * start state: STARTING
 * terminal states: DONE, FAILED, CANCELLED, *STOPPED*

I think it would be good to make python work like Java, so that there is a 
difference in state between a job that has been prepared and one that has 
additionally been run.

It's hard to tell how far this problem has spread within the various runners.  
I think a simple thing that can be done to help standardize behavior is to 
implement the terminal states as an enum in the beam_job_api.proto, or create a 
utility function in each language for checking if a state is terminal, so that 
it's not left up to each runner to reimplement this logic.

 

  was:
The Beam job state transitions are ill-defined, which is big problem for 
anything that relies on the values coming from JobAPI.GetStateStream.

I was hoping to find something like a state transition diagram in the docs so 
that I could determine the start state, the terminal states, and the valid 
transitions, but I could not find this. The code reveals that the SDKs differ 
on the fundamentals:

Java InMemoryJobService:
 * start state: *STOPPED*
 * run: about to submit to executor:  STARTING
 * run: actually running on executor:  RUNNING
 * terminal states: DONE, FAILED, CANCELLED, DRAINED

Python AbstractJobServiceServicer / LocalJobServicer:
 * start state: STARTING
 * terminal states: DONE, FAILED, CANCELLED, *STOPPED*

I think it would be good to make python work like Java, so that there is a 
difference in state between a job that has been prepared and one that has 
additionally been run.

It's hard to tell how far this problem has spread within the various runners.  
I think a simple thing that can be done to help standardize behavior is to 
implement the terminal states as an enum in the beam_job_api.proto, or create a 
utility function in each language for checking if a state is terminal, so that 
it's not left up to each runner to reimplement this logic.

 


> Clearly define the valid job state transitions
> ----------------------------------------------
>
>                 Key: BEAM-8539
>                 URL: https://issues.apache.org/jira/browse/BEAM-8539
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model, runner-core, sdk-java-core, sdk-py-core
>            Reporter: Chad Dombrova
>            Priority: Major
>
> The Beam job state transitions are ill-defined, which is big problem for 
> anything that relies on the values coming from JobAPI.GetStateStream.
> I was hoping to find something like a state transition diagram in the docs so 
> that I could determine the start state, the terminal states, and the valid 
> transitions, but I could not find this. The code reveals that the SDKs differ 
> on the fundamentals:
> Java InMemoryJobService:
>  * start state: *STOPPED*
>  * run - about to submit to executor:  STARTING
>  * run - actually running on executor:  RUNNING
>  * terminal states: DONE, FAILED, CANCELLED, DRAINED
> Python AbstractJobServiceServicer / LocalJobServicer:
>  * start state: STARTING
>  * terminal states: DONE, FAILED, CANCELLED, *STOPPED*
> I think it would be good to make python work like Java, so that there is a 
> difference in state between a job that has been prepared and one that has 
> additionally been run.
> It's hard to tell how far this problem has spread within the various runners. 
>  I think a simple thing that can be done to help standardize behavior is to 
> implement the terminal states as an enum in the beam_job_api.proto, or create 
> a utility function in each language for checking if a state is terminal, so 
> that it's not left up to each runner to reimplement this logic.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to