[ 
https://issues.apache.org/jira/browse/BEAM-8539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964981#comment-16964981
 ] 

Chad Dombrova commented on BEAM-8539:
-------------------------------------

I searched the java code for STOPPED and it seems to be used correctly 
everywhere except for possibly one spot in the spark runner:

[https://github.com/apache/beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkPipelineResult.java#L133]

It's hard to tell from the [spark 
docs|https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/api/java/JavaSparkContext.html]
 if {{JavaSparkContext.stop()}} is considered terminal (i.e. can it be 
resumed?).  Who would know for sure?
{quote}We should add documentation to the JobState enum in the Job API (e.g. 
state machine diagram).
{quote}
I'm happy to do this, but I'm not that familiar with the documentation 
generation, so I have a few questions:
 * Where should this go? I don't see the generated code for in the python or 
java docs. Also, since Java and Python use different documentation generators, 
I'm not sure if the diagram can be universally rendered. If not there, then 
where? JobInvocation?
 * Can you give me an example of somewhere else in the code that is currently 
generating a diagram, so that I can see how it's done?

> Clearly define the valid job state transitions
> ----------------------------------------------
>
>                 Key: BEAM-8539
>                 URL: https://issues.apache.org/jira/browse/BEAM-8539
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model, runner-core, sdk-java-core, sdk-py-core
>            Reporter: Chad Dombrova
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The Beam job state transitions are ill-defined, which is big problem for 
> anything that relies on the values coming from JobAPI.GetStateStream.
> I was hoping to find something like a state transition diagram in the docs so 
> that I could determine the start state, the terminal states, and the valid 
> transitions, but I could not find this. The code reveals that the SDKs differ 
> on the fundamentals:
> Java InMemoryJobService:
>  * start state: *STOPPED*
>  * run - about to submit to executor:  STARTING
>  * run - actually running on executor:  RUNNING
>  * terminal states: DONE, FAILED, CANCELLED, DRAINED
> Python AbstractJobServiceServicer / LocalJobServicer:
>  * start state: STARTING
>  * terminal states: DONE, FAILED, CANCELLED, *STOPPED*
> I think it would be good to make python work like Java, so that there is a 
> difference in state between a job that has been prepared and one that has 
> additionally been run.
> It's hard to tell how far this problem has spread within the various runners. 
>  I think a simple thing that can be done to help standardize behavior is to 
> implement the terminal states as an enum in the beam_job_api.proto, or create 
> a utility function in each language for checking if a state is terminal, so 
> that it's not left up to each runner to reimplement this logic.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to