[ 
https://issues.apache.org/jira/browse/HADOOP-9933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760127#comment-13760127
 ] 

Steve Loughran commented on HADOOP-9933:
----------------------------------------

thinking some more, it gets even more complex, as you don't want to allow the 
following state flows

create -> stop -> start
create -> init -> stop -> start

Yes, a flag could be added "started", but the pure way to do this an FSM is to 
have explicit states "stopped before started" and "stopped after start", where 
a start is only a valid transition from the latter.

now, back in HADOOP-3628 service model I did try to separate out started and 
live; the service could take itself in and out of LIVE depending on the state 
of dependencies (DN -> NN, TT -> JT, JT -> HDFS out of safe mode), along with 
an explicit FAILED state
[https://github.com/apache/hadoop-common/blob/HADOOP-3628/src/core/org/apache/hadoop/util/Service.java#L391]

It complicated a lot of the logic as now live has two states, as does failed. I 
also left it to the service itself to perform the STARTED <--> LIVE 
transitions, and decide when it fails. 

For the YARN service model, things are simpler
* LIVE, with the ability to add/remove a list of things you are waiting for 
(blockers), which is meant to be there purely for the benefit of management 
tools. This hasn't been turned on for anything yet, though I should go through 
the services and add it, starting with the DN when we get round to 
service-modelling it
* STOPPED has an exception; any exception thrown during init & start goes in 
there, and anything raised during shutdown (if not already set), though that's 
just a hint. If we drop the latter then you can define {{FAILED := STOPPED & 
!exception}}.


# could we have an explicit active/passive modes for the RM, either purely as 
part of that service, or for other things we could take on/offline.
# what about just creating a new service instance on each startup, in the 
existing process? This would ensure that the service is cleanly initialised, 
and we could verify that there aren't leakages by having a test run that tries 
to do this a few thousand times.

Option #2 appeals to me if the cost of creation and startup is low enough; if 
there's lot of pre-startup initialisation then the ready-to-start instance 
could be created at the same time its predecessor is stopped


                
> Augment Service model to support starting stopped services
> ----------------------------------------------------------
>
>                 Key: HADOOP-9933
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9933
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.1.0-beta
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>              Labels: service
>
> For ResourceManager-HA (YARN-149 and co), we would want to start/stop/start 
> RM's active services as it transitions to Active/Standby/Active respectively. 
> In the current service model, we can't start the services that are already 
> stopped.
> Would be nice to augment this. To avoid accidental restart of stopped 
> services, we can add another API: start(boolean restartIfStopped). Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to