[
https://issues.apache.org/jira/browse/HADOOP-9933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760127#comment-13760127
]
Steve Loughran commented on HADOOP-9933:
----------------------------------------
thinking some more, it gets even more complex, as you don't want to allow the
following state flows
create -> stop -> start
create -> init -> stop -> start
Yes, a flag could be added "started", but the pure way to do this an FSM is to
have explicit states "stopped before started" and "stopped after start", where
a start is only a valid transition from the latter.
now, back in HADOOP-3628 service model I did try to separate out started and
live; the service could take itself in and out of LIVE depending on the state
of dependencies (DN -> NN, TT -> JT, JT -> HDFS out of safe mode), along with
an explicit FAILED state
[https://github.com/apache/hadoop-common/blob/HADOOP-3628/src/core/org/apache/hadoop/util/Service.java#L391]
It complicated a lot of the logic as now live has two states, as does failed. I
also left it to the service itself to perform the STARTED <--> LIVE
transitions, and decide when it fails.
For the YARN service model, things are simpler
* LIVE, with the ability to add/remove a list of things you are waiting for
(blockers), which is meant to be there purely for the benefit of management
tools. This hasn't been turned on for anything yet, though I should go through
the services and add it, starting with the DN when we get round to
service-modelling it
* STOPPED has an exception; any exception thrown during init & start goes in
there, and anything raised during shutdown (if not already set), though that's
just a hint. If we drop the latter then you can define {{FAILED := STOPPED &
!exception}}.
# could we have an explicit active/passive modes for the RM, either purely as
part of that service, or for other things we could take on/offline.
# what about just creating a new service instance on each startup, in the
existing process? This would ensure that the service is cleanly initialised,
and we could verify that there aren't leakages by having a test run that tries
to do this a few thousand times.
Option #2 appeals to me if the cost of creation and startup is low enough; if
there's lot of pre-startup initialisation then the ready-to-start instance
could be created at the same time its predecessor is stopped
> Augment Service model to support starting stopped services
> ----------------------------------------------------------
>
> Key: HADOOP-9933
> URL: https://issues.apache.org/jira/browse/HADOOP-9933
> Project: Hadoop Common
> Issue Type: Improvement
> Affects Versions: 2.1.0-beta
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Labels: service
>
> For ResourceManager-HA (YARN-149 and co), we would want to start/stop/start
> RM's active services as it transitions to Active/Standby/Active respectively.
> In the current service model, we can't start the services that are already
> stopped.
> Would be nice to augment this. To avoid accidental restart of stopped
> services, we can add another API: start(boolean restartIfStopped). Thoughts?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira