[
https://issues.apache.org/jira/browse/HADOOP-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628176#action_12628176
]
Steve Loughran commented on HADOOP-3628:
----------------------------------------
Konstantin,
-of course the docs can be part of the hadoop documentation; they've got apache
licenses on them for that reason.
-The nice thing about getServiceState() is it side effect free. Once we add in
more advance health checks into things, the cost of a ping() will increase. So
yes, the same info will be returned, but a ping() does more work. If you take
that away then every service would have to create another thread to do its own
health checks; not that expensive for big services, but high for small things
deployed many-to-a-JVM.
-if we do away with the separate operation, then HTML pages for each service
could still get the state and return something other than 200 if the service
was not in one of the desired states. Load balancers and other tools could use
this with a GET of /service/state?state=LIVE to demand a live service.
-I'll try and prototype a response that includes state+other info, though I'm
pretty busy with other things for the next week. I will propose some ideas
first.
Some of the WS-* management APIS have a state model that can include lots of
inner information
[http://docs.oasis-open.org/wsdm/wsdm-muws2-1.1-spec-os-01.htm#_Toc129683823].
I'm not sure how far down that path we should go. Maybe people that use the
various management tools to monitor their cluster will have opinions here?
> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
> Key: HADOOP-3628
> URL: https://issues.apache.org/jira/browse/HADOOP-3628
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs, mapred
> Affects Versions: 0.19.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: AbstractHadoopComponent.java, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-lifecycle.pdf,
> hadoop-lifecycle.sxw
>
>
> I'd like to propose we have a standard interface for hadoop components, the
> things that get started or stopped when you bring up a namenode. currently,
> some of these classes have a stop() or shutdown() method, with no standard
> name/interface, but no way of seeing if they are live, checking their health
> of shutting them down reliably. Indeed, there is a tendency for the spawned
> threads to not want to die; to require the entire process to be killed to
> stop the workers.
> Having a standard interface would make it easier for
> * management tools to manage the different things
> * monitoring the state of things
> * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up
> threads in their constructor; that's very dangerous as subclasses may have
> their methods called before they are full initialised. Adding this interface
> would be the right time to clean up the startup process so that subclassing
> is less risky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.