[ 
https://issues.apache.org/jira/browse/HADOOP-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623026#action_12623026
 ] 

Konstantin Shvachko commented on HADOOP-3628:
---------------------------------------------

I do not see any difference in behavior of ping in case (UNDEFINED: CREATED: 
INITIALIZED: STARTED) and LIVE:
In both cases ping will return nothing and the caller will not be able to know 
whether the service is live or not.
Do I understand the purpose of ping() correctly?
In case of FAILED: TERMINATED: ping() throws ServiceStateException, and this is 
only way to know that the service is not live.
But is this the right way to pass information outside? Exceptions are for 
errors right?
I think ping() is somewhat redundant because Service already has 
getServiceState(). This is as close to ping as it could be, imo.

> 1. is a datanode considered live only once registered?

May be it needs to be clarified. Data-node indeed is not alive until it 
registers for the first time. It later can be asked to re-register
when name-node restarts, but it still should be considered alive when the 
name-node is down and therefore formally the data-node
is not registered with anything out there.

> 2. similarly, should the ping() require dnRegistration to be non null?

After registration dnRegistration is not null, so ping can suppose that. If 
ping() is really necessary, see above.

> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-3628
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3628
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: AbstractHadoopComponent.java, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch
>
>
> I'd like to propose we have a standard interface for hadoop components, the 
> things that get started or stopped when you bring up a namenode. currently, 
> some of these classes have a stop() or shutdown() method, with no standard 
> name/interface, but no way of seeing if they are live, checking their health 
> of shutting them down reliably. Indeed, there is a tendency for the spawned 
> threads to not want to die; to require the entire process to be killed to 
> stop the workers. 
> Having a standard interface would make it easier for 
>  * management tools to manage the different things
>  * monitoring the state of things
>  * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up 
> threads in their constructor; that's very dangerous as subclasses may have 
> their methods called before they are full initialised. Adding this interface 
> would be the right time to clean up the startup process so that subclassing 
> is less risky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to