[ 
https://issues.apache.org/jira/browse/HADOOP-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627827#action_12627827
 ] 

Konstantin Shvachko commented on HADOOP-3628:
---------------------------------------------

Steve, 
- I totally support your idea of returning a structured response. NodeHealth 
structure may also include service state, a message and an error code (as Pete 
proposes) in the form of an enum (as Owen asks).
- But I don't think it should be thrown as an exception, because you can only 
pack a text within an exception.
- I also think that ping can be merged with the getServiceState() after that. 
Because the getServiceState() result is a subset of what ping() returns.
- BTW, I was not proposing to return a string and my main concern is not the 
cost of marshaling.
- I was advocating not to throw exceptions as a normal result of an operation. 
Because exceptions are for error handling.

The life cycle api is a very interesting and useful abstraction.
And that is why I think it is so important to refine it.
It is a public api, so once committed it will be hard to change it.

The attached pdf helps a lot. Is it going to be a part of hadoop documentation?
A minor correction. Safe mode is not referred to data-nodes, only the name-node.
I think your previous variant of defining a data-node live after it registered 
with the name-node was correct.

> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-3628
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3628
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: AbstractHadoopComponent.java, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-lifecycle.pdf, 
> hadoop-lifecycle.sxw
>
>
> I'd like to propose we have a standard interface for hadoop components, the 
> things that get started or stopped when you bring up a namenode. currently, 
> some of these classes have a stop() or shutdown() method, with no standard 
> name/interface, but no way of seeing if they are live, checking their health 
> of shutting them down reliably. Indeed, there is a tendency for the spawned 
> threads to not want to die; to require the entire process to be killed to 
> stop the workers. 
> Having a standard interface would make it easier for 
>  * management tools to manage the different things
>  * monitoring the state of things
>  * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up 
> threads in their constructor; that's very dangerous as subclasses may have 
> their methods called before they are full initialised. Adding this interface 
> would be the right time to clean up the startup process so that subclassing 
> is less risky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to