[
https://issues.apache.org/jira/browse/HADOOP-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627827#action_12627827
]
Konstantin Shvachko commented on HADOOP-3628:
---------------------------------------------
Steve,
- I totally support your idea of returning a structured response. NodeHealth
structure may also include service state, a message and an error code (as Pete
proposes) in the form of an enum (as Owen asks).
- But I don't think it should be thrown as an exception, because you can only
pack a text within an exception.
- I also think that ping can be merged with the getServiceState() after that.
Because the getServiceState() result is a subset of what ping() returns.
- BTW, I was not proposing to return a string and my main concern is not the
cost of marshaling.
- I was advocating not to throw exceptions as a normal result of an operation.
Because exceptions are for error handling.
The life cycle api is a very interesting and useful abstraction.
And that is why I think it is so important to refine it.
It is a public api, so once committed it will be hard to change it.
The attached pdf helps a lot. Is it going to be a part of hadoop documentation?
A minor correction. Safe mode is not referred to data-nodes, only the name-node.
I think your previous variant of defining a data-node live after it registered
with the name-node was correct.
> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
> Key: HADOOP-3628
> URL: https://issues.apache.org/jira/browse/HADOOP-3628
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs, mapred
> Affects Versions: 0.19.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: AbstractHadoopComponent.java, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-lifecycle.pdf,
> hadoop-lifecycle.sxw
>
>
> I'd like to propose we have a standard interface for hadoop components, the
> things that get started or stopped when you bring up a namenode. currently,
> some of these classes have a stop() or shutdown() method, with no standard
> name/interface, but no way of seeing if they are live, checking their health
> of shutting them down reliably. Indeed, there is a tendency for the spawned
> threads to not want to die; to require the entire process to be killed to
> stop the workers.
> Having a standard interface would make it easier for
> * management tools to manage the different things
> * monitoring the state of things
> * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up
> threads in their constructor; that's very dangerous as subclasses may have
> their methods called before they are full initialised. Adding this interface
> would be the right time to clean up the startup process so that subclassing
> is less risky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.