[
https://issues.apache.org/jira/browse/HADOOP-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624021#action_12624021
]
Steve Loughran commented on HADOOP-3628:
----------------------------------------
Konstantin, thank you for your comments.
I've just posted some slides where I talk about my experience of using the
current prototype, where I'd been blocking my filesystem operations until the
namenode and datanodes went live.
http://people.apache.org/~stevel/slides/deploying_hadoop_with_smartfrog.pdf
I want to delay work until the cluster itself is live, which is slighly harder
to determine than aggregating the liveness of every node; an HDFS cluster is
"live" if enough datanodes are connected to the namenode, where "enough" is a
matter of personal preference. If there is a namenode outage then all the
datanodes are still live, but the cluster itself is down.
How do we represent these states to the caller?
One option: fail the ping() with explicit exceptions for NoManager and
NoWorkers. The Job and Task Trackers could throw the same exceptions when they
were down. Callers would then be able to look out for these exceptions and
recognise that these are probably transient states and not overreact, or react
by checking the health of other parts of the cluster.
An alternate option -suggested by Doug Cutting- is to have the nodes switch
from LIVE to STARTED when they arent connected to the other parts of the
system. More formally
A Namenode is only only live when it has >'enough' worker. When the number of
datanodes drops below this value, the name node reverts to the STARTED state
until the count goes above a predefined minimum value.
A Datanode is only live when it is connected to a name node. When it
deregisters or loses contact, it goes back to STARTED until it successfully
registers.
The job/task trackers would do the same.
This approach would still aggregate well. A full M/R cluster would revert to
STARTED if the namenode considered itself in the STARTED state, or if the
JobTracker was in that state.
1. I can prototype this if people think it is good.
2. it would seem to make sense to have the ping() operation return the current
state, so you'd easily differentiate the started versus live ping(), the latter
being stricter about system health.
3. There will be some fun tests here, but good ones, as we get to verify the
cluster copes with with transient node outages. They may be slow, though.
> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
> Key: HADOOP-3628
> URL: https://issues.apache.org/jira/browse/HADOOP-3628
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs, mapred
> Affects Versions: 0.19.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: AbstractHadoopComponent.java, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch,
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch
>
>
> I'd like to propose we have a standard interface for hadoop components, the
> things that get started or stopped when you bring up a namenode. currently,
> some of these classes have a stop() or shutdown() method, with no standard
> name/interface, but no way of seeing if they are live, checking their health
> of shutting them down reliably. Indeed, there is a tendency for the spawned
> threads to not want to die; to require the entire process to be killed to
> stop the workers.
> Having a standard interface would make it easier for
> * management tools to manage the different things
> * monitoring the state of things
> * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up
> threads in their constructor; that's very dangerous as subclasses may have
> their methods called before they are full initialised. Adding this interface
> would be the right time to clean up the startup process so that subclassing
> is less risky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.