[ 
https://issues.apache.org/jira/browse/HDFS-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778992#action_12778992
 ] 

Steve Loughran commented on HDFS-326:
-------------------------------------

Stu., 

I've not touched this code for a bit because it was working enough to show we 
could push out dynamically configured hadoop installations -now I've got sucked 
into the other half of the problem, asking for allocated real/virtual machines 
and creating valid configurations for the workers based on the master nodes' 
hostnames, pushing them out, etc, etc. Which is a good test case for all this 
dynamic stuff, but the other bits and their tests do take up a lot of time. 

I should put up a plan for doing this properly, something like
 * switch to Git to keep changes more isolated
 * write some tests to demonstrate the problems w/ JobTracker hanging if the 
filesystem isn't there
 * Come up with a solution that involved interrupting threads or similar
 * get the base changes into -common, then worry about hdfs and mapred as 
separate issue

Thoughts?

> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-326
>                 URL: https://issues.apache.org/jira/browse/HDFS-326
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: AbstractHadoopComponent.java, HADOOP-3628-18.patch, 
> HADOOP-3628-19.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, 
> hadoop-lifecycle-tomw.sxw, hadoop-lifecycle.pdf, hadoop-lifecycle.pdf, 
> hadoop-lifecycle.sxw
>
>
> I'd like to propose we have a standard interface for hadoop components, the 
> things that get started or stopped when you bring up a namenode. currently, 
> some of these classes have a stop() or shutdown() method, with no standard 
> name/interface, but no way of seeing if they are live, checking their health 
> of shutting them down reliably. Indeed, there is a tendency for the spawned 
> threads to not want to die; to require the entire process to be killed to 
> stop the workers. 
> Having a standard interface would make it easier for 
>  * management tools to manage the different things
>  * monitoring the state of things
>  * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up 
> threads in their constructor; that's very dangerous as subclasses may have 
> their methods called before they are full initialised. Adding this interface 
> would be the right time to clean up the startup process so that subclassing 
> is less risky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to