[jira] Commented: (HADOOP-3628) Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.

Steve Loughran (JIRA) Wed, 16 Jul 2008 05:54:25 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613942#action_12613942
 ]


Steve Loughran commented on HADOOP-3628:
----------------------------------------

3 tests have failed. 

#1 testJobTrackerPorts NPEs on job tracker termnation, when the notifier 
(singleton) is terminated and set to null. The fact that a singleton is an 
something I'd been hoping to deal with separately. Either it needs to be dealt 
with first, or the jobTracker terminate code made more robust. 

org.apache.hadoop.mapred.TestMRServerPorts.testJobTrackerPorts
Failing for the past 1 build (Since Failed#2833 )
Took 1 second.

java.lang.NullPointerException
        at 
org.apache.hadoop.mapred.JobEndNotifier.stopNotifier(JobEndNotifier.java:92)
        at 
org.apache.hadoop.mapred.JobTracker.innerTerminate(JobTracker.java:812)
        at org.apache.hadoop.util.Service.terminate(Service.java:134)
        at org.apache.hadoop.util.Service.deploy(Service.java:266)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:150)
        at 
org.apache.hadoop.mapred.TestMRServerPorts.canStartJobTracker(TestMRServerPorts.java:66)
        at 
org.apache.hadoop.mapred.TestMRServerPorts.testJobTrackerPorts(TestMRServerPorts.java:107)


#2, and #3 are almost the same stack trace; the filesystem is closed when the 
client expects it to be open. This doesnt show up on the work box, so it could 
be a race condition that is surfacing here.


org.apache.hadoop.mapred.TestRackAwareTaskPlacement.testTaskPlacement
Failing for the past 1 build (Since Failed#2833 )
Took 41 seconds.

java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:201)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:569)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:383)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:663)
        at 
org.apache.hadoop.mapred.TestRackAwareTaskPlacement.launchJobAndTestCounters(TestRackAwareTaskPlacement.java:77)
        at 
org.apache.hadoop.mapred.TestRackAwareTaskPlacement.testTaskPlacement(TestRackAwareTaskPlacement.java:155)

org.apache.hadoop.mapred.TestMultipleLevelCaching.testMultiLevelCaching
java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:201)
        at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:532)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:185)
        at 
org.apache.hadoop.mapred.TestMultipleLevelCaching.testCachingAtLevel(TestMultipleLevelCaching.java:117)
        at 
org.apache.hadoop.mapred.TestMultipleLevelCaching.testMultiLevelCaching(TestMultipleLevelCaching.java:69)

expect updated patches in the week of july 25.

> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-3628
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3628
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: AbstractHadoopComponent.java, hadoop-3628.patch, 
> hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch
>
>
> I'd like to propose we have a standard interface for hadoop components, the 
> things that get started or stopped when you bring up a namenode. currently, 
> some of these classes have a stop() or shutdown() method, with no standard 
> name/interface, but no way of seeing if they are live, checking their health 
> of shutting them down reliably. Indeed, there is a tendency for the spawned 
> threads to not want to die; to require the entire process to be killed to 
> stop the workers. 
> Having a standard interface would make it easier for 
>  * management tools to manage the different things
>  * monitoring the state of things
>  * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up 
> threads in their constructor; that's very dangerous as subclasses may have 
> their methods called before they are full initialised. Adding this interface 
> would be the right time to clean up the startup process so that subclassing 
> is less risky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3628) Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.

Reply via email to