[jira] Commented: (MAPREDUCE-437) JobTracker may need to close its filesystem when being terminated

Steve Loughran (JIRA) Sat, 12 Feb 2011 01:09:25 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993863#comment-12993863
 ]


Steve Loughran commented on MAPREDUCE-437:
------------------------------------------

Reviewing the code in trunk, the problem is a bit more serious and relates to 
what happens when a cached FS instance is closed: everyone who has a reference 
to that instance cannot use the filesystem. 

this does not normally surface in production as the JT runs in its own VM. It 
does exist in MiniMR clusters, in testing, but hasn't shown up because nobody 
other than me has tried to shut down an FS instance while the JT is still live.

Proposed actions
 1-rename this issue to be more explicit: JT must ask for a new FS instance and 
close it when terminated.
 2-add a test to verify that a miniMR cluster will fail if you get the same 
instance and close it
 3-have the JT get a new instance on startup/going live and verify that test 2 
now passes
 4-have the JT close its filesystem on shutdown, set its local reference to null
I can't think of an easy way to test #4 unless there is a method to get the JT 
filesystem reference

> JobTracker may need to close its filesystem when being terminated
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-437
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-437
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've been experimenting with HADOOP-3268; I'm not sure what 
> the right action is here.
> -currently, the JobTracker does not close() its filesystem when it is shut 
> down. This will cause it to leak filesystem references if JobTrackers are 
> started and stopped in the same process.
> -The TestMRServerPorts test explicitly closes the filesystem
>         jt.fs.close();
>         jt.stopTracker();
> -If you move the close() operation into the stopTracker()/terminate logic, 
> the filesystem gets cleaned up, but 
> TestRackAwareTaskPlacement and TestMultipleLevelCaching fail with a 
> FilesystemClosed error (stack traces to follow)
> Should the JobTracker close its filesystem whenever it is terminated? If so, 
> there are some tests that need to be reworked slightly to not expect the 
> fileystem to be live after the jobtracker is taken down.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAPREDUCE-437) JobTracker may need to close its filesystem when being terminated

Reply via email to