[
https://issues.apache.org/jira/browse/MAPREDUCE-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993863#comment-12993863
]
Steve Loughran commented on MAPREDUCE-437:
------------------------------------------
Reviewing the code in trunk, the problem is a bit more serious and relates to
what happens when a cached FS instance is closed: everyone who has a reference
to that instance cannot use the filesystem.
this does not normally surface in production as the JT runs in its own VM. It
does exist in MiniMR clusters, in testing, but hasn't shown up because nobody
other than me has tried to shut down an FS instance while the JT is still live.
Proposed actions
1-rename this issue to be more explicit: JT must ask for a new FS instance and
close it when terminated.
2-add a test to verify that a miniMR cluster will fail if you get the same
instance and close it
3-have the JT get a new instance on startup/going live and verify that test 2
now passes
4-have the JT close its filesystem on shutdown, set its local reference to null
I can't think of an easy way to test #4 unless there is a method to get the JT
filesystem reference
> JobTracker may need to close its filesystem when being terminated
> -----------------------------------------------------------------
>
> Key: MAPREDUCE-437
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-437
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Steve Loughran
> Priority: Minor
>
> This is something I've been experimenting with HADOOP-3268; I'm not sure what
> the right action is here.
> -currently, the JobTracker does not close() its filesystem when it is shut
> down. This will cause it to leak filesystem references if JobTrackers are
> started and stopped in the same process.
> -The TestMRServerPorts test explicitly closes the filesystem
> jt.fs.close();
> jt.stopTracker();
> -If you move the close() operation into the stopTracker()/terminate logic,
> the filesystem gets cleaned up, but
> TestRackAwareTaskPlacement and TestMultipleLevelCaching fail with a
> FilesystemClosed error (stack traces to follow)
> Should the JobTracker close its filesystem whenever it is terminated? If so,
> there are some tests that need to be reworked slightly to not expect the
> fileystem to be live after the jobtracker is taken down.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira