[
https://issues.apache.org/jira/browse/HADOOP-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714436#action_12714436
]
Steve Loughran commented on HADOOP-5933:
----------------------------------------
There isn't direct caching for DFSClients, but there is in Filesystem.get(),
which is how I've been getting DFSClient instances (and those of other
filesystems)
Up until march I could have different things get filesystems, do some work and
then close them, but now in SVN_HEAD I'm seeing stack traces when different
threads try to work with filesystem instances they have been holding on to. So
one thread running a TaskTracker is happily spinning away, until something does
a quick check on a different thread that a specific file exists in the
fileysystem, does a close afterwards.
My stack traces are here: http://jira.smartfrog.org/jira/browse/SFOS-1208
The semantics of {{FileSystem.get()}} have changed; if I moved my code to the
new {{FileSystem.newInstance()}} method then things should work again. That
doesn't mean we dont benefit from tracing who closed the instance, only that
anyone else doing work in different threads who were getting the filesystem
clients by way of {{FileSystem.get()}} are going to encounter the same
problems. I just saw them first :)
> Make it harder to accidentally close a shared DFSClient
> -------------------------------------------------------
>
> Key: HADOOP-5933
> URL: https://issues.apache.org/jira/browse/HADOOP-5933
> Project: Hadoop Core
> Issue Type: Improvement
> Components: fs
> Affects Versions: 0.21.0
> Reporter: Steve Loughran
> Priority: Minor
> Attachments: HADOOP-5933.patch
>
>
> Every so often I get stack traces telling me that DFSClient is closed,
> usually in {{org.apache.hadoop.hdfs.DFSClient.checkOpen() }} . The root cause
> of this is usually that one thread has closed a shared fsclient while another
> thread still has a reference to it. If the other thread then asks for a new
> client it will get one -and the cache repopulated- but if has one already,
> then I get to see a stack trace.
> It's effectively a race condition between clients in different threads.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.