[jira] Commented: (HADOOP-5933) Make it harder to accidentally close a shared DFSClient

Steve Loughran (JIRA) Fri, 29 May 2009 05:30:21 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714436#action_12714436
 ]


Steve Loughran commented on HADOOP-5933:
----------------------------------------

There isn't direct caching for DFSClients, but there is in Filesystem.get(), 
which is how I've been getting DFSClient instances (and those of other 
filesystems)

Up until march I could have different things get filesystems, do some work and 
then close them, but now in SVN_HEAD I'm seeing stack traces when different 
threads try to work with filesystem instances they have been holding on to. So 
one thread running a TaskTracker is happily spinning away, until something does 
a quick check on a different thread that a specific file exists in the 
fileysystem, does a close afterwards. 

My stack traces are here: http://jira.smartfrog.org/jira/browse/SFOS-1208

The semantics of {{FileSystem.get()}} have changed; if I moved my code to the 
new {{FileSystem.newInstance()}} method then things should work again. That 
doesn't mean we dont benefit from tracing who closed the instance, only that 
anyone else doing work in different threads who were getting the filesystem 
clients by way of {{FileSystem.get()}}  are going to encounter the same 
problems. I just saw them first :)

> Make it harder to accidentally close a shared DFSClient
> -------------------------------------------------------
>
>                 Key: HADOOP-5933
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5933
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-5933.patch
>
>
> Every so often I get stack traces telling me that DFSClient is closed, 
> usually in {{org.apache.hadoop.hdfs.DFSClient.checkOpen() }} . The root cause 
> of this is usually that one thread has closed a shared fsclient while another 
> thread still has a reference to it. If the other thread then asks for a new 
> client it will get one -and the cache repopulated- but if has one already, 
> then I get to see a stack trace. 
> It's effectively a race condition between clients in different threads. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5933) Make it harder to accidentally close a shared DFSClient

Reply via email to