[jira] Commented: (HADOOP-2915) mapred output files and directories should be created as the job submitter, not tasktracker or jobtracker

Sameer Paranjpye (JIRA) Tue, 04 Mar 2008 16:55:00 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575211#action_12575211
 ]


Sameer Paranjpye commented on HADOOP-2915:
------------------------------------------

> It seems really error-prone having a cache that gives out FileSystems more 
> than once and requiring users to close them. Take for instance, the case 
> where a user has 
> two jobs running at the same time, you can easily end up with two copies of 
> the FileSystem being used for different jobs. I propose that we fix this by 
> making the 
> FileSystem cache contain weak references to the file systems and make the 
> finializers close the filesystem.

Will the finalizer also delete the filesystem entry from the cache? It probably 
should otherwise we could end up lots of dead weak references in the cache. 
Need to be careful when deleting the entry. A weak reference will start 
returning null before the finalizer is run. Between the time the weak reference 
dies and the finalizer is run another call to get() could cause a new 
filesystem object to be instantiated for the authority and user. The finalizer 
should delete the entry iff the weakreference is currently returning null. It 
should also deal with the entry not being present.






> mapred output files and directories should be created as the job submitter, 
> not tasktracker or jobtracker
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2915
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2915
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs, mapred
>    Affects Versions: 0.16.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.16.1
>
>         Attachments: 2915_20080229.patch, 2915_20080302.patch, 
> 2915_20080303.patch
>
>
> Quoted from an email sending to core-dev by Andy Li:
> {quote}
> For example, assuming I have installed Hadoop with an account 'hadoop' and I 
> am going to run my program with user account 'test'. I have created an input 
> folder as /user/test/input/ with user 'test' and the permission is set to 
> 0775.
> /user/test/input      <dir>          2008-02-27 01:20 rwxr-xr-x      test  
> hadoop
> When I run the MapReduce code, the output I specified will be set to user 
> 'hadoop' instead of 'test'.
> /bin/hadoop jar /tmp/test_perm.jar -m 57 -r 3 "/user/test/input/l" 
> "/user/test/output/"
> The directory "/user/test/output/" will have the following permission and 
> user:group.
> /user/test/output    <dir>          2008-02-27 03:53        rwxr-xr-x hadoop  
> hadoop
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2915) mapred output files and directories should be created as the job submitter, not tasktracker or jobtracker

Reply via email to