[jira] Commented: (HADOOP-2915) mapred output files and directories should be created as the job submitter, not tasktracker or jobtracker

Owen O'Malley (JIRA) Tue, 04 Mar 2008 14:50:52 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575175#action_12575175
 ]


Owen O'Malley commented on HADOOP-2915:
---------------------------------------

You can not use UserGroupInformation as the the key in a hash table, because it 
has no hash function or equals defined. I would recommend using the user name 
instead, since String has all of the important methods defined.

In general abbreviations in variable names are hard to read and "saToFs" is 
pretty opaque.

The finally should go on the same line as the closing brace.

It seems really error-prone having a cache that gives out FileSystems more than 
once and requiring users to close them. Take for instance, the case where a 
user has two jobs running at the same time, you can easily end up with two 
copies of the FileSystem being used for different jobs. I propose that we fix 
this by making the FileSystem cache contain weak references to the file systems 
and make the finializers close the filesystem.

The JobTracker should keep a reference to the output file system in 
JobInProgress and continue to use it over again. When the job completes, the 
fields should be cleared by JobTracker.markCompletedJob, so that it can be 
reclaimed by the garbage collector.

> mapred output files and directories should be created as the job submitter, 
> not tasktracker or jobtracker
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2915
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2915
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs, mapred
>    Affects Versions: 0.16.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.16.1
>
>         Attachments: 2915_20080229.patch, 2915_20080302.patch, 
> 2915_20080303.patch
>
>
> Quoted from an email sending to core-dev by Andy Li:
> {quote}
> For example, assuming I have installed Hadoop with an account 'hadoop' and I 
> am going to run my program with user account 'test'. I have created an input 
> folder as /user/test/input/ with user 'test' and the permission is set to 
> 0775.
> /user/test/input      <dir>          2008-02-27 01:20 rwxr-xr-x      test  
> hadoop
> When I run the MapReduce code, the output I specified will be set to user 
> 'hadoop' instead of 'test'.
> /bin/hadoop jar /tmp/test_perm.jar -m 57 -r 3 "/user/test/input/l" 
> "/user/test/output/"
> The directory "/user/test/output/" will have the following permission and 
> user:group.
> /user/test/output    <dir>          2008-02-27 03:53        rwxr-xr-x hadoop  
> hadoop
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2915) mapred output files and directories should be created as the job submitter, not tasktracker or jobtracker

Reply via email to