[jira] Commented: (HADOOP-5327) Job files for a job failing because of ACLs are not clean from the system directory

Hemanth Yamijala (JIRA) Fri, 06 Mar 2009 02:18:29 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679548#action_12679548
 ]


Hemanth Yamijala commented on HADOOP-5327:
------------------------------------------

I think this patch should also add the system directory to the clean up thread 
in the code path where job submission fails due to ACLs. In a majority of the 
cases, this action alone will prevent the problem from happening in the first 
place. However, this is only in addition to the changes in the patch as they 
are still needed to take care of cases where the job tracker could be restarted 
before the clean up thread has had a chance to delete the system directory 
completely.

Regarding cleanup, there seem to be two different cases here:

- The job was never submitted in the first place
- The job was running in the first place, and after restart it can no longer 
run because the ACLs were changed. 

I think the patch is cleanly handling the first case (with the comments 
incorporated). In the second case, ideally the job should be killed by the 
JobTracker so that all parts related to the job (system directory, running 
tasks, cleanup task, etc) are cleaned up properly. I am thinking handling the 
second case (which ideally should be rare) should be a separate jira. Thoughts ?

> Job files for a job failing because of  ACLs are not clean from the system 
> directory
> ------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5327
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5327
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Karam Singh
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5327-v2.3.patch
>
>
> Jobs which failed because of ACLs gets added during JT restart recovery 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5327) Job files for a job failing because of ACLs are not clean from the system directory

Reply via email to