[ 
https://issues.apache.org/jira/browse/HADOOP-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573153#action_12573153
 ] 

Hemanth Yamijala commented on HADOOP-2899:
------------------------------------------

Luca,

bq. Afterall I'm not sure exactly why this directory is created, so I might be 
wrong here, but if it's like a "lock" or a placeholder to the hodring node I 
see solution 3 as potentially dangerous, in the sense that having that info in 
the user space would not prevent the system from reusing the same resource by a 
different process run by a different user.

This is not a lock directory. This is the directory where the jobtracker stores 
the job files for jobs it runs - like the job.jar, job.xml etc. It creates a 
sub directory for each job it runs.

Nicholas,

bq. Hemanth, is jobConf.getSystemDir() the dfs directory 
(/mapredsystem/hostname) you mentioned above? If it is the case, I believe we 
should implement solution 2.

Yes. It is the same.

bq. In addition, we should also implement solution 3, the path should be unique 
across HOD. If we use /mapredsystem/hostname, I guess two users in the same 
host cannot allocate clusters at the same time. Something like 
/mapredsystem/user-name.clusterid seems good. It might be even better to append 
a random number at the end.

Makes sense, I have no problems doing this. If done in addition to 2, this 
might actually remove the doubt I asked, about number of DFS directories 
created. In current configuration of torque, the situation of a clash will not 
arise, because we don't allow nodes to be allocated for more than one cluster. 
But its definitely better not to assume this.

Just in case 2 is not done for some reason, this problem would be addressed, 
except for my doubt. Can someone shed some light on that ? Is it preferable to 
not have too many directories created in dfs ? (Note these will be empty once 
the cluster is deallocated - so no files).




> hdfs:///mapredsystem directory not cleaned up after deallocation 
> -----------------------------------------------------------------
>
>                 Key: HADOOP-2899
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2899
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Luca Telloli
>
> Each submitted job creates a hdfs:///mapredsystem directory, created by (I 
> guess) the hodring process. Problem is that it's not cleaned up at the end of 
> the process; a use case would be:
> - user A allocates a cluster, the hodring is svrX, so a /mapredsystem/srvX 
> directory is created
> - user A deallocates the cluster, but that directory is not cleaned up
> - user B allocates a cluster, and the first node chosen as hodring is svrX, 
> so hodring tries to write hdfs:///mapredsystem but it fails
> - allocation succeeds, but there's no hodring running; looking at
> 0-jobtracker/logdir/hadoop.log under the temporary directory I can read:
> 2008-02-26 17:28:42,567 WARN org.apache.hadoop.mapred.JobTracker: Error 
> starting tracker: org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.fs.permission.AccessControlException: Permission denied: 
> user=B, access=WRITE, inode="mapredsystem":hadoop:supergroup:rwxr-xr-x
> I guess a possible solution would be to clean up those directories during the 
> deallocation process. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to