[
https://issues.apache.org/jira/browse/HADOOP-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573153#action_12573153
]
Hemanth Yamijala commented on HADOOP-2899:
------------------------------------------
Luca,
bq. Afterall I'm not sure exactly why this directory is created, so I might be
wrong here, but if it's like a "lock" or a placeholder to the hodring node I
see solution 3 as potentially dangerous, in the sense that having that info in
the user space would not prevent the system from reusing the same resource by a
different process run by a different user.
This is not a lock directory. This is the directory where the jobtracker stores
the job files for jobs it runs - like the job.jar, job.xml etc. It creates a
sub directory for each job it runs.
Nicholas,
bq. Hemanth, is jobConf.getSystemDir() the dfs directory
(/mapredsystem/hostname) you mentioned above? If it is the case, I believe we
should implement solution 2.
Yes. It is the same.
bq. In addition, we should also implement solution 3, the path should be unique
across HOD. If we use /mapredsystem/hostname, I guess two users in the same
host cannot allocate clusters at the same time. Something like
/mapredsystem/user-name.clusterid seems good. It might be even better to append
a random number at the end.
Makes sense, I have no problems doing this. If done in addition to 2, this
might actually remove the doubt I asked, about number of DFS directories
created. In current configuration of torque, the situation of a clash will not
arise, because we don't allow nodes to be allocated for more than one cluster.
But its definitely better not to assume this.
Just in case 2 is not done for some reason, this problem would be addressed,
except for my doubt. Can someone shed some light on that ? Is it preferable to
not have too many directories created in dfs ? (Note these will be empty once
the cluster is deallocated - so no files).
> hdfs:///mapredsystem directory not cleaned up after deallocation
> -----------------------------------------------------------------
>
> Key: HADOOP-2899
> URL: https://issues.apache.org/jira/browse/HADOOP-2899
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/hod
> Affects Versions: 0.16.0
> Reporter: Luca Telloli
>
> Each submitted job creates a hdfs:///mapredsystem directory, created by (I
> guess) the hodring process. Problem is that it's not cleaned up at the end of
> the process; a use case would be:
> - user A allocates a cluster, the hodring is svrX, so a /mapredsystem/srvX
> directory is created
> - user A deallocates the cluster, but that directory is not cleaned up
> - user B allocates a cluster, and the first node chosen as hodring is svrX,
> so hodring tries to write hdfs:///mapredsystem but it fails
> - allocation succeeds, but there's no hodring running; looking at
> 0-jobtracker/logdir/hadoop.log under the temporary directory I can read:
> 2008-02-26 17:28:42,567 WARN org.apache.hadoop.mapred.JobTracker: Error
> starting tracker: org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.fs.permission.AccessControlException: Permission denied:
> user=B, access=WRITE, inode="mapredsystem":hadoop:supergroup:rwxr-xr-x
> I guess a possible solution would be to clean up those directories during the
> deallocation process.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.