[
https://issues.apache.org/jira/browse/HADOOP-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hemanth Yamijala updated HADOOP-2899:
-------------------------------------
Attachment: 2899.1.patch
Attached patch adds code to the hodcleanup process to delete the mapred system
directory.
- The hodring process passes arguments required for the deletion (process id of
the job tracker, namenode url, path to the hadoop command, and the name of the
mapred system directory) to the hodcleanup process.
- The hodcleaup process first waits until the jobtracker is down. It waits by
checking for the job tracker process to end.
- Then it issues a hadoop dfs -fs namenode-url -rmr mapredsystemdir
Another change made in this patch is the way the namenode url is specified
while copying logs to the dfs as part of the hod cleanup process.
Added a few tests. However, they are not really complete. We need to refactor
hod code a bit to enable more unit testing - like providing the ability to
create a mock simplecommand etc. These can be handled later.
> [HOD] hdfs:///mapredsystem directory not cleaned up after deallocation
> -----------------------------------------------------------------------
>
> Key: HADOOP-2899
> URL: https://issues.apache.org/jira/browse/HADOOP-2899
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/hod
> Affects Versions: 0.16.0
> Reporter: Luca Telloli
> Assignee: Hemanth Yamijala
> Fix For: 0.17.0
>
> Attachments: 2899.1.patch
>
>
> Each submitted job creates a hdfs:///mapredsystem directory, created by (I
> guess) the hodring process. Problem is that it's not cleaned up at the end of
> the process; a use case would be:
> - user A allocates a cluster, the hodring is svrX, so a /mapredsystem/srvX
> directory is created
> - user A deallocates the cluster, but that directory is not cleaned up
> - user B allocates a cluster, and the first node chosen as hodring is svrX,
> so hodring tries to write hdfs:///mapredsystem but it fails
> - allocation succeeds, but there's no hodring running; looking at
> 0-jobtracker/logdir/hadoop.log under the temporary directory I can read:
> 2008-02-26 17:28:42,567 WARN org.apache.hadoop.mapred.JobTracker: Error
> starting tracker: org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.fs.permission.AccessControlException: Permission denied:
> user=B, access=WRITE, inode="mapredsystem":hadoop:supergroup:rwxr-xr-x
> I guess a possible solution would be to clean up those directories during the
> deallocation process.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.