[
https://issues.apache.org/jira/browse/HADOOP-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571275#action_12571275
]
Hemanth Yamijala commented on HADOOP-2862:
------------------------------------------
HOD currently uses two separate configuration options for temp space. One is
the temp-dir which it uses for HOD specific temporary work. The other is
work-dirs which it uses for Hadoop specific work space. It is in the latter
that HDFS and Mapred work directories are set up. The value specified in the
configuration file for these options is the root, under which directories are
set up per job, owned byu the user. These are deleted at the end of the job -
atleast, AFAIK, the data is deleted, but sometimes the directories themselves
are not cleaned up which looks like a bug in HOD.
That said, I see what you are proposing has 2 advantages:
- Currently, HOD requires the temp directory to be world writable, so different
users can write to it. With your method, we longer need that requirement. It
seems cleaner.
- We are ensured of clean-up of the directories as well. Though, I think HOD
should still take responsibility of cleaning up what it creates.
It seems, therefore, like a useful addition.
> [HOD] Support PBS env vars in hod configuration
> -----------------------------------------------
>
> Key: HADOOP-2862
> URL: https://issues.apache.org/jira/browse/HADOOP-2862
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/hod
> Affects Versions: 0.16.0
> Environment: Torque PBS
> Reporter: Craig Macdonald
>
> In some batch environments, eg using Torque PBS, scratch spaces are provided
> on cluster nodes for where jobs should put their temporary files. These are
> automatically cleaned up when the job exists by an epilogue script.
> For instance, in our local Torque cluster, all nodes have a /scratch
> partition. For each job, the prologue script creates a scratch folder owned
> by the user at /scratch/pbstmp.$PBS_JOBID - $PBS_JOBID is then the env var
> containing the job id, as set by pbs_mom.
> Would it be possible to use these env vars in the configuration of hod. For
> instance, say I want to create an hdfs on demand using hod, but that the hdfs
> space should be in /scratch/pbstmp.$PBS_JOBID, not in /tmp/hod say. This
> would involve HOD supporting env vars in configuration, but knowing when to
> substitute the env var with it's current value (ie not until running on the
> correct node where the operation should take place).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.