[ 
https://issues.apache.org/jira/browse/HADOOP-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534177
 ] 

Chris Douglas commented on HADOOP-2025:
---------------------------------------

Automatically creating a working directory may become awkward with permissions 
and when delegating tasks to an agent. Further, creating a default place for a 
user's data should be part of adding that user to the system, not executing a 
task.

Requesting a "home" directory for a given set of credentials- rather than a 
default "working directory"- from a FileSystem seems more correct; the working 
directory seems like FileSystem state owned by an application (i.e. the 
FileSystem object). If one wants to resolve relative paths, the working 
directory must be set first on the particular instance.

This way, relative Paths can only be resolved against a FileSystem where the 
working directory is set, absolute Paths are always OK, FileSystems can return 
a default directory for a given user (but not in general), and all Paths from a 
FileSystem are fully qualified (HADOOP-1909).

At the moment, the working directory is set by the TaskTracker (to the property 
provided) and by IsolationRunner (for local, temporary storage). It is used 
sparingly, but notably by applications like FsShell and distcp (where 
reasonable defaults can be set and checked). Are there other places where this 
is relied on that might make effecting this change more difficult?

> Instantiating a FileSystem object should guarantee the existence of the 
> working directory
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2025
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2025
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.14.1
>            Reporter: Sameer Paranjpye
>             Fix For: 0.16.0
>
>
> Issues like HADOOP-1891 and HADOOP-1916 illustrate the need for this behavior.
> In HADOOP-1916 the problem is that the default working directory for a user 
> on HDFS '/user/<username>' does not exist. This results in the command 
> 'hadoop dfs -copyFromLocal foo ." creating a *file* called /user/<username> 
> and copying the contents of the file 'foo' into this file.
> HADOOP-1891 is basically the same problem. The problem that Olga observed was 
> that copying a file to '.' on HDFS when her 'home directory' did not exist 
> resulted in the creation of a file with the path as her home directory. The 
> problem is incorrectly filed as a bug in the Path class. The behavior of Path 
> is correct, as Doug points out, it is perfectly reasonable for Path(".") to 
> convert to an empty path. When this empty path is resolved in HDFS or any 
> other filesystem the resolution to '/user/<username>' is also correct (at 
> least for HDFS). The problem IMO is that the existence of the working 
> directory is not guaranteed.
> When I log in to a machine my default working directory is '/home/sameerp' 
> and filesystem operations that I execute with relative paths all work 
> correctly because this directory exists. My home directory lives on a filer, 
> in the event of it being unmountable the default working directory I get is 
> '/' which also is guaranteed to exist.
> In the context of Hadoop, instantiating a FileSystem object is the analogue 
> of logging in and should result in a working directory whose existence has 
> been validated. In the case of HDFS this should be '/user/<username>' or '/' 
> if the directory does not exist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to