[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-1979:
------------------------------------

    Description: 
"Output directory already exists" error is seen in gridmix when 
gridmix.output.directory is not defined. When gridmix.output.directory is not 
defined, then gridmix uses inputDir/gridmix/ as output path for gridmix run. 
Because gridmix is creating outputPath(in this case, inputDir/gridmix/) at the 
begining, the output path to generate-data-mapreduce-job(i.e. inputDir) already 
exists and becomes error from mapreduce.

There is need for creation of this outputPath in any case(whether user 
specifies the path using gridmix.output.directory OR gridmix itself considering 
inputDir/gridmix/ ) even though the paths are automatically created for output 
paths of mapreduce jobs(like mkdir -p), because gridmix needs to set 777 
permissions for this outputPath sothat different users can create different 
output directories of different mapreduce jobs within this gridmix run.

The other case in which this problem is seen is when gridmix.output.directory 
is defined as a relative path. This is because in this case also, gridmix tries 
to create relative path under ioPath/ and thus the same issue.

  was:
"Output directory already exists" error is seen in gridmix when 
gridmix.output.directory is not defined. When gridmix.output.directory is not 
defined, then gridmix uses inputDir/gridmix/ as output path for gridmix run. 
Because gridmix is creating outputPath(in this case, inputDir/gridmix/) at the 
begining, the output path to generate-data-mapreduce-job(i.e. inputDir) already 
exists and becomes error from mapreduce.

There is no need of creating this outputPath in any case(whether user specifies 
the path using gridmix.output.directory OR gridmix itself considering 
inputDir/gridmix/ ) because the paths are automatically created for output 
paths of mapreduce jobs(like mkdir -p).

The other case in which this problem is seen is when gridmix.output.directory 
is defined as a relative path. This is because in this case also, gridmix tries 
to create relative path under ioPath/ and thus the same issue.


> "Output directory already exists" error in gridmix when 
> gridmix.output.directory is not defined
> -----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1979
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1979
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 1979.patch, 1979.v1.1.patch, 1979.v1.2.patch, 
> 1979.v1.patch
>
>
> "Output directory already exists" error is seen in gridmix when 
> gridmix.output.directory is not defined. When gridmix.output.directory is not 
> defined, then gridmix uses inputDir/gridmix/ as output path for gridmix run. 
> Because gridmix is creating outputPath(in this case, inputDir/gridmix/) at 
> the begining, the output path to generate-data-mapreduce-job(i.e. inputDir) 
> already exists and becomes error from mapreduce.
> There is need for creation of this outputPath in any case(whether user 
> specifies the path using gridmix.output.directory OR gridmix itself 
> considering inputDir/gridmix/ ) even though the paths are automatically 
> created for output paths of mapreduce jobs(like mkdir -p), because gridmix 
> needs to set 777 permissions for this outputPath sothat different users can 
> create different output directories of different mapreduce jobs within this 
> gridmix run.
> The other case in which this problem is seen is when gridmix.output.directory 
> is defined as a relative path. This is because in this case also, gridmix 
> tries to create relative path under ioPath/ and thus the same issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to