Ilya,
Could you try passing -Dmapred.local.dir=<random generated tmp location>
( -Dyarn.nodemanager.local-dirs=<random generated tmp location> in case of
hadoop 23) when launching pig local mode tests and see if that works.
TestDriver.pm already has a block that passes additional java_params to
local mode.
if ($testCmd->{'exectype'} eq "local") {
push(@{$testCmd->{'java_params'}}, "-Xmx1024m");
push(@pigCmd, ("-x", "local"));
}
Regards,
Rohini
On Fri, Sep 21, 2012 at 5:01 AM, Ilya Katsov <[email protected]>wrote:
> Hello All,
>
> I'm trying to run Pig e2e tests in parallel and there are many
> failures like this in local mode:
>
> WARN org.apache.hadoop.mapred.Task - Could not find output size
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> output/file.out in any of the configured local directories
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
> at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56)
> at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944)
> at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924)
> at org.apache.hadoop.mapred.Task.done(Task.java:875)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374)
>
> It seems that the problem is in concurrent access to the JobTracker's
> temporary directory - file.out is a temporary JobTracker's file. It's
> clearly visible that different tests open files in the same directory:
>
> $ lsof | grep output
> java 20719 ikatsov 13r REG 8,1 3486
> 17039996
> /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_0.out
> java 20719 ikatsov 16r REG 8,1 349196
> 17039986
> /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_1.out
>
> $ lsof | grep output
> java 25410 ikatsov 13w REG 8,1 8145
> 17039997
> /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill0.out
>
> $ lsof | grep output
> java 2223 ikatsov 13r REG 8,1 289196
> 16384629
> /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0003/attempt_local_0003_r_000000_0/output/map_0.out
>
> $ lsof | grep output
> java 12187 ikatsov 14r REG 8,1 349196
> 17039996
> /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_0.out
> java 12187 ikatsov 17r REG 8,1 349196
> 17039999
> /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_1.out
>
>
> I wonder, is there way to specify temporary Hadoop directory
> (mapreduce.cluster.local.dir) when launching Pig in local mode?
>
> Thank you in advance,
> Ilya
>