Ilya, Could you try passing -Dmapred.local.dir=<random generated tmp location> ( -Dyarn.nodemanager.local-dirs=<random generated tmp location> in case of hadoop 23) when launching pig local mode tests and see if that works.
TestDriver.pm already has a block that passes additional java_params to local mode. if ($testCmd->{'exectype'} eq "local") { push(@{$testCmd->{'java_params'}}, "-Xmx1024m"); push(@pigCmd, ("-x", "local")); } Regards, Rohini On Fri, Sep 21, 2012 at 5:01 AM, Ilya Katsov <ikat...@griddynamics.com>wrote: > Hello All, > > I'm trying to run Pig e2e tests in parallel and there are many > failures like this in local mode: > > WARN org.apache.hadoop.mapred.Task - Could not find output size > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > output/file.out in any of the configured local directories > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) > at > org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) > at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944) > at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924) > at org.apache.hadoop.mapred.Task.done(Task.java:875) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374) > > It seems that the problem is in concurrent access to the JobTracker's > temporary directory - file.out is a temporary JobTracker's file. It's > clearly visible that different tests open files in the same directory: > > $ lsof | grep output > java 20719 ikatsov 13r REG 8,1 3486 > 17039996 > /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_0.out > java 20719 ikatsov 16r REG 8,1 349196 > 17039986 > /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_1.out > > $ lsof | grep output > java 25410 ikatsov 13w REG 8,1 8145 > 17039997 > /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/spill0.out > > $ lsof | grep output > java 2223 ikatsov 13r REG 8,1 289196 > 16384629 > /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0003/attempt_local_0003_r_000000_0/output/map_0.out > > $ lsof | grep output > java 12187 ikatsov 14r REG 8,1 349196 > 17039996 > /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_0.out > java 12187 ikatsov 17r REG 8,1 349196 > 17039999 > /tmp/hadoop-ikatsov/mapred/local/taskTracker/ikatsov/jobcache/job_local_0001/attempt_local_0001_r_000000_0/output/map_1.out > > > I wonder, is there way to specify temporary Hadoop directory > (mapreduce.cluster.local.dir) when launching Pig in local mode? > > Thank you in advance, > Ilya >