In case it is required, I was trying to run this using 400mappers (my DFS block size is 128MB) and 4 reducers. Each of my machines is a 2.4 GHz 64-bit Quad Core Xeon E5530 "Nehalem" processor and I am using a 32-bit Ubuntu 10.4.
-Virajith On Thu, Jun 23, 2011 at 3:09 PM, Virajith Jalaparti <virajit...@gmail.com>wrote: > Hi, > > I am trying to run a sort job (from hadoop-0.20.2-examples.jar) on 50GB of > data (generated using randomwriter). I am using hadoop-0.20.2 on a cluster > of 3 machines with one machine serving as the master and the other two as > slaves. > I get the following errors for various the task attempts: > ======================================================================= > 11/06/23 07:57:14 INFO mapred.JobClient: Task Id : > attempt_201106230747_0001_m_000119_0, Status : FAILED > Error: java.io.IOException: No space left on device > at java.io.FileOutputStream.writeBytes(Native Method) > at java.io.FileOutputStream.write(FileOutputStream.java:282) > at > org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:190) > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) > at > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) > at java.io.FilterOutputStream.close(FilterOutputStream.java:140) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1298) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173) > > Error initializing attempt_201106230747_0001_m_000119_0: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for taskTracker/jobcache/job_201106230747_0001/job.xml > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) > at > org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:750) > at > org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1664) > at > org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97) > at > org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1629) > ======================================================================= > > The dfsadmin -report gives me the following: > > ================================================================== > Configured Capacity: 465230045184 (433.28 GB) > Present Capacity: 440799092736 (410.53 GB) > DFS Remaining: 371988148224 (346.44 GB) > DFS Used: 68810944512 (64.09 GB) > DFS Used%: 15.61% > Under replicated blocks: 1 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > > ------------------------------------------------- > Datanodes available: 2 (2 total, 0 dead) > > Name: 10.1.1.4:50010 > Decommission Status : Normal > Configured Capacity: 232615022592 (216.64 GB) > DFS Used: 32243871744 (30.03 GB) > Non DFS Used: 12215377920 (11.38 GB) > DFS Remaining: 188155772928(175.23 GB) > DFS Used%: 13.86% > DFS Remaining%: 80.89% > Last contact: Thu Jun 23 08:04:51 MDT 2011 > > > Name: 10.1.1.3:50010 > Decommission Status : Normal > Configured Capacity: 232615022592 (216.64 GB) > DFS Used: 36567072768 (34.06 GB) > Non DFS Used: 12215574528 (11.38 GB) > DFS Remaining: 183832375296(171.21 GB) > DFS Used%: 15.72% > DFS Remaining%: 79.03% > Last contact: Thu Jun 23 08:04:51 MDT 2011 > > ================================================================== > > > > I have the following parameters configured in core-site.xml and > mapred-site.xml > > *core-site.xml:* > <property> > <name>hadoop.tmp.dir</name> > <value>/mnt/local/mapred/</value> > </property> > </configuration> > > *mapred-site.xml:* > <name>mapred.system.dir</name> > <value>/mnt/local/mapred/system</value> > </property> > > <property> > <name>mapred.local.dir</name> > <value>/mnt/local/mapred/local</value> > </property> > > <property> > <name>mapred.temp.dir</name> > <value>/mnt/local/mapred/temp</value> > </property> > > /mnt/ is on a local disk at each node in my cluster and it is just 17% full > with a total disk capacity of around 220GB. Each of the above directories > are created with read/write permissions. > > > I dont see why I am getting the "No space left on device" error from these > configurations. Any ideas how to solve this problem? > > Thanks, > Virajith > >