Hi Marcin, did you solve this error ? I stumbled into the same thing also i have no NFS involved...
Johannes > Hi there, > > I've got a simple Map Reduce application that works perfectly when I use > NFS as an underlying filesystem (not using HDFS at all). > I've got a working HDFS configuration as well - grep example works for > me with this configuration. > > However, when I try to run the same application on HDFS instead of NFS I > keep recieving "IOException: Filesystem closed." exception and the job > fails. > I've spent a day searching for a solution with Google and scanning thru > old archieves but no results so far... > > Job summary is: > --->output > 10/05/26 17:29:13 INFO mapred.JobClient: Job complete: job_201005261710_0002 > 10/05/26 17:29:13 INFO mapred.JobClient: Counters: 4 > 10/05/26 17:29:13 INFO mapred.JobClient: Job Counters > 10/05/26 17:29:13 INFO mapred.JobClient: Rack-local map tasks=12 > 10/05/26 17:29:13 INFO mapred.JobClient: Launched map tasks=16 > 10/05/26 17:29:13 INFO mapred.JobClient: Data-local map tasks=4 > 10/05/26 17:29:13 INFO mapred.JobClient: Failed map tasks=1 > > Each map task's attempt log reads somehow like: > --->attempt_201005261710_0001_m_000000_3/syslog: > 2010-05-26 17:13:47,297 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > Initializing JVM Metrics with processName=MAP, session > Id= > 2010-05-26 17:13:47,470 INFO org.apache.hadoop.mapred.MapTask: > io.sort.mb = 100 > 2010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: data > buffer = 79691776/99614720 > 2010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: record > buffer = 262144/327680 > 2010-05-26 17:13:47,712 INFO org.apache.hadoop.mapred.MapTask: Starting > flush of map output > 2010-05-26 17:13:47,784 INFO org.apache.hadoop.mapred.MapTask: Finished > spill 0 > 2010-05-26 17:13:47,788 INFO org.apache.hadoop.mapred.TaskRunner: > Task:attempt_201005261710_0001_m_000000_3 is done. And is i > n the process of commiting > 2010-05-26 17:13:47,797 WARN org.apache.hadoop.mapred.TaskTracker: Error > running child > *java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648) > at > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.needsTaskCommit(FileOutputCommitter.java:217) > at org.apache.hadoop.mapred.Task.done(Task.java:671) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:309) > at org.apache.hadoop.mapred.Child.main(Child.java:170)* > 2010-05-26 17:13:47,802 INFO org.apache.hadoop.mapred.TaskRunner: > Runnning cleanup for the task > 2010-05-26 17:13:47,802 WARN > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Error > discarding output* > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:580) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:227) > at > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.abortTask(FileOutputCommitter.java:179) > at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:815) > at org.apache.hadoop.mapred.Child.main(Child.java:191)* > > There are no reduce task run, as map tasks haven't managed to save their > solution. > > This exceptions are visible in JobTracker's log as well. What is the > reason for this excpetion? Is it critical (I guess it is, but it's > listed in JobTracker's log as INFO not ERROR). > > My config (I'm not sure which directories should be local and which > located on HDFS, maybe the issue is somewhere here?): > > ---->core-site.xml > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <configuration> > <property> > <name>fs.default.name</name> > <value>hdfs://blade02:5432/</value> > </property> > <property> > <name>hadoop.tmp.dir</name> > <value>/tmp/hadoop/tmp</value> <!-- local --> > </property> > > </configuration> > > ---->hdfs-site.xml > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <configuration> > <property> > <name>dfs.replication</name> > <value>1</value> > </property> > <property> > <name>dfs.name.dir</name> > <value>/tmp/hadoop/name2</value> <!-- local dir where HDFS is located--> > </property> > <property> > <name>dfs.data.dir</name> > <value>/tmp/hadoop/data</value> <!-- local dir where HDFS is located --> > </property> > </configuration> > > ---->mapred-site.xml > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <configuration> > <property> > <name>mapred.job.tracker</name> > <value>blade02:5435</value> > </property> > <property> > <name>mapred.temp.dir</name> > <value>mapred_tmp</value> <!-- on HDFS I suppose --> > </property> > <property> > <name>mapred.system.dir</name> > <value>system</value> <!-- on HDFS I suppose --> > </property> > <property> > <name>mapred.local.dir</name> > <value>/tmp/hadoop/local</value> <!-- local --> > </property> > <property> > <name>mapred.task.tracker.http.address</name> > <value>0.0.0.0:0</value> > </property> > <property> > <name>mapred.textoutputformat.separator</name> > <value>,</value> > </property> > </configuration> > > I'm using Hadoop 0.20.2 (new API -> org.apache.hadoop.mapreduce.*, > default OutputFormat and RecordWriter), running on a 3-node cluster > (blade02, blade03, blade04). blade02 is a master, all of them are > slaves. My OS: Linux blade02 2.6.9-42.0.2.ELsmp #1 SMP Tue Aug 22 > 17:26:55 CDT 2006 i686 i686 i386 GNU/Linux. > > Note that there are currently 3 filesystems in my configuration: > /tmp/* - is a local fs for each processor > /home/* - as the NFS common for all processors - this is where the > hadoop is installed > hdfs://blade02:5432/* - HDFS > > I'm not sure if this is relevant, but intermediate (key, value) pair is > of type (Text, TermVector), and TermVector Writable methods are > implemented like this: > public class TermVector implements Writable { > private Map<Text, IntWritable> vec = new HashMap<Text, > IntWritable>(); > > @Override > public void write(DataOutput out) throws IOException { > out.writeInt(vec.size()); > for (Map.Entry<Text, IntWritable> e : > vec.entrySet()) { > e.getKey().write(out); > e.getValue().write(out); > } > } > > @Override > public void readFields(DataInput in) throws IOException { > int n = in.readInt(); > for (int i = 0; i < n; ++i) { > Text t = new Text(); > t.readFields(in); > IntWritable iw = new IntWritable(); > iw.readFields(in); > vec.put(t, iw); > } > } > ... > } > > Any help appreciated. > > Many thanks, > Marcin Sieniek > >