"IOException: Filesystem closed." when trying to commit reduce output.

Marcin Sieniek Wed, 26 May 2010 09:40:47 -0700

Hi there,

I've got a simple Map Reduce application that works perfectly when I useNFS as an underlying filesystem (not using HDFS at all).I've got a working HDFS configuration as well - grep example works forme with this configuration.

However, when I try to run the same application on HDFS instead of NFS Ikeep recieving "IOException: Filesystem closed." exception and the jobfails.I've spent a day searching for a solution with Google and scanning thruold archieves but no results so far...


Job summary is:
--->output
10/05/26 17:29:13 INFO mapred.JobClient: Job complete: job_201005261710_0002
10/05/26 17:29:13 INFO mapred.JobClient: Counters: 4
10/05/26 17:29:13 INFO mapred.JobClient:   Job Counters
10/05/26 17:29:13 INFO mapred.JobClient:     Rack-local map tasks=12
10/05/26 17:29:13 INFO mapred.JobClient:     Launched map tasks=16
10/05/26 17:29:13 INFO mapred.JobClient:     Data-local map tasks=4
10/05/26 17:29:13 INFO mapred.JobClient:     Failed map tasks=1

Each map task's attempt log reads somehow like:
--->attempt_201005261710_0001_m_000000_3/syslog:

2010-05-26 17:13:47,297 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:Initializing JVM Metrics with processName=MAP, session

Id=

2010-05-26 17:13:47,470 INFO org.apache.hadoop.mapred.MapTask:io.sort.mb = 1002010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: databuffer = 79691776/996147202010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: recordbuffer = 262144/3276802010-05-26 17:13:47,712 INFO org.apache.hadoop.mapred.MapTask: Startingflush of map output2010-05-26 17:13:47,784 INFO org.apache.hadoop.mapred.MapTask: Finishedspill 02010-05-26 17:13:47,788 INFO org.apache.hadoop.mapred.TaskRunner:Task:attempt_201005261710_0001_m_000000_3 is done. And is i

n the process of commiting

2010-05-26 17:13:47,797 WARN org.apache.hadoop.mapred.TaskTracker: Errorrunning child

*java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617)

atorg.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453)

        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648)

atorg.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.needsTaskCommit(FileOutputCommitter.java:217)

        at org.apache.hadoop.mapred.Task.done(Task.java:671)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:309)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)*

2010-05-26 17:13:47,802 INFO org.apache.hadoop.mapred.TaskRunner:Runnning cleanup for the task2010-05-26 17:13:47,802 WARNorg.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Errordiscarding output*

java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
        at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:580)

atorg.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:227)atorg.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.abortTask(FileOutputCommitter.java:179)

        at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:815)
        at org.apache.hadoop.mapred.Child.main(Child.java:191)*

There are no reduce task run, as map tasks haven't managed to save theirsolution.

This exceptions are visible in JobTracker's log as well. What is thereason for this excpetion? Is it critical (I guess it is, but it'slisted in JobTracker's log as INFO not ERROR).

My config (I'm not sure which directories should be local and whichlocated on HDFS, maybe the issue is somewhere here?):


---->core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://blade02:5432/</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop/tmp</value> <!-- local -->
</property>

</configuration>

---->hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/tmp/hadoop/name2</value> <!-- local dir where HDFS is located-->
</property>
<property>
<name>dfs.data.dir</name>
<value>/tmp/hadoop/data</value> <!-- local dir where HDFS is located -->
</property>
</configuration>

---->mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>blade02:5435</value>
</property>
<property>
<name>mapred.temp.dir</name>
<value>mapred_tmp</value> <!-- on HDFS I suppose -->
</property>
<property>
<name>mapred.system.dir</name>
<value>system</value> <!-- on HDFS I suppose -->
</property>
<property>
<name>mapred.local.dir</name>
<value>/tmp/hadoop/local</value> <!-- local -->
</property>
<property>
<name>mapred.task.tracker.http.address</name>
<value>0.0.0.0:0</value>
</property>
<property>
<name>mapred.textoutputformat.separator</name>
<value>,</value>
</property>
</configuration>

I'm using Hadoop 0.20.2 (new API -> org.apache.hadoop.mapreduce.*,default OutputFormat and RecordWriter), running on a 3-node cluster(blade02, blade03, blade04). blade02 is a master, all of them areslaves. My OS: Linux blade02 2.6.9-42.0.2.ELsmp #1 SMP Tue Aug 2217:26:55 CDT 2006 i686 i686 i386 GNU/Linux.


Note that there are currently 3 filesystems in my configuration:
/tmp/* - is a local fs for each processor

/home/* - as the NFS common for all processors - this is where thehadoop is installed

hdfs://blade02:5432/* - HDFS

I'm not sure if this is relevant, but intermediate (key, value) pair isof type (Text, TermVector), and TermVector Writable methods areimplemented like this:

        public class TermVector implements Writable {

private Map<Text, IntWritable> vec = new HashMap<Text,IntWritable>();


                @Override
                public void write(DataOutput out) throws IOException {
                        out.writeInt(vec.size());

for (Map.Entry<Text, IntWritable> e :vec.entrySet()) {

                                e.getKey().write(out);
                                e.getValue().write(out);
                        }
                }

                @Override
                public void readFields(DataInput in) throws IOException {
                        int n = in.readInt();
                        for (int i = 0; i < n; ++i) {
                                Text t = new Text();
                                t.readFields(in);
                                IntWritable iw = new IntWritable();
                                iw.readFields(in);
                                vec.put(t, iw);
                        }
                }
    ...
    }

Any help appreciated.

Many thanks,
Marcin Sieniek

"IOException: Filesystem closed." when trying to commit reduce output.

Reply via email to