Worth checking if this is caused by the open file limit issue http://sudhirvn.blogspot.com/2010/07/hadoop-error-logs-orgapachehadoophdfsse .html
On 10/4/10 2:04 AM, "[email protected]" <[email protected]> wrote: > From: AJ Chen <[email protected]> > Date: Sat, 2 Oct 2010 10:28:29 -0700 > To: <[email protected]>, nutch-user <[email protected]> > Subject: Re: hadoop or nutch problem? > > More observations: during hadoop job running, this "filesystem closed" error > happens consistently. > 2010-10-02 05:29:58,951 WARN mapred.TaskTracker - Error running child > java.io.IOException: Filesystem closed > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226) > at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:67) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.close(DFSClient.java:1678) > at java.io.FilterInputStream.close(FilterInputStream.java:155) > at > org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1584) > at > org.apache.hadoop.mapred.SequenceFileRecordReader.close(SequenceFileRecordRead > er.java:125) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:198) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:362) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > 2010-10-02 05:29:58,979 WARN mapred.TaskRunner - Parent died. Exiting > attempt_201009301134_0006_m_000074_1 > > Could this error turns on the safemode in hadoop? I suspect this because the > next hadoop job is supposed to create a segment directory and write out > segment results, but it does not create the directory. Anything else could > happen to hdfs? > > thanks, > -aj > > On Tue, Sep 28, 2010 at 4:40 PM, AJ Chen <[email protected]> wrote: > >> I'm doing web crawling using nutch, which runs on hadoop in distributed >> mode. When the crawldb has tens of millions of urls, I have started to see >> strange failure in generating new segment and updating crawldb. >> For generating segment, the hadoop job for select is completed successfully >> and generate-temp-1285641291765 is created. but it does not start the >> partition job and the segment is not created in segments directory. I try to >> understand where it fails. There is no error message except for a few WARN >> messages about connection reset by peer. Hadoop fsck and dfsadmin show the >> nodes and directories are healthy. Is this a hadoop problem or nutch >> problem? I'll appreciate any suggestion for how to debug this fatal >> problem. >> >> Similar problem is seen for updatedb step, which creates the temp dir but >> never actually update the crawldb. >> >> thanks, >> aj >> -- >> AJ Chen, PhD >> Chair, Semantic Web SIG, sdforum.org >> web2express.org >> twitter: @web2express >> Palo Alto, CA, USA >> iCrossing Privileged and Confidential Information This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
