Crash with multiple temp directories
------------------------------------
Key: NUTCH-170
URL: http://issues.apache.org/jira/browse/NUTCH-170
Project: Nutch
Type: Bug
Reporter: Rod Taylor
Priority: Critical
A brief read of the code indicated it may be possible to use multiple local
directories using something like the below:
<property>
<name>mapred.local.dir</name>
<value>/local,/local1,/local2</value>
<description>The local directory where MapReduce stores intermediate
data files.
</description>
</property>
This failed with the below exception during either the generate or update phase
(not entirely sure which).
java.lang.ArrayIndexOutOfBoundsException
at java.util.zip.CRC32.update(CRC32.java:51)
at
org.apache.nutch.fs.NFSDataInputStream$Checker.read(NFSDataInputStream.java:92)
at
org.apache.nutch.fs.NFSDataInputStream$PositionCache.read(NFSDataInputStream.java:156)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
at java.io.DataInputStream.readFully(DataInputStream.java:176)
at
org.apache.nutch.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55)
at org.apache.nutch.io.DataOutputBuffer.write(DataOutputBuffer.java:89)
at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:378)
at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:301)
at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:323)
at
org.apache.nutch.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:60)
at
org.apache.nutch.segment.SegmentReader$InputFormat$1.next(SegmentReader.java:80)
at org.apache.nutch.mapred.MapTask$2.next(MapTask.java:106)
at org.apache.nutch.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.nutch.mapred.MapTask.run(MapTask.java:116)
at org.apache.nutch.mapred.TaskTracker$Child.main(TaskTracker.java:604)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira