Crash with multiple temp directories
------------------------------------

         Key: NUTCH-170
         URL: http://issues.apache.org/jira/browse/NUTCH-170
     Project: Nutch
        Type: Bug
    Reporter: Rod Taylor
    Priority: Critical


A brief read of the code indicated it may be possible to use multiple local 
directories using something like the below:

  <property>
    <name>mapred.local.dir</name>
    <value>/local,/local1,/local2</value>
    <description>The local directory where MapReduce stores intermediate
    data files.
    </description>
  </property>

This failed with the below exception during either the generate or update phase 
(not entirely sure which).

java.lang.ArrayIndexOutOfBoundsException
        at java.util.zip.CRC32.update(CRC32.java:51)
        at 
org.apache.nutch.fs.NFSDataInputStream$Checker.read(NFSDataInputStream.java:92)
        at 
org.apache.nutch.fs.NFSDataInputStream$PositionCache.read(NFSDataInputStream.java:156)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
        at java.io.DataInputStream.readFully(DataInputStream.java:176)
        at 
org.apache.nutch.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55)
        at org.apache.nutch.io.DataOutputBuffer.write(DataOutputBuffer.java:89)
        at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:378)
        at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:301)
        at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:323)
        at 
org.apache.nutch.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:60)
        at 
org.apache.nutch.segment.SegmentReader$InputFormat$1.next(SegmentReader.java:80)
        at org.apache.nutch.mapred.MapTask$2.next(MapTask.java:106)
        at org.apache.nutch.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.nutch.mapred.MapTask.run(MapTask.java:116)
        at org.apache.nutch.mapred.TaskTracker$Child.main(TaskTracker.java:604)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to