Crash with multiple temp directories
------------------------------------

         Key: NUTCH-170
         URL: http://issues.apache.org/jira/browse/NUTCH-170
     Project: Nutch
        Type: Bug
    Reporter: Rod Taylor
    Priority: Critical


A brief read of the code indicated it may be possible to use multiple local 
directories using something like the below:

  <property>
    <name>mapred.local.dir</name>
    <value>/local,/local1,/local2</value>
    <description>The local directory where MapReduce stores intermediate
    data files.
    </description>
  </property>

This failed with the below exception during either the generate or update phase 
(not entirely sure which).

java.lang.ArrayIndexOutOfBoundsException
        at java.util.zip.CRC32.update(CRC32.java:51)
        at 
org.apache.nutch.fs.NFSDataInputStream$Checker.read(NFSDataInputStream.java:92)
        at 
org.apache.nutch.fs.NFSDataInputStream$PositionCache.read(NFSDataInputStream.java:156)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
        at java.io.DataInputStream.readFully(DataInputStream.java:176)
        at 
org.apache.nutch.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55)
        at org.apache.nutch.io.DataOutputBuffer.write(DataOutputBuffer.java:89)
        at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:378)
        at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:301)
        at org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:323)
        at 
org.apache.nutch.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:60)
        at 
org.apache.nutch.segment.SegmentReader$InputFormat$1.next(SegmentReader.java:80)
        at org.apache.nutch.mapred.MapTask$2.next(MapTask.java:106)
        at org.apache.nutch.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.nutch.mapred.MapTask.run(MapTask.java:116)
        at org.apache.nutch.mapred.TaskTracker$Child.main(TaskTracker.java:604)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to