I think i found the problem: At MapOutputFile.java:123 bytesToRead = Math.min((int) unread, buffer.length);
if unread is greater then 2^31, bytesToRead will be negative. M> It seems that new version fixed this problem, i haven't seen this M> error anymore, but new problem arised during indexing process (i'm M> using mapred revision 291801): M> i'm trying to index via "./nutch index", segments were created by slightly M> modificated version of crawl.Crawl class. With 1-2 segments everything M> works ok, with about 20 segments task tracker logs on both servers M> show repeating error block: M> 050926 180831 task_r_o4tt4z Got 1 map output locations. M> 050926 180831 Client connection to 127.0.0.1:60218: starting M> 050926 180831 Server connection on port 60218 from 127.0.0.1: starting M> 050926 180831 Client connection to 127.0.0.1:60218 caught: M> java.lang.IndexOutOfBoundsException M> java.lang.IndexOutOfBoundsException M> at M> java.io.DataInputStream.readFully(DataInputStream.java:263) M> at M> org.apache.nutch.mapred.MapOutputFile.readFields(MapOutputFile.java:123) M> at M> org.apache.nutch.io.ObjectWritable.readObject(ObjectWritable.java:232) M> at M> org.apache.nutch.io.ObjectWritable.readFields(ObjectWritable.java:60) M> at M> org.apache.nutch.ipc.Client$Connection.run(Client.java:163) M> 050926 180831 Client connection to 127.0.0.1:60218: closing M> 050926 180831 Server handler on 60218 caught: M> java.net.SocketException: Connection reset M> java.net.SocketException: Connection reset M> at M> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) M> at M> java.net.SocketOutputStream.write(SocketOutputStream.java:136) M> at M> java.io.BufferedOutputStream.write(BufferedOutputStream.java:106) M> at java.io.DataOutputStream.write(DataOutputStream.java:85) M> at M> org.apache.nutch.mapred.MapOutputFile.write(MapOutputFile.java:98) M> at M> org.apache.nutch.io.ObjectWritable.writeObject(ObjectWritable.java:117) M> at M> org.apache.nutch.io.ObjectWritable.write(ObjectWritable.java:64) M> at org.apache.nutch.ipc.Server$Handler.run(Server.java:213) M> 050926 180831 Server connection on port 60218 from 127.0.0.1: exiting M> 050926 180931 task_r_o4tt4z copy failed: task_m_ypindn from M> goku1.deeptown.net/127.0.0.1:60218 M> java.io.IOException: timed out waiting for response M> at org.apache.nutch.ipc.Client.call(Client.java:296) M> at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127) M> at $Proxy2.getFile(Unknown Source) M> at M> org.apache.nutch.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:94) M> at M> org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:61) Michael
