I think i found the problem:
At MapOutputFile.java:123
bytesToRead = Math.min((int) unread, buffer.length);

if unread is greater then 2^31, bytesToRead will be negative.

M> It seems that new version fixed this problem, i haven't seen this
M> error anymore, but new problem arised during indexing process (i'm
M> using mapred revision 291801):

M> i'm trying to index via "./nutch index", segments were created by slightly
M> modificated version of crawl.Crawl class. With 1-2 segments everything
M> works ok, with about 20 segments task tracker logs on both servers
M> show repeating error block:

M> 050926 180831 task_r_o4tt4z Got 1 map output locations.
M> 050926 180831 Client connection to 127.0.0.1:60218: starting
M> 050926 180831 Server connection on port 60218 from 127.0.0.1: starting
M> 050926 180831 Client connection to 127.0.0.1:60218 caught:
M> java.lang.IndexOutOfBoundsException
M> java.lang.IndexOutOfBoundsException
M>         at
M> java.io.DataInputStream.readFully(DataInputStream.java:263)
M>         at
M> org.apache.nutch.mapred.MapOutputFile.readFields(MapOutputFile.java:123)
M>         at
M> org.apache.nutch.io.ObjectWritable.readObject(ObjectWritable.java:232)
M>         at
M> org.apache.nutch.io.ObjectWritable.readFields(ObjectWritable.java:60)
M>         at
M> org.apache.nutch.ipc.Client$Connection.run(Client.java:163)
M> 050926 180831 Client connection to 127.0.0.1:60218: closing
M> 050926 180831 Server handler on 60218 caught:
M> java.net.SocketException: Connection reset
M> java.net.SocketException: Connection reset
M>         at
M> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
M>         at
M> java.net.SocketOutputStream.write(SocketOutputStream.java:136)
M>         at
M> java.io.BufferedOutputStream.write(BufferedOutputStream.java:106)
M>         at java.io.DataOutputStream.write(DataOutputStream.java:85)
M>         at
M> org.apache.nutch.mapred.MapOutputFile.write(MapOutputFile.java:98)
M>         at
M> org.apache.nutch.io.ObjectWritable.writeObject(ObjectWritable.java:117)
M>         at
M> org.apache.nutch.io.ObjectWritable.write(ObjectWritable.java:64)
M>         at org.apache.nutch.ipc.Server$Handler.run(Server.java:213)
M> 050926 180831 Server connection on port 60218 from 127.0.0.1: exiting
M> 050926 180931 task_r_o4tt4z copy failed: task_m_ypindn from
M> goku1.deeptown.net/127.0.0.1:60218
M> java.io.IOException: timed out waiting for response
M>         at org.apache.nutch.ipc.Client.call(Client.java:296)
M>         at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
M>         at $Proxy2.getFile(Unknown Source)
M>         at
M> org.apache.nutch.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:94)
M>         at
M> org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:61)






Michael

Reply via email to