Hi fredy, I've done all that. I didnt note in previous reply the I run it in local mode. And the findings that I posted are based on debugging and caching the exact moment when the exception is thrown.
For some reason the DataInputStream in a certain moment has values of length and startPos, that make the sum startPos+length, bigger than max int value, and thus the result turns negative which makes the code think that the stream ended. I have a base url list of 1 million urls, I've tried running them in many different ways - segments with different amounts of urls. This problem always repeats itself, for each input in a different url, but within each input, always in the same place. -- View this message in context: http://lucene.472066.n3.nabble.com/NegativeArraySizeException-and-problem-advancing-port-rec-during-fetching-tp3994633p3998375.html Sent from the Nutch - User mailing list archive at Nabble.com.

