Thanks Markus for correction. It might be correct.
However, in my case, the server was taking very long to respond (to serve the page). And I receive similar to following when fetching the document. 2010-08-28 01:24:53,212 INFO fetcher.Fetcher - fetch of http://www.------.com/sudoku/2324-2008-10-02-15-00-56 > failed with: java.io.EOFException > My understanding is that the issue is with the server. Not with the Nutch crawler. Warm Regards, YT Thet On Wed, Nov 17, 2010 at 1:28 AM, matinte <[email protected]> wrote: > > The url does exist but for example, when I try curl <url> it returns: > curl: (56) Failure when receiving data from the peer > > It could be a problem of the server? > > 2010/11/16 Markus Jelsma-2 [via Lucene] < > [email protected]<ml-node%[email protected]> > <ml-node%[email protected]<ml-node%[email protected]> > > > > > > > That should generate an IOException if i'm not mistaken. > > > > On Tuesday 16 November 2010 18:16:45 Ye T Thet wrote: > > > > > Matinte, > > > > > > I have encountered that before. > > > > > > In my experience, it is caused by <url>. The url you are trying to > crawl > > > does not exists or server is not responding. > > > > > > Warm Regards, > > > > > > YT Thet > > > > > > On Wed, Nov 17, 2010 at 12:44 AM, matinte <[hidden email]< > http://user/SendEmail.jtp?type=node&node=1912044&i=0>> > > wrote: > > > > Hi, > > > > I am trying to crawl with a seed url given but I'm having the next > > error: > > > > ... > > > > fetch of <url> failed with: java.io.EOFException > > > > -finishing thread FetcherThread, activeThreads=0 > > > > -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0 > > > > -activeThreads=0 > > > > Fetcher: done > > > > > > > > Do you have any idea? > > > > > > > > Thanks in advance > > > > -- > > > > View this message in context: > > > > > > > http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p< > http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p?by-user=t > > > > > > 1911847.html Sent from the Nutch - User mailing list archive at > > > > Nabble.com. > > > > -- > > Markus Jelsma - CTO - Openindex > > http://www.linkedin.com/in/markus17 > > 050-8536600 / 06-50258350 > > > > > > ------------------------------ > > View message @ > > > http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p1912044.html > > To unsubscribe from Fetch error during crawling, click here< > http://lucene.472066.n3.nabble.com/template/TplServlet.jtp?tpl=unsubscribe_by_code&node=1911847&code=bWlndWVsLnRpbnRlQGdtYWlsLmNvbXwxOTExODQ3fC0xODMzNjA4OTYy > >. > > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p1912096.html > Sent from the Nutch - User mailing list archive at Nabble.com. >

