Thanks Markus for correction.

It might be correct.

However, in my case, the server was taking very long to respond (to serve
the page). And I receive similar to following when fetching the document.

2010-08-28 01:24:53,212 INFO  fetcher.Fetcher - fetch of
http://www.------.com/sudoku/2324-2008-10-02-15-00-56
> failed with: java.io.EOFException
>

My understanding is that the issue is with the server. Not with the Nutch
crawler.

Warm Regards,

YT Thet

On Wed, Nov 17, 2010 at 1:28 AM, matinte <[email protected]> wrote:

>
> The url does exist but for example, when I try curl <url> it returns:
> curl: (56) Failure when receiving data from the peer
>
> It could be a problem of the server?
>
> 2010/11/16 Markus Jelsma-2 [via Lucene] <
> [email protected]<ml-node%[email protected]>
> <ml-node%[email protected]<ml-node%[email protected]>
> >
> >
>
> > That should generate an IOException if i'm not mistaken.
> >
> > On Tuesday 16 November 2010 18:16:45 Ye T Thet wrote:
> >
> > > Matinte,
> > >
> > > I have encountered that before.
> > >
> > > In my experience, it is caused by <url>. The url you are trying to
> crawl
> > > does not exists or server is not responding.
> > >
> > > Warm Regards,
> > >
> > > YT Thet
> > >
> > > On Wed, Nov 17, 2010 at 12:44 AM, matinte <[hidden email]<
> http://user/SendEmail.jtp?type=node&node=1912044&i=0>>
> > wrote:
> > > > Hi,
> > > > I am trying to crawl with a seed url given but I'm having the next
> > error:
> > > > ...
> > > > fetch of <url> failed with: java.io.EOFException
> > > > -finishing thread FetcherThread, activeThreads=0
> > > > -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
> > > > -activeThreads=0
> > > > Fetcher: done
> > > >
> > > > Do you have any idea?
> > > >
> > > > Thanks in advance
> > > > --
> > > > View this message in context:
> > > >
> >
> http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p<
> http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p?by-user=t
> >
> > > > 1911847.html Sent from the Nutch - User mailing list archive at
> > > > Nabble.com.
> >
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536600 / 06-50258350
> >
> >
> > ------------------------------
> >  View message @
> >
> http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p1912044.html
> > To unsubscribe from Fetch error during crawling, click here<
> http://lucene.472066.n3.nabble.com/template/TplServlet.jtp?tpl=unsubscribe_by_code&node=1911847&code=bWlndWVsLnRpbnRlQGdtYWlsLmNvbXwxOTExODQ3fC0xODMzNjA4OTYy
> >.
> >
> >
> >
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Fetch-error-during-crawling-tp1911847p1912096.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Reply via email to