Hi all! I have a little doubt. My WebDB contains, actually, 779 pages with 899 links. When I use the segread command it returns 779 count pages too in one segment. However when I make a search or when I use the luke software the maximum number of documents is 437. I've seen the recrawl logs and when the script is fetching pages, some of them contains the message:
... failed with: java.lang.Exception: org.apache.nutch.protocol.RetryLater: Exceeded http.max.delays: retry later. I thing that it happens because some network problem. The fetcher try to fetch some page, but it did not obtain. Because this, when the segment is being indexed, only the fetched pages will appear in results. It is a problem to me. Could someone explain me what should I do to refetch these pages to increase my web search results? Should I change the http.max.delays and fetcher.server.delay properties in nutch-default.xml? Regards, -- Lourival Junior Universidade Federal do Pará Curso de Bacharelado em Sistemas de Informação http://www.ufpa.br/cbsi Msn: [EMAIL PROTECTED]