On 13-jan-2007, at 14:34, Mathijs Homminga wrote:
I'm using nutch 0.8.1 and I noticed the following.
When pageA redirects to pageB (HTTP 3xx), pageA remains unfetched in the crawlDB (pageB is fetched).

Hence, pageA shows up in each generate/fetch/updatedb iteration.

Is this a bug? I found a previous thread on this list which describes this issue too:
http://www.mail-archive.com/[email protected]/msg04599.html

Yes.  See http://issues.apache.org/jira/browse/NUTCH-273

--
Regards,

Eelco Lempsink

Attachment: PGP.sig
Description: This is a digitally signed message part

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to