Is there a possibility to log all URLs that Nutch 1.8 finds but that are ignored due to regex url filter settings?
Second question: I have a couple of URLs that are fetched, but not parsed. In the log I can see that it was fetched, but there is no parse entry for the same URL. But I can’t see the reason for not parsing that URL in the log file. Is there a way to log the reason for not parsing a fetched URL? Cheers Peter
signature.asc
Description: Message signed with OpenPGP using GPGMail

