Hi,

I have a need to get multiplte search results entries from a single URL. For
example I want to index the photo captions in this url
http://racer007.albumpost.com/montreal without having to navigate to each
picture page, because sometimes there is no individual picture page.

I did it by writing an HTMLParserFilter, modifying ParseData, and Fetcher,
then disabling the clean duplicate code in the CrawlerTool. I did this in
Nutch 0.7.1 Is there a better way of doing thing?

Regards

--Ragy

Reply via email to