Hi, I have a need to get multiplte search results entries from a single URL. For example I want to index the photo captions in this url http://racer007.albumpost.com/montreal without having to navigate to each picture page, because sometimes there is no individual picture page.
I did it by writing an HTMLParserFilter, modifying ParseData, and Fetcher, then disabling the clean duplicate code in the CrawlerTool. I did this in Nutch 0.7.1 Is there a better way of doing thing? Regards --Ragy
