Hi, I am Nobin, and I am working on a search engine based on nutch.
I have some questions regarding nutch, and will be very helpful for me if somebody can answer. I am working on a plugin(anchor based url filter) where i need to have anchor text in CrawlDbFilter (nutch 1.2), but after going through source, it seems getting anchor in CrawlDbFilter will not be easy, because none of parameters in public void map(Text key, CrawlDatum value, OutputCollector<Text, CrawlDatum> output, Reporter reporter) stores the anchor text, is there any class through which i can access this anchor text? 2)in nutch 2.0 (nutch base) i think there is a way to get this anchor text in class GeneratorMapper public void map(String reversedUrl, WebPage page, Context context) through the WebPage class. But there is a problem, I think this Webpage object is for this url (reverse of reversedUrl), not it's parent (parent's webpage(page conatining this outlink), only parent contain anchor text. 3)what is the use of reprUrl member in WebPage class. Thanks Nobin Mathew

