Lucas Rockwell wrote:
I am fairly new to nutch (but I have been wading through the code, docs and mailing lists) and I am wondering if there is a way to get the url of an anchor as well as the text of an anchor? I have a feeling there is, but I have not pulled things apart enough to really know for sure.

At present this is not supported. It could be easily added, but would substantially slow things. With a little more work it could be made somewhat efficient. I will attempt to include this feature in the MapReduce rewrite that I'm now starting. So, one way or another, this feature will be added.


Doug

Reply via email to