Hi,
i'm pretty new to Nutch and i'm trying to modify the code so it stores the words before and after a hyperlink as well as the anchor text.
i've ben looking through the nutch code for a couple of days and i'm still a little unclear as to the layout... 
Nutch parses incoming webpages in HTMLParser.java right? i can't seem to find the code in here for url processing though - where exactly does it parse the anchor text and write it to the database?

any help greatly appreciated!
              Brian
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to