Hi,
i'm pretty new to Nutch and i'm trying to modify the code so it stores the
words before and after a hyperlink as well as the anchor text.
i've ben looking through the nutch code for a couple of days and i'm still a
little unclear as to the layout...
Nutch parses incoming webpages in HTMLParser.java right? i can't seem to
find the code in here for url processing though - where exactly does it
parse the anchor text and write it to the database?

any help greatly appreciated!
             Brian

Reply via email to