Robert Goene wrote:

I am trying to extend the current HTMLParser of lenya 1.2.1 to support keywords.

that is some of the nastiest code in lenya as you might have figured out by now. if i recall correctly, that code is auto generated by a parser generator and is almost illegible. i tried to document things a little bit at


http://lenya.apache.org/apidocs/1.4/org/apache/lenya/lucene/html/HTMLParser.html

michi is apparently working on replacing that custom crawler with the nutch codebase, which should hopefully be easier to deal with:

http://incubator.apache.org/nutch/apidocs/index.html

michi, why not do your experiments in the sandbox.. ?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to