rubdabadub wrote:
On 2/20/07, Renaud Richardet <[EMAIL PROTECTED]> wrote:
Hi Thorsten,

I have quickly looked at the Droid code, and was wondering why you don't
want to completely reuse the Nutch plugin API in Droid. This way, you
could reuse the Nutch parse-* plugins without modifications. Just trying
to understand...


hmm.. interesting .. I am not fully on board with Nutch. But how would
the end output
of such crawl be.. as i.e ..

HTML file 1 crawl --> parse-html --> parsing rule say.. "pick up <a
href=> tags"  --> dump it to a predefined text file (i.e. 1 "ahref
tag" per line or whatever based on template or something) ... or?? cos
the current Nutch saves it in bin format.. so I am trying to
understand here as well...
Errr, you're right: the parsers return an object of type Parse, and no file... Does anybody see a way to integrate this?

Thanks,
Renaud

Reply via email to