Antony Bowesman wrote:
> I'm looking to use the Nutch parsing framework in a separate Lucene
> project. I'd like to be able to use the existing plugins directory
> structure as-is, so wondered Nutch sets up the class loading environment
> to find all the jar files in the plugins directories.

There are dedicated class loaders for each plugin. The classpath is
constructed (recursively) based on plugin metadata (plugin.xml).

> Any pointers to the Nutch class(es) that do the work?

Check the package o.a.n.plugin which contains most of the general
plug-in code.

There's also a recently established project called Apache Tika [1] which
has a goal of putting together generally usable parsing/extracting
framework. It hasn't yet got out of the ground so there is a good chance
to get your voice heard.

[1] http://incubator.apache.org/tika/

-- 
 Sami Siren

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to