Antony Bowesman wrote: > I'm looking to use the Nutch parsing framework in a separate Lucene > project. I'd like to be able to use the existing plugins directory > structure as-is, so wondered Nutch sets up the class loading environment > to find all the jar files in the plugins directories.
There are dedicated class loaders for each plugin. The classpath is constructed (recursively) based on plugin metadata (plugin.xml). > Any pointers to the Nutch class(es) that do the work? Check the package o.a.n.plugin which contains most of the general plug-in code. There's also a recently established project called Apache Tika [1] which has a goal of putting together generally usable parsing/extracting framework. It hasn't yet got out of the ground so there is a good chance to get your voice heard. [1] http://incubator.apache.org/tika/ -- Sami Siren
