Hi nutch community,
as you had may be notice I posted a patch for the plugin system to the nutch bug tracking system.
Sourceforge does not accept my file since it is may be to large with 1,6 MB, so I had uploaded to
http://www.media-style.com/gfx/nutch/nutch-plugin-patch.zip
As sourceforge now 3 times already mentioned, sorry for that,
the description can be found here:
https://sourceforge.net/tracker/?func=detail&atid=491356&aid=954964&group_id=59548
Doug can you please close the bug again, since i have no rights to do that. Thanks!
The code was ready since some months, but sorry I didn't found the time to do the last small changes.
For people remembering the conversation some months ago, the patch comes with:
+ the required ant build script update
+ the first standard plugin for that contains a set of content extractors.
+ HTML content extractor respecting the robot.txt
+ strongly improves of the java doc (but there is still room for improvement since I'm no native speaker)
To install the patch copy the dom4j.jar to $HOME/nutch/libs.
Assign the nutch_plugin_patch.txt to $HOME/nutch
copy "nutch-extractors" to $HOME (so in the same level as "nutch")
Use "ant test" or "ant tar".
It would be great if a native speaker can assist me to write a "how to write a plugin" tutorial until next week.
If there is anything I can do to help bringing this patch to the cvs head - let me know.
Greetings,
Stefan
---------------------------------------------------------------
open technology: http://www.media-style.com
open source: http://www.weta-group.net
open discussion: http://www.text-mining.org
- Re: [Nutch-dev] plugin-system patch Stefan Groschupf
- Re: [Nutch-dev] plugin-system patch john
- Re: [Nutch-dev] plugin-system patch Stefan Groschupf
- Re: [Nutch-dev] plugin-system patch Stefan Groschupf
- Re: [Nutch-dev] plugin-system patch Doug Cutting
- [Nutch-dev] new plugin patch Stefan Groschupf
- Re: [Nutch-dev] new plugin patch Doug Cutting
- Re: [Nutch-dev] new plugin patc... Doug Cutting
- Re: [Nutch-dev] new plugin ... Stefan Groschupf
- Re: [Nutch-dev] new plugin patc... Stefan Groschupf
