Andrzej Bialecki wrote:

Philipp Suter wrote:

I would have some spare cycles starting end of july until end of august.. but I would need some short explanation where and how to integrate the flash text extractor. furthermore is there any document, whatsoever explaining the nutch deign approach? I never had a look at the sources of nutch and the design is very much tuned for performance, which does not make it easier to understand it but better to use it :-)


First, you need to check out the complete source from SVN trunk. Then you can copy one of the existing plugins and use it as a template. I attached an Eclipse project - just put these two files in the main directory (where README.txt is), import the project into Eclipse and off you go.

Thanks!

I will have a look at it as soon as I come back from my holiday (appr. 23.7.). Do the sources of Stefan Groschupf still exist or are they included somehow in another plugin? Probably they are a good starting point for a new implementation.

Reply via email to