You can easily add new file formats by writing new content type parser plugins. Just browse the code of one of the existing parsers like pdf or the new swt parser to get an idea what you need to do. In the end you only need to write a parser for the content and return some values. ... and write a plugin.xml :)
Good luck.
Stefan
Am 13.02.2006 um 11:43 schrieb Raghavendra Prabhu:

Hi

How do we go about the process of adding more file types and parsers to
nutch?

How do we arrive at a new file parser so that we can contribute it to nutch
.

What about parsing through even image files and retrieving data?

Rgds
Prabhu

---------------------------------------------
George Orwel was an Optimist
blog: http://www.find23.org
company: http://www.media-style.com


Reply via email to