Hi!

As I read on nutch's homepage you need contributions for supporting different types of content apart from HTML via HTTP. I've done a lot of work on MPEG-7 in the last 2 years and I've implemented 2 of the defined descriptors for fast color based and color distribution based retrieval of images (given an example image). I've tested the effieciency of these descriptors using a relational database and found it working very fast (lower than 300 ms on a Intel PIII 1 GHz w. 512 MB RAM) in a database of about 32000 images. These two descriptors are part of a sf project of mine for photo annotation and retrieval called Caliph & Emir (where I also integrated Lucene because I did not want to re-invent the wheel making it less round :)
URI: http://sourceforge.net/projects/caliph-emir


Another descriptor, which behaves also very well, extracts the "edginess" of the picture for retrieval purposes. It was done by a research group in Vienna in a project called VizIR and was also integrated in Caliph & Emir.

If you are interested in using these content based image descriptors it would be a pleasure for me to contribute them to Nutch. All three of them are coded in Java and are licensed under GPL.

best wishes,
 Mathias

--

    '   '    '
       '   '    '        Mathias Lux:
  o/          '  \o      [EMAIL PROTECTED]
  /-'            -\      +43 (316) 818296
 /\               /\     http://www.juggle.at


------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to