Re: will nutch-2 be able to index image files

alxsss Tue, 08 Mar 2011 13:51:36 -0800

I meant to extract image title, src link and alt from <img tags and not store 
image files. For a keyword search in must display link, which automatically 
displays image itself in the search page.
Not sure what do you mean image content-based retrieval? Do image files have 
tags like mp3 ones?
Must  a parse plugin be written in both cases?

Thanks.
Alex.

-----Original Message-----
From: Andrzej Bialecki <[email protected]>
To: user <[email protected]>
Sent: Tue, Mar 8, 2011 12:58 pm
Subject: Re: will nutch-2 be able to index image files

On 3/8/11 9:09 PM, [email protected] wrote:

> Hello,

>

> I wondered if nutch version 2 be able to index image files?

In what way? Extract metadata and index image metadata as text? Sure, if 

we implement a plugin for it. Tika already supports EXIF, so this 

shouldn't be complicated, perhaps it's a tweak to the parse-tika 

configuration. Or did you mean the image content-based retrieval (e.g. 

using wavelets)?

-- 

Best regards,

Andrzej Bialecki     <><

  ___. ___ ___ ___ _ _   __________________________________

[__ || __|__/|__||\/|  Information Retrieval, Semantic Web

___|||__||  \|  ||  |  Embedded Unix, System Integration

http://www.sigram.com  Contact: info at sigram dot com

Re: will nutch-2 be able to index image files

Reply via email to