Hi, As Lewis say before, if you are going to use nutch for image retrieval and indexing in solr, you'll need to invest some time writing some tools depending on your needs. I've been working on a search engine using nutch for the crawling process and solr as an indexing server, the typical use, when we start dealing with images we became aware that nutch (through the tike project) extract to few information about the image "per se" (basically only metadata, gets extracted), I think that this is the biggest problem with nutch. One particular requirement for me was to show a thumbnail of the image, so I wrote a plugin that generates the thumbnail, then encode it using base64 and store it in the solr index. Other need was to annotate the image with the surrounding text to improve the search, I also write a plugin for this.
Summarizing, nutch it's a very good start point, but depending on your particular needs you'll have to write some plugins on your own. Greetings On Oct 20, 2012, at 10:02 AM, Lewis John Mcgibbney <[email protected]> wrote: > Hi, > > On Fri, Oct 19, 2012 at 10:48 PM, Santosh Mahto > <[email protected]> wrote: >> Hi all > >> I have few question: >> 1. Does nutch support images crawling and indexing(or how much support is >> there) > > Depending on how you wish to process and then present your images e.g. > as thumbnails for example, I would say you need to invest some time > writing a custom parser for images. You can read a pretty thorough and > comprehensive thread [0] on this topic. > >> 2. As I got some link where apache-tika plugin is used to make image search >> engine, with little exploration i found >> tikka is defaulted in nutch(as I think ,not sure) . so is image seaching >> also happens by default. > > Image processing and indexing is not enabled my default in the above context > >> 3. As I think i also need to configure solr to show the image result . >> could you guide me what extra configuration need to be set in solr side > > Unless someone here who has worked with image indexing in Solr can > help you in a more verbose manner than me, I would certainly direct > you to thee solr-user@ list archives [1]. There appears to be plenty > there. > > hth > > Lewis > > [0] http://www.mail-archive.com/[email protected]/msg06758.html > [1] http://www.mail-archive.com/search?q=image&l=solr-user%40lucene.apache.org > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS > INFORMATICAS... > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > http://www.uci.cu > http://www.facebook.com/universidad.uci > http://www.flickr.com/photos/universidad_uci 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci

