Re: [Nutch-general] Image Search

Zaheed Haque Sat, 03 Jun 2006 09:28:03 -0700

Yes! I am very interested.

Regards



On 6/3/06, Dima Mazmanov <[EMAIL PROTECTED]> wrote:
> Hi,Stefan.
>
> That would be great!!!
> I think many people would vote for this.
> Since nutch is really  powerfull  search engine, it would be nice to
> see several types of search in it.
>
> You wrote 3 июня 2006 г., 20:17:06:
>
> > Having a image search component for nutch would be nice.
> > However I think we need to implement this as a kind of separated tool
> > outside of the nutch code itself, since it is not 100 % integrateable
> > into the nutch code.
> > (E.G. Nutch define one url == one index document.)
> > May be this would be a nice project for a nutch sandbox.
> > If you like you can open an issue to request a nutch sandbox project
> > "image search".
> > If we got enough people vote for this issue we may have a chance to
> > got it created.
>
> > Stefan
>
> > Am 03.06.2006 um 10:38 schrieb TDLN:
>
> >> I am interested in developing such a solution as well.
> >>
> >> I am currently storing the thumbnails on the file system under a
> >> system generated name. My indexing plugin stores the filename in the
> >> index. Thumbnails are later served to the client by seperate Apache
> >> HTTP server. This required some changes but is otherwise pretty
> >> straight forward and performs very well for my current 300.000+
> >> images, around 15kb each.
> >>
> >> If you are developing the more "Nutch-like" solution I could
> >> contribute to that. For instance; I have some code that generates the
> >> thumbs using ImageJ that yields very good results.
> >>
> >> But I would definitely need some guidance in writing the hadoop map
> >> reduce job. we could even contribute this back and base a small
> >> tutorial on this work.
> >>
> >> What do you think?
> >>
> >> Rgrds, Thomas
> >>
> >> On 6/2/06, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> >>> Hi,
> >>> using search http is a bad idea, since you get many but not all
> >>> pages.
> >>> Just write a hadoop map reduce job that process the fetched content
> >>> in your segments, that should be easy.
> >>> Storing images in a file system will be very slow as soon you have
> >>> too many.
> >>> I personal don't like databases since compared to nutch they are slow
> >>> as a snail.
> >>> For a other project also related to images I had created a own
> >>> ImageWritable that contained the binary data of a compressed image
> >>> compared with some meta data.
> >>> If you use a MapFile finding a image based on a key should be very
> >>> fast. I think much faster than a database with binary content.
> >>>
> >>> HTH
> >>> Stefan
> >>>
> >>>
> >>>
> >>>
> >>> Am 02.06.2006 um 21:10 schrieb Marco Pereira:
> >>>
> >>> > Hi Everybody,
> >>> >
> >>> > I've got nutch to index images searching it's url and alt and title
> >>> > tags.
> >>> > But the problem comes when storing the thumbnails.
> >>> > I`ve indexed 3million images for a national search engine.
> >>> > I was in doubt wheter I use a file system scheme or a database to
> >>> > store the
> >>> > thumbnails.
> >>> > The thumbnails are created with a script that gets the image
> >>> urls from
> >>> > nutch index doing a search for http (search.jsp?query=http).
> >>> >
> >>> > Do you have any tips, ideas on this?
> >>> >
> >>> > Thanks you,
> >>> > Marco
> >>>
> >>> ---------------------------------------------
> >>> blog: http://www.find23.org
> >>> company: http://www.media-style.com
> >>>
> >>>
> >>>
> >>
>
>
>
>
> > __________ NOD32 1.1576 (20060602) Information __________
>
> > This message was checked by NOD32 antivirus system.
> > http://www.eset.com
>
>
>
>
> --
> Regards,
>  Dima                          mailto:[EMAIL PROTECTED]
>
>

_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] Image Search

Reply via email to