Sounds like everyone, even me is interested in being able to provide this
service. 

If the process requires that we break it off of nutch code, what all would
be required to make this happen? 

r/d

-----Original Message-----
From: Zaheed Haque [mailto:[EMAIL PROTECTED] 
Sent: Saturday, June 03, 2006 9:28 AM
To: [email protected]
Subject: Re: Re[2]: Image Search

Yes! I am very interested.

Regards


On 6/3/06, Dima Mazmanov <[EMAIL PROTECTED]> wrote:
> Hi,Stefan.
>
> That would be great!!!
> I think many people would vote for this.
> Since nutch is really  powerfull  search engine, it would be nice to
> see several types of search in it.
>
> You wrote 3 июня 2006 г., 20:17:06:
>
> > Having a image search component for nutch would be nice.
> > However I think we need to implement this as a kind of separated tool
> > outside of the nutch code itself, since it is not 100 % integrateable
> > into the nutch code.
> > (E.G. Nutch define one url == one index document.)
> > May be this would be a nice project for a nutch sandbox.
> > If you like you can open an issue to request a nutch sandbox project
> > "image search".
> > If we got enough people vote for this issue we may have a chance to
> > got it created.
>
> > Stefan
>
> > Am 03.06.2006 um 10:38 schrieb TDLN:
>
> >> I am interested in developing such a solution as well.
> >>
> >> I am currently storing the thumbnails on the file system under a
> >> system generated name. My indexing plugin stores the filename in the
> >> index. Thumbnails are later served to the client by seperate Apache
> >> HTTP server. This required some changes but is otherwise pretty
> >> straight forward and performs very well for my current 300.000+
> >> images, around 15kb each.
> >>
> >> If you are developing the more "Nutch-like" solution I could
> >> contribute to that. For instance; I have some code that generates the
> >> thumbs using ImageJ that yields very good results.
> >>
> >> But I would definitely need some guidance in writing the hadoop map
> >> reduce job. we could even contribute this back and base a small
> >> tutorial on this work.
> >>
> >> What do you think?
> >>
> >> Rgrds, Thomas
> >>
> >> On 6/2/06, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> >>> Hi,
> >>> using search http is a bad idea, since you get many but not all
> >>> pages.
> >>> Just write a hadoop map reduce job that process the fetched content
> >>> in your segments, that should be easy.
> >>> Storing images in a file system will be very slow as soon you have
> >>> too many.
> >>> I personal don't like databases since compared to nutch they are slow
> >>> as a snail.
> >>> For a other project also related to images I had created a own
> >>> ImageWritable that contained the binary data of a compressed image
> >>> compared with some meta data.
> >>> If you use a MapFile finding a image based on a key should be very
> >>> fast. I think much faster than a database with binary content.
> >>>
> >>> HTH
> >>> Stefan
> >>>
> >>>
> >>>
> >>>
> >>> Am 02.06.2006 um 21:10 schrieb Marco Pereira:
> >>>
> >>> > Hi Everybody,
> >>> >
> >>> > I've got nutch to index images searching it's url and alt and title
> >>> > tags.
> >>> > But the problem comes when storing the thumbnails.
> >>> > I`ve indexed 3million images for a national search engine.
> >>> > I was in doubt wheter I use a file system scheme or a database to
> >>> > store the
> >>> > thumbnails.
> >>> > The thumbnails are created with a script that gets the image
> >>> urls from
> >>> > nutch index doing a search for http (search.jsp?query=http).
> >>> >
> >>> > Do you have any tips, ideas on this?
> >>> >
> >>> > Thanks you,
> >>> > Marco
> >>>
> >>> ---------------------------------------------
> >>> blog: http://www.find23.org
> >>> company: http://www.media-style.com
> >>>
> >>>
> >>>
> >>
>
>
>
>
> > __________ NOD32 1.1576 (20060602) Information __________
>
> > This message was checked by NOD32 antivirus system.
> > http://www.eset.com
>
>
>
>
> --
> Regards,
>  Dima                          mailto:[EMAIL PROTECTED]
>
>



_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to