Dima.

I think there are several issues that need to be thought through
thoroughly before we can implement this.

I created a Wiki page to discuss the design:

http://wiki.apache.org/nutch/Image_Search_Design

Writing a map reduce job is completely new for me, so with my limited
knowledge in this area I cannot answer your question.

Anyway, now I think is time to read hadoop MapReduce code :)

Rgrds, Thomas



On 6/3/06, Dima Mazmanov <[EMAIL PROTECTED]> wrote:
> Hi,TDLN.
>
> But how image data will be stored in nutch database?
> Would it affect on rest data in it?
> >> (E.G. Nutch define one url == one index document.)
>
> > Why can't we create a document for every image that is found?
>
> > Then it is as if we will have a parse-image plugin just like we have a
> > parse-html and parse-pdf plugin, with the only difference that it will
> > be run after all the pages in the segment have been fetched?
>
> > Rgrds, Thomas
>
>
>
>
> --
> Regards,
>  Dima                          mailto:[EMAIL PROTECTED]
>
>


_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to