[ 
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583877#action_12583877
 ] 

Gordon Mohr commented on NUTCH-296:
-----------------------------------

FYI: We've suggested image-search extensions to Nutch as a possible 
InternetArchive-mentored Google Summer of Code 2008 student project. (See our 
ideas page at 
<http://webteam.archive.org/confluence/display/SOC06/Summer+of+Code+2008>.) Too 
early to say if we'll get any good proposals or if that project will make the 
cut when we see the final list of proposals and how many projects we get. 

> Image Search
> ------------
>
>                 Key: NUTCH-296
>                 URL: https://issues.apache.org/jira/browse/NUTCH-296
>             Project: Nutch
>          Issue Type: New Feature
>            Reporter: Thomas Delnoij
>            Priority: Minor
>
> Per the discussion in the Nutch-User mailing list, there is a wish for an 
> "Image Search" add-on component that will index images.
> Must have:
> - retrieve outlinks to image files from fetched pages
> - generate thumbnails from images
> - thumbnails are stored in the segments as ImageWritable that contains the 
> compressed binary data and some meta data 
> Should have:
> - implemented as hadoop map reduce job
> - should be seperate from main Nutch codeline as it breaks general Nutch 
> logic of one url == one index document.
> Could  have:
> - store the original image in the segments
> Would like to have:
> - search interface for image index
> - parameterizable thumbnail generation (width, height, quality)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to