> We can use source forge as the cvs, In worst case we can use sf. However I would love to wait what Doug is thinking about having a sandbox repository in the nutch svn with limited access.
> > -----Original Message----- > From: TDLN [mailto:[EMAIL PROTECTED] > Sent: Saturday, June 03, 2006 10:25 AM > To: [email protected] > Subject: Re: Re[2]: Image Search > > Dan - this sounds really good! Participation in an Open Source project > is new to me as well, but hey, that's why we get to start in the > sandbox :) > > I was also thinking about source control. We definitely need a > repository, don't you think? > > Rgrds, Thomas > > On 6/3/06, Dan Morrill <[EMAIL PROTECTED]> wrote: >> Well I can do the project management side of it, and can volunteer >> some >> time, but have never done this in an open source model before. But >> I can > do >> documentation, project management support, and make a decent cheer >> leader > as >> well. >> >> Let me know. >> r/d >> >> -----Original Message----- >> From: TDLN [mailto:[EMAIL PROTECTED] >> Sent: Saturday, June 03, 2006 9:59 AM >> To: [email protected] >> Subject: Re: Re[2]: Image Search >> >> Ok, I created a Jira Issue for this: >> >> http://issues.apache.org/jira/browse/NUTCH-296 >> >> I did not assign the Issue to any component. Maybe we can have a >> "Sandbox" component? >> >> Now, the question is how we can support several people working on >> this >> from a "project management" or code management perspective? >> >> I mean, if we want the Sandbox to flourish, we need some kind of >> infrastructure, right? >> >> Rgrds, Thomas Delnoij >> >> >> >> On 6/3/06, Dan Morrill <[EMAIL PROTECTED]> wrote: >>> Sounds like everyone, even me is interested in being able to provide > this >>> service. >>> >>> If the process requires that we break it off of nutch code, what all > would >>> be required to make this happen? >>> >>> r/d >>> >>> -----Original Message----- >>> From: Zaheed Haque [mailto:[EMAIL PROTECTED] >>> Sent: Saturday, June 03, 2006 9:28 AM >>> To: [email protected] >>> Subject: Re: Re[2]: Image Search >>> >>> Yes! I am very interested. >>> >>> Regards >>> >>> >>> On 6/3/06, Dima Mazmanov <[EMAIL PROTECTED]> wrote: >>>> Hi,Stefan. >>>> >>>> That would be great!!! >>>> I think many people would vote for this. >>>> Since nutch is really powerfull search engine, it would be >>>> nice to >>>> see several types of search in it. >>>> >>>> You wrote 3 июня 2006 г., 20:17:06: >>>> >>>>> Having a image search component for nutch would be nice. >>>>> However I think we need to implement this as a kind of separated > tool >>>>> outside of the nutch code itself, since it is not 100 % > integrateable >>>>> into the nutch code. >>>>> (E.G. Nutch define one url == one index document.) >>>>> May be this would be a nice project for a nutch sandbox. >>>>> If you like you can open an issue to request a nutch sandbox >>>>> project >>>>> "image search". >>>>> If we got enough people vote for this issue we may have a >>>>> chance to >>>>> got it created. >>>> >>>>> Stefan >>>> >>>>> Am 03.06.2006 um 10:38 schrieb TDLN: >>>> >>>>>> I am interested in developing such a solution as well. >>>>>> >>>>>> I am currently storing the thumbnails on the file system under a >>>>>> system generated name. My indexing plugin stores the filename in > the >>>>>> index. Thumbnails are later served to the client by seperate >>>>>> Apache >>>>>> HTTP server. This required some changes but is otherwise pretty >>>>>> straight forward and performs very well for my current 300.000+ >>>>>> images, around 15kb each. >>>>>> >>>>>> If you are developing the more "Nutch-like" solution I could >>>>>> contribute to that. For instance; I have some code that generates > the >>>>>> thumbs using ImageJ that yields very good results. >>>>>> >>>>>> But I would definitely need some guidance in writing the >>>>>> hadoop map >>>>>> reduce job. we could even contribute this back and base a small >>>>>> tutorial on this work. >>>>>> >>>>>> What do you think? >>>>>> >>>>>> Rgrds, Thomas >>>>>> >>>>>> On 6/2/06, Stefan Groschupf <[EMAIL PROTECTED]> wrote: >>>>>>> Hi, >>>>>>> using search http is a bad idea, since you get many but not all >>>>>>> pages. >>>>>>> Just write a hadoop map reduce job that process the fetched > content >>>>>>> in your segments, that should be easy. >>>>>>> Storing images in a file system will be very slow as soon you >>>>>>> have >>>>>>> too many. >>>>>>> I personal don't like databases since compared to nutch they are >> slow >>>>>>> as a snail. >>>>>>> For a other project also related to images I had created a own >>>>>>> ImageWritable that contained the binary data of a compressed >>>>>>> image >>>>>>> compared with some meta data. >>>>>>> If you use a MapFile finding a image based on a key should be >>>>>>> very >>>>>>> fast. I think much faster than a database with binary content. >>>>>>> >>>>>>> HTH >>>>>>> Stefan >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Am 02.06.2006 um 21:10 schrieb Marco Pereira: >>>>>>> >>>>>>>> Hi Everybody, >>>>>>>> >>>>>>>> I've got nutch to index images searching it's url and alt and >> title >>>>>>>> tags. >>>>>>>> But the problem comes when storing the thumbnails. >>>>>>>> I`ve indexed 3million images for a national search engine. >>>>>>>> I was in doubt wheter I use a file system scheme or a database > to >>>>>>>> store the >>>>>>>> thumbnails. >>>>>>>> The thumbnails are created with a script that gets the image >>>>>>> urls from >>>>>>>> nutch index doing a search for http (search.jsp?query=http). >>>>>>>> >>>>>>>> Do you have any tips, ideas on this? >>>>>>>> >>>>>>>> Thanks you, >>>>>>>> Marco >>>>>>> >>>>>>> --------------------------------------------- >>>>>>> blog: http://www.find23.org >>>>>>> company: http://www.media-style.com >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>> >>>> >>>> >>>> >>>>> __________ NOD32 1.1576 (20060602) Information __________ >>>> >>>>> This message was checked by NOD32 antivirus system. >>>>> http://www.eset.com >>>> >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Dima mailto:[EMAIL PROTECTED] >>>> >>>> >>> >>> >> >> > > _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
