thank u, I will try 2005/8/21, Fuad Efendi <[EMAIL PROTECTED]>: > > If you need search engine - you can use Nutch. > For front-end, you can use links to images from source sites, it's > better for performance of your own site. > > If you need just to collect image files, probably best is > http://htmlparser.sourceforge.net/ > org.htmlparser.parserapplications.SiteCapturer > However, I'd prefer to write some additional readers for Nutch! Better > for troubleshooting, I need it. > Thanks > > > -----Original Message----- > From: Zhou LiBing [mailto:[EMAIL PROTECTED] > Sent: Saturday, August 20, 2005 7:59 PM > To: [EMAIL PROTECTED] > Subject: Re: [Nutch-general] Re: about the nutch function > > > can I specify more than one URL to crawl the whole web?are you sure? how > to edit the crawl-urlfilter.txt to fetch the images?how could I extract > these* segment *images' feature? > thank you > > > 2005/8/19, Piotr Kosiorowski <[EMAIL PROTECTED]>: > > > > Yes.Yes. :). > > You can specify more than one url while injecting pages to WebDB. You > > can fetch the image file (you have to edit crawl-urlfilter.txt or > > regex-urlfilter.txt to allow particular extension as majority of image > > > extensions are blocked by default). But such data would be only stored > > > in segment - I do not think it would be accesible by search. > > P. > > > > > > On 8/19/05, Zhou LiBing <[EMAIL PROTECTED]> wrote: > > > Can Nutch use one or more start URL to crawl the WEB? > > > Can Nutch fetch the IMAGE file? > > > thank you > > > > > > > > > -- > > > ---Letter From your friend Blue at HUST CGCL--- > > > > > > > > > > > > ------------------------------------------------------- > > SF.Net email is Sponsored by the Better Software Conference & EXPO > > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > > Practices Agile & Plan-Driven Development * Managing Projects & Teams > > * Testing & QA Security * Process Improvement & Measurement * > > http://www.sqe.com/bsce5sf > > _______________________________________________ > > Nutch-general mailing list [email protected] > > https://lists.sourceforge.net/lists/listinfo/nutch-general > > > > > > -- > ---Letter From your friend Blue at HUST CGCL--- > > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices > Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA > Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf > _______________________________________________ > Nutch-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nutch-general >
-- ---Letter From your friend Blue at HUST CGCL---
