Re: [Nutch-general] Re: about the nutch function

Zhou LiBing Sat, 20 Aug 2005 16:59:29 -0700

can I specify more than one URL to crawl the whole web?are you sure?
how to edit the crawl-urlfilter.txt to fetch the images?how could I extract 
these* segment *images' feature?
 thank you


 2005/8/19, Piotr Kosiorowski <[EMAIL PROTECTED]>: 
> 
> Yes.Yes. :).
> You can specify more than one url while injecting pages to WebDB.
> You can fetch the image file (you have to edit crawl-urlfilter.txt or
> regex-urlfilter.txt to allow particular extension as majority of image
> extensions are blocked by default).
> But such data would be only stored in segment - I do not think it
> would be accesible by search.
> P.
> 
> 
> On 8/19/05, Zhou LiBing <[EMAIL PROTECTED]> wrote:
> > Can Nutch use one or more start URL to crawl the WEB?
> > Can Nutch fetch the IMAGE file?
> > thank you
> >
> >
> > --
> > ---Letter From your friend Blue at HUST CGCL---
> >
> >
> 
> 
> -------------------------------------------------------
> SF.Net email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle 
> Practices
> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> _______________________________________________
> Nutch-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/nutch-general
> 



-- 
---Letter From your friend Blue at HUST CGCL---

Re: [Nutch-general] Re: about the nutch function

Reply via email to