Re: [Nutch-general] searching limits, and fetcher output

ogjunk-nutch Wed, 27 Jul 2005 15:51:11 -0700

I _think_ the answer to the first question is: yes.
As for the second question, you can use regex URL filter (see conf/
dir).


Otis


--- Jay Pound <[EMAIL PROTECTED]> wrote:

> if I crawl with the -noParsing tag can I trash the fetcher output
> folder
> after I parse that segment?
> 
> searching, is there any way to limit the results to only english, or
> only
> websites ending in extensions I
> define(.com.edu.org.net.tv.info.gov.biz.us.cc.name.bz)?
> thanks,
> -J
> 
> 
> 
> 
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration
> Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> _______________________________________________
> Nutch-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/nutch-general
> 



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO September
19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] searching limits, and fetcher output

Reply via email to