There is a FLFilter in OC which uses Nutch's regex-urlfilter.txt. I believe its 
called NutchUrlFLFilter

On Tue, 6 Sep 2005 19:32:15 -0700 (PDT), Michael Ji wrote:
> Hi Kelvin:
>
> Does OC support domain crawling like url-fliter.txt?
> If so, how to insert the seeds domain list to OC?
>
> I saw OC's org.supermind.crawl.scope  package, didn't
> see a similar concept.
>
> thanks,
>
> Michael Ji
>
>
> ______________________________________________________
> Click here to donate to the Hurricane Katrina relief effort.
> http://store.yahoo.com/redcross-donate3/
>
>
> ------------------------------------------------------- SF.Net
> email is Sponsored by the Better Software Conference & EXPO
> September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices Agile & Plan-Driven Development * Managing Projects &
> Teams * Testing & QA Security * Process Improvement & Measurement *
> http://www.sqe.com/bsce5sf
> _______________________________________________ Nutch-general
> mailing list [email protected]https://lists.sourceforge.net/lists/listinfo/nutch-general




-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to