There is a FLFilter in OC which uses Nutch's regex-urlfilter.txt. I believe its called NutchUrlFLFilter
On Tue, 6 Sep 2005 19:32:15 -0700 (PDT), Michael Ji wrote: > Hi Kelvin: > > Does OC support domain crawling like url-fliter.txt? > If so, how to insert the seeds domain list to OC? > > I saw OC's org.supermind.crawl.scope package, didn't > see a similar concept. > > thanks, > > Michael Ji > > > ______________________________________________________ > Click here to donate to the Hurricane Katrina relief effort. > http://store.yahoo.com/redcross-donate3/ > > > ------------------------------------------------------- SF.Net > email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices Agile & Plan-Driven Development * Managing Projects & > Teams * Testing & QA Security * Process Improvement & Measurement * > http://www.sqe.com/bsce5sf > _______________________________________________ Nutch-general > mailing list [email protected] > https://lists.sourceforge.net/lists/listinfo/nutch-general ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
