Hi All,

I want to crawl only sites that their language is XXX. I wrote a
ParseFilter for detect the language of sites and put data metadata column.
I can prevent crawling outlinks, which site is none XXX language, with this
plugin. But I can not prevent to re-crawling of main page. Is there any
filter can I use? Is it possible with any FetchSchedule?(I need to use
metadata column data for filtering url)

Not: Content-Language or Accept-Language is not suitable for my case.

Nutch2.1/Hbase

Reply via email to