See http://wiki.apache.org/nutch/bin/nutch_inject => "*nutch.fetchInterval*:
allows to set a custom fetch interval for a specific URL"


On 5 February 2013 15:04, kemical <[email protected]> wrote:

> Hi,
>
> I'd like to invalidate fetch interval of given urls without waiting
> db.default.fetch.interval .
> It seems -adddays is doing the job but only for the whole database
>
> i was thinking about freegen command (on my seed urls file), but how to be
> sure it will fetch urls with fetch interval not expired already?
>
> A small explanation about why i'm searching this :
> The tool is to improve search on new featured content (website homepages),
> so almost every urls in my seed list need to be refetch every day (but i
> still want to keep 30 days for all others)
>
> I'm using nutch 1.6, and as far as possible, i don't really want to make
> plugins since i'm not a java dev (as soon as my crawler is clean, i'll
> focus
> on the front end with my usual tools/languages).
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/invalidate-fetch-interval-only-for-given-urls-tp4038591.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to