Hi, While working for a client we came across a use case that seems like it might not be uncommon. We may have some code to contribute.
The use case is that we have a few seed URLs that we need to fetch at relatively high frequency (e.g. every N minutes). There URLs have pointers to news type of content. Thus, these seed URLs are used primarily for URL discovery. From there we do w relatively shallow crawl. But the important thing is that we need to make sure we get to refetching seed URLs (depth=0) at some high frequency, while all other URLs can be refetched at their default frequency. In case of news that actually probably means "fetch once and never again". So I'm wondering if a simple custom "seed URL scheduler" would be of interest. Something like: if (URL is seed) fetch at seed URL fetch freq else fetch at standard freq ? .... or if this can already be done without a custom scheduler, I'd love to know how! Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/

