Re: Avoiding cache for start_urls

Pablo Hoffman Tue, 31 Dec 2013 07:18:49 -0800

We should add a dont_cache request meta.


On Thu, Oct 31, 2013 at 8:18 AM, Alvaro Moe <[email protected]>wrote:

> Hi list,
>
> I want to to avoid caching the start_urls, but not the inner pages. Is
> this possible?
>
> The use case: I'm scraping articles from a news website, I assume articles
> don't change, but the home page is my source of new articles. So I need to
> run the scraper regularly, hit the start_urls, get all the fresh links and
> ignore the old ones.
>
> How would you go about this?
>
> Thanks in advance!!
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Avoiding cache for start_urls

Reply via email to