Re: add delay to start_urls

Nicolás Alejandro Ramírez Quiros Mon, 13 Oct 2014 10:57:12 -0700

If they are from different domains override start_requests and use 
meta['download_slot'] = <some_name>


El martes, 7 de octubre de 2014 18:17:11 UTC-2, [email protected] 
escribió:
>
> It look like Scrapy just run all start_urls at the same time. How do I 
> tell scrapy to start with url1 , wait 30s, then fetch url2
>
> Here is my setting:
>
> AUTOTHROTTLE_ENABLED = True
> AUTOTHROTTLE_DEBUG = True
>
> DOWNLOAD_DELAY = 60
> DOWNLOAD_TIMEOUT = 30
> CONCURRENT_REQUESTS_PER_DOMAIN = 1
> AUTOTHROTTLE_START_DELAY = 10
>
>  
> And this is spider
>
>     start_urls = [
>         "url1",
>         "url2",
>         "url3",
>         "url4",
>         "url5",
>      ]
>
>
> Here is the log:
>
> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
> url1> (referer: None)
> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
> url2> (referer: None)
> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
> url3> (referer: None)
> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
> url4> (referer: None)
> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
> url5> (referer: None)
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: add delay to start_urls

Reply via email to