can't seem to pass ~200 processed items per minute

Daniel Dubovski Sun, 08 Nov 2015 11:13:53 -0800


 have a working scrapy spider deployed on an amazon ec2 instance (c4xlarge) 
and running using scrapyd.


No matter what I do, I can't seem to top ~200 processed items per minute 
(according to scrapy logs).

I tried playing around with scrapyd conccurency settings, nothing helped, 
tried playing around with scrapyd max_proc_per_cpu(lowered to 1 to avoid 
context switch), tried to run separate scrapy crawlers from command line, 
still, all of them together give the same results of an aggregate amount of 
around 200 items.

I can see from scrapy logs that the aggregate amount of web pages hit is 
increasing almost linearly but the scraped items per minute seems stuck at 
200.

Any tips? has anybody come across this before? Have i missed a setting 
somewhere?

Much appreciated, Daniel.

*also asked on stackoverflow.com
http://stackoverflow.com/questions/33595986/scrapy-scrpyd-cant-process-more-than-200-items-per-minute

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

can't seem to pass ~200 processed items per minute

Reply via email to