have a working scrapy spider deployed on an amazon ec2 instance (c4xlarge) and running using scrapyd.
No matter what I do, I can't seem to top ~200 processed items per minute (according to scrapy logs). I tried playing around with scrapyd conccurency settings, nothing helped, tried playing around with scrapyd max_proc_per_cpu(lowered to 1 to avoid context switch), tried to run separate scrapy crawlers from command line, still, all of them together give the same results of an aggregate amount of around 200 items. I can see from scrapy logs that the aggregate amount of web pages hit is increasing almost linearly but the scraped items per minute seems stuck at 200. Any tips? has anybody come across this before? Have i missed a setting somewhere? Much appreciated, Daniel. *also asked on stackoverflow.com http://stackoverflow.com/questions/33595986/scrapy-scrpyd-cant-process-more-than-200-items-per-minute -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.