Hello,

This might be useful to those who store thumbnails/images on S3.
I noticed that the current implementation of S3FilesStore is issuing 
blocking boto requests to check if images are existing in the S3 bucket. 
This check is done in a Twisted thread pool to avoid blocking the main 
thread but that pool is capped to something like 20.

As a quick experiment I replaced boto by a Twisted library (for reads only) 
and I noticed an immediate 2x throughput increase in the same crawl. The 
code is available 
here 
https://github.com/Curbside/scrapy/commit/2b544df2bfb347de9963fed4f3546da19ca3cc8f

The txaws library is somewhat outdated and I couldn't make the upload part 
working. If someone is interested in updating it/making it work natively or 
has an alternative implementation I'd be interested to learn about it.

Cheers
Denis

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to