It depends on your use-case, but in general: 1. Downloading multiple files
If you want to download multiple files / objects, you can parallelize this process. You can either do this by downloading each object in a separate thread or process and / or by utilizing a thread or process pool. If you want speed things up and reduce thread / process overhead, you should also have a look at gevent (http://www.gevent.org/). That's the approach I use in file_syncer where a common case is that multiple independent operations are performed in parallel (downloading / uploading files) - https://github.com/Kami/python-file-syncer/blob/master/file_syncer/syncer.py#L143 2. Downloading a single file / container and object ID is known in advance If you know the container and object ID in advance, you can avoid 2 HTTP requests (get_container, get_object) by manually instantiating Container and Object class with the known IDs. There are some examples of how to do that at https://libcloud.readthedocs.org/en/latest/other/working-with-oo-apis.html In this case, using gevent wouldn't really speed things up much since you are only issuing one HTTP request (unless an object is composed of multiple chunks and provider allows you to retrieve chunks independently...). On Tue, Sep 2, 2014 at 5:30 PM, Chris Richards <ch...@infiniteio.com> wrote: > Howdy. I've noticed a variance in the download time of a file depending on > the method of download, and I'm hoping to shave off overhead. I'm using the > standard s3 provider. That stats I present are consistent between my office > and my home within +/-100 ms. In shortened form: > > - > Timing via driver.get_container().get_object().download() > get_container: 431.4596652984619 ms > get_object: 808.0205917358398 ms > download: 8257.043838500977 ms > Complete, downloaded 8.15 MB > > Timing via driver.get_object().download() > get_object: 811.8221759796143 ms > download: 4801.661729812622 ms > Complete, downloaded 8.15 MB > - > > > In the first case, it appears that getting the container has significant > overhead and should be avoided if possible--which I can do--and trims > 400-500 ms per download (for small files, this is significant). Is my > observation and conclusion correct? > > In the second case, I want to examine is the .get_object() requirement to > download a file. This adds another significant overhead on the order of > 700-900 ms. Is there a way to bypass this? I have many small files where > the .get_object() time exceeds that of the .download() time! > > import std.newbie.disclaimer > > Thanks! > Chris >