Hi,

I am trying to understand whether it is possible to read with parallel requests 
when using the vsis3 virtual file system - I am currently experimenting with a 
regular gtiff file and have not yet found a way to read with speeds beyond s3 
single-read bandwidth.

I am no expert in http communication, so I would be very happy if you point out 
misconceptions. My thought process is:

As s3 does not support multiple ranges per request (see 
https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html), I assume 
that GDAL_HTTP_MULTIPLEX would not be an option.

The old documentation 
(https://trac.osgeo.org/gdal/wiki/ConfigOptions#GDAL_HTTP_MULTIRANGE) made me 
hopeful that maybe I can convince gdal to generate multi-range requests, which 
it then could send in parallel as single-range request via several HTTP 
connections.
> (GDAL >= 2.3) Can be set to SINGLE_GET, SERIAL or YES. Defaults to YES. 
> Controls how ReadMultiRange() requests emitted by the GeoTIFF driver are 
> satisfied. SINGLE_GET means that several ranges will be expressed in the 
> Range header of a single GET requests, which is not supported by a majority 
> of servers (including AWS S3 or Google GCS). SERIAL means that each range 
> will be requested sequentially. YES means that each range will be requested 
> in parallel, using HTTP/2 multiplexing or several HTTP connections.

(Option seems to still exist, but uses PARALLEL and SERIAL now - found here: 
https://github.com/OSGeo/gdal/blob/8633aa6f0c57d38b768059ac5dc137531421b9d7/port/cpl_vsil_curl.cpp#L3405)

I tried playing around with GDAL_HTTP_VERSION and GDAL_HTTP_MULTIRANGE, but it 
doesn’t seem they change the parallelism (at least I don’t seem to be able to 
get beyond ~20Mb/s even with optimal reading conditions).

Is currently at all possible to get parallel reads from S3 - if so what 
settings would I need?

Thanks so much,
Christian 




_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to