Hi, I am trying to understand whether it is possible to read with parallel requests when using the vsis3 virtual file system - I am currently experimenting with a regular gtiff file and have not yet found a way to read with speeds beyond s3 single-read bandwidth.
I am no expert in http communication, so I would be very happy if you point out misconceptions. My thought process is: As s3 does not support multiple ranges per request (see https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html), I assume that GDAL_HTTP_MULTIPLEX would not be an option. The old documentation (https://trac.osgeo.org/gdal/wiki/ConfigOptions#GDAL_HTTP_MULTIRANGE) made me hopeful that maybe I can convince gdal to generate multi-range requests, which it then could send in parallel as single-range request via several HTTP connections. > (GDAL >= 2.3) Can be set to SINGLE_GET, SERIAL or YES. Defaults to YES. > Controls how ReadMultiRange() requests emitted by the GeoTIFF driver are > satisfied. SINGLE_GET means that several ranges will be expressed in the > Range header of a single GET requests, which is not supported by a majority > of servers (including AWS S3 or Google GCS). SERIAL means that each range > will be requested sequentially. YES means that each range will be requested > in parallel, using HTTP/2 multiplexing or several HTTP connections. (Option seems to still exist, but uses PARALLEL and SERIAL now - found here: https://github.com/OSGeo/gdal/blob/8633aa6f0c57d38b768059ac5dc137531421b9d7/port/cpl_vsil_curl.cpp#L3405) I tried playing around with GDAL_HTTP_VERSION and GDAL_HTTP_MULTIRANGE, but it doesn’t seem they change the parallelism (at least I don’t seem to be able to get beyond ~20Mb/s even with optimal reading conditions). Is currently at all possible to get parallel reads from S3 - if so what settings would I need? Thanks so much, Christian _______________________________________________ gdal-dev mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/gdal-dev
