To close this out on the GDAL side - from the rasterio logging and the
source code emitting the log lines, it appears this is occurring due to
rasterio eagerly determining the `nodata` for all bands even if only one is
requested, which is what is then forcing GDAL to have to read through most
of the file. I'll continue chasing on the rasterio side separately.

Cheers,
Daniel

On Fri, 10 Oct 2025 at 11:47, Daniel Evans <[email protected]>
wrote:

> Hmm, yes - I see it jumping straight to the relevant band when run via a
> locally compiled GDAL 3.11.4 using your code, but when using rasterio built
> on top of that same GDAL 3.11.4, it's paging through the whole file. Seems
> like there's something with how my Python environment/code configures
> things when using rasterio, or something that rasterio configures, that is
> modifying the behaviour.
>
> Thoughts on where to look welcome, but it doesn't appear to be a
> GDAL-level problem.
>
> Cheers,
> Daniel
>
> On Thu, 9 Oct 2025 at 16:36, Daniel Baston <[email protected]> wrote:
>
>> FWIW, the following snippet is working with gdal master:
>>
>> from osgeo import gdal
>>
>> with gdal.config_options({"AWS_NO_SIGN_REQUEST":"True",
>> "CPL_DEBUG":"True", "CPL_CURL_VERBOSE":"True"}):
>>     ds =
>> gdal.Open("/vsis3/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012")
>>     band = ds.GetRasterBand(636)
>>     x = band.ReadAsArray()
>>     print(x.mean())
>>
>> Dan
>>
>> On Thu, Oct 9, 2025 at 10:27 AM Daniel Evans via gdal-dev <
>> [email protected]> wrote:
>>
>>> Hi all,
>>>
>>> I am attempting to read a single band from a NOAA GRIB2 file on S3, with
>>> an associated .idx file. Reading the GRIB2 driver documentation, it is
>>> stated that the existence of such an idx file allows a file to be opened
>>> without reading all bands.
>>>
>>> However, looking at the CPL_CURL_VERBOSE=True logs, it appears that GDAL
>>> is still paging through the file from the start until reaching the
>>> requested band.
>>>
>>> GDAL identifies the existence of the .idx file:
>>>
>>> DEBUG:CPLE_None in GRIB: Reading inventories from sidecar file
>>> /vsis3/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx
>>> DEBUG:CPLE_None in S3: Downloading 0-41215 (
>>> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012.idx).
>>> ..
>>>
>>> But it then appears to scan the file from the start until it has passed
>>> the requested band:
>>>
>>> DEBUG:CPLE_None in S3: Downloading 16384-999423 (
>>> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
>>> ..
>>> DEBUG:CPLE_None in S3: Downloading 999424-2965503 (
>>> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
>>> ..
>>> DEBUG:CPLE_None in S3: Downloading 2965504-6897663 (
>>> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
>>> ..
>>> [...]
>>> DEBUG:S3: Downloading 449626112-450461695 (
>>> https://s3.amazonaws.com/noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012).
>>> ..
>>>
>>> Band 636 is listed in the .idx with offset 443333308, Band 637 having
>>> offset 444174665. The total filesize is 545533166.
>>>
>>>
>>> Do I need to do something extra to trigger GDAL to read only the
>>> requested band based on the .idx? Are some GRIB/.idx files not able to be
>>> loaded in this way?
>>>
>>> I am running via rasterio v1.4.3 which is using GDAL v3.9.3. My code is
>>> below, the file is in a public NOAA-hosted bucket.
>>>
>>> Cheers,
>>> Daniel
>>>
>>> ###
>>>
>>> import logging
>>> import rasterio
>>>
>>> logging.basicConfig(format="%(levelname)s:%(message)s",
>>> level=logging.DEBUG)
>>>
>>> with rasterio.Env(USE_IDX=True, AWS_VIRTUAL_HOSTING=False,
>>> CPL_DEBUG=True, CPL_CURL_VERBOSE=True):
>>>     with
>>> rasterio.open("s3://noaa-gfs-bdp-pds/gfs.20250918/00/atmos/gfs.t00z.pgrb2.0p25.f012")
>>> as ds:
>>>         band = ds.read(636)
>>>
>>> ###
>>>
>>> _______________________________________________
>>> gdal-dev mailing list
>>> [email protected]
>>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>>>
>>
_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to