Hi, last week the openEO project [1] started, in which we aim to develop an interface (api) to cloud-based processing of Earth observation (EO) imagery. For this project, GDAL has been the blueprint in terms of successfully integrating the diverse landscape of file formats [2]. The project allocated a budget for subcontracting some GDAL development.
During discussions, it became clear that for practically everyone involved, GDAL plays an important role, be it for ingesting images, or for processing them on the fly. Also, a key requirement in all cases seems to be to do something useful with time series of EO images. Here, the current inability of GDAL to report the time associated with datasets, subdatasets and/or bands was noted as a high potential area for extending gdal. Currently, parsing this from the metadata strings is possible but messy and ad hoc, error prone, and has to be done by each client for every driver. My proposal is to augment the GDAL interface to raster data with two methods, GetStartTime() and GetEndTime(), which operate (at least) on bands and report the start and end times of the data acquisition using a simple interface, e.g. similar to poFeature->GetFieldAsDataTime [3] does (returning IIRC either GMT or local time zone). Drivers should fill these fields (which might be equal) or else a flag should indicate the time is missing. ## Start and end time Although most snapshot-based products may have identical start and end times, many others don't; a lot of derived (e.g. climate) products give daily or monthly averages, where start and end time differ. ## Multiple time stamps Datasets might have many time stamps, e.g. some related to the processing steps that a dataset has undergone. Other datasets, e.g. forecast data, may have two relevant times (two-dimensional time): the time at which the forecast was made (t0), and the forecast times the band refers to (e.g. t0+6h, t0+12h, t0+18h, t0+24h etc). For both cases, I believe there is a "default time": the time of observation or prediction a band refers to. Access to other time aspects could be obtained by tags as in GetStartTime(..., reference = "TIME_OF_REPROCESSING"), which might be driver dependent. ## NetCDF/udunits time Most file formats will have time strings like 2017-03-14T10:40:11.026Z that should be pretty straightforward to handle. NetCDF however uses time encoded in a form understood by udunits2 [4]; for example band metadata may have time#units=days since 1978-01-01 00:00:00 NETCDF_DIM_time=1339 which refers to 1339 days after 1978-01-01 00:00:00. Since the units can be set very flexible, for such data I think one should either: (i) have GDAL link (optionally) to udunits2, and if present, use the library to convert to OFTDateTime; (ii) do not try this but return the time units as string and the time as double, and leave the conversion to the client. I would like to know: - whether there is support for this idea, in general (time in gdal), - whether the approach sketched above makes sense - what I've overlooked, what are the potential road blocks [1] http://openeo.org/ [2] http://r-spatial.org/2016/11/29/openeo.html [3] http://www.gdal.org/classOGRFeature.html#a6c5d2444407b07e07b79863c42ee7a49 [4] https://www.unidata.ucar.edu/software/udunits/ -- Edzer Pebesma Institute for Geoinformatics (ifgi), University of Münster Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081 Journal of Statistical Software: http://www.jstatsoft.org/ Computers & Geosciences: http://elsevier.com/locate/cageo/
signature.asc
Description: OpenPGP digital signature
_______________________________________________ gdal-dev mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/gdal-dev
