Re: [gdal-dev] Passing open options along dataset name in a string ?
Hi Jukka, On Tue, Nov 17, 2020 at 2:57 AM jratike80 < jukka.rahko...@maanmittauslaitos.fi> wrote: > Hi, > > I have done some helpdesk work within the GDAL community and I know well > that the open options and config options are confusing. I also know that > they exists for a reason but simplified and uniform way to use them would > be > nice. > I'm glad to read that you're interested! I believe there is tension between simple and uniform and we'll need wide input to find a good balance. I've been using QGIS today for the first time in a while and I think that application could benefit from this as well. A "URL" for datasets could remove some of the complexity of QGIS dialog boxes for creating or configuring connections. > Some comments on comments: > > >> gdalinfo my.tif -oo GEOREF_SOURCES=WORLDFILE,PAM > >> > > > Ideally this would be baked into the format, but, yes, I think we've got > a > > bead on dataset open options. > > I don't know how it could be baked into the format. The option gives user > an > option to override wrong GeoTIFF georeferencing with wordfile, for example. > Yes. > >> gdalinfo BAG:"data/test_vr.bag":supergrid:0:1 > >> > > > DRIVER:"file":something > > > Right. This will require some work because of multiple colons. Though > I've > > never seen BAG driver data in the wild. Is this a real live format? > > As far as I know BAG is the hdf of bathymetry and widely used in that > context. > > >> gdalinfo data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo > SUPERGRIDS_MASK=YES > >> gdalinfo HDF5:"d:\foo.he5"://HDFEOS/SWATHS/foo/bar > >> > > > HDF5 driver, filename using Windows drive, and UNC path within it. This > is > > marginal, right? > > The part beginning with // is not UNC path but the name of the subdataset > within hdf5 file https://gdal.org/drivers/raster/hdf5.html. Not more > marginal than HDF5 itself. > > Thank you for explaining. > >> ogrinfo OCI:warmerda/password@.dreadfest > > >Wat? > > Text has just been formatted into email link because of the @ sign that > belongs to the Oracle connection string "username" / "password" @ "the name > of the Oracle database as it appears in the tnsnames.ora file". Let's see > if > formatting happens again when I send this from Nabble: > > OCI:warmerda/passw...@gdal800.dreadfest.com abc.shp > Thanks. > -Jukka Rahkonen- > > > > > Sean Gillies-3 wrote > > Hi Even, > > > > On Wed, Nov 4, 2020 at 9:01 AM Even Rouault > > > even.rouault@ > > > > > wrote: > > > >> > > Another particularity we have in GDAL is that the dataset name might > >> be > >> > > almost > >> > > anything. Most of the time, it is a regular file path, or some /vsi > >> path. > >> > > But > >> > > sometimes, it can be JSON content (the GeoJSON driver accepts the > >> content > >> > > to > >> > > be directly provided as the dataset name), or XML (VRT, WMS > drivers). > >> > > We have also the subdataset syntax "HDF5:foo.hdf:my_variable" > >> > > >> > Could VRT XML and JSON be exempted? We already have a way to embed > open > >> > options in the XML. > >> > >> If the gdn: mechanism is a new possibility offered that doesn't exclude > >> existing ones (otherwise that would be a pretty big breaking change), we > >> could > >> possibly exempt the odd cases I mentioned (or have some quoting/escaping > >> rules > >> to enable that payload to be seen as a file), which generally don't need > >> a > >> "permanent" way of refering to the dataset like gdn: would offer, since > >> this > >> is content often generated programatically or retrieved dynamically. > >> > >> Covering subdataset would be a more important use case. Something that > >> would > >> have to be decided if the way we express subdatasets would be somehow > >> standardized or if it would be a black-box string for the gdn: > >> encapsulation. > >> For a black-box approach, we would have to define some escaping/quoting > >> rules > >> to avoid any potential issue with separators of the gdn syntax. If we > >> decide > >> that the subdataset syntax is part of what is standardized by GDN that > >> would > >> be a more challenging exercice, because the subdataset syntax varies > from > >> driver to driver. > >> > > > > The variation of subdataset syntax among drivers is a bug, let's try to > > fix > > this. > > > > It seems to me that the internet way to address subdatasets would be to > > use > > a # URL fragment. But since most of our formats and the servers that > serve > > files of these formats are not aware, we may have to come up with > > something > > different. We may need to consider making subdatasets a layer opening > > option? > > > > pending on how we design things, that might impact between: > >> - just GDALOpen() generic code if GDALOpen() decodes the gdn: string to > >> decompose it into 'classic' dataset names and open options > >> - all drivers if the gdn: string would be passed to each > >> GDALDriver::pfnOpen() > >> implementation > >> - intermediate situation if we decide to drop (at least for
Re: [gdal-dev] Passing open options along dataset name in a string ?
> The variation of subdataset syntax among drivers is a bug, let's try to fix > this. > > It seems to me that the internet way to address subdatasets would be to use > a # URL fragment. But since most of our formats and the servers that serve > files of these formats are not aware, we may have to come up with something > different. We may need to consider making subdatasets a layer opening > option? Hum, I'm a bit confused. Isn't the purpose to have a single string covering subdataset specification and open options ? Because you could potentially have use cases where you open a "container" dataset with its name and open options (not selecting a particular subdataset) and the GetMetadata("SUBDATASETS") should return potentially a GDN that would have the same open options but also additions foreach specific subdataset. Let's say "gdalinfo my.hdf5 -oo FOO=BAR" would return a list of subdatasets: gdn:HDF5:my.hdf5+encoding_FOO=BAR+encoding_VARIABLE=temperature gdn:HDF5:my.hdf5+encoding_FOO=BAR+encoding_VARIABLE=pressure Some additions to Jukka's answer: > > gdalinfo GTIFF_DIR:0:d:\my.tif > > WTF is this? :) https://gdal.org/drivers/raster/gtiff.html : """ Multi-page TIFF files are exposed as subdatasets. On opening, a subdataset name is GTIFF_DIR:{index}:filename.tif, where {index} starts at 1. """ (ok, so my example was wrong :-) should have benn GTIFF_DIR:1:d:\my.tif) > > > gdalinfo EEDAI:my/asset > > gdalinfo EEDAI: -oo ASSET=my/asset > > gdalinfo EEDAI:my/asset:band1, band2 > > gdalinfo EEDAI: -oo ASSET=my/asset -oo BANDS=band1,band2 > > Never seen these. Cf https://gdal.org/drivers/raster/eedai.html This driver shows a case where we handle both worlds. The specification of a dataset can be in the dataset name ("EEDAI:my/asset") or as a dataset name ("EEDAI:") + open options (ASSET=my/asset). This was my attempt to use open options as a way of having more explictness on how to specify the subparts of a subdataset, but perhaps this wasn't a good idea. But this is a case where the border between what is the dataset/subdataset name and what is an open option is fuzzy. "gdalinfo EEDAI:" without any option will not work: you can't reasonably list all datasets hosted on Earth Engine... But in some circumstances (when all bands don’t have the same georeferencing, resolution, CRS or image dimensions), whatever you open with dataset name ("EEDAI:my/asset") or as a dataset name ("EEDAI:") + open options (ASSET=my/ asset), you may get a list of subdatasets > > GDALOpen() is not even aware that HDF5:bla means that the dataset will be > > recognized by the HDF5 driver > > Wait what? GDALOpen() just iterates over drivers and passes the dataset name and open options to them until one says "yes, that's for me". The use of "DRIVER_NAME:bla" is mostly a convention, but in no way a core mechanism. If you use DRIVER_NAME:bla fo a driver that doesn't recognize the DRIVER_NAME: prefix, that won't work. Even -- Spatialys - Geospatial professional services http://www.spatialys.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Passing open options along dataset name in a string ?
Hi, I have done some helpdesk work within the GDAL community and I know well that the open options and config options are confusing. I also know that they exists for a reason but simplified and uniform way to use them would be nice. Some comments on comments: >> gdalinfo my.tif -oo GEOREF_SOURCES=WORLDFILE,PAM >> > Ideally this would be baked into the format, but, yes, I think we've got a > bead on dataset open options. I don't know how it could be baked into the format. The option gives user an option to override wrong GeoTIFF georeferencing with wordfile, for example. >> gdalinfo BAG:"data/test_vr.bag":supergrid:0:1 >> > DRIVER:"file":something > Right. This will require some work because of multiple colons. Though I've > never seen BAG driver data in the wild. Is this a real live format? As far as I know BAG is the hdf of bathymetry and widely used in that context. >> gdalinfo data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo SUPERGRIDS_MASK=YES >> gdalinfo HDF5:"d:\foo.he5"://HDFEOS/SWATHS/foo/bar >> > HDF5 driver, filename using Windows drive, and UNC path within it. This is > marginal, right? The part beginning with // is not UNC path but the name of the subdataset within hdf5 file https://gdal.org/drivers/raster/hdf5.html. Not more marginal than HDF5 itself. >> ogrinfo OCI:warmerda/password@.dreadfest >Wat? Text has just been formatted into email link because of the @ sign that belongs to the Oracle connection string "username" / "password" @ "the name of the Oracle database as it appears in the tnsnames.ora file". Let's see if formatting happens again when I send this from Nabble: OCI:warmerda/passw...@gdal800.dreadfest.com abc.shp -Jukka Rahkonen- Sean Gillies-3 wrote > Hi Even, > > On Wed, Nov 4, 2020 at 9:01 AM Even Rouault > even.rouault@ > > wrote: > >> > > Another particularity we have in GDAL is that the dataset name might >> be >> > > almost >> > > anything. Most of the time, it is a regular file path, or some /vsi >> path. >> > > But >> > > sometimes, it can be JSON content (the GeoJSON driver accepts the >> content >> > > to >> > > be directly provided as the dataset name), or XML (VRT, WMS drivers). >> > > We have also the subdataset syntax "HDF5:foo.hdf:my_variable" >> > >> > Could VRT XML and JSON be exempted? We already have a way to embed open >> > options in the XML. >> >> If the gdn: mechanism is a new possibility offered that doesn't exclude >> existing ones (otherwise that would be a pretty big breaking change), we >> could >> possibly exempt the odd cases I mentioned (or have some quoting/escaping >> rules >> to enable that payload to be seen as a file), which generally don't need >> a >> "permanent" way of refering to the dataset like gdn: would offer, since >> this >> is content often generated programatically or retrieved dynamically. >> >> Covering subdataset would be a more important use case. Something that >> would >> have to be decided if the way we express subdatasets would be somehow >> standardized or if it would be a black-box string for the gdn: >> encapsulation. >> For a black-box approach, we would have to define some escaping/quoting >> rules >> to avoid any potential issue with separators of the gdn syntax. If we >> decide >> that the subdataset syntax is part of what is standardized by GDN that >> would >> be a more challenging exercice, because the subdataset syntax varies from >> driver to driver. >> > > The variation of subdataset syntax among drivers is a bug, let's try to > fix > this. > > It seems to me that the internet way to address subdatasets would be to > use > a # URL fragment. But since most of our formats and the servers that serve > files of these formats are not aware, we may have to come up with > something > different. We may need to consider making subdatasets a layer opening > option? > > pending on how we design things, that might impact between: >> - just GDALOpen() generic code if GDALOpen() decodes the gdn: string to >> decompose it into 'classic' dataset names and open options >> - all drivers if the gdn: string would be passed to each >> GDALDriver::pfnOpen() >> implementation >> - intermediate situation if we decide to drop (at least for future >> drivers) >> per-driver subdataset syntax (which has deficiencies has the quoting >> rules >> to >> separate the filename from the non-filename component vary from driver to >> driver, and are most of the time not defined) to come up with something >> more >> standardized >> >> To help brainstorming, a non-exhaustive overview of a few situations >> mixing >> driver prefixing, subdataset syntax and open options: >> >> gdalinfo my.tif >> > > Yes. We have to handle bare paths to local dataset files. > > >> gdalinfo my.tif -oo GEOREF_SOURCES=WORLDFILE,PAM >> > > Ideally this would be baked into the format, but, yes, I think we've got a > bead on dataset open options. > > >> gdalinfo GTIFF_DIR:0:d:\my.tif >> > > WTF is this? :) > > >> gdalinfo EEDAI:my/asset >>
Re: [gdal-dev] Passing open options along dataset name in a string ?
Hi Even, On Wed, Nov 4, 2020 at 9:01 AM Even Rouault wrote: > > > Another particularity we have in GDAL is that the dataset name might be > > > almost > > > anything. Most of the time, it is a regular file path, or some /vsi > path. > > > But > > > sometimes, it can be JSON content (the GeoJSON driver accepts the > content > > > to > > > be directly provided as the dataset name), or XML (VRT, WMS drivers). > > > We have also the subdataset syntax "HDF5:foo.hdf:my_variable" > > > > Could VRT XML and JSON be exempted? We already have a way to embed open > > options in the XML. > > If the gdn: mechanism is a new possibility offered that doesn't exclude > existing ones (otherwise that would be a pretty big breaking change), we > could > possibly exempt the odd cases I mentioned (or have some quoting/escaping > rules > to enable that payload to be seen as a file), which generally don't need a > "permanent" way of refering to the dataset like gdn: would offer, since > this > is content often generated programatically or retrieved dynamically. > > Covering subdataset would be a more important use case. Something that > would > have to be decided if the way we express subdatasets would be somehow > standardized or if it would be a black-box string for the gdn: > encapsulation. > For a black-box approach, we would have to define some escaping/quoting > rules > to avoid any potential issue with separators of the gdn syntax. If we > decide > that the subdataset syntax is part of what is standardized by GDN that > would > be a more challenging exercice, because the subdataset syntax varies from > driver to driver. > The variation of subdataset syntax among drivers is a bug, let's try to fix this. It seems to me that the internet way to address subdatasets would be to use a # URL fragment. But since most of our formats and the servers that serve files of these formats are not aware, we may have to come up with something different. We may need to consider making subdatasets a layer opening option? pending on how we design things, that might impact between: > - just GDALOpen() generic code if GDALOpen() decodes the gdn: string to > decompose it into 'classic' dataset names and open options > - all drivers if the gdn: string would be passed to each > GDALDriver::pfnOpen() > implementation > - intermediate situation if we decide to drop (at least for future > drivers) > per-driver subdataset syntax (which has deficiencies has the quoting rules > to > separate the filename from the non-filename component vary from driver to > driver, and are most of the time not defined) to come up with something > more > standardized > > To help brainstorming, a non-exhaustive overview of a few situations > mixing > driver prefixing, subdataset syntax and open options: > > gdalinfo my.tif > Yes. We have to handle bare paths to local dataset files. > gdalinfo my.tif -oo GEOREF_SOURCES=WORLDFILE,PAM > Ideally this would be baked into the format, but, yes, I think we've got a bead on dataset open options. > gdalinfo GTIFF_DIR:0:d:\my.tif > WTF is this? :) > gdalinfo EEDAI:my/asset > gdalinfo EEDAI: -oo ASSET=my/asset > gdalinfo EEDAI:my/asset:band1, band2 > gdalinfo EEDAI: -oo ASSET=my/asset -oo BANDS=band1,band2 > Never seen these. > gdalinfo BAG:"data/test_vr.bag":supergrid:0:1 > DRIVER:"file":something Right. This will require some work because of multiple colons. Though I've never seen BAG driver data in the wild. Is this a real live format? > gdalinfo data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo SUPERGRIDS_MASK=YES > gdalinfo HDF5:"d:\foo.he5"://HDFEOS/SWATHS/foo/bar > HDF5 driver, filename using Windows drive, and UNC path within it. This is marginal, right? > gdalinfo netCDF:"/vsicurl/http://example.com/my.nc":my_var > This looks less complicated than some of the examples above. > ogrinfo "PG:dbname=testdb user=foo" > ogrinfo "mySQL:testdb,user=foo" > These seem like they could be driver specific, but generalized key-value parameters. > ogrinfo OCI:warmerda/passw...@gdal800.dreadfest.com Wat? > GDALOpen() is not even aware that HDF5:bla means that the dataset will be > recognized by the HDF5 driver > > Wait what? -- Sean Gillies ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Passing open options along dataset name in a string ?
Even, On Wed, Nov 4, 2020 at 3:40 AM Even Rouault wrote: > Sean, > > What GDN stands for: GDAL Dataset Name ? > Yes. I just made that up on the spot. Think of it as a GDAL or FOSS4G specific namespace. Until now, GDAL has been using symbols like WFS: and HDF5: in the global namespace, which causes problems for interoperability. > > > The URN or GDN version might look something like the thing below, using > ?+ > > and ?= [3] to identify vsi and driver option sections > > > > gdn:curl:csv: > > > example.com/foo.csv?a=1=2?+max_retry=5?=autodetect_type=yes_geom_colu > > mns=no > > The http or https protocol should be captured too. > > > Bringing a little more order to how we name and address datasets was on > my > > todo list at the start of the year, but then 2020 went into a spiral. I > > don't think rasterio's "zip+s3" etc approach is the best. > > I see fsspec has a syntax for chaining filesystems in > https://filesystem-spec.readthedocs.io/en/latest/features.html#url-chaining Yes, I recognized this as being a different approach to a similar problem we have in GDAL: data files that can be accessed with different layers of optional protocols. > > > Another particularity we have in GDAL is that the dataset name might be > almost > anything. Most of the time, it is a regular file path, or some /vsi path. > But > sometimes, it can be JSON content (the GeoJSON driver accepts the content > to > be directly provided as the dataset name), or XML (VRT, WMS drivers). > We have also the subdataset syntax "HDF5:foo.hdf:my_variable" > Could VRT XML and JSON be exempted? We already have a way to embed open options in the XML. -- Sean Gillies ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Passing open options along dataset name in a string ?
> > Another particularity we have in GDAL is that the dataset name might be > > almost > > anything. Most of the time, it is a regular file path, or some /vsi path. > > But > > sometimes, it can be JSON content (the GeoJSON driver accepts the content > > to > > be directly provided as the dataset name), or XML (VRT, WMS drivers). > > We have also the subdataset syntax "HDF5:foo.hdf:my_variable" > > Could VRT XML and JSON be exempted? We already have a way to embed open > options in the XML. If the gdn: mechanism is a new possibility offered that doesn't exclude existing ones (otherwise that would be a pretty big breaking change), we could possibly exempt the odd cases I mentioned (or have some quoting/escaping rules to enable that payload to be seen as a file), which generally don't need a "permanent" way of refering to the dataset like gdn: would offer, since this is content often generated programatically or retrieved dynamically. Covering subdataset would be a more important use case. Something that would have to be decided if the way we express subdatasets would be somehow standardized or if it would be a black-box string for the gdn: encapsulation. For a black-box approach, we would have to define some escaping/quoting rules to avoid any potential issue with separators of the gdn syntax. If we decide that the subdataset syntax is part of what is standardized by GDN that would be a more challenging exercice, because the subdataset syntax varies from driver to driver. Depending on how we design things, that might impact between: - just GDALOpen() generic code if GDALOpen() decodes the gdn: string to decompose it into 'classic' dataset names and open options - all drivers if the gdn: string would be passed to each GDALDriver::pfnOpen() implementation - intermediate situation if we decide to drop (at least for future drivers) per-driver subdataset syntax (which has deficiencies has the quoting rules to separate the filename from the non-filename component vary from driver to driver, and are most of the time not defined) to come up with something more standardized To help brainstorming, a non-exhaustive overview of a few situations mixing driver prefixing, subdataset syntax and open options: gdalinfo my.tif gdalinfo my.tif -oo GEOREF_SOURCES=WORLDFILE,PAM gdalinfo GTIFF_DIR:0:d:\my.tif gdalinfo EEDAI:my/asset gdalinfo EEDAI: -oo ASSET=my/asset gdalinfo EEDAI:my/asset:band1, band2 gdalinfo EEDAI: -oo ASSET=my/asset -oo BANDS=band1,band2 gdalinfo BAG:"data/test_vr.bag":supergrid:0:1 gdalinfo data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo SUPERGRIDS_MASK=YES gdalinfo HDF5:"d:\foo.he5"://HDFEOS/SWATHS/foo/bar gdalinfo netCDF:"/vsicurl/http://example.com/my.nc":my_var ogrinfo "PG:dbname=testdb user=foo" ogrinfo "mySQL:testdb,user=foo" ogrinfo OCI:warmerda/passw...@gdal800.dreadfest.com GDALOpen() is not even aware that HDF5:bla means that the dataset will be recognized by the HDF5 driver Even -- Spatialys - Geospatial professional services http://www.spatialys.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Passing open options along dataset name in a string ?
Sean, What GDN stands for: GDAL Dataset Name ? > The URN or GDN version might look something like the thing below, using ?+ > and ?= [3] to identify vsi and driver option sections > > gdn:curl:csv: > example.com/foo.csv?a=1=2?+max_retry=5?=autodetect_type=yes_geom_colu > mns=no The http or https protocol should be captured too. > Bringing a little more order to how we name and address datasets was on my > todo list at the start of the year, but then 2020 went into a spiral. I > don't think rasterio's "zip+s3" etc approach is the best. I see fsspec has a syntax for chaining filesystems in https://filesystem-spec.readthedocs.io/en/latest/features.html#url-chaining Another particularity we have in GDAL is that the dataset name might be almost anything. Most of the time, it is a regular file path, or some /vsi path. But sometimes, it can be JSON content (the GeoJSON driver accepts the content to be directly provided as the dataset name), or XML (VRT, WMS drivers). We have also the subdataset syntax "HDF5:foo.hdf:my_variable" -- Spatialys - Geospatial professional services http://www.spatialys.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Passing open options along dataset name in a string ?
Even, On Mon, Nov 2, 2020 at 1:16 PM Even Rouault wrote: > Sean, > > > We already have a way of passing "open" options for vsicurl: > > > https://gdal.org/user/virtual_file_systems.html#vsicurl-http-https-ftp-files > > -random-access. What about reusing that conceptual framework and syntax? > > > > For example: > > > > "foo.csv?AUTODETECT_TYPE=YES_GEOM_COLUMNS=NO" > > I actually considered that, but realized that things would get messy if > you want > to use that vsicurl syntax and open options... > > You would then have strings like > > /vsicurl?max_retry=5= > http://example.com/foo.csv_TYPE=YES_GEOM_COLUMNS=NO > > and the GDALOpen() logic would have to figure out whas is the /vsicurl > part and the open option part. > > Or we would have to URL-escape the "/vsicurl?max_retry=5= > http://example.com/foo.csv; part > to avoid using '?' and '&', like: > > /vsicurl%3Fmax_retry=5%26url= > http://example.com/foo.csv?AUTODETECT_TYPE=YES_GEOM_COLUMNS=NO > > > Another issue is we have connection strings like "WFS: > http://example.com/wfs?SERVICE=WFS=2.0.0; (or actually > just the "/vsicurl?max_retry=5=http://example.com/foo.csv; string > mentioned above). > GDALOpen() would then mis-interpret this as dataset name = "WFS: > http://example.com/wfs; > with open options SERVICE=WFS and VERSION=2.0.0 > I see. I wish our data formats were more standard and less slippery and didn't need these open options. But it's true that some files are very different without the proper combination of opening options and there's a benefit to helping applications use the right combination. I'm not a fan of the mix of JSON and not-JSON elements in the syntax you proposed. I think a good solution for naming datasets and including all the driver options and vsi options looks more like a URN [1] and I think we should write a GDAL RFC to standardize it. I also think that we should get some people outside of GDAL involved. Like folks from the Dask community, who might share some lessons learned from writing fsspec [2]. The URN or GDN version might look something like the thing below, using ?+ and ?= [3] to identify vsi and driver option sections gdn:curl:csv: example.com/foo.csv?a=1=2?+max_retry=5?=autodetect_type=yes_geom_columns=no Bringing a little more order to how we name and address datasets was on my todo list at the start of the year, but then 2020 went into a spiral. I don't think rasterio's "zip+s3" etc approach is the best. We should start from scratch and come up with something excellent and expressive and broadly supported in GDAL, QGIS, rasterio, GeoPandas, GeoTrellis etc. [1] https://en.wikipedia.org/wiki/Uniform_Resource_Name [2] https://filesystem-spec.readthedocs.io/en/latest/index.html [3] https://tools.ietf.org/html/rfc8141#section-2.3 -- Sean Gillies ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Passing open options along dataset name in a string ?
Hi Even, On Mon, Nov 2, 2020 at 3:10 AM Even Rouault wrote: > Hi, > > I've heard interest in having the capability of passing a GDAL dataset > name > and its open options in a single string, since this is easier for storing. > > The syntax could be a JSON serialized string prefixed by GDAL_JSON: to > avoid > any ambiguity with drivers that would accept JSON as a connection string/ > dataset name. > > So something like: > > GDAL_JSON:{"dataset":"foo.csv","open_options":["AUTODETECT_TYPE=YES", > "KEEP_GEOM_COLUMNS=NO"]} > > A "allowed_drivers" member could also be added to reflect the > corresponding > argument of GDALOpenEx() > > GDALOpen()/GDALOpenEx() would parse this, and process that exactly as if > it > was called with the dataset name, open options and allowed drivers put in > the > dedicated C arguments. So no change in drivers, just in GDALOpenEx(). > > If using that syntax, it wouldn't make sense to have both serialized > options > and options passed as C-argument together, so a warning would be emitted > if > that happened > > Thoughts ? > > Even > We already have a way of passing "open" options for vsicurl: https://gdal.org/user/virtual_file_systems.html#vsicurl-http-https-ftp-files-random-access. What about reusing that conceptual framework and syntax? For example: "foo.csv?AUTODETECT_TYPE=YES_GEOM_COLUMNS=NO" -- Sean Gillies ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Passing open options along dataset name in a string ?
Sean, > We already have a way of passing "open" options for vsicurl: > https://gdal.org/user/virtual_file_systems.html#vsicurl-http-https-ftp-files > -random-access. What about reusing that conceptual framework and syntax? > > For example: > > "foo.csv?AUTODETECT_TYPE=YES_GEOM_COLUMNS=NO" I actually considered that, but realized that things would get messy if you want to use that vsicurl syntax and open options... You would then have strings like /vsicurl?max_retry=5=http://example.com/foo.csv_TYPE=YES_GEOM_COLUMNS=NO and the GDALOpen() logic would have to figure out whas is the /vsicurl part and the open option part. Or we would have to URL-escape the "/vsicurl?max_retry=5=http://example.com/foo.csv; part to avoid using '?' and '&', like: /vsicurl%3Fmax_retry=5%26url=http://example.com/foo.csv?AUTODETECT_TYPE=YES_GEOM_COLUMNS=NO Another issue is we have connection strings like "WFS:http://example.com/wfs?SERVICE=WFS=2.0.0; (or actually just the "/vsicurl?max_retry=5=http://example.com/foo.csv; string mentioned above). GDALOpen() would then mis-interpret this as dataset name = "WFS:http://example.com/wfs; with open options SERVICE=WFS and VERSION=2.0.0 -- Spatialys - Geospatial professional services http://www.spatialys.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] Passing open options along dataset name in a string ?
Hi, I've heard interest in having the capability of passing a GDAL dataset name and its open options in a single string, since this is easier for storing. The syntax could be a JSON serialized string prefixed by GDAL_JSON: to avoid any ambiguity with drivers that would accept JSON as a connection string/ dataset name. So something like: GDAL_JSON:{"dataset":"foo.csv","open_options":["AUTODETECT_TYPE=YES", "KEEP_GEOM_COLUMNS=NO"]} A "allowed_drivers" member could also be added to reflect the corresponding argument of GDALOpenEx() GDALOpen()/GDALOpenEx() would parse this, and process that exactly as if it was called with the dataset name, open options and allowed drivers put in the dedicated C arguments. So no change in drivers, just in GDALOpenEx(). If using that syntax, it wouldn't make sense to have both serialized options and options passed as C-argument together, so a warning would be emitted if that happened Thoughts ? Even -- Spatialys - Geospatial professional services http://www.spatialys.com ___ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev