Re: [gdal-dev] Formalizing GDAL "file name" syntax in an RFC?

2023-09-15 Thread Even Rouault


I agree that there is an untidy legacy. And it's probably not worth 
formalizing the marginal filenames and connection strings. How about 
an RFC and formal syntax only for hierarchical datasets then? It seems 
like this is the direction the industry is growing.
Hiearchical gridded datasets are well covered by the multidimensional 
API, which has facilities like 
GDALGroupOpenMDArrayFromFullname(hRootGroup, 
"/group/subgroup/.../array_name") . What you provide to GDALOpen() when 
using the API is the plain filename  (or ZARR:{filename} since it is 
sometimes hard for the Zarr driver to recognize datasets otherwise)


Yes, I think a formal syntax for /vsi filenames would be useful. It's 
almost done already, right?


There's some differences among the /vsi file systems how to provide options

/vsisubfile/,

/vsisubfile/_,

/vsizip//

/vsizip/{}/   (the external { 
} are real characters, so an example of this is for example 
/vsizip/{my.apk}/file.bin )


/vsicurl?==

I've a PR sitting at https://github.com/OSGeo/gdal/pull/8351 to propose 
/vsicached?==


--
http://www.spatialys.com
My software is free, but my time generally not.

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Formalizing GDAL "file name" syntax in an RFC?

2023-09-15 Thread Sean Gillies
Hi Alessandro,

On Fri, Sep 15, 2023 at 8:31 AM ElPaso  wrote:

> Il 14/09/23 17:14, Sean Gillies ha scritto:
> > Hi all,
> >
> > In https://github.com/OSGeo/gdal/pull/8155#issuecomment-1704923263 I
> > think I see (for the first time?) the beginning of a specification of
> > the syntax for GDAL "file names". I think it would be helpful if there
> > was an RFC for this.
> >
> > I'm sure a lot of applications construct GDAL file names without much
> > understanding of what's correct or incorrect. A formal spec could help
> > make it more likely that anyone can construct a valid filename on the
> > first try.
> >
> > A stretch goal for the RFC could be to come up with a syntax that is
> > sufficiently general that authors of new format drivers don't have to
> > create their own new idiosyncratic file names.
> >
>
> Hi Sean,
>
>
> I totally agree that it would be useful to have a formally defined
> syntax to describe data sources, we have the same problem with QGIS
> QgsDataSourceUri and we are treating the URIs as a private
> implementation detail, but in QGIS we have a GUI to set the data source
> strings that partially mitigatest the issue.
>
>
> We discussed a few times what it could be the best format to encode a
> data source and URL encoding is probably a good candidate because it's
> well known and well supported by libraries.
>
>
> Maybe something like:
>
>
> :///
>
> mssql://username:password@hostname:port/?table=table_name=arg1...
>
> gpkg:///path/to.gpkg?table=table_name=arg1...
>
> postgis://username:password@hostname:port/?table=table_name=arg1...
>
> shapefile:///path/to.shp/?arg1=arg1...
>

Yes, something like this is my dream :) I started working on such a
proposal a while back and then was derailed by a job change. I might try to
dust it off and start writing on it again.

I think the "vrt://" URI scheme was a mistake, though, and that we
shouldn't extend it to dozens of other formats. At least not without
consulting with other software communities. Ideally these schemes would be
universal across the internet.

-- 
Sean Gillies
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Formalizing GDAL "file name" syntax in an RFC?

2023-09-15 Thread Sean Gillies
Even,

I agree that there is an untidy legacy. And it's probably not worth
formalizing the marginal filenames and connection strings. How about an RFC
and formal syntax only for hierarchical datasets then? It seems like this
is the direction the industry is growing.

Yes, I think a formal syntax for /vsi filenames would be useful. It's
almost done already, right? Only in prose, not ABNF (or whatever), but
that's a good start.

On Thu, Sep 14, 2023 at 10:21 AM Even Rouault 
wrote:

> Sean,
>
> It is far from obvious to me to find a universal pattern that would fit
> existing and future use cases. If we would find a universal syntax, there
> would be the problem of deciding what to do with code that doesn't match
> it: support only new way (breaking external code),  or supporting both old
> and new ways (additional complexity in our code base)
>
> How would we deal with the existing practices which are very diverse ?
> Examples:
>
> - True filenames (most of the drivers)
>
> - Database based drivers:
>
> PG:"dbname='databasename' host='addr' port='5432' user='x' password='y'"   
> (Postgis vector)PG:"[host=''] [port=''] [dbname='' [user=''] [password=''] 
> [schema=''] [table=''] [column=''] [where=''] [mode=''] 
> [outdb_resolution='']"  (Postgis raster, consistent with previous one)
> MYSQL:dbname,user="userid",password="password",host=x,port=y
> OCI:username/password@host_name:port_number/service_name:MY_SCHEMA.MY_VIEWgeoraster:{,/}{,@}[db],[schema.][table],[column],[where]
> georaster:{,/}{,@}[db],,
>
> - Web services (some of them also accept plain filenames that are .xml
> files describing the service and parameters)
>
> WMS:http://demo.opengeo.org/geoserver/gwc/service/wms?SERVICE=WMS=1.1.1;
> 
> REQUEST=GetMap=og%3Abugsites=EPSG:900913&
> 
> BBOX=-1.15841845090625E7,5479006.186718751,-1.1505912992109375E7,5557277.703671876&
> 
> FORMAT=image/png=256=25=0.0046653459640220=true"
> (note the mix of pure WMS GetMap request query parameters + GDAL specific 
> properties TILESIZE, OVERVIEWCOUNT, MINRESOLUTION)
>
> WCS:http://194.66.252.155/cgi-bin/BGS_EMODnet_bathymetry/ows?VERSION=1.1.0=BGS_EMODNET_CentralMed-MColWFS:http://www2.dmsolutions.ca/cgi-bin/mswfs_gmapOAPIF:https://www.ldproxy.nrw.de/rest/services/katasterWMTS:http://maps.wien.gv.at/wmts/1.0.0/WMTSCapabilities.xml,layer=lb
> ( layer= is GDAL specific)
>
>
> - Raster subdatasets (whether to surround filename with double quotes is
> driver specific)
>
> NITF_IM:{image_number}:{filename}
>
> NITF_TOC_ENTRY:{product_type}_{chartname}_{scale}_{somenumber}_{anothernumber}:GNCJNCN/rpf/a.toc
> ECRG_TOC_ENTRY:{product_name}:{disk_name}:{scale}:{filename}
> GTIFF:{directory_number}:{filename}
> GTIFF:off:{directory_offset}:{filename}
> GPKG:{filename}:{tablename}
> netCDF:{filename}:{variable_name}
> HDF5:{filename}:{dataset_path}
> ZARR:{filename}  (mostly when filename is /vsicurl/ and we don't have a
> ReadDirectory() API)
> ZARR:{filename}:{path_to_array}
>
> ZARR:{filename}:{path_to_array}:{third_dim_index}[:{fourth_dim_index}[:{fifth_dim_index}...]]
>
> - Other syntax
>
> vrt:// connection string (
> https://gdal.org/drivers/raster/vrt.html#vrt-connection-string) . e.g.
> vrt://my.tif?bands=3,2,1
>
> Do you have also /vsi syntax in mind ?
>
> Even
>
> Le 14/09/2023 à 17:14, Sean Gillies a écrit :
>
> Hi all,
>
> In https://github.com/OSGeo/gdal/pull/8155#issuecomment-1704923263 I
> think I see (for the first time?) the beginning of a specification of the
> syntax for GDAL "file names". I think it would be helpful if there was an
> RFC for this.
>
> I'm sure a lot of applications construct GDAL file names without much
> understanding of what's correct or incorrect. A formal spec could help make
> it more likely that anyone can construct a valid filename on the first try.
>
> A stretch goal for the RFC could be to come up with a syntax that is
> sufficiently general that authors of new format drivers don't have to
> create their own new idiosyncratic file names.
>
> --
> Sean Gillies
>
>
-- 
Sean Gillies
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Formalizing GDAL "file name" syntax in an RFC?

2023-09-15 Thread ElPaso

Il 14/09/23 17:14, Sean Gillies ha scritto:

Hi all,

In https://github.com/OSGeo/gdal/pull/8155#issuecomment-1704923263 I 
think I see (for the first time?) the beginning of a specification of 
the syntax for GDAL "file names". I think it would be helpful if there 
was an RFC for this.


I'm sure a lot of applications construct GDAL file names without much 
understanding of what's correct or incorrect. A formal spec could help 
make it more likely that anyone can construct a valid filename on the 
first try.


A stretch goal for the RFC could be to come up with a syntax that is 
sufficiently general that authors of new format drivers don't have to 
create their own new idiosyncratic file names.




Hi Sean,


I totally agree that it would be useful to have a formally defined 
syntax to describe data sources, we have the same problem with QGIS 
QgsDataSourceUri and we are treating the URIs as a private 
implementation detail, but in QGIS we have a GUI to set the data source 
strings that partially mitigatest the issue.



We discussed a few times what it could be the best format to encode a 
data source and URL encoding is probably a good candidate because it's 
well known and well supported by libraries.



Maybe something like:


:///

mssql://username:password@hostname:port/?table=table_name=arg1...

gpkg:///path/to.gpkg?table=table_name=arg1...

postgis://username:password@hostname:port/?table=table_name=arg1...

shapefile:///path/to.shp/?arg1=arg1...


--
Alessandro Pasotti
w3: www.itopen.it

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Formalizing GDAL "file name" syntax in an RFC?

2023-09-14 Thread Even Rouault

Sean,

It is far from obvious to me to find a universal pattern that would fit 
existing and future use cases. If we would find a universal syntax, 
there would be the problem of deciding what to do with code that doesn't 
match it: support only new way (breaking external code),  or supporting 
both old and new ways (additional complexity in our code base)


How would we deal with the existing practices which are very diverse ? 
Examples:


- True filenames (most of the drivers)

- Database based drivers:

PG:"dbname='databasename' host='addr' port='5432' user='x' password='y'" 
(Postgis vector) PG:"[host=''] [port=''] [dbname='' [user=''] 
[password=''] [schema=''] [table=''] [column=''] [where=''] [mode=''] 
[outdb_resolution='']" (Postgis raster, consistent with previous one) 
MYSQL:dbname,user="userid",password="password",host=x,port=y 
OCI:username/password@host_name:port_number/service_name:MY_SCHEMA.MY_VIEW georaster:{,/}{,@}[db],[schema.][table],[column],[where]

georaster:{,/}{,@}[db],,

- Web services (some of them also accept plain filenames that are .xml 
files describing the service and parameters)


WMS:http://demo.opengeo.org/geoserver/gwc/service/wms?SERVICE=WMS=1.1.1;
REQUEST=GetMap=og%3Abugsites=EPSG:900913&

BBOX=-1.15841845090625E7,5479006.186718751,-1.1505912992109375E7,5557277.703671876&
FORMAT=image/png=256=25=0.0046653459640220=true" 
(note the mix of pure WMS GetMap request query parameters + GDAL 
specific properties TILESIZE, OVERVIEWCOUNT, MINRESOLUTION)


WCS:http://194.66.252.155/cgi-bin/BGS_EMODnet_bathymetry/ows?VERSION=1.1.0=BGS_EMODNET_CentralMed-MCol
WFS:http://www2.dmsolutions.ca/cgi-bin/mswfs_gmap 
OAPIF:https://www.ldproxy.nrw.de/rest/services/kataster
WMTS:http://maps.wien.gv.at/wmts/1.0.0/WMTSCapabilities.xml,layer=lb
( layer= is GDAL specific)


- Raster subdatasets (whether to surround filename with double quotes is 
driver specific)


NITF_IM:{image_number}:{filename}
NITF_TOC_ENTRY:{product_type}_{chartname}_{scale}_{somenumber}_{anothernumber}:GNCJNCN/rpf/a.toc
ECRG_TOC_ENTRY:{product_name}:{disk_name}:{scale}:{filename}
GTIFF:{directory_number}:{filename}
GTIFF:off:{directory_offset}:{filename}
GPKG:{filename}:{tablename}
netCDF:{filename}:{variable_name}
HDF5:{filename}:{dataset_path}
ZARR:{filename}  (mostly when filename is /vsicurl/ and we don't have a 
ReadDirectory() API)

ZARR:{filename}:{path_to_array}
ZARR:{filename}:{path_to_array}:{third_dim_index}[:{fourth_dim_index}[:{fifth_dim_index}...]]

- Other syntax

vrt:// connection string 
(https://gdal.org/drivers/raster/vrt.html#vrt-connection-string) . e.g.  
vrt://my.tif?bands=3,2,1


Do you have also /vsi syntax in mind ?

Even

Le 14/09/2023 à 17:14, Sean Gillies a écrit :

Hi all,

In https://github.com/OSGeo/gdal/pull/8155#issuecomment-1704923263 I 
think I see (for the first time?) the beginning of a specification of 
the syntax for GDAL "file names". I think it would be helpful if there 
was an RFC for this.


I'm sure a lot of applications construct GDAL file names without much 
understanding of what's correct or incorrect. A formal spec could help 
make it more likely that anyone can construct a valid filename on the 
first try.


A stretch goal for the RFC could be to come up with a syntax that is 
sufficiently general that authors of new format drivers don't have to 
create their own new idiosyncratic file names.


--
Sean Gillies

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


--
http://www.spatialys.com
My software is free, but my time generally not.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] Formalizing GDAL "file name" syntax in an RFC?

2023-09-14 Thread Sean Gillies
Hi all,

In https://github.com/OSGeo/gdal/pull/8155#issuecomment-1704923263 I think
I see (for the first time?) the beginning of a specification of the syntax
for GDAL "file names". I think it would be helpful if there was an RFC for
this.

I'm sure a lot of applications construct GDAL file names without much
understanding of what's correct or incorrect. A formal spec could help make
it more likely that anyone can construct a valid filename on the first try.

A stretch goal for the RFC could be to come up with a syntax that is
sufficiently general that authors of new format drivers don't have to
create their own new idiosyncratic file names.

-- 
Sean Gillies
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev