Hi everyone,

I’m looking for some clarification (and potentially a small spec update)
regarding the Geospatial Physical Types documentation -
https://parquet.apache.org/docs/file-format/types/geospatial/, specifically
the CRS Customization section.

1) The Confusion

Currently, the spec states that custom CRS values should follow the
`type:identifier` format, where type is either `srid` or `projjson` -
(e.g., `srid:4326` or `projjson:property_name`). The spec also defines the
default CRS as `OGC:CRS84`.

Depending on how the specification is read, the reader may consider as
valid CRS definition to be only strings of the form `srid:<some number>` or
`projjson:<property name>`, which implies that `OGC:CRS84` does not adhere
to the rules defined in the customization section. This creates confusion
for implementers: should the type string always be parsed as a strict
"custom" format which necessitates the srid: prefix?

2) The Suggestion

I suggest we update the language to be explicit about allowed formats for
CRS, and my suggestion is that we break it down like this:
   - Standard CRS: Any string from a known authority in a format of
`<authority>:<identifier>` (e.g., `EPSG:4326`, `OGC:CRS84`, `ESRI:102100`)
is accepted.
   - Custom CRS: in the format of `type:identifier`
         - `srid:1234`: The definition resides in a local/database spatial
reference table.
         - `projjson:key`: The definition is stored in Parquet file/table
metadata.

This would validate `OGC:CRS84` as a first-class string while providing a
clear "escape hatch" for custom definitions.

What are your thoughts ?

Kind regards,
Milan

Reply via email to