jornfranke commented on issue #2586:
URL: https://github.com/apache/iceberg/issues/2586#issuecomment-1707233081

   I think it is a good start. I would like your opinion on the following:
   * From my point of view we need to support some spatial metadata - 
especially CRS  ). I propose to reuse the same as in the geoparquet definition: 
https://geoparquet.org/releases/v1.0.0-rc.1/
   * What do you propose as an underlying storage format? You mention three: 
geoparquet, spatialparquet and Geolake parquet and you implemented geoparquet, 
geoparquet (bbox), Geolake parquet. I propose to reduce this to one. At the 
moment it looks to me geoparquet has the largest community and support also in 
other systems (e.g. geopandas), which may make it easier to use in the Iceberg 
ecosystem (e.g. https://py.iceberg.apache.org/)
   
   Generally, I propose to go with a  roadmap with a simple release first first 
to make it also easier for people from the Iceberg project to review and get 
initial feedback from the Iceberg community, e.g.:
   First release: Storage backend geoparquet (and also include geoparquet 
metadata). Supported Ecosystem: Apache Sedona - Spark
   Second release: Add XZ partitioning. Supported Ecosystem: Apache Sedona 
Spark and Flink and PyIceberg.
   Third release: Include raster data (here the challenge is to split a big 
raster into multiple tiles that are transparently read as one, cf. 
https://sedona.apache.org/1.4.1/tutorial/storing-blobs-in-parquet/)...?
   
   This is just an example, it can be changed in the detail.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to