Thomas, thanks for taking the time to put this together! I've always wanted geospatial support in the format, but thought that it would be best to have an expert design and build it with us so we don't get it wrong.
I think Walaa is right about the approach. We want to use partition transforms to do the heavy lifting of finding the right files for a query. That means that we'd need some clear but generic definition of geospatial objects in the data, along with more specific attributes. At a high level, I think that's probably done by storing each object using a standard envelope definition (bbox?) that we can use in partition transforms, and then a WKB column for the actual object. What do you think? Ryan On Thu, Oct 27, 2022 at 4:03 AM Walaa Eldin Moustafa <wa.moust...@gmail.com> wrote: > > Hi Thomas, > > It sounds what you are trying to achieve is to provide a custom partition > function? There is some discussion here > https://github.com/apache/iceberg/issues/1482. I guess supporting geometry > through this framework makes more sense since it does not require extending > the Iceberg type system, yet general enough to support other applications. > > Thanks, > Walaa. > > On Thu, Oct 27, 2022 at 12:33 AM Thomas Fredriksen > <thomas.fredriksen@oceandata.earth> wrote: >> >> Hello everyone, >> >> I am working big geospatial and trying to solve very large tables in object >> storage. Iceberg appear to be the ideal solution but does unfortunately not >> appear to support geometry columns. >> >> The way that iceberg is structured, it appears to be a good fit with the >> GeoParquet-standard >> (https://github.com/opengeospatial/geoparquet/blob/main/format-specs/geoparquet.md), >> so I created a pull request where I attempt to add this support: >> https://github.com/apache/iceberg/pull/6062 >> >> The PR deviates from GeoParquet in the CRS-field of the column metadata. >> GeoParquet requires the CRS to be defined as a PROJJSON JSON object, while >> the PR simply asks the user to specify and EPSG ID, where EPSG:4326 (WGS84 - >> latitude/longitude) is considered default. >> >> I would love feedback on the PR and welcome the discussion on whether >> geospatial/geometry belongs in the iceberg standard. >> >> Thomas Li Fredriksen >> Lead Solution Architect >> >> p +47 452 21 055 >> >> >> ––––– >> >> www.hubocean.earth -- Ryan Blue Tabular