Hi Dylan,
Thanks for the detailed review. I clarified these points in the proposal. Q1: The core proposal of this FLIP is the native `GEOGRAPHY` type and the interoperability contract support in Flink: SQL/planner type support, runtime representation, serialization/state, SQL Gateway/compiled plan support, and connector/format interoperability with Iceberg and Parquet. The FLIP does not intend that Flink core will become a replacement for dedicated geospatial projects such as Sedona. I added clarification of the function scope in the proposal doc. The v1 function surface is intentionally small and is not meant to become a full geospatial function catalog. Constructor and conversion functions are part of making the native type usable for interoperability purposes. Richer geospatial processing, spatial joins, indexing, reprojection, and larger PostGIS/Sedona-like function coverage remains outside the scope of this FLIP. Q2: Regarding Sedona, I added clarification in the proposal doc that `GeographyData` is a Flink internal runtime representation, not a required computation API for Sedona. The stable interoperability path for Sedona and similar projects is ISO WKB via `GeographyData.toBytes()`, `ST_ASWKB`, and `ST_GEOGFROMWKB`. Sedona may still convert values into its own internal representation for computation and produce Flink `GEOGRAPHY` values back through WKB. Q3: For display and casts, I added clarification in the proposal that display rendering is separate from SQL casts. SQL Client, `TableResult.print()`, and SQL Gateway textual results should render `GEOGRAPHY` values as WKT, equivalent to `ST_ASTEXT`, for human-readable output. This display path does not imply an implicit cast to `STRING`. I also clarified that Java `toString()` should not be treated as the SQL conversion contract; explicit textual conversion remains `ST_ASTEXT`. Q4: For SQL semantics, I added explicit wording for equality, grouping, and comparability. The proposed v1 semantics are representation-based: equality and hashing are based on the stored/canonical WKB representation, not topological geospatial equality. `GROUP BY` and `DISTINCT` use the same equality/hash semantics. Topological equality, if exposed, is a separate geospatial predicate and should not be confused with SQL grouping semantics. `GEOGRAPHY` is not order-comparable, so ordering comparisons such as `<`, `<=`, `>`, `>=`, as well as `MIN` and `MAX`, are not defined for this type unless users explicitly convert the value to another representation, such as WKT or WKB. For the minor issues: - You are right about the JTS license. I corrected the license note. JTS would be included as a dependency (similar to how it’s included in Apache Sedona [1]), following Apache licensing guidelines for Eclipse Distribution License 1.0 [2]. - I fixed the `GEOMETRYCOLLECTION` inconsistency. It remains listed as a supported standard 2D WKB subtype and was removed from Future Work. I hope this clarifies the intent and addresses your questions. Best, David [1] - https://github.com/apache/sedona/blob/master/pom.xml#L155-L159 [2] - https://www.apache.org/legal/resolved.html On Thu, Jun 11, 2026 at 11:25 AM dylanhz <[email protected]> wrote: > Hi David, > > Thanks for the proposal. The FLIP is very detailed and explains the > motivation clearly. I have a few questions about the scope and semantics. > > 1. What is the expected scope of Flink in this area? Should Flink only > provide the native type and interoperability with formats/connectors, or > also maintain geospatial computation functions? I can understand having > constructor/conversion functions in core, but the supported scope and > long-term maintenance cost of domain-specific computation functions seem to > be an important concern. I am not sure whether those functions should be > maintained by Flink rather than a dedicated geospatial project such as > Sedona. > > 2. Have we considered how Sedona would adapt to this new type? Is the > proposed GeographyData interface enough for Sedona to consume and produce > GEOGRAPHY values efficiently? Or would Sedona not be able to directly use > Flink’s internal interface for computation and still need to convert to its > own internal representation? > > 3. If GEOGRAPHY does not support cast to STRING, how will values be > displayed in SQL Client, TableResult.print(), and SQL Gateway results? Do > we plan to add a display-specific converter, or should explicit cast to > STRING be supported? > > 4. Could the FLIP define the SQL semantics of GEOGRAPHY more explicitly? > For example, equality, comparability, GROUP BY and DISTINCT . > > A couple of minor issues: > > a. The JTS license in the document seems inaccurate. > > b. GEOMETRYCOLLECTION is listed as supported, but also appears in future > work. > > > ---------- > Best regards, > dylanhz > > > > > > 2026年6月9日 02:32,David Chaava via dev <[email protected]> 写道: > > > > Hi Martijn, > > > > > > > > Thanks for raising this point. I updated the proposal with a dedicated > > Calcite / Planner Integration section to clarify how `GEOGRAPHY` and the > > proposed `ST_*` functions fit into Flink's SQL/planner path. > > > > > > > > Please let us know if this addresses the question or if there are any > > additional > > planner details you would like us to cover. > > > > > > > > Thanks > > > > On Mon, Jun 8, 2026 at 11:46 AM Martijn Visser <[email protected] > > > > wrote: > > > >> Hi David, > >> > >> Thanks for the FLIP. It doesn't contain any information/reference to > >> Calcite though. Are you not planning to leverage Calcite at all for > >> this? > >> > >> Best regards, > >> > >> Martijn > >> > >> Op vr 5 jun 2026 om 10:19 schreef David Chaava via dev < > >> [email protected]>: > >>> > >>> lHi everyone, > >>> > >>> I would like to start a discussion on FLIP-XXX: GEOGRAPHY type in Flink > >> SQL > >>> and Table API [1]. > >>> > >>> Flink currently has no first-class geospatial type. Users working with > >>> geographic data are forced into unsatisfying workarounds — encoding > >>> geometries as raw strings, storing binary blobs, or pulling in external > >>> libraries with no SQL-level integration. None of these options are > >>> ergonomic, interoperable, or type-safe. > >>> > >>> We propose introducing a native GEOGRAPHY type to Flink SQL and the > Table > >>> API, bringing first-class geospatial support to streaming and batch > >>> pipelines. The key changes are: > >>> > >>> 1. New GEOGRAPHY Type - A dedicated logical type representing > geospatial > >>> values (points, lines, polygons, etc.) following the WKT/WKB standard, > >> with > >>> proper serialization and catalog integration. > >>> > >>> 2. Built-in Geospatial Functions - A set of SQL functions (e.g. > >>> ST_Distance, ST_Contains, ST_AsText) enabling spatial predicates and > >>> transformations directly in SQL queries. > >>> > >>> 3. Connector & Format Support - Pluggable encoding support so > connectors > >>> can read and write GEOGRAPHY values in standard formats (WKT, WKB, > >> GeoJSON). > >>> > >>> Looking forward to your feedback! > >>> > >>> Best regards, > >>> David Chaava > >>> > >>> [1] > >>> > >> > https://docs.google.com/document/d/1rpOTETT_Ui3TlEGioUr2NKJ1p1dlxjJQudXHndxBpO0/edit?usp=sharing > >> > >
