Hi Sergey,

Thanks for raising this question.

I did consider Calcite's existing GEOMETRY type. The key reason not to use
it is that GEOMETRY and GEOGRAPHY represent different user-facing models.

GEOMETRY is planar: coordinates are interpreted in a flat coordinate space,
and operations such as distance, area, intersection, buffering, and
containment are modeled in that planar space. This is the right model for
projected local data, CAD/GIS-style datasets, indoor maps, building
footprints, parcels, or any workload where the user intentionally chooses a
planar CRS.

GEOGRAPHY is different: coordinates are longitude/latitude, the CRS is part
of the type contract, and operations are expected to use spherical/geodesic
semantics. This is the right model for GPS points, routing, and other
workloads involving the Earth’s surface.

If Flink represented GEOGRAPHY internally as Calcite GEOMETRY, we would
lose that distinction at the planner boundary. That would make it harder to
add a future Flink GEOMETRY type cleanly, and it would make function typing
ambiguous. For example, ST_DISTANCE over GEOGRAPHY should produce a
geodesic distance over the Earth's surface, while ST_DISTANCE over GEOMETRY
should produce a planar distance in the coordinate system chosen by the
user. These should be different overloads with different semantics, not the
same planner type with out-of-band interpretation.

Regarding contributing GEOGRAPHY to Calcite first: I think that’s a
reasonable long-term direction. For this FLIP, however, introducing
GEOGRAPHY as a native Flink logical type allows us to move forward
independently of the Calcite contribution and release process. An upstream
Calcite contribution could still be pursued as a possible future follow-up.

The plan we propose to proceed in this FLIP is: (1) GEOGRAPHY as a native
Flink logical type; (2) bridged through Flink's Calcite integration layer,
following Flink’s BITMAP data type implementation pattern; (3) an upstream
Calcite GEOGRAPHY contribution left as a possible future follow-up if the
Calcite community shows interest.

I've updated the FLIP to make the Calcite boundary explicit and noted
Calcite GEOMETRY as a considered alternative, with an explanation that it
would collapse planar and spherical/geodesic use cases into one planner
type and make geography-specific function modeling ambiguous.

Best,

David


On Sat, Jun 20, 2026 at 11:38 PM Sergey Nuyanzin <[email protected]>
wrote:

> or a follow up question: have you considered first adding GEOGRAPHY
> type to Calcite and then using it from Flink?
>
> On Sat, Jun 20, 2026 at 11:33 PM Sergey Nuyanzin <[email protected]>
> wrote:
> >
> > Hi David
> >
> > since you mentioned Calcite and mentioned type name GEOGRAPHY
> >
> > why do we need a new type instead of using existing Calcite's type
> > which is called GEOMETRY[1]?
> >
> > [1]
> https://github.com/apache/calcite/blob/00aec5236cf9c57ce90eb4a30f777798c3b52abb/core/src/main/java/org/apache/calcite/sql/type/SqlTypeName.java#L137
> >
> > On Fri, Jun 12, 2026 at 11:17 PM David Chaava via dev
> > <[email protected]> wrote:
> > >
> > > Hi Dylan,
> > >
> > >
> > >
> > > Thanks for the detailed review. I clarified these points in the
> proposal.
> > >
> > >
> > >
> > > Q1: The core proposal of this FLIP is the native `GEOGRAPHY` type and
> the
> > > interoperability contract support in Flink: SQL/planner type support,
> > > runtime representation, serialization/state, SQL Gateway/compiled plan
> > > support, and connector/format interoperability with
> > >
> > > Iceberg and Parquet. The FLIP does not intend that Flink core will
> become a
> > >
> > > replacement for dedicated geospatial projects such as Sedona.
> > >
> > >
> > >
> > > I added clarification of the function scope in the proposal doc. The v1
> > > function surface is intentionally small and is not meant to become a
> full
> > > geospatial function catalog. Constructor
> > >
> > > and conversion functions are part of making the native type usable for
> > > interoperability purposes. Richer geospatial processing, spatial joins,
> > > indexing, reprojection, and larger
> > >
> > > PostGIS/Sedona-like function coverage remains outside the scope of this
> > > FLIP.
> > >
> > >
> > >
> > > Q2: Regarding Sedona, I added clarification in the proposal doc that
> > > `GeographyData` is a Flink internal runtime representation, not a
> required
> > > computation API for Sedona. The stable interoperability path for
> Sedona and
> > > similar projects is ISO WKB via
> > >
> > > `GeographyData.toBytes()`, `ST_ASWKB`, and `ST_GEOGFROMWKB`. Sedona may
> > > still
> > >
> > > convert values into its own internal representation for computation and
> > > produce
> > >
> > > Flink `GEOGRAPHY` values back through WKB.
> > >
> > >
> > >
> > > Q3: For display and casts, I added clarification in the proposal that
> > > display rendering is separate from SQL casts. SQL Client,
> > > `TableResult.print()`, and SQL Gateway textual results should render
> > > `GEOGRAPHY` values as WKT, equivalent to `ST_ASTEXT`, for
> human-readable
> > > output. This display path does not imply an implicit cast to `STRING`.
> I
> > > also clarified that Java `toString()` should not be treated as the SQL
> > > conversion contract; explicit textual conversion remains `ST_ASTEXT`.
> > >
> > >
> > >
> > > Q4: For SQL semantics, I added explicit wording for equality,
> grouping, and
> > >
> > > comparability. The proposed v1 semantics are representation-based:
> equality
> > > and
> > >
> > > hashing are based on the stored/canonical WKB representation, not
> > > topological
> > >
> > > geospatial equality. `GROUP BY` and `DISTINCT` use the same
> equality/hash
> > >
> > > semantics. Topological equality, if exposed, is a separate geospatial
> > > predicate
> > >
> > > and should not be confused with SQL grouping semantics. `GEOGRAPHY` is
> not
> > >
> > > order-comparable, so ordering comparisons such as `<`, `<=`, `>`,
> `>=`, as
> > > well
> > >
> > > as `MIN` and `MAX`, are not defined for this type unless users
> explicitly
> > >
> > > convert the value to another representation, such as WKT or WKB.
> > >
> > >
> > >
> > > For the minor issues:
> > >
> > > - You are right about the JTS license. I corrected the license note.
> JTS
> > > would be included as a dependency (similar to how it’s included in
> Apache
> > > Sedona [1]), following Apache licensing guidelines for Eclipse
> Distribution
> > > License 1.0 [2].
> > >
> > > - I fixed the `GEOMETRYCOLLECTION` inconsistency. It remains listed as
> a
> > > supported standard 2D WKB subtype and was removed from Future Work.
> > >
> > >
> > >
> > > I hope this clarifies the intent and addresses your questions.
> > >
> > >
> > >
> > > Best,
> > >
> > > David
> > >
> > > [1] - https://github.com/apache/sedona/blob/master/pom.xml#L155-L159
> > >
> > > [2] - https://www.apache.org/legal/resolved.html
> > >
> > >
> > > On Thu, Jun 11, 2026 at 11:25 AM dylanhz <[email protected]> wrote:
> > >
> > > > Hi David,
> > > >
> > > > Thanks for the proposal. The FLIP is very detailed and explains the
> > > > motivation clearly. I have a few questions about the scope and
> semantics.
> > > >
> > > > 1. What is the expected scope of Flink in this area? Should Flink
> only
> > > > provide the native type and interoperability with
> formats/connectors, or
> > > > also maintain geospatial computation functions? I can understand
> having
> > > > constructor/conversion functions in core, but the supported scope and
> > > > long-term maintenance cost of domain-specific computation functions
> seem to
> > > > be an important concern. I am not sure whether those functions
> should be
> > > > maintained by Flink rather than a dedicated geospatial project such
> as
> > > > Sedona.
> > > >
> > > > 2. Have we considered how Sedona would adapt to this new type? Is the
> > > > proposed GeographyData interface enough for Sedona to consume and
> produce
> > > > GEOGRAPHY values efficiently? Or would Sedona not be able to
> directly use
> > > > Flink’s internal interface for computation and still need to convert
> to its
> > > > own internal representation?
> > > >
> > > > 3. If GEOGRAPHY does not support cast to STRING, how will values be
> > > > displayed in SQL Client, TableResult.print(), and SQL Gateway
> results? Do
> > > > we plan to add a display-specific converter, or should explicit cast
> to
> > > > STRING be supported?
> > > >
> > > > 4. Could the FLIP define the SQL semantics of GEOGRAPHY more
> explicitly?
> > > > For example, equality, comparability, GROUP BY and DISTINCT .
> > > >
> > > > A couple of minor issues:
> > > >
> > > > a. The JTS license in the document seems inaccurate.
> > > >
> > > > b. GEOMETRYCOLLECTION is listed as supported, but also appears in
> future
> > > > work.
> > > >
> > > >
> > > > ----------
> > > > Best regards,
> > > > dylanhz
> > > >
> > > >
> > > >
> > > >
> > > > > 2026年6月9日 02:32,David Chaava via dev <[email protected]> 写道:
> > > > >
> > > > > Hi Martijn,
> > > > >
> > > > >
> > > > >
> > > > > Thanks for raising this point. I updated the proposal with a
> dedicated
> > > > > Calcite / Planner Integration section to clarify how `GEOGRAPHY`
> and the
> > > > > proposed `ST_*` functions fit into Flink's SQL/planner path.
> > > > >
> > > > >
> > > > >
> > > > > Please let us know if this addresses the question or if there are
> any
> > > > > additional
> > > > > planner details you would like us to cover.
> > > > >
> > > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Mon, Jun 8, 2026 at 11:46 AM Martijn Visser <
> [email protected]
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Hi David,
> > > > >>
> > > > >> Thanks for the FLIP. It doesn't contain any information/reference
> to
> > > > >> Calcite though. Are you not planning to leverage Calcite at all
> for
> > > > >> this?
> > > > >>
> > > > >> Best regards,
> > > > >>
> > > > >> Martijn
> > > > >>
> > > > >> Op vr 5 jun 2026 om 10:19 schreef David Chaava via dev <
> > > > >> [email protected]>:
> > > > >>>
> > > > >>> lHi everyone,
> > > > >>>
> > > > >>> I would like to start a discussion on FLIP-XXX: GEOGRAPHY type
> in Flink
> > > > >> SQL
> > > > >>> and Table API [1].
> > > > >>>
> > > > >>> Flink currently has no first-class geospatial type. Users
> working with
> > > > >>> geographic data are forced into unsatisfying workarounds —
> encoding
> > > > >>> geometries as raw strings, storing binary blobs, or pulling in
> external
> > > > >>> libraries with no SQL-level integration. None of these options
> are
> > > > >>> ergonomic, interoperable, or type-safe.
> > > > >>>
> > > > >>> We propose introducing a native GEOGRAPHY type to Flink SQL and
> the
> > > > Table
> > > > >>> API, bringing first-class geospatial support to streaming and
> batch
> > > > >>> pipelines. The key changes are:
> > > > >>>
> > > > >>> 1. New GEOGRAPHY Type - A dedicated logical type representing
> > > > geospatial
> > > > >>> values (points, lines, polygons, etc.) following the WKT/WKB
> standard,
> > > > >> with
> > > > >>> proper serialization and catalog integration.
> > > > >>>
> > > > >>> 2. Built-in Geospatial Functions - A set of SQL functions (e.g.
> > > > >>> ST_Distance, ST_Contains, ST_AsText) enabling spatial predicates
> and
> > > > >>> transformations directly in SQL queries.
> > > > >>>
> > > > >>> 3. Connector & Format Support - Pluggable encoding support so
> > > > connectors
> > > > >>> can read and write GEOGRAPHY values in standard formats (WKT,
> WKB,
> > > > >> GeoJSON).
> > > > >>>
> > > > >>> Looking forward to your feedback!
> > > > >>>
> > > > >>> Best regards,
> > > > >>> David Chaava
> > > > >>>
> > > > >>> [1]
> > > > >>>
> > > > >>
> > > >
> https://docs.google.com/document/d/1rpOTETT_Ui3TlEGioUr2NKJ1p1dlxjJQudXHndxBpO0/edit?usp=sharing
> > > > >>
> > > >
> > > >
> >
> >
> >
> > --
> > Best regards,
> > Sergey
>
>
>
> --
> Best regards,
> Sergey
>

Reply via email to