Hi all,

I would like to start a discussion about *improving the type support
consistency of built-in aggregate functions in Flink SQL and aligning them
more systematically with the ANSI SQL standard*.
Background

Currently, Flink SQL provides a rich set of built-in aggregate functions
(e.g., SUM, AVG, MIN, MAX, COUNT, STDDEV, etc.). However, the supported
input types for these functions are not fully documented in a structured
way, and in some cases they appear to be inconsistent with ANSI SQL
expectations or with common database systems.

For example:

   -

   Some aggregate functions do not support certain string types such as CHAR
   .
   -

   Numeric aggregates may have limitations or implicit behaviors around
   DECIMAL precision/scale inference.
   -

   Support for INTERVAL, BOOLEAN, or time-related types is not always
   clearly defined or consistent.
   -

   There is no centralized “type support matrix” describing which aggregate
   function supports which logical types.

This makes it harder for users to reason about SQL portability and standard
compliance.
Proposal

I propose the following steps:

   1.

   Define a built-in aggregate function × data type support matrix.
   -

      Cover all built-in aggregate functions.
      -

      Cover major logical types (CHAR/VARCHAR, numeric types, DECIMAL,
      DATE/TIME/TIMESTAMP, INTERVAL, BOOLEAN, etc.).
      -

      Explicitly document current support status.
      2.

   Compare the current behavior against:
   -

      ANSI SQL standard expectations
      -

      Widely adopted database behaviors (for reference)
      3.

   Identify gaps and inconsistencies, and prioritize incremental
   improvements.
   -

      For example: enabling MIN/MAX on CHAR, clarifying DECIMAL inference
      rules for AVG, etc.
      -

      Ensure backward compatibility and avoid breaking changes.
      4.

   Add corresponding validation tests and documentation updates to make the
   behavior explicit and predictable.

Scope

This discussion is limited to:

   -

   Built-in aggregate functions in the Table/SQL planner.
   -

   Type inference, validation, and return type determination.
   -

   No changes to runtime semantics beyond enabling or clarifying type
   support.

Compatibility

All changes should:

   -

   Preserve existing semantics where possible.
   -

   Avoid breaking existing queries.
   -

   Be introduced incrementally through small, reviewable improvements.

Questions for the community

   1.

   Do we agree that defining a formal type support matrix for built-in
   aggregates would improve clarity and standard alignment?
   2.

   Are there known historical design decisions or constraints around
   aggregate type support that we should consider?
   3.

   Would this effort require a FLIP, or can we proceed incrementally under
   a series of improvement JIRAs?

If there is consensus, I can start by drafting an initial type support
matrix based on the current implementation and share it for review.

Looking forward to your feedback.


 FLIP-XXX: Align Built-in Aggregate Function Type Support with ANSI SQL
<https://drive.google.com/open?id=1BWAU0ms6c5E1VkxplD9MwjOPL_V-ptexIu3CvWARa8g>

Best regards,
Feat Zhang

Reply via email to