Thanks for working on these clarifying PRs, Kevin!

My vote is +1 to SHOULD. I left the detailed analysis/reasoning on #16798.

I noticed PyIceberg is a bit stricter than the other language libraries; here's 
a PR to bring it in line: https://github.com/apache/iceberg-python/pull/3530

Cheers,
Sung

On 2026/06/19 17:43:54 Kevin Liu wrote:
> Going to merge "Spec: Clarify geography type serialization #16799" since
> it's a minor formatting fix. Thanks everyone for reviewing and approving
> the PR.
> 
> I'm looking for more feedback on "Spec: Clarify decimal type
> serialization #16798",
> it is a minor formatting fix and also contains a sentence clarifying the
> spec.
> 
> Best,
> Kevin Liu
> 
> On Wed, Jun 17, 2026 at 9:57 AM Kevin Liu <[email protected]> wrote:
> 
> > Hi everyone,
> >
> > I’d like to get feedback on two small spec clarification PRs that update
> > the schema JSON type string serialization table:
> >
> > * https://github.com/apache/iceberg/pull/16798
> >   Clarifies that the canonical schema JSON decimal type string is
> > `decimal(P, S)`, matching current writer output, and notes that readers
> > should accept optional whitespace for compatibility with non-canonical type
> > strings such as `decimal(9,2)`.
> >
> > * https://github.com/apache/iceberg/pull/16799
> >   Clarifies that the canonical schema JSON geography type string is
> > `geography(C, A)`, including the space after the comma and the parameter
> > name `A`, matching current writer output.
> >
> > The motivation came from
> > https://github.com/apache/iceberg-rust/issues/2534. iceberg-rust was
> > writing decimal types as `decimal(P,S)` (without space), while Java and
> > Python write `decimal(P, S)` (with space). That exposed ambiguity in the
> > spec because strict downstream parsers may accept only one form. The
> > intended behavior is that writers produce the canonical form, while readers
> > remain compatible with existing metadata by accepting both spacing variants.
> >
> > These are intended as spec clarifications only. The goal is to document
> > the canonical serialized form produced by implementations while preserving
> > reader-side compatibility for existing metadata.
> >
> > One point I’d like explicit feedback on is the reader compatibility
> > wording in #16798. The PR currently uses “should” for accepting optional
> > whitespace. I think this should use “should” rather than “must”. “Must”
> > would make accepting optional whitespace a hard conformance rule, while
> > this is intended as reader-side compatibility for non-canonical type
> > strings.
> >
> > Would love to hear your thoughts!
> >
> > Thanks,
> > Kevin
> >
> 

Reply via email to