I'm a little concerned we're making a lot of unnecessary updates to the
spec.

The sections that are not specifically about serialization should be fine
as they are.  Those sections document the requirements and behavior, but
implementers shouldn't be look at that for the representation (that's the
purpose of the serialization section).

I'd isolate these changes to just the section on serialization.

-Dan

On Fri, Jun 19, 2026 at 12:01 PM Sung Yun <[email protected]> wrote:

> Thanks for working on these clarifying PRs, Kevin!
>
> My vote is +1 to SHOULD. I left the detailed analysis/reasoning on #16798.
>
> I noticed PyIceberg is a bit stricter than the other language libraries;
> here's a PR to bring it in line:
> https://github.com/apache/iceberg-python/pull/3530
>
> Cheers,
> Sung
>
> On 2026/06/19 17:43:54 Kevin Liu wrote:
> > Going to merge "Spec: Clarify geography type serialization #16799" since
> > it's a minor formatting fix. Thanks everyone for reviewing and approving
> > the PR.
> >
> > I'm looking for more feedback on "Spec: Clarify decimal type
> > serialization #16798",
> > it is a minor formatting fix and also contains a sentence clarifying the
> > spec.
> >
> > Best,
> > Kevin Liu
> >
> > On Wed, Jun 17, 2026 at 9:57 AM Kevin Liu <[email protected]> wrote:
> >
> > > Hi everyone,
> > >
> > > I’d like to get feedback on two small spec clarification PRs that
> update
> > > the schema JSON type string serialization table:
> > >
> > > * https://github.com/apache/iceberg/pull/16798
> > >   Clarifies that the canonical schema JSON decimal type string is
> > > `decimal(P, S)`, matching current writer output, and notes that readers
> > > should accept optional whitespace for compatibility with non-canonical
> type
> > > strings such as `decimal(9,2)`.
> > >
> > > * https://github.com/apache/iceberg/pull/16799
> > >   Clarifies that the canonical schema JSON geography type string is
> > > `geography(C, A)`, including the space after the comma and the
> parameter
> > > name `A`, matching current writer output.
> > >
> > > The motivation came from
> > > https://github.com/apache/iceberg-rust/issues/2534. iceberg-rust was
> > > writing decimal types as `decimal(P,S)` (without space), while Java and
> > > Python write `decimal(P, S)` (with space). That exposed ambiguity in
> the
> > > spec because strict downstream parsers may accept only one form. The
> > > intended behavior is that writers produce the canonical form, while
> readers
> > > remain compatible with existing metadata by accepting both spacing
> variants.
> > >
> > > These are intended as spec clarifications only. The goal is to document
> > > the canonical serialized form produced by implementations while
> preserving
> > > reader-side compatibility for existing metadata.
> > >
> > > One point I’d like explicit feedback on is the reader compatibility
> > > wording in #16798. The PR currently uses “should” for accepting
> optional
> > > whitespace. I think this should use “should” rather than “must”. “Must”
> > > would make accepting optional whitespace a hard conformance rule, while
> > > this is intended as reader-side compatibility for non-canonical type
> > > strings.
> > >
> > > Would love to hear your thoughts!
> > >
> > > Thanks,
> > > Kevin
> > >
> >
>

Reply via email to