Thanks for working on these clarifying PRs, Kevin! My vote is +1 to SHOULD. I left the detailed analysis/reasoning on #16798.
I noticed PyIceberg is a bit stricter than the other language libraries; here's a PR to bring it in line: https://github.com/apache/iceberg-python/pull/3530 Cheers, Sung On 2026/06/19 17:43:54 Kevin Liu wrote: > Going to merge "Spec: Clarify geography type serialization #16799" since > it's a minor formatting fix. Thanks everyone for reviewing and approving > the PR. > > I'm looking for more feedback on "Spec: Clarify decimal type > serialization #16798", > it is a minor formatting fix and also contains a sentence clarifying the > spec. > > Best, > Kevin Liu > > On Wed, Jun 17, 2026 at 9:57 AM Kevin Liu <[email protected]> wrote: > > > Hi everyone, > > > > I’d like to get feedback on two small spec clarification PRs that update > > the schema JSON type string serialization table: > > > > * https://github.com/apache/iceberg/pull/16798 > > Clarifies that the canonical schema JSON decimal type string is > > `decimal(P, S)`, matching current writer output, and notes that readers > > should accept optional whitespace for compatibility with non-canonical type > > strings such as `decimal(9,2)`. > > > > * https://github.com/apache/iceberg/pull/16799 > > Clarifies that the canonical schema JSON geography type string is > > `geography(C, A)`, including the space after the comma and the parameter > > name `A`, matching current writer output. > > > > The motivation came from > > https://github.com/apache/iceberg-rust/issues/2534. iceberg-rust was > > writing decimal types as `decimal(P,S)` (without space), while Java and > > Python write `decimal(P, S)` (with space). That exposed ambiguity in the > > spec because strict downstream parsers may accept only one form. The > > intended behavior is that writers produce the canonical form, while readers > > remain compatible with existing metadata by accepting both spacing variants. > > > > These are intended as spec clarifications only. The goal is to document > > the canonical serialized form produced by implementations while preserving > > reader-side compatibility for existing metadata. > > > > One point I’d like explicit feedback on is the reader compatibility > > wording in #16798. The PR currently uses “should” for accepting optional > > whitespace. I think this should use “should” rather than “must”. “Must” > > would make accepting optional whitespace a hard conformance rule, while > > this is intended as reader-side compatibility for non-canonical type > > strings. > > > > Would love to hear your thoughts! > > > > Thanks, > > Kevin > > >
