Hi everyone, I’d like to get feedback on two small spec clarification PRs that update the schema JSON type string serialization table:
* https://github.com/apache/iceberg/pull/16798 Clarifies that the canonical schema JSON decimal type string is `decimal(P, S)`, matching current writer output, and notes that readers should accept optional whitespace for compatibility with non-canonical type strings such as `decimal(9,2)`. * https://github.com/apache/iceberg/pull/16799 Clarifies that the canonical schema JSON geography type string is `geography(C, A)`, including the space after the comma and the parameter name `A`, matching current writer output. The motivation came from https://github.com/apache/iceberg-rust/issues/2534. iceberg-rust was writing decimal types as `decimal(P,S)` (without space), while Java and Python write `decimal(P, S)` (with space). That exposed ambiguity in the spec because strict downstream parsers may accept only one form. The intended behavior is that writers produce the canonical form, while readers remain compatible with existing metadata by accepting both spacing variants. These are intended as spec clarifications only. The goal is to document the canonical serialized form produced by implementations while preserving reader-side compatibility for existing metadata. One point I’d like explicit feedback on is the reader compatibility wording in #16798. The PR currently uses “should” for accepting optional whitespace. I think this should use “should” rather than “must”. “Must” would make accepting optional whitespace a hard conformance rule, while this is intended as reader-side compatibility for non-canonical type strings. Would love to hear your thoughts! Thanks, Kevin
