Hi all, I wanted to revive this thread after the same expression problem came up again in a recent issue (https://github.com/apache/iceberg/issues/15072), this time while using the expressions model for reportMetrics API.
We've been writing literal booleans for a while, but the spec says it should be an object. Clients may have implemented the spec, so they expect the aforementioned object form. The previous consensus was to accept both forms when reading, and keep writing as boolean literals. Either way, we know some clients will need updates. Would love to hear thoughts on whether this approach still makes sense. PR: https://github.com/apache/iceberg/pull/14677 Drew On Wed, Dec 3, 2025 at 7:31 PM Drew <[email protected]> wrote: > Hey Everyone, > > Quick update on the boolean expression issue in this PR 14677 > <https://github.com/apache/iceberg/pull/14677>. > > This showed up while working on scan planning, but expressions are used in > other areas of the REST spec as well. Since the expression parser has > always written boolean literals, there are some users who have relied on > that behavior without ever using the REST models. Given that, I don't think > there's a path here that avoids breaking someone. > > Even if we start accepting the object form from the spec, users still need > to update their models to handle the parser's current behavior. And if we > flip the wire format to match the spec, then any client that's been > deserializing boolean will need to update. Either direction comes with a > breaking change. > > Here are the some paths forward: > > *Option 1*: Align the spec with what's actually been written. > > Pros: Matches existing behavior that clients already rely on > Cons: Clients that implemented the object form from the spec will need a > small update > > *Option 2*: Align the implementation with the published spec. > > Pros: Matches the current wording in the spec > Cons: Breaks clients that only expect boolean literals today > > *Option 3: *Keep writing boolean literals, but read both. > > Pros: No change to what we write. Accepts both the spec object form > {"type":"true"} and true > Cons: Still requires a spec update for reads from the client, and adds > extra logic to parser. > > I'm leaning more towards option one as booleans are what we have been > writing since the beginning and any client that has been following the spec > today can update their models to follow this behavior. > > Let me know what you all think! > > Thanks, > Drew > > On Tue, Nov 25, 2025 at 1:09 AM Fokko Driesprong <[email protected]> wrote: > >> Thanks, Drew, for finding and fixing this! We should definitely remove >> this discrepancy. I've replied to the PR. >> >> Kind regards, >> Fokko >> >> Op di 25 nov 2025 om 08:23 schreef Drew <[email protected]>: >> >>> Hi all, >>> >>> I ran into an issue using the REST scan planning APIs where filters >>> containing the boolean expressions were failing to be parsed. The REST spec >>> defines these models as an object wrapping the string representation like >>> {"type": "true"}, but the ExpressionParser actually read and writes them as >>> plain booleans. That mismatch causes the parser to reject filters that >>> follow the current spec. >>> >>> I opened a PR to update the REST spec to align with how the expression >>> parser is used. >>> >>> If anyone has any concerns with the spec change, or thinks we should >>> handle it differently (for example by changing the Java representation >>> instead), I’d appreciate any feedback. >>> >>> PR: https://github.com/apache/iceberg/pull/14677 >>> >>> - Drew >>> >>
