Perhaps, for now, we can decouple the two discussions? Let’s first add
support for pagination and then consider removing support for
non-pagination?

—EM

On Fri, Oct 24, 2025 at 9:13 AM Dmitri Bourlatchkov <[email protected]>
wrote:

> Hi Yufei,
>
> In the proposed NoSQL persistence [1189] pagination consistency across
> requests will be guaranteed since the data model tracks changes across the
> whole catalog. Specifically, a pointer to the "state" of the catalog data
> can be included into the pagination token.
>
> As for the JDBC persistence, I'm not sure even single request full entity
> lists are completely free from the side effects of concurrent transactions.
> I suppose that depends on the transaction isolation level at the database,
> which is not strictly controlled by Polaris ATM.
>
> [1189] https://github.com/apache/polaris/pull/1189
>
> Cheers,
> Dmitri.
>
> On Thu, Oct 23, 2025 at 5:38 PM Yufei Gu <[email protected]> wrote:
>
> > +1 on introducing pagination to avoid unbounded collections, that’s
> > definitely the right direction.
> >
> > That said, I’d be cautious about completely removing non-paginated
> > behavior. There are valid scenarios that rely on retrieving a consistent,
> > point-in-time view of data. Pagination across a live dataset can
> introduce
> > drift between pages unless the server supports some form of snapshot
> > pinning (e.g., a read timestamp, revision ID, or snapshot token).
> >
> > It might be worth discussing how we can support these point-in-time
> > correctness requirements alongside pagination.
> >
> > Yufei
> >
> >
> > On Thu, Oct 23, 2025 at 10:54 AM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi All,
> > >
> > > Supporting the "old" (full list, non-paginated) behaviour with a
> feature
> > > flag sounds reasonable to me.
> > >
> > > I believe the default should still be "off" (i.e. all requests are
> > > paginated). Affected deployments will be able to set the flag
> proactively
> > > before upgrading to maintain compatibility.
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > >
> > > On Thu, Oct 23, 2025 at 12:45 PM Andrew Guterman <
> > > [email protected]>
> > > wrote:
> > >
> > > > I understand the risks of non-paginated APIs.
> > > >
> > > > The problem is that coupling deprecation of features to release
> > versions
> > > > means that downstream projects are blocked from upgrading Polaris
> > unless
> > > > they perform every migration for every breaking change. Feature flags
> > > allow
> > > > downstream projects to pick their own upgrade path.
> > > >
> > > > Best,
> > > > Andrew
> > > >
> > > > On Thu, Oct 23, 2025 at 4:22 AM Alexandre Dutra <[email protected]>
> > > wrote:
> > > >
> > > > > +1 to Robert's proposal of deprecating for removal all
> non-paginated
> > > > > requests to Polaris's own APIs.
> > > > >
> > > > > For IRC APIs, I'll note that the ObjectMapper that we use already
> has
> > > > > a stream read length protection, see
> > > > > PolarisIcebergObjectMapperCustomizer [1]. We could add a stream
> write
> > > > > length protection as well.
> > > > >
> > > > > Thanks,
> > > > > Alex
> > > > >
> > > > > [1]:
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/20febdaede19fb7c46e120652fdd1a262c2138e4/runtime/service/src/main/java/org/apache/polaris/service/config/PolarisIcebergObjectMapperCustomizer.java#L61-L64
> > > > >
> > > > > On Thu, Oct 23, 2025 at 12:31 PM Robert Stupp <[email protected]>
> > wrote:
> > > > > >
> > > > > > Returning full lists, which can be extremely large, can let
> > requests
> > > > > > fail on the client or the server, cause overly excessive resource
> > > > > > usage or even bring down clients and servers (OOM). That's why
> most
> > > > > > listing endpoints have limits on the response size (# of bytes or
> > > > > > elements) and support paging as a 1st class citizen. I think this
> > is
> > > > > > what Polaris should do as well.
> > > > > >
> > > > > > Considering the risks that come with large responses, I think
> > having
> > > > > > paging always enabled is the safer approach.
> > > > > > I propose to deprecate the ability to return "full response
> lists"
> > at
> > > > > > least for Polaris' own APIs and require pagination after 1 or 2
> > minor
> > > > > > releases.
> > > > > >
> > > > > > For IRC, if we agree that overly large responses are a risk, we
> can
> > > > > > let requests that would yield too large responses (w/o
> pagination)
> > > > > > fail early and protect both the server and the client.
> > > > > >
> > > > > > On Mon, Oct 20, 2025 at 7:18 PM Andrew Guterman
> > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > Returning the full list when no pageToken is specified would be
> > > > > necessary
> > > > > > > for backward compatibility, but a feature flag as you mentioned
> > > above
> > > > > makes
> > > > > > > sense to me.
> > > > > > >
> > > > > > > Best,
> > > > > > > Andrew
> > > > > > >
> > > > > > > On Wed, Oct 15, 2025 at 7:08 AM Dmitri Bourlatchkov <
> > > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Eric,
> > > > > > > >
> > > > > > > > I agree with your points.
> > > > > > > >
> > > > > > > > What worries me in the Iceberg spec is this statement:
> > > > > > > >
> > > > > > > > "Clients may initiate the first paginated request by sending
> an
> > > > > empty query
> > > > > > > > parameter `pageToken` to the server."
> > > > > > > >
> > > > > > > > I think it implies that a client that does not send a
> pageToken
> > > > > parameter
> > > > > > > > can expect to get the full response (not paginated).
> > > > > > > >
> > > > > > > > This is probably not the right forum to discuss the Iceberg
> > spec,
> > > > > but I'd
> > > > > > > > like to avoid this kind of ambiguity in APIs owned by
> Polaris.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Dmitri.
> > > > > > > >
> > > > > > > > On Tue, Oct 14, 2025 at 7:45 PM Eric Maynard <
> > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey Dmitri,
> > > > > > > > >
> > > > > > > > > This actually matches my interpretation of the IRC spec. It
> > > says
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/iceberg/blob/c7df5200df462764ba0b3e81484243532c941caf/open-api/rest-catalog-open-api.yaml#L2024
> > > > > > > > > >
> > > > > > > > > :
> > > > > > > > >
> > > > > > > > > > Servers that support pagination should identify the
> > > `pageToken`
> > > > > > > > parameter
> > > > > > > > > and return a `next-page-token` in the response if there are
> > > more
> > > > > results
> > > > > > > > > available.
> > > > > > > > >
> > > > > > > > > My interpretation of the above is that next-page-token
> > uniquely
> > > > > describes
> > > > > > > > > whether or not more results are available. Not the size of
> > the
> > > > > response.
> > > > > > > > In
> > > > > > > > > fact, the spec defines page-size as "an *upper bound* of
> the
> > > > > number of
> > > > > > > > > results that a client will receive". Tangentially, I would
> > > prefer
> > > > > if the
> > > > > > > > > spec described this in looser terms, such as a "requested
> > upper
> > > > > bound".
> > > > > > > > >
> > > > > > > > > What the spec does *not* say is that a client can safely
> > assume
> > > > > there are
> > > > > > > > > no more results if it receives less than page-size
> elements.
> > I
> > > > > think that
> > > > > > > > > you are probably right that a client exists which makes an
> > > > > incorrect
> > > > > > > > > assumption here though :)
> > > > > > > > >
> > > > > > > > > --EM
> > > > > > > > >
> > > > > > > > > On Tue, Oct 14, 2025 at 4:37 PM Dmitri Bourlatchkov <
> > > > > [email protected]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Andrew and everyone,
> > > > > > > > > >
> > > > > > > > > > Adding pagination to the Management API would be very
> > > helpful.
> > > > > > > > > >
> > > > > > > > > > As to reusing the pagination parameter sepantics of the
> > > Iceberg
> > > > > REST
> > > > > > > > > > spec... I'm not so sure.
> > > > > > > > > >
> > > > > > > > > > I do believe that servers should have ultimate control
> over
> > > > page
> > > > > sizes.
> > > > > > > > > So
> > > > > > > > > > any client-side "size" parameters should be suggestions
> or
> > > > hints
> > > > > at
> > > > > > > > most.
> > > > > > > > > >
> > > > > > > > > > As a continuation of that approach, the server should
> > always
> > > be
> > > > > able to
> > > > > > > > > > produce a partial response (with a next page token) even
> if
> > > the
> > > > > client
> > > > > > > > > did
> > > > > > > > > > not provide any explicit pagination parameters.
> > > > > > > > > >
> > > > > > > > > > That said, given that existing clients may expect to get
> > > "full"
> > > > > results
> > > > > > > > > > from the Management API when they do not use pagination
> > > > > parameters, I
> > > > > > > > > think
> > > > > > > > > > it should be fine to enable that behaviour with a feature
> > > flag.
> > > > > > > > > >
> > > > > > > > > > WDYT?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Dmitri.
> > > > > > > > > >
> > > > > > > > > > On Fri, Oct 10, 2025 at 8:08 PM Andrew Guterman <
> > > > > > > > > > [email protected]>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hey folks,
> > > > > > > > > > >
> > > > > > > > > > > I wanted to gauge sentiment on adding pagination to
> > non-IRC
> > > > > APIs,
> > > > > > > > such
> > > > > > > > > as
> > > > > > > > > > > the management APIs, as the number of management
> entities
> > > > > (catalogs,
> > > > > > > > > > > principals, etc) can grow large and become un-listable
> > all
> > > at
> > > > > once.
> > > > > > > > > > >
> > > > > > > > > > > I'm not sure if this has been discussed previously but
> I
> > > > > couldn't
> > > > > > > > find
> > > > > > > > > a
> > > > > > > > > > > thread nor PRs related to it.
> > > > > > > > > > >
> > > > > > > > > > > My proposal is to not reinvent the wheel and just
> re-use
> > > the
> > > > > spec and
> > > > > > > > > > > implementation of the IRC APIs, where requests contain
> a
> > > > > "page-token"
> > > > > > > > > and
> > > > > > > > > > > "page-size" param, and responses return a
> > > "next-page-token".
> > > > > > > > > > >
> > > > > > > > > > > Let me know what you think.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Andrew
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to