+1 on introducing pagination to avoid unbounded collections, that’s
definitely the right direction.

That said, I’d be cautious about completely removing non-paginated
behavior. There are valid scenarios that rely on retrieving a consistent,
point-in-time view of data. Pagination across a live dataset can introduce
drift between pages unless the server supports some form of snapshot
pinning (e.g., a read timestamp, revision ID, or snapshot token).

It might be worth discussing how we can support these point-in-time
correctness requirements alongside pagination.

Yufei


On Thu, Oct 23, 2025 at 10:54 AM Dmitri Bourlatchkov <[email protected]>
wrote:

> Hi All,
>
> Supporting the "old" (full list, non-paginated) behaviour with a feature
> flag sounds reasonable to me.
>
> I believe the default should still be "off" (i.e. all requests are
> paginated). Affected deployments will be able to set the flag proactively
> before upgrading to maintain compatibility.
>
> Cheers,
> Dmitri.
>
>
> On Thu, Oct 23, 2025 at 12:45 PM Andrew Guterman <
> [email protected]>
> wrote:
>
> > I understand the risks of non-paginated APIs.
> >
> > The problem is that coupling deprecation of features to release versions
> > means that downstream projects are blocked from upgrading Polaris unless
> > they perform every migration for every breaking change. Feature flags
> allow
> > downstream projects to pick their own upgrade path.
> >
> > Best,
> > Andrew
> >
> > On Thu, Oct 23, 2025 at 4:22 AM Alexandre Dutra <[email protected]>
> wrote:
> >
> > > +1 to Robert's proposal of deprecating for removal all non-paginated
> > > requests to Polaris's own APIs.
> > >
> > > For IRC APIs, I'll note that the ObjectMapper that we use already has
> > > a stream read length protection, see
> > > PolarisIcebergObjectMapperCustomizer [1]. We could add a stream write
> > > length protection as well.
> > >
> > > Thanks,
> > > Alex
> > >
> > > [1]:
> > >
> >
> https://github.com/apache/polaris/blob/20febdaede19fb7c46e120652fdd1a262c2138e4/runtime/service/src/main/java/org/apache/polaris/service/config/PolarisIcebergObjectMapperCustomizer.java#L61-L64
> > >
> > > On Thu, Oct 23, 2025 at 12:31 PM Robert Stupp <[email protected]> wrote:
> > > >
> > > > Returning full lists, which can be extremely large, can let requests
> > > > fail on the client or the server, cause overly excessive resource
> > > > usage or even bring down clients and servers (OOM). That's why most
> > > > listing endpoints have limits on the response size (# of bytes or
> > > > elements) and support paging as a 1st class citizen. I think this is
> > > > what Polaris should do as well.
> > > >
> > > > Considering the risks that come with large responses, I think having
> > > > paging always enabled is the safer approach.
> > > > I propose to deprecate the ability to return "full response lists" at
> > > > least for Polaris' own APIs and require pagination after 1 or 2 minor
> > > > releases.
> > > >
> > > > For IRC, if we agree that overly large responses are a risk, we can
> > > > let requests that would yield too large responses (w/o pagination)
> > > > fail early and protect both the server and the client.
> > > >
> > > > On Mon, Oct 20, 2025 at 7:18 PM Andrew Guterman
> > > > <[email protected]> wrote:
> > > > >
> > > > > Returning the full list when no pageToken is specified would be
> > > necessary
> > > > > for backward compatibility, but a feature flag as you mentioned
> above
> > > makes
> > > > > sense to me.
> > > > >
> > > > > Best,
> > > > > Andrew
> > > > >
> > > > > On Wed, Oct 15, 2025 at 7:08 AM Dmitri Bourlatchkov <
> > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi Eric,
> > > > > >
> > > > > > I agree with your points.
> > > > > >
> > > > > > What worries me in the Iceberg spec is this statement:
> > > > > >
> > > > > > "Clients may initiate the first paginated request by sending an
> > > empty query
> > > > > > parameter `pageToken` to the server."
> > > > > >
> > > > > > I think it implies that a client that does not send a pageToken
> > > parameter
> > > > > > can expect to get the full response (not paginated).
> > > > > >
> > > > > > This is probably not the right forum to discuss the Iceberg spec,
> > > but I'd
> > > > > > like to avoid this kind of ambiguity in APIs owned by Polaris.
> > > > > >
> > > > > > Cheers,
> > > > > > Dmitri.
> > > > > >
> > > > > > On Tue, Oct 14, 2025 at 7:45 PM Eric Maynard <
> > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Dmitri,
> > > > > > >
> > > > > > > This actually matches my interpretation of the IRC spec. It
> says
> > > > > > > <
> > > > > > >
> > > > > >
> > >
> >
> https://github.com/apache/iceberg/blob/c7df5200df462764ba0b3e81484243532c941caf/open-api/rest-catalog-open-api.yaml#L2024
> > > > > > > >
> > > > > > > :
> > > > > > >
> > > > > > > > Servers that support pagination should identify the
> `pageToken`
> > > > > > parameter
> > > > > > > and return a `next-page-token` in the response if there are
> more
> > > results
> > > > > > > available.
> > > > > > >
> > > > > > > My interpretation of the above is that next-page-token uniquely
> > > describes
> > > > > > > whether or not more results are available. Not the size of the
> > > response.
> > > > > > In
> > > > > > > fact, the spec defines page-size as "an *upper bound* of the
> > > number of
> > > > > > > results that a client will receive". Tangentially, I would
> prefer
> > > if the
> > > > > > > spec described this in looser terms, such as a "requested upper
> > > bound".
> > > > > > >
> > > > > > > What the spec does *not* say is that a client can safely assume
> > > there are
> > > > > > > no more results if it receives less than page-size elements. I
> > > think that
> > > > > > > you are probably right that a client exists which makes an
> > > incorrect
> > > > > > > assumption here though :)
> > > > > > >
> > > > > > > --EM
> > > > > > >
> > > > > > > On Tue, Oct 14, 2025 at 4:37 PM Dmitri Bourlatchkov <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Andrew and everyone,
> > > > > > > >
> > > > > > > > Adding pagination to the Management API would be very
> helpful.
> > > > > > > >
> > > > > > > > As to reusing the pagination parameter sepantics of the
> Iceberg
> > > REST
> > > > > > > > spec... I'm not so sure.
> > > > > > > >
> > > > > > > > I do believe that servers should have ultimate control over
> > page
> > > sizes.
> > > > > > > So
> > > > > > > > any client-side "size" parameters should be suggestions or
> > hints
> > > at
> > > > > > most.
> > > > > > > >
> > > > > > > > As a continuation of that approach, the server should always
> be
> > > able to
> > > > > > > > produce a partial response (with a next page token) even if
> the
> > > client
> > > > > > > did
> > > > > > > > not provide any explicit pagination parameters.
> > > > > > > >
> > > > > > > > That said, given that existing clients may expect to get
> "full"
> > > results
> > > > > > > > from the Management API when they do not use pagination
> > > parameters, I
> > > > > > > think
> > > > > > > > it should be fine to enable that behaviour with a feature
> flag.
> > > > > > > >
> > > > > > > > WDYT?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Dmitri.
> > > > > > > >
> > > > > > > > On Fri, Oct 10, 2025 at 8:08 PM Andrew Guterman <
> > > > > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey folks,
> > > > > > > > >
> > > > > > > > > I wanted to gauge sentiment on adding pagination to non-IRC
> > > APIs,
> > > > > > such
> > > > > > > as
> > > > > > > > > the management APIs, as the number of management entities
> > > (catalogs,
> > > > > > > > > principals, etc) can grow large and become un-listable all
> at
> > > once.
> > > > > > > > >
> > > > > > > > > I'm not sure if this has been discussed previously but I
> > > couldn't
> > > > > > find
> > > > > > > a
> > > > > > > > > thread nor PRs related to it.
> > > > > > > > >
> > > > > > > > > My proposal is to not reinvent the wheel and just re-use
> the
> > > spec and
> > > > > > > > > implementation of the IRC APIs, where requests contain a
> > > "page-token"
> > > > > > > and
> > > > > > > > > "page-size" param, and responses return a
> "next-page-token".
> > > > > > > > >
> > > > > > > > > Let me know what you think.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Andrew
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > >
> >
>

Reply via email to