+1 on introducing pagination to avoid unbounded collections, that’s definitely the right direction.
That said, I’d be cautious about completely removing non-paginated behavior. There are valid scenarios that rely on retrieving a consistent, point-in-time view of data. Pagination across a live dataset can introduce drift between pages unless the server supports some form of snapshot pinning (e.g., a read timestamp, revision ID, or snapshot token). It might be worth discussing how we can support these point-in-time correctness requirements alongside pagination. Yufei On Thu, Oct 23, 2025 at 10:54 AM Dmitri Bourlatchkov <[email protected]> wrote: > Hi All, > > Supporting the "old" (full list, non-paginated) behaviour with a feature > flag sounds reasonable to me. > > I believe the default should still be "off" (i.e. all requests are > paginated). Affected deployments will be able to set the flag proactively > before upgrading to maintain compatibility. > > Cheers, > Dmitri. > > > On Thu, Oct 23, 2025 at 12:45 PM Andrew Guterman < > [email protected]> > wrote: > > > I understand the risks of non-paginated APIs. > > > > The problem is that coupling deprecation of features to release versions > > means that downstream projects are blocked from upgrading Polaris unless > > they perform every migration for every breaking change. Feature flags > allow > > downstream projects to pick their own upgrade path. > > > > Best, > > Andrew > > > > On Thu, Oct 23, 2025 at 4:22 AM Alexandre Dutra <[email protected]> > wrote: > > > > > +1 to Robert's proposal of deprecating for removal all non-paginated > > > requests to Polaris's own APIs. > > > > > > For IRC APIs, I'll note that the ObjectMapper that we use already has > > > a stream read length protection, see > > > PolarisIcebergObjectMapperCustomizer [1]. We could add a stream write > > > length protection as well. > > > > > > Thanks, > > > Alex > > > > > > [1]: > > > > > > https://github.com/apache/polaris/blob/20febdaede19fb7c46e120652fdd1a262c2138e4/runtime/service/src/main/java/org/apache/polaris/service/config/PolarisIcebergObjectMapperCustomizer.java#L61-L64 > > > > > > On Thu, Oct 23, 2025 at 12:31 PM Robert Stupp <[email protected]> wrote: > > > > > > > > Returning full lists, which can be extremely large, can let requests > > > > fail on the client or the server, cause overly excessive resource > > > > usage or even bring down clients and servers (OOM). That's why most > > > > listing endpoints have limits on the response size (# of bytes or > > > > elements) and support paging as a 1st class citizen. I think this is > > > > what Polaris should do as well. > > > > > > > > Considering the risks that come with large responses, I think having > > > > paging always enabled is the safer approach. > > > > I propose to deprecate the ability to return "full response lists" at > > > > least for Polaris' own APIs and require pagination after 1 or 2 minor > > > > releases. > > > > > > > > For IRC, if we agree that overly large responses are a risk, we can > > > > let requests that would yield too large responses (w/o pagination) > > > > fail early and protect both the server and the client. > > > > > > > > On Mon, Oct 20, 2025 at 7:18 PM Andrew Guterman > > > > <[email protected]> wrote: > > > > > > > > > > Returning the full list when no pageToken is specified would be > > > necessary > > > > > for backward compatibility, but a feature flag as you mentioned > above > > > makes > > > > > sense to me. > > > > > > > > > > Best, > > > > > Andrew > > > > > > > > > > On Wed, Oct 15, 2025 at 7:08 AM Dmitri Bourlatchkov < > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hi Eric, > > > > > > > > > > > > I agree with your points. > > > > > > > > > > > > What worries me in the Iceberg spec is this statement: > > > > > > > > > > > > "Clients may initiate the first paginated request by sending an > > > empty query > > > > > > parameter `pageToken` to the server." > > > > > > > > > > > > I think it implies that a client that does not send a pageToken > > > parameter > > > > > > can expect to get the full response (not paginated). > > > > > > > > > > > > This is probably not the right forum to discuss the Iceberg spec, > > > but I'd > > > > > > like to avoid this kind of ambiguity in APIs owned by Polaris. > > > > > > > > > > > > Cheers, > > > > > > Dmitri. > > > > > > > > > > > > On Tue, Oct 14, 2025 at 7:45 PM Eric Maynard < > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > Hey Dmitri, > > > > > > > > > > > > > > This actually matches my interpretation of the IRC spec. It > says > > > > > > > < > > > > > > > > > > > > > > > > > > > https://github.com/apache/iceberg/blob/c7df5200df462764ba0b3e81484243532c941caf/open-api/rest-catalog-open-api.yaml#L2024 > > > > > > > > > > > > > > > : > > > > > > > > > > > > > > > Servers that support pagination should identify the > `pageToken` > > > > > > parameter > > > > > > > and return a `next-page-token` in the response if there are > more > > > results > > > > > > > available. > > > > > > > > > > > > > > My interpretation of the above is that next-page-token uniquely > > > describes > > > > > > > whether or not more results are available. Not the size of the > > > response. > > > > > > In > > > > > > > fact, the spec defines page-size as "an *upper bound* of the > > > number of > > > > > > > results that a client will receive". Tangentially, I would > prefer > > > if the > > > > > > > spec described this in looser terms, such as a "requested upper > > > bound". > > > > > > > > > > > > > > What the spec does *not* say is that a client can safely assume > > > there are > > > > > > > no more results if it receives less than page-size elements. I > > > think that > > > > > > > you are probably right that a client exists which makes an > > > incorrect > > > > > > > assumption here though :) > > > > > > > > > > > > > > --EM > > > > > > > > > > > > > > On Tue, Oct 14, 2025 at 4:37 PM Dmitri Bourlatchkov < > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Andrew and everyone, > > > > > > > > > > > > > > > > Adding pagination to the Management API would be very > helpful. > > > > > > > > > > > > > > > > As to reusing the pagination parameter sepantics of the > Iceberg > > > REST > > > > > > > > spec... I'm not so sure. > > > > > > > > > > > > > > > > I do believe that servers should have ultimate control over > > page > > > sizes. > > > > > > > So > > > > > > > > any client-side "size" parameters should be suggestions or > > hints > > > at > > > > > > most. > > > > > > > > > > > > > > > > As a continuation of that approach, the server should always > be > > > able to > > > > > > > > produce a partial response (with a next page token) even if > the > > > client > > > > > > > did > > > > > > > > not provide any explicit pagination parameters. > > > > > > > > > > > > > > > > That said, given that existing clients may expect to get > "full" > > > results > > > > > > > > from the Management API when they do not use pagination > > > parameters, I > > > > > > > think > > > > > > > > it should be fine to enable that behaviour with a feature > flag. > > > > > > > > > > > > > > > > WDYT? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > On Fri, Oct 10, 2025 at 8:08 PM Andrew Guterman < > > > > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hey folks, > > > > > > > > > > > > > > > > > > I wanted to gauge sentiment on adding pagination to non-IRC > > > APIs, > > > > > > such > > > > > > > as > > > > > > > > > the management APIs, as the number of management entities > > > (catalogs, > > > > > > > > > principals, etc) can grow large and become un-listable all > at > > > once. > > > > > > > > > > > > > > > > > > I'm not sure if this has been discussed previously but I > > > couldn't > > > > > > find > > > > > > > a > > > > > > > > > thread nor PRs related to it. > > > > > > > > > > > > > > > > > > My proposal is to not reinvent the wheel and just re-use > the > > > spec and > > > > > > > > > implementation of the IRC APIs, where requests contain a > > > "page-token" > > > > > > > and > > > > > > > > > "page-size" param, and responses return a > "next-page-token". > > > > > > > > > > > > > > > > > > Let me know what you think. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Andrew > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
