I understand the risks of non-paginated APIs. The problem is that coupling deprecation of features to release versions means that downstream projects are blocked from upgrading Polaris unless they perform every migration for every breaking change. Feature flags allow downstream projects to pick their own upgrade path.
Best, Andrew On Thu, Oct 23, 2025 at 4:22 AM Alexandre Dutra <[email protected]> wrote: > +1 to Robert's proposal of deprecating for removal all non-paginated > requests to Polaris's own APIs. > > For IRC APIs, I'll note that the ObjectMapper that we use already has > a stream read length protection, see > PolarisIcebergObjectMapperCustomizer [1]. We could add a stream write > length protection as well. > > Thanks, > Alex > > [1]: > https://github.com/apache/polaris/blob/20febdaede19fb7c46e120652fdd1a262c2138e4/runtime/service/src/main/java/org/apache/polaris/service/config/PolarisIcebergObjectMapperCustomizer.java#L61-L64 > > On Thu, Oct 23, 2025 at 12:31 PM Robert Stupp <[email protected]> wrote: > > > > Returning full lists, which can be extremely large, can let requests > > fail on the client or the server, cause overly excessive resource > > usage or even bring down clients and servers (OOM). That's why most > > listing endpoints have limits on the response size (# of bytes or > > elements) and support paging as a 1st class citizen. I think this is > > what Polaris should do as well. > > > > Considering the risks that come with large responses, I think having > > paging always enabled is the safer approach. > > I propose to deprecate the ability to return "full response lists" at > > least for Polaris' own APIs and require pagination after 1 or 2 minor > > releases. > > > > For IRC, if we agree that overly large responses are a risk, we can > > let requests that would yield too large responses (w/o pagination) > > fail early and protect both the server and the client. > > > > On Mon, Oct 20, 2025 at 7:18 PM Andrew Guterman > > <[email protected]> wrote: > > > > > > Returning the full list when no pageToken is specified would be > necessary > > > for backward compatibility, but a feature flag as you mentioned above > makes > > > sense to me. > > > > > > Best, > > > Andrew > > > > > > On Wed, Oct 15, 2025 at 7:08 AM Dmitri Bourlatchkov <[email protected]> > > > wrote: > > > > > > > Hi Eric, > > > > > > > > I agree with your points. > > > > > > > > What worries me in the Iceberg spec is this statement: > > > > > > > > "Clients may initiate the first paginated request by sending an > empty query > > > > parameter `pageToken` to the server." > > > > > > > > I think it implies that a client that does not send a pageToken > parameter > > > > can expect to get the full response (not paginated). > > > > > > > > This is probably not the right forum to discuss the Iceberg spec, > but I'd > > > > like to avoid this kind of ambiguity in APIs owned by Polaris. > > > > > > > > Cheers, > > > > Dmitri. > > > > > > > > On Tue, Oct 14, 2025 at 7:45 PM Eric Maynard < > [email protected]> > > > > wrote: > > > > > > > > > Hey Dmitri, > > > > > > > > > > This actually matches my interpretation of the IRC spec. It says > > > > > < > > > > > > > > > > https://github.com/apache/iceberg/blob/c7df5200df462764ba0b3e81484243532c941caf/open-api/rest-catalog-open-api.yaml#L2024 > > > > > > > > > > > : > > > > > > > > > > > Servers that support pagination should identify the `pageToken` > > > > parameter > > > > > and return a `next-page-token` in the response if there are more > results > > > > > available. > > > > > > > > > > My interpretation of the above is that next-page-token uniquely > describes > > > > > whether or not more results are available. Not the size of the > response. > > > > In > > > > > fact, the spec defines page-size as "an *upper bound* of the > number of > > > > > results that a client will receive". Tangentially, I would prefer > if the > > > > > spec described this in looser terms, such as a "requested upper > bound". > > > > > > > > > > What the spec does *not* say is that a client can safely assume > there are > > > > > no more results if it receives less than page-size elements. I > think that > > > > > you are probably right that a client exists which makes an > incorrect > > > > > assumption here though :) > > > > > > > > > > --EM > > > > > > > > > > On Tue, Oct 14, 2025 at 4:37 PM Dmitri Bourlatchkov < > [email protected]> > > > > > wrote: > > > > > > > > > > > Hi Andrew and everyone, > > > > > > > > > > > > Adding pagination to the Management API would be very helpful. > > > > > > > > > > > > As to reusing the pagination parameter sepantics of the Iceberg > REST > > > > > > spec... I'm not so sure. > > > > > > > > > > > > I do believe that servers should have ultimate control over page > sizes. > > > > > So > > > > > > any client-side "size" parameters should be suggestions or hints > at > > > > most. > > > > > > > > > > > > As a continuation of that approach, the server should always be > able to > > > > > > produce a partial response (with a next page token) even if the > client > > > > > did > > > > > > not provide any explicit pagination parameters. > > > > > > > > > > > > That said, given that existing clients may expect to get "full" > results > > > > > > from the Management API when they do not use pagination > parameters, I > > > > > think > > > > > > it should be fine to enable that behaviour with a feature flag. > > > > > > > > > > > > WDYT? > > > > > > > > > > > > Thanks, > > > > > > Dmitri. > > > > > > > > > > > > On Fri, Oct 10, 2025 at 8:08 PM Andrew Guterman < > > > > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > Hey folks, > > > > > > > > > > > > > > I wanted to gauge sentiment on adding pagination to non-IRC > APIs, > > > > such > > > > > as > > > > > > > the management APIs, as the number of management entities > (catalogs, > > > > > > > principals, etc) can grow large and become un-listable all at > once. > > > > > > > > > > > > > > I'm not sure if this has been discussed previously but I > couldn't > > > > find > > > > > a > > > > > > > thread nor PRs related to it. > > > > > > > > > > > > > > My proposal is to not reinvent the wheel and just re-use the > spec and > > > > > > > implementation of the IRC APIs, where requests contain a > "page-token" > > > > > and > > > > > > > "page-size" param, and responses return a "next-page-token". > > > > > > > > > > > > > > Let me know what you think. > > > > > > > > > > > > > > Best, > > > > > > > Andrew > > > > > > > > > > > > > > > > > > > > > > >
