+1 to Robert's proposal of deprecating for removal all non-paginated requests to Polaris's own APIs.
For IRC APIs, I'll note that the ObjectMapper that we use already has a stream read length protection, see PolarisIcebergObjectMapperCustomizer [1]. We could add a stream write length protection as well. Thanks, Alex [1]: https://github.com/apache/polaris/blob/20febdaede19fb7c46e120652fdd1a262c2138e4/runtime/service/src/main/java/org/apache/polaris/service/config/PolarisIcebergObjectMapperCustomizer.java#L61-L64 On Thu, Oct 23, 2025 at 12:31 PM Robert Stupp <[email protected]> wrote: > > Returning full lists, which can be extremely large, can let requests > fail on the client or the server, cause overly excessive resource > usage or even bring down clients and servers (OOM). That's why most > listing endpoints have limits on the response size (# of bytes or > elements) and support paging as a 1st class citizen. I think this is > what Polaris should do as well. > > Considering the risks that come with large responses, I think having > paging always enabled is the safer approach. > I propose to deprecate the ability to return "full response lists" at > least for Polaris' own APIs and require pagination after 1 or 2 minor > releases. > > For IRC, if we agree that overly large responses are a risk, we can > let requests that would yield too large responses (w/o pagination) > fail early and protect both the server and the client. > > On Mon, Oct 20, 2025 at 7:18 PM Andrew Guterman > <[email protected]> wrote: > > > > Returning the full list when no pageToken is specified would be necessary > > for backward compatibility, but a feature flag as you mentioned above makes > > sense to me. > > > > Best, > > Andrew > > > > On Wed, Oct 15, 2025 at 7:08 AM Dmitri Bourlatchkov <[email protected]> > > wrote: > > > > > Hi Eric, > > > > > > I agree with your points. > > > > > > What worries me in the Iceberg spec is this statement: > > > > > > "Clients may initiate the first paginated request by sending an empty > > > query > > > parameter `pageToken` to the server." > > > > > > I think it implies that a client that does not send a pageToken parameter > > > can expect to get the full response (not paginated). > > > > > > This is probably not the right forum to discuss the Iceberg spec, but I'd > > > like to avoid this kind of ambiguity in APIs owned by Polaris. > > > > > > Cheers, > > > Dmitri. > > > > > > On Tue, Oct 14, 2025 at 7:45 PM Eric Maynard <[email protected]> > > > wrote: > > > > > > > Hey Dmitri, > > > > > > > > This actually matches my interpretation of the IRC spec. It says > > > > < > > > > > > > https://github.com/apache/iceberg/blob/c7df5200df462764ba0b3e81484243532c941caf/open-api/rest-catalog-open-api.yaml#L2024 > > > > > > > > > : > > > > > > > > > Servers that support pagination should identify the `pageToken` > > > parameter > > > > and return a `next-page-token` in the response if there are more results > > > > available. > > > > > > > > My interpretation of the above is that next-page-token uniquely > > > > describes > > > > whether or not more results are available. Not the size of the response. > > > In > > > > fact, the spec defines page-size as "an *upper bound* of the number of > > > > results that a client will receive". Tangentially, I would prefer if the > > > > spec described this in looser terms, such as a "requested upper bound". > > > > > > > > What the spec does *not* say is that a client can safely assume there > > > > are > > > > no more results if it receives less than page-size elements. I think > > > > that > > > > you are probably right that a client exists which makes an incorrect > > > > assumption here though :) > > > > > > > > --EM > > > > > > > > On Tue, Oct 14, 2025 at 4:37 PM Dmitri Bourlatchkov <[email protected]> > > > > wrote: > > > > > > > > > Hi Andrew and everyone, > > > > > > > > > > Adding pagination to the Management API would be very helpful. > > > > > > > > > > As to reusing the pagination parameter sepantics of the Iceberg REST > > > > > spec... I'm not so sure. > > > > > > > > > > I do believe that servers should have ultimate control over page > > > > > sizes. > > > > So > > > > > any client-side "size" parameters should be suggestions or hints at > > > most. > > > > > > > > > > As a continuation of that approach, the server should always be able > > > > > to > > > > > produce a partial response (with a next page token) even if the client > > > > did > > > > > not provide any explicit pagination parameters. > > > > > > > > > > That said, given that existing clients may expect to get "full" > > > > > results > > > > > from the Management API when they do not use pagination parameters, I > > > > think > > > > > it should be fine to enable that behaviour with a feature flag. > > > > > > > > > > WDYT? > > > > > > > > > > Thanks, > > > > > Dmitri. > > > > > > > > > > On Fri, Oct 10, 2025 at 8:08 PM Andrew Guterman < > > > > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hey folks, > > > > > > > > > > > > I wanted to gauge sentiment on adding pagination to non-IRC APIs, > > > such > > > > as > > > > > > the management APIs, as the number of management entities (catalogs, > > > > > > principals, etc) can grow large and become un-listable all at once. > > > > > > > > > > > > I'm not sure if this has been discussed previously but I couldn't > > > find > > > > a > > > > > > thread nor PRs related to it. > > > > > > > > > > > > My proposal is to not reinvent the wheel and just re-use the spec > > > > > > and > > > > > > implementation of the IRC APIs, where requests contain a > > > > > > "page-token" > > > > and > > > > > > "page-size" param, and responses return a "next-page-token". > > > > > > > > > > > > Let me know what you think. > > > > > > > > > > > > Best, > > > > > > Andrew > > > > > > > > > > > > > > > > > >
