+1 to Robert's proposal of deprecating for removal all non-paginated
requests to Polaris's own APIs.

For IRC APIs, I'll note that the ObjectMapper that we use already has
a stream read length protection, see
PolarisIcebergObjectMapperCustomizer [1]. We could add a stream write
length protection as well.

Thanks,
Alex

[1]: 
https://github.com/apache/polaris/blob/20febdaede19fb7c46e120652fdd1a262c2138e4/runtime/service/src/main/java/org/apache/polaris/service/config/PolarisIcebergObjectMapperCustomizer.java#L61-L64

On Thu, Oct 23, 2025 at 12:31 PM Robert Stupp <[email protected]> wrote:
>
> Returning full lists, which can be extremely large, can let requests
> fail on the client or the server, cause overly excessive resource
> usage or even bring down clients and servers (OOM). That's why most
> listing endpoints have limits on the response size (# of bytes or
> elements) and support paging as a 1st class citizen. I think this is
> what Polaris should do as well.
>
> Considering the risks that come with large responses, I think having
> paging always enabled is the safer approach.
> I propose to deprecate the ability to return "full response lists" at
> least for Polaris' own APIs and require pagination after 1 or 2 minor
> releases.
>
> For IRC, if we agree that overly large responses are a risk, we can
> let requests that would yield too large responses (w/o pagination)
> fail early and protect both the server and the client.
>
> On Mon, Oct 20, 2025 at 7:18 PM Andrew Guterman
> <[email protected]> wrote:
> >
> > Returning the full list when no pageToken is specified would be necessary
> > for backward compatibility, but a feature flag as you mentioned above makes
> > sense to me.
> >
> > Best,
> > Andrew
> >
> > On Wed, Oct 15, 2025 at 7:08 AM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi Eric,
> > >
> > > I agree with your points.
> > >
> > > What worries me in the Iceberg spec is this statement:
> > >
> > > "Clients may initiate the first paginated request by sending an empty 
> > > query
> > > parameter `pageToken` to the server."
> > >
> > > I think it implies that a client that does not send a pageToken parameter
> > > can expect to get the full response (not paginated).
> > >
> > > This is probably not the right forum to discuss the Iceberg spec, but I'd
> > > like to avoid this kind of ambiguity in APIs owned by Polaris.
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Tue, Oct 14, 2025 at 7:45 PM Eric Maynard <[email protected]>
> > > wrote:
> > >
> > > > Hey Dmitri,
> > > >
> > > > This actually matches my interpretation of the IRC spec. It says
> > > > <
> > > >
> > > https://github.com/apache/iceberg/blob/c7df5200df462764ba0b3e81484243532c941caf/open-api/rest-catalog-open-api.yaml#L2024
> > > > >
> > > > :
> > > >
> > > > > Servers that support pagination should identify the `pageToken`
> > > parameter
> > > > and return a `next-page-token` in the response if there are more results
> > > > available.
> > > >
> > > > My interpretation of the above is that next-page-token uniquely 
> > > > describes
> > > > whether or not more results are available. Not the size of the response.
> > > In
> > > > fact, the spec defines page-size as "an *upper bound* of the number of
> > > > results that a client will receive". Tangentially, I would prefer if the
> > > > spec described this in looser terms, such as a "requested upper bound".
> > > >
> > > > What the spec does *not* say is that a client can safely assume there 
> > > > are
> > > > no more results if it receives less than page-size elements. I think 
> > > > that
> > > > you are probably right that a client exists which makes an incorrect
> > > > assumption here though :)
> > > >
> > > > --EM
> > > >
> > > > On Tue, Oct 14, 2025 at 4:37 PM Dmitri Bourlatchkov <[email protected]>
> > > > wrote:
> > > >
> > > > > Hi Andrew and everyone,
> > > > >
> > > > > Adding pagination to the Management API would be very helpful.
> > > > >
> > > > > As to reusing the pagination parameter sepantics of the Iceberg REST
> > > > > spec... I'm not so sure.
> > > > >
> > > > > I do believe that servers should have ultimate control over page 
> > > > > sizes.
> > > > So
> > > > > any client-side "size" parameters should be suggestions or hints at
> > > most.
> > > > >
> > > > > As a continuation of that approach, the server should always be able 
> > > > > to
> > > > > produce a partial response (with a next page token) even if the client
> > > > did
> > > > > not provide any explicit pagination parameters.
> > > > >
> > > > > That said, given that existing clients may expect to get "full" 
> > > > > results
> > > > > from the Management API when they do not use pagination parameters, I
> > > > think
> > > > > it should be fine to enable that behaviour with a feature flag.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > Thanks,
> > > > > Dmitri.
> > > > >
> > > > > On Fri, Oct 10, 2025 at 8:08 PM Andrew Guterman <
> > > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hey folks,
> > > > > >
> > > > > > I wanted to gauge sentiment on adding pagination to non-IRC APIs,
> > > such
> > > > as
> > > > > > the management APIs, as the number of management entities (catalogs,
> > > > > > principals, etc) can grow large and become un-listable all at once.
> > > > > >
> > > > > > I'm not sure if this has been discussed previously but I couldn't
> > > find
> > > > a
> > > > > > thread nor PRs related to it.
> > > > > >
> > > > > > My proposal is to not reinvent the wheel and just re-use the spec 
> > > > > > and
> > > > > > implementation of the IRC APIs, where requests contain a 
> > > > > > "page-token"
> > > > and
> > > > > > "page-size" param, and responses return a "next-page-token".
> > > > > >
> > > > > > Let me know what you think.
> > > > > >
> > > > > > Best,
> > > > > > Andrew
> > > > > >
> > > > >
> > > >
> > >

Reply via email to