Thank you Alex for all the hard work ! I appreciate you breaking this down to Milestone 1 and MileStone 2 to make incremental progress [1]
I wanted to share my concerns / considerations for the same : 1. *Perf* : This will now require all the files (which can be easily millions) to come to the catalog for getting a signed url and look up entities / grants / table metadata. This concern is addressed partly by adding prod readiness check and feature flag for enable this feature and we actually plan to fully address this in MileStone 2 where we send a crypto signed metadata from Polaris as part of signed url (i.e table name / table metadata required / other auxiliary info) so that signing can be lightweight as this will be decrypted in polaris and avoid polaris to do additional lookups. 2. *Introducing new endpoints in Milestone 1 only to introduce yet another new ones in Milestone 2 shortly : *I am fairly concerned about this and mentioned the same here [2] since we are introducing a set of public facing API only to deprecate when M2 is achieved and we already have folks who are volunteering working on M2 [3], so why not just go when M2 is ready as we plan to deprecate endpoint in M1 in favour of M2. If we wanna merge M1 changes to unblock M2 changes (given other folks are on the same page) I think that's reasonable, but IMHO it will become a *blocker* to the next release unless M2 is completed. *Note*: These endpoints are custom to Polaris and not what signer spec dictates, but it's fine since spec gives wiggle room for that As folks who have their own Polaris deployment this will be a significant burden as to support the endpoint introduced in M1, they will have telemetry / alerting / capacity planning made around it and all go in vain when M2 comes *very* shortly (since we are actively working on it). I understand APIs can be evolving and one should be prepared for it but in this scenario since we already know / plan that we want to do M2. I would recommend going with only 1 set of public Polaris APIs which together as a community can stand behind for a reasonable amount of time. Would love to know what others have to say in this context. [1] https://github.com/apache/polaris/pull/2280#issuecomment-3487401124 [2] https://github.com/apache/polaris/pull/2280#issuecomment-3504202125 [3] https://github.com/apache/polaris/pull/2280#issuecomment-3553234640 Best, Prashant Singh On Sat, Dec 13, 2025 at 8:02 AM Artur Rakhmatulin < [email protected]> wrote: > Hello everyone. Thanks Alex for reviving the discussion. > I see this PR is struggling a bit with ongoing conflicts, and I'd like > to offer my help and share my thought on it. > > If we decide this feature should move forward, I suggest splitting the > PR into 3 smaller parts: > > **Introduce the API and scaffolding** > Add a new s3-sign-service module with the required interfaces/DTOs, but > without wiring it into the build/runtime yet. This provides a clean > contract for further work. > > Deliver: > - api/s3-sign-service > - specs > - put s3-sign-service build under a feature flag > > **Add core changes (config + auth)** > Deliver the polaris-core updates needed for storage configuration and > authorization, keeping the feature fully opt-in (e.g., > storage.s3.signing.enabled=false). No functional signing yet. > > Deliver: > - polaris-core > > **Add the actual implementation** > Provide the concrete S3 signing implementation, register it, and add > integration/e2e tests. Enable it only when the feature flag is turned on. > > Deliver: > - runtime > - and rest of the tests > - weak a feature flag > > What do you think about splitting the delivery process this way? > Do you have alternative suggestions, or do you see this feature being > delivered more effectively as a single PR instead? > > I'd be glad to hear your thoughts. > > > On 12/12/2025 21:20, Alexandre Dutra wrote: > > Hi all, > > > > I'm reviving the discussion regarding remote S3 signing because the PR > > [1] is now more than 4 months old, and it's been quite a pain to > > rebase it regularly. > > > > I would like to thank Prashant for his thorough review of the PR so > > far; his feedback did uncover a few issues around table locations that > > led to [2], but the resulting PR now aligns with Milestone 1 (M1). And > > by the way, Milestone 2 is already underway. As a reminder, in the M1 > > PR, remote signing is clearly labeled as beta and disabled by default. > > > > What is the community's interest and appetite for this, and what is > > the desired timeline? Do we have any outstanding blockers? I know the > > PR is big, but maybe it could benefit from more reviews as well. > > > > Anyways, let me know what's the best way to move forward with remote > signing. > > > > Thanks, > > Alex > > > > [1]: https://github.com/apache/polaris/pull/2280 > > [2]: https://github.com/apache/polaris/pull/3226 > > > > On Tue, Aug 26, 2025 at 3:42 AM Alexandre Dutra <[email protected]> > wrote: > >> Hi all, > >> > >> I'm starting a new thread on S3 remote signing to avoid hijacking the > >> existing one [1]. > >> > >> To summarize our current progress: we have a design document [2], a > >> Github issue [3] and an initial PR [4]. > >> > >> This initial PR establishes the foundation for the feature. In that > >> PR, remote signing is marked experimental, due to suboptimal > >> authorization checks and potential performance bottlenecks. However, a > >> clear path for improvements in both areas has been identified. > >> > >> How should we proceed? Is the community in agreement with the general > >> implementation guidelines and the current PR? > >> > >> Thanks, > >> Alex > >> > >> [1]: https://lists.apache.org/thread/qvzwc3qxlfrk9vr7yfbx6zxfhz9lhlbc > >> [2]: > https://docs.google.com/document/d/1ygdia7u4bUHUt6n8XhZo48aKoIyyrCvKqan3XP25iB8/edit?usp=sharing > >> [3]: https://github.com/apache/polaris/issues/32 > >> [4]: https://github.com/apache/polaris/pull/2280 >
