Hey Alex, Thanks for the update and I'm glad to see the reduction in overall size.
I feel like the discussion we had focused more on integrating and having a single OAuth2Manager solution as opposed to the direction of Option 1 (separate module), which would create two different flavors of OAuth2. I took a quick look at the latest from the project and there are a lot of parallels between the proposal and what's currently in the baseline from an implementation perspective. I feel like there are two things we need to address: 1) What flows do we want to support (e.g. ROPC is discouraged) and we probably need to document the use cases (U2M vs. M2M flows). 2) How can we go about integrating these new flows into the current implementation I'm not sure I follow some of the items you have listed above (like the spark/flink runtimes), but maybe these are addressed by having the implementation included in core. We can follow up in the next sync as well, -Dan On Mon, Sep 1, 2025 at 8:15 AM Alexandre Dutra <adu...@apache.org> wrote: > Hi all, > > Bumping this thread again with a few updates from the AuthManager project: > > The project recently adopted the Nimbus OIDC Java SDK. This decision > was made after carefully considering its pros and cons, ultimately > concluding it was the best choice for the project's continued growth. > > This move addresses two prior concerns about the donation: > > - The absence of a recognized OAuth2 library as the project's foundation. > - The volume of code to be donated: Nimbus incidentally reduced the > number of Java production classes by half (from about 90 to 45). > > With the above in mind, I've simplified my previous PR breakdown > proposal to align with the current codebase, and the updated version > is as follows: > > 1. Project setup > 2. OAuth2Manager > 3. Token Exchange > 4. Authorization Code > 5. Device Code > 6. Resource Owner Password > 7. Integration & Stress tests > 8. Spark & Flink runtimes > 9. Advanced authentication > 10. Documentation generator tool > > Note: the above plan assumes Option 1 from my previous email (donation > within the Apache Iceberg repository, as a separate module). > > Let me know what you all think of this revised plan. > > Thanks, > Alex > > On Thu, Jul 31, 2025 at 7:29 PM Alexandre Dutra <adu...@apache.org> wrote: > > > > Hi all, > > > > Thanks for the productive discussion yesterday ! Since there isn't a > > recording available yet, I wanted to summarize the key outcomes and > > next steps: > > > > Our main questions revolved around "where" and "what": where to host > > the donated code, and what features to accept. > > > > I believe we should start by focusing on the first question: the > > code's location. During our discussion, we explored a few options: > > > > 1) Within the Apache Iceberg repository, as a separate module > > 2) In a new repository under Apache Iceberg governance > > 3) In a new repository under Apache Polaris governance > > > > Each option has its pros and cons: > > > > - Option 1: This offers a better user experience, as it makes the new > > manager readily available for Spark and Flink, and simplifies its > > integration into the Iceberg connector for Trino. The main drawback is > > the increased maintenance burden. > > > > - Options 2 and 3: These options would require users to adjust their > > habits by adding more JARs or packages. Releases would also follow a > > separate cadence. However, the code would be better confined within > > its own repository, which facilitates maintenance. > > > > I already voice my preference for Option 1, but I don't have any > > strong opinions against the others. > > > > I would love to hear the opinions of all those involved. > > > > Thanks, > > Alex > > > > > > On Wed, Jul 30, 2025 at 2:41 PM Alexandre Dutra <adu...@apache.org> > wrote: > > > > > > Hi Ryan, > > > > > > Great idea! I will add this topic to the agenda today. > > > > > > I also prepared a proposal document to facilitate the discussion: > > > > > > > https://docs.google.com/document/d/1ZcZ5VrXZZOgYllPI9-HTZt8986kBJTMQwFHT_-ASgj0/edit?usp=sharing > > > > > > Thanks, > > > Alex > > > > > > On Wed, Jul 30, 2025 at 1:23 AM Ryan Blue <rdb...@gmail.com> wrote: > > > > > > > > Hi Alex, I think it's a great idea to break down contributions like > this into smaller PRs. It's probably good to discuss this at tomorrow's > catalog sync to prioritize the functionality you want to add and figure out > the best way to fit it in. > > > > > > > > On Tue, Jul 29, 2025 at 11:33 AM Alex Dutra > <alex.du...@dremio.com.invalid> wrote: > > > >> > > > >> Dear Community, > > > >> > > > >> I would like to revive this discussion regarding the potential > donation of Dremio's Auth Manager. > > > >> > > > >> Over the past few days, I have explored the suggestion of dividing > the contribution into smaller parts. I am pleased to report that I have > successfully broken down the features into approximately 15 pull requests, > targeting the main Iceberg repository. > > > >> > > > >> While these pull requests are all rather substantial, I think that > they remain within a manageable size for reviewers. > > > >> > > > >> Would this approach be a good path forward? If so, I can share more > details about the timeline and roadmap I have in mind, and of course, I am > prepared to begin the donation as soon as I have the Community's green > light. > > > >> > > > >> Thanks, > > > >> Alex Dutra > > > >> > > > >> > > > >> On Wed, Jun 25, 2025 at 9:57 AM Alex Dutra <alex.du...@dremio.com> > wrote: > > > >>> > > > >>> Hi Daniel, hi all, > > > >>> > > > >>> Sorry for the late reply. Here are some answers to your questions: > > > >>> > > > >>> > I was under the impression that the AuthManager implementation > was relatively small (based on the recent work for the GCP AuthManager) > > > >>> > > > >>> These are not comparable. The GCP AuthManager is small because it > only > > > >>> works for GCP, and thus can leverage Google auth libraries (more > > > >>> specifically, it uses the google-auth-library-oauth2-http artifact; > > > >>> and since this artifact is already a required dependency for > > > >>> iceberg-gcp, it doesn't bring in any extra dependency). > > > >>> > > > >>> Conversely, this AuthManager is a general-purpose AuthManager that > can > > > >>> work with any IDP. > > > >>> > > > >>> > The broader community wasn't involved in decisions made about > the implementation > > > >>> > > > >>> That’s exactly the purpose of this donation. > > > >>> > > > >>> > "impersonation flow" which I'm not familiar with > > > >>> > > > >>> This is a feature where the manager can dynamically fetch the > subject > > > >>> token for a token exchange, thus managing both the catalog's token > and > > > >>> the user's token, facilitating impersonation (and delegation) use > > > >>> cases. Hence the name (admittedly a bit confusing). This feature is > > > >>> still evolving, but we received positive feedback from users and we > > > >>> believe it brings a lot of value – and is not something that a > > > >>> third-party library could do. > > > >>> > > > >>> > we need to break it into smaller contributions and figure out > the appropriate way to review and assimilate the functionality > > > >>> > > > >>> While we are open to this option, we are concerned about the > potential > > > >>> duration of its completion. In the interim, users have expressed a > > > >>> need for improved OAuth2 support. Would it be possible to gain some > > > >>> clarity regarding the timeline for a review of this initiative? > > > >>> Perhaps an initial review of the current codebase could help > identify > > > >>> and address any potential roadblocks? I can also schedule a demo of > > > >>> the new auth manager, if that helps. > > > >>> > > > >>> > how well the community understands the behaviors. > > > >>> > > > >>> While OAuth2 may not be familiar or palatable to most Iceberg > > > >>> contributors, I am confident that some of them possess the > expertise > > > >>> to effectively review and assess the donation. > > > >>> > > > >>> > The main competency of this project isn't to implement security > protocols > > > >>> > > > >>> This may be true for the GCP auth manager or for the SigV4 one – > these > > > >>> are vendor-specific and can leverage the respective vendor's SDK. > But > > > >>> how would we support OAuth2 in a generic way otherwise? Or > Kerberos? > > > >>> Whether this is a competency of the project or not is debatable. > > > >>> Managing HTTP requests is not a main competency of this project > > > >>> either, and yet we have one RESTClient interface and one HTTPClient > > > >>> implementation, and lots of JSON parsers. > > > >>> > > > >>> The RESTClient in its current form already implies using some > > > >>> authentication protocol. The simple case of using static (provided > via > > > >>> configuration) tokens does not cover real-world cases that users > have > > > >>> expressed interest in. Accepting the Auth Manager will certainly > > > >>> require some extra attention to security protocols from Iceberg > > > >>> maintainers, but it will allow the project to support more advanced > > > >>> use cases. Additionally, the Auth Manager provides a path for > users of > > > >>> the existing, deprecated “/token” endpoint to migrate to standard > > > >>> RFC-based OAuth flows. > > > >>> > > > >>> > Was there any exploration of leveraging other standard > implementations like Apache Oltu, Nimbus, etc. to build the implementation > off of? > > > >>> > > > >>> Yes, we considered that and decided not to go down that route. For > a > > > >>> few reasons: > > > >>> > > > >>> 1. Most OAuth libraries provide building blocks to create clients, > but > > > >>> they are not fully-fledged clients; you still need to write code in > > > >>> order to glue things together [1]. > > > >>> > > > >>> 2. These libraries usually have (too?) many dependencies [2]; some > of > > > >>> them have not been maintained for a while. And Apache Oltu is > retired. > > > >>> In contrast, our Auth Manager only has one small dependency: > > > >>> auth0-jwt. > > > >>> > > > >>> 3. If you delegate to a third-party library, then you cannot share > the > > > >>> catalog's RESTClient or Executor. The library is going to maintain > its > > > >>> own HTTP client and executor, leading to increased resource > > > >>> consumption. > > > >>> > > > >>> 4. Nothing precludes us from switching to a third-party library > later > > > >>> on (it's an implementation detail). We thought it's best to start > with > > > >>> a self-contained project. > > > >>> > > > >>> Thanks, > > > >>> Alex > > > >>> > > > >>> [1]: > https://connect2id.com/products/nimbus-oauth-openid-connect-sdk/guides/oauth-client-server-development > > > >>> [2] For Nimbus: > > > >>> > https://central.sonatype.com/artifact/com.nimbusds/oauth2-oidc-sdk/11.26/dependencies > > > >>> > > > >>> On Thu, Jun 19, 2025 at 5:58 PM Daniel Weeks <dwe...@apache.org> > wrote: > > > >>> > > > > >>> > I hadn't seen this thread before we discussed it yesterday, but > since then I've taken a look and have some reservations. > > > >>> > > > > >>> > I was under the impression that the AuthManager implementation > was relatively small (based on the recent work for the GCP AuthManager), > but after taking a look at the repo, this is far from a small contribution. > > > >>> > > > > >>> > I strongly support more robust security support (especially for > OAuth2/OIDC), but I don't feel this is going to be a small effort to > introduce. The broader community wasn't involved in decisions made about > the implementation and I see elements that give me pause (like > "impersonation flow" which I'm not familiar with and implementation details > like extensions to immutables that aren't consistent with the broader > codebase). > > > >>> > > > > >>> > If we decide that we want to take this on, I feel like we need > to break it into smaller contributions and figure out the appropriate way > to review and assimilate the functionality in a way that's consistent with > the rest of the project. Due to this being security related, we should > take extra precautions around what this introduces and how well the > community understands the behaviors. > > > >>> > > > > >>> > However, looking at the complexity here relative to the approach > with the GCP, I have to question whether this is the right path overall. > The main competency of this project isn't to implement security protocols, > so it's a lot to say we want a full and complete (possibly with extensions) > native implementation of the OAuth2 specification (there are whole projects > built around that alone). > > > >>> > > > > >>> > Was there any exploration of leveraging other standard > implementations like Apache Oltu, Nimbus, etc. to build the implementation > off of? > > > >>> > > > > >>> > -Dan > > > >>> > > > > >>> > On Thu, Jun 19, 2025 at 5:33 AM Alex Dutra > <alex.du...@dremio.com.invalid> wrote: > > > >>> >> > > > >>> >> Hi Ryan & JB, hi all, > > > >>> >> > > > >>> >> I think it would be easier to introduce this new manager as an > > > >>> >> alternative manager. This would make the migration smoother as > it > > > >>> >> would give users time to migrate at their convenience. Besides, > the > > > >>> >> new manager has the notion of "dialects", and can be configured > to > > > >>> >> behave exactly like the current one (honoring the same config > > > >>> >> options), making the migration even easier. > > > >>> >> > > > >>> >> > Why not contribute the functionality directly to the > AuthManager already in Iceberg? Is this incompatible or is there a reason > the current one can't be extended through contributions? > > > >>> >> > > > >>> >> There are a few reasons why I believe it's not possible to > extend the > > > >>> >> current manager indefinitely: > > > >>> >> > > > >>> >> 1. The current auth manager lives in iceberg-core; as we > introduce > > > >>> >> more features, it will become impractical to keep it there, > especially > > > >>> >> since some of the features will require third-party > dependencies. As a > > > >>> >> data point: the new manager contains almost 100 Java production > > > >>> >> classes (not counting test classes and build scripts). > > > >>> >> 2. The current auth manager has some well known shortcomings, > notably > > > >>> >> around token refreshes. It's not possible to fix that without > > > >>> >> introducing regressions and potentially breaking many catalog > clients > > > >>> >> already in production. > > > >>> >> 3. As we introduce features like Authorization Code grant > support, > > > >>> >> interactions with the IDP will become more complex than just a > > > >>> >> request-response cycle. Since most of the current logic resides > in the > > > >>> >> OAuth2Util class, which is entirely public, it won't be an easy > task > > > >>> >> to introduce support for such complex flows while avoiding > binary > > > >>> >> incompatibilities. > > > >>> >> > > > >>> >> Thanks, > > > >>> >> Alex > > > >>> >> > > > >>> >> > > > >>> >> On Wed, Jun 18, 2025 at 11:35 PM Jean-Baptiste Onofré < > j...@nanthrax.net> wrote: > > > >>> >> > > > > >>> >> > Hi > > > >>> >> > > > > >>> >> > I think it makes sense to directly add in AuthManager. I > don't see > > > >>> >> > blockers (with some adaptations). Alex ? > > > >>> >> > > > > >>> >> > From a donation process standpoint (if accepted), I'm happy > to help > > > >>> >> > with the SGA and IP Clearance. > > > >>> >> > > > > >>> >> > Regards > > > >>> >> > JB > > > >>> >> > > > > >>> >> > On Wed, Jun 18, 2025 at 9:15 PM Ryan Blue <rdb...@gmail.com> > wrote: > > > >>> >> > > > > > >>> >> > > I think it would be great to bring this functionality into > Iceberg. I'm curious about your plan for getting it in. It sounds like > you're suggesting adding the Dremio project to the Iceberg repo and making > it optional. Why not contribute the functionality directly to the > AuthManager already in Iceberg? Is this incompatible or is there a reason > the current one can't be extended through contributions? > > > >>> >> > > > > > >>> >> > > On Tue, Jun 17, 2025 at 11:23 AM Christian Thiel < > christian.t.b...@gmail.com> wrote: > > > >>> >> > >> > > > >>> >> > >> Hey Alex, > > > >>> >> > >> > > > >>> >> > >> Thanks for the Initiative — I really appreciate the effort > here! > > > >>> >> > >> > > > >>> >> > >> Having good auth compatibility in the Catalog ecosystem is > key to establish secure standards by making them easy to use. While Iceberg > should stay open to other means of Authentication, OAuth2 is the most > widely adopted interoperable auth standard, and its role in Iceberg REST > reflects that. But with human-centric flows like Auth Code (with PKCE 😉) > and Device Code missing from most standard clients, users often default to > handing out personal Client ID/secret pairs—which is really bad from a > security perspective. > > > >>> >> > >> > > > >>> >> > >> While I can’t speak to the Java details, I fully support > bringing the functionality into Iceberg. I have tested the proposed code > successfully with Spark and different IdPs, including Auth & Device Code > flows with token refresh, as well as token refresh for Client Credential > flows. > > > >>> >> > >> > > > >>> >> > >> Thanks! > > > >>> >> > >> > > > >>> >> > >> Christian > > > >>> >> > >> > > > >>> >> > >> > > > >>> >> > >> > > > >>> >> > >> On Mon, 16 Jun 2025 at 20:33, Alex Dutra > <alex.du...@dremio.com.invalid> wrote: > > > >>> >> > >>> > > > >>> >> > >>> Hi all, > > > >>> >> > >>> > > > >>> >> > >>> Dremio recently open-sourced a new implementation of the > Auth Manager > > > >>> >> > >>> API for OAuth2: > > > >>> >> > >>> > > > >>> >> > >>> https://github.com/dremio/iceberg-auth-manager > > > >>> >> > >>> > > > >>> >> > >>> I wrote a blog post about it a while ago [1]. > > > >>> >> > >>> > > > >>> >> > >>> Built on top of the Auth Manager API introduced in > Iceberg 1.9.0, this > > > >>> >> > >>> project provides a more flexible and extensible OAuth2 > manager > > > >>> >> > >>> compared to the built-in equivalent in Iceberg Core. It > follows OAuth2 > > > >>> >> > >>> standards strictly, but also provides compatibility with > any existing > > > >>> >> > >>> Apache Iceberg REST catalog, and contains no > Dremio-specific > > > >>> >> > >>> functionality. To date, this is the only OAuth2 manager > fully > > > >>> >> > >>> compliant with external identity providers. > > > >>> >> > >>> > > > >>> >> > >>> Dremio would like to contribute this code to the Apache > Iceberg > > > >>> >> > >>> project. I am therefore initiating this discussion to > determine the > > > >>> >> > >>> community's interest in accepting this donation. > > > >>> >> > >>> > > > >>> >> > >>> This project is beneficial to the community because it > addresses > > > >>> >> > >>> well-known limitations, such as token refresh problems > [2][3][4], and > > > >>> >> > >>> also because it introduces highly anticipated features > like the > > > >>> >> > >>> Authorization Code grant support [5]. Fixing these > limitations or > > > >>> >> > >>> adding support for such large features in the built-in > manager, while > > > >>> >> > >>> avoiding any risk of regressions, would have been a lot > harder. > > > >>> >> > >>> > > > >>> >> > >>> Also worth mentioning: this project adheres to the > "Iceberg OAuth2 > > > >>> >> > >>> Client Authentication Guide", proposed by Christian Thiel > [6]. > > > >>> >> > >>> > > > >>> >> > >>> This project could initially serve as a > runtime-selectable alternative > > > >>> >> > >>> to the current built-in implementation. Upon reaching > sufficient > > > >>> >> > >>> maturity however, it could potentially replace the > existing manager. > > > >>> >> > >>> > > > >>> >> > >>> Please share your thoughts by replying to this email. > Alternatively, > > > >>> >> > >>> we can discuss this topic at the Catalog Sync meeting > this Wednesday, > > > >>> >> > >>> June 18th, if that is a more comfortable option to > everyone. > > > >>> >> > >>> > > > >>> >> > >>> Thanks, > > > >>> >> > >>> > > > >>> >> > >>> Alex > > > >>> >> > >>> > > > >>> >> > >>> [1] > https://medium.com/data-engineering-with-dremio/introducing-dremio-auth-manager-for-apache-iceberg-223827342d19 > > > >>> >> > >>> [2]: https://github.com/apache/iceberg/issues/12196 > > > >>> >> > >>> [3]: https://github.com/apache/iceberg/issues/12363 > > > >>> >> > >>> [4]: https://github.com/apache/iceberg/issues/13030 > > > >>> >> > >>> [5]: https://github.com/apache/iceberg/issues/10677 > > > >>> >> > >>> [6]: > https://docs.google.com/document/d/1buW9PCNoHPeP7Br5_vZRTU-_3TExwLx6bs075gi94xc/edit?tab=t.0#heading=h.hufqidg1ij89 >