Hi Ryan,

Great idea! I will add this topic to the agenda today.

I also prepared a proposal document to facilitate the discussion:

https://docs.google.com/document/d/1ZcZ5VrXZZOgYllPI9-HTZt8986kBJTMQwFHT_-ASgj0/edit?usp=sharing

Thanks,
Alex

On Wed, Jul 30, 2025 at 1:23 AM Ryan Blue <rdb...@gmail.com> wrote:
>
> Hi Alex, I think it's a great idea to break down contributions like this into 
> smaller PRs. It's probably good to discuss this at tomorrow's catalog sync to 
> prioritize the functionality you want to add and figure out the best way to 
> fit it in.
>
> On Tue, Jul 29, 2025 at 11:33 AM Alex Dutra <alex.du...@dremio.com.invalid> 
> wrote:
>>
>> Dear Community,
>>
>> I would like to revive this discussion regarding the potential donation of 
>> Dremio's Auth Manager.
>>
>> Over the past few days, I have explored the suggestion of dividing the 
>> contribution into smaller parts. I am pleased to report that I have 
>> successfully broken down the features into approximately 15 pull requests, 
>> targeting the main Iceberg repository.
>>
>> While these pull requests are all rather substantial, I think that they 
>> remain within a manageable size for reviewers.
>>
>> Would this approach be a good path forward? If so, I can share more details 
>> about the timeline and roadmap I have in mind, and of course, I am prepared 
>> to begin the donation as soon as I have the Community's green light.
>>
>> Thanks,
>> Alex Dutra
>>
>>
>> On Wed, Jun 25, 2025 at 9:57 AM Alex Dutra <alex.du...@dremio.com> wrote:
>>>
>>> Hi Daniel, hi all,
>>>
>>> Sorry for the late reply. Here are some answers to your questions:
>>>
>>> > I was under the impression that the AuthManager implementation was 
>>> > relatively small (based on the recent work for the GCP AuthManager)
>>>
>>> These are not comparable. The GCP AuthManager is small because it only
>>> works for GCP, and thus can leverage Google auth libraries (more
>>> specifically, it uses the google-auth-library-oauth2-http artifact;
>>> and since this artifact is already a required dependency for
>>> iceberg-gcp, it doesn't bring in any extra dependency).
>>>
>>> Conversely, this AuthManager is a general-purpose AuthManager that can
>>> work with any IDP.
>>>
>>> > The broader community wasn't involved in decisions made about the 
>>> > implementation
>>>
>>> That’s exactly the purpose of this donation.
>>>
>>> > "impersonation flow" which I'm not familiar with
>>>
>>> This is a feature where the manager can dynamically fetch the subject
>>> token for a token exchange, thus managing both the catalog's token and
>>> the user's token, facilitating impersonation (and delegation) use
>>> cases. Hence the name (admittedly a bit confusing). This feature is
>>> still evolving, but we received positive feedback from users and we
>>> believe it brings a lot of value – and is not something that a
>>> third-party library could do.
>>>
>>> > we need to break it into smaller contributions and figure out the 
>>> > appropriate way to review and assimilate the functionality
>>>
>>> While we are open to this option, we are concerned about the potential
>>> duration of its completion. In the interim, users have expressed a
>>> need for improved OAuth2 support. Would it be possible to gain some
>>> clarity regarding the timeline for a review of this initiative?
>>> Perhaps an initial review of the current codebase could help identify
>>> and address any potential roadblocks? I can also schedule a demo of
>>> the new auth manager, if that helps.
>>>
>>> > how well the community understands the behaviors.
>>>
>>> While OAuth2 may not be familiar or palatable to most Iceberg
>>> contributors, I am confident that some of them possess the expertise
>>> to effectively review and assess the donation.
>>>
>>> > The main competency of this project isn't to implement security protocols
>>>
>>> This may be true for the GCP auth manager or for the SigV4 one – these
>>> are vendor-specific and can leverage the respective vendor's SDK. But
>>> how would we support OAuth2 in a generic way otherwise? Or Kerberos?
>>> Whether this is a competency of the project or not is debatable.
>>> Managing HTTP requests is not a main competency of this project
>>> either, and yet we have one RESTClient interface and one HTTPClient
>>> implementation, and lots of JSON parsers.
>>>
>>> The RESTClient in its current form already implies using some
>>> authentication protocol. The simple case of using static (provided via
>>> configuration) tokens does not cover real-world cases that users have
>>> expressed interest in. Accepting the Auth Manager will certainly
>>> require some extra attention to security protocols from Iceberg
>>> maintainers, but it will allow the project to support more advanced
>>> use cases. Additionally, the Auth Manager provides a path for users of
>>> the existing, deprecated “/token” endpoint to migrate to standard
>>> RFC-based OAuth flows.
>>>
>>> > Was there any exploration of leveraging other standard implementations 
>>> > like Apache Oltu, Nimbus, etc. to build the implementation off of?
>>>
>>> Yes, we considered that and decided not to go down that route. For a
>>> few reasons:
>>>
>>> 1. Most OAuth libraries provide building blocks to create clients, but
>>> they are not fully-fledged clients; you still need to write code in
>>> order to glue things together [1].
>>>
>>> 2. These libraries usually have (too?) many dependencies [2]; some of
>>> them have not been maintained for a while. And Apache Oltu is retired.
>>> In contrast, our Auth Manager only has one small dependency:
>>> auth0-jwt.
>>>
>>> 3. If you delegate to a third-party library, then you cannot share the
>>> catalog's RESTClient or Executor. The library is going to maintain its
>>> own HTTP client and executor, leading to increased resource
>>> consumption.
>>>
>>> 4. Nothing precludes us from switching to a third-party library later
>>> on (it's an implementation detail). We thought it's best to start with
>>> a self-contained project.
>>>
>>> Thanks,
>>> Alex
>>>
>>> [1]: 
>>> https://connect2id.com/products/nimbus-oauth-openid-connect-sdk/guides/oauth-client-server-development
>>> [2] For Nimbus:
>>> https://central.sonatype.com/artifact/com.nimbusds/oauth2-oidc-sdk/11.26/dependencies
>>>
>>> On Thu, Jun 19, 2025 at 5:58 PM Daniel Weeks <dwe...@apache.org> wrote:
>>> >
>>> > I hadn't seen this thread before we discussed it yesterday, but since 
>>> > then I've taken a look and have some reservations.
>>> >
>>> > I was under the impression that the AuthManager implementation was 
>>> > relatively small (based on the recent work for the GCP AuthManager), but 
>>> > after taking a look at the repo, this is far from a small contribution.
>>> >
>>> > I strongly support more robust security support (especially for 
>>> > OAuth2/OIDC), but I don't feel this is going to be a small effort to 
>>> > introduce.  The broader community wasn't involved in decisions made about 
>>> > the implementation and I see elements that give me pause (like 
>>> > "impersonation flow" which I'm not familiar with and implementation 
>>> > details like extensions to immutables that aren't consistent with the 
>>> > broader codebase).
>>> >
>>> > If we decide that we want to take this on, I feel like we need to break 
>>> > it into smaller contributions and figure out the appropriate way to 
>>> > review and assimilate the functionality in a way that's consistent with 
>>> > the rest of the project.  Due to this being security related, we should 
>>> > take extra precautions around what this introduces and how well the 
>>> > community understands the behaviors.
>>> >
>>> > However, looking at the complexity here relative to the approach with the 
>>> > GCP, I have to question whether this is the right path overall.  The main 
>>> > competency of this project isn't to implement security protocols, so it's 
>>> > a lot to say we want a full and complete (possibly with extensions) 
>>> > native implementation of the OAuth2 specification (there are whole 
>>> > projects built around that alone).
>>> >
>>> > Was there any exploration of leveraging other standard implementations 
>>> > like Apache Oltu, Nimbus, etc. to build the implementation off of?
>>> >
>>> > -Dan
>>> >
>>> > On Thu, Jun 19, 2025 at 5:33 AM Alex Dutra 
>>> > <alex.du...@dremio.com.invalid> wrote:
>>> >>
>>> >> Hi Ryan & JB, hi all,
>>> >>
>>> >> I think it would be easier to introduce this new manager as an
>>> >> alternative manager. This would make the migration smoother as it
>>> >> would give users time to migrate at their convenience. Besides, the
>>> >> new manager has the notion of "dialects", and can be configured to
>>> >> behave exactly like the current one (honoring the same config
>>> >> options), making the migration even easier.
>>> >>
>>> >> > Why not contribute the functionality directly to the AuthManager 
>>> >> > already in Iceberg? Is this incompatible or is there a reason the 
>>> >> > current one can't be extended through contributions?
>>> >>
>>> >> There are a few reasons why I believe it's not possible to extend the
>>> >> current manager indefinitely:
>>> >>
>>> >> 1. The current auth manager lives in iceberg-core; as we introduce
>>> >> more features, it will become impractical to keep it there, especially
>>> >> since some of the features will require third-party dependencies. As a
>>> >> data point: the new manager contains almost 100 Java production
>>> >> classes (not counting test classes and build scripts).
>>> >> 2. The current auth manager has some well known shortcomings, notably
>>> >> around token refreshes. It's not possible to fix that without
>>> >> introducing regressions and potentially breaking many catalog clients
>>> >> already in production.
>>> >> 3. As we introduce features like Authorization Code grant support,
>>> >> interactions with the IDP will become more complex than just a
>>> >> request-response cycle. Since most of the current logic resides in the
>>> >> OAuth2Util class, which is entirely public, it won't be an easy task
>>> >> to introduce support for such complex flows while avoiding binary
>>> >> incompatibilities.
>>> >>
>>> >> Thanks,
>>> >> Alex
>>> >>
>>> >>
>>> >> On Wed, Jun 18, 2025 at 11:35 PM Jean-Baptiste Onofré 
>>> >> <j...@nanthrax.net> wrote:
>>> >> >
>>> >> > Hi
>>> >> >
>>> >> > I think it makes sense to directly add in AuthManager. I don't see
>>> >> > blockers (with some adaptations). Alex ?
>>> >> >
>>> >> > From a donation process standpoint (if accepted), I'm happy to help
>>> >> > with the SGA and IP Clearance.
>>> >> >
>>> >> > Regards
>>> >> > JB
>>> >> >
>>> >> > On Wed, Jun 18, 2025 at 9:15 PM Ryan Blue <rdb...@gmail.com> wrote:
>>> >> > >
>>> >> > > I think it would be great to bring this functionality into Iceberg. 
>>> >> > > I'm curious about your plan for getting it in. It sounds like you're 
>>> >> > > suggesting adding the Dremio project to the Iceberg repo and making 
>>> >> > > it optional. Why not contribute the functionality directly to the 
>>> >> > > AuthManager already in Iceberg? Is this incompatible or is there a 
>>> >> > > reason the current one can't be extended through contributions?
>>> >> > >
>>> >> > > On Tue, Jun 17, 2025 at 11:23 AM Christian Thiel 
>>> >> > > <christian.t.b...@gmail.com> wrote:
>>> >> > >>
>>> >> > >> Hey Alex,
>>> >> > >>
>>> >> > >> Thanks for the Initiative — I really appreciate the effort here!
>>> >> > >>
>>> >> > >> Having good auth compatibility in the Catalog ecosystem is key to 
>>> >> > >> establish secure standards by making them easy to use. While 
>>> >> > >> Iceberg should stay open to other means of Authentication, OAuth2 
>>> >> > >> is the most widely adopted interoperable auth standard, and its 
>>> >> > >> role in Iceberg REST reflects that. But with human-centric flows 
>>> >> > >> like Auth Code (with PKCE 😉) and Device Code missing from most 
>>> >> > >> standard clients, users often default to handing out personal 
>>> >> > >> Client ID/secret pairs—which is really bad from a security 
>>> >> > >> perspective.
>>> >> > >>
>>> >> > >> While I can’t speak to the Java details, I fully support bringing 
>>> >> > >> the functionality into Iceberg. I have tested the proposed code 
>>> >> > >> successfully with Spark and different IdPs, including Auth & Device 
>>> >> > >> Code flows with token refresh, as well as token refresh for Client 
>>> >> > >> Credential flows.
>>> >> > >>
>>> >> > >> Thanks!
>>> >> > >>
>>> >> > >> Christian
>>> >> > >>
>>> >> > >>
>>> >> > >>
>>> >> > >> On Mon, 16 Jun 2025 at 20:33, Alex Dutra 
>>> >> > >> <alex.du...@dremio.com.invalid> wrote:
>>> >> > >>>
>>> >> > >>> Hi all,
>>> >> > >>>
>>> >> > >>> Dremio recently open-sourced a new implementation of the Auth 
>>> >> > >>> Manager
>>> >> > >>> API for OAuth2:
>>> >> > >>>
>>> >> > >>> https://github.com/dremio/iceberg-auth-manager
>>> >> > >>>
>>> >> > >>> I wrote a blog post about it a while ago [1].
>>> >> > >>>
>>> >> > >>> Built on top of the Auth Manager API introduced in Iceberg 1.9.0, 
>>> >> > >>> this
>>> >> > >>> project provides a more flexible and extensible OAuth2 manager
>>> >> > >>> compared to the built-in equivalent in Iceberg Core. It follows 
>>> >> > >>> OAuth2
>>> >> > >>> standards strictly, but also provides compatibility with any 
>>> >> > >>> existing
>>> >> > >>> Apache Iceberg REST catalog, and contains no Dremio-specific
>>> >> > >>> functionality. To date, this is the only OAuth2 manager fully
>>> >> > >>> compliant with external identity providers.
>>> >> > >>>
>>> >> > >>> Dremio would like to contribute this code to the Apache Iceberg
>>> >> > >>> project. I am therefore initiating this discussion to determine the
>>> >> > >>> community's interest in accepting this donation.
>>> >> > >>>
>>> >> > >>> This project is beneficial to the community because it addresses
>>> >> > >>> well-known limitations, such as token refresh problems [2][3][4], 
>>> >> > >>> and
>>> >> > >>> also because it introduces highly anticipated features like the
>>> >> > >>> Authorization Code grant support [5]. Fixing these limitations or
>>> >> > >>> adding support for such large features in the built-in manager, 
>>> >> > >>> while
>>> >> > >>> avoiding any risk of regressions, would have been a lot harder.
>>> >> > >>>
>>> >> > >>> Also worth mentioning: this project adheres to the "Iceberg OAuth2
>>> >> > >>> Client Authentication Guide", proposed by Christian Thiel [6].
>>> >> > >>>
>>> >> > >>> This project could initially serve as a runtime-selectable 
>>> >> > >>> alternative
>>> >> > >>> to the current built-in implementation. Upon reaching sufficient
>>> >> > >>> maturity however, it could potentially replace the existing 
>>> >> > >>> manager.
>>> >> > >>>
>>> >> > >>> Please share your thoughts by replying to this email. 
>>> >> > >>> Alternatively,
>>> >> > >>> we can discuss this topic at the Catalog Sync meeting this 
>>> >> > >>> Wednesday,
>>> >> > >>> June 18th, if that is a more comfortable option to everyone.
>>> >> > >>>
>>> >> > >>> Thanks,
>>> >> > >>>
>>> >> > >>> Alex
>>> >> > >>>
>>> >> > >>> [1] 
>>> >> > >>> https://medium.com/data-engineering-with-dremio/introducing-dremio-auth-manager-for-apache-iceberg-223827342d19
>>> >> > >>> [2]: https://github.com/apache/iceberg/issues/12196
>>> >> > >>> [3]: https://github.com/apache/iceberg/issues/12363
>>> >> > >>> [4]: https://github.com/apache/iceberg/issues/13030
>>> >> > >>> [5]: https://github.com/apache/iceberg/issues/10677
>>> >> > >>> [6]: 
>>> >> > >>> https://docs.google.com/document/d/1buW9PCNoHPeP7Br5_vZRTU-_3TExwLx6bs075gi94xc/edit?tab=t.0#heading=h.hufqidg1ij89

Reply via email to