Hi Dennis, Thanks for stating the concerns (A,B,C).
I'm planning to work in that area for [2207]. I propose to have an in-depth review of that code under that PR (still WIP on my part). However, I'm kind of lost about the relationship of that with making roleArn optional (which is the main topic of this thread). Is roleArn being optional detrimental? >From my POV, it enables nicer integration with MinIO use cases in the current codebase (not setting roleArn) at the same time AWS use cases are not affected. The only remote problem might be that users of AWS S3 may miss to set roleArn in the config. However, that will be caught in runtime (failures to Assume Role). WDYT? [2207] https://github.com/apache/polaris/issues/2207 Thanks, Dmitri. On Fri, Aug 22, 2025 at 1:38 AM Dennis Huo <huoi...@gmail.com> wrote: > Yeah excellent point, and that definitely highlights the need for a more > comprehensive design for non-AWS S3-compat storage. > > Using the removal of roleArn as an "incidental" fix for a fuzzy subset of > scenarios is probably not how we want to get entrenched for the first > introduction of those features, especially when we didn't even make it > clear in the github issue or the committed code how we expect optional > roleArn to interact with session-token exchange. > > IMO the ability to "assumeRole(null /* roleArn */, sessionPolicy)" should > itself be treated as idiosyncratic to specific storage providers and paired > with some explicit expression of intent both for Polaris internally as well > as for the user. > > From what I can tell, "null assumeRole" in MinIO is more analogous to > "getSessionPolicy" from AWS, though I'm not too familiar with MinIO so we > should invite some expert opinions on this. > > Right now there are several different concerns rolled up into the single > "getSubscopedCredential" in Polaris: > > A. Indirection between root "service identities" (owned by the Polaris > service owner) and per-Catalog storage-actor identities (owned by the > Catalog administrative user) > -This indirection *in itself* is an important element of the Polaris > security model, where service identities do *not* generally have latent > direct storage-access permissions, but instead hold "actAs" or "assumeRole" > types of permissions > B. Applying a "subscoping policy" that restricts the blast radius of any > storage credentials that may be used, both in terms of "path prefix" and in > "duration" > -It's intentional to make Polaris "internal" FileIO go through the same > subscoping flow as much as possible, so that even when it's Polaris > writing/reading metadata files, the blast radius matches what would be > vended out to a sufficiently privileged principal > C. Applying "configuration overrides" related to endpoints, region, etc. > These crept into getSubscopedCredentials due to being "convenient", but are > substantially a different action than credential-minting, though are > closely related because of needing to determine STS endpoints from the > config > > I guess we probably want to refactor so that (C) will *always* happen > correctly, so we'd need to split out some kind of "getDynamicConfig" that > is separate from injecting the *credentials* into the config map. > > It sounds like we have potential use cases for any mix of (A) and (B). > > - Single-tenant use cases may not need "indirection" but may still want > subscoping both for internal blast-radius management and for > credential-vending > - Other single-tenant use cases might be okay with neither > identity-indirection nor subscoping > - I think we've had some discussion about whether to ever allow > credential-vending without subscoping (i.e. vending long-lived credentials) > > On Thu, Aug 21, 2025 at 3:53 AM Alexandre Dutra <adu...@apache.org> wrote: > > > Hi, > > > > We just had an issue created by a user that was attempting to do use > > case #2 in Dennis' categorization ("Using DefaultCredentialsProvider > > directly without subscoping to access non-AWS s3-compat storage"): > > > > https://github.com/apache/polaris/issues/2398 > > > > This uncovered some interesting findings (at least for me), which > > leads me to think that setting > > SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION=true is actually not enough, > > and even not recommended in that case. When credentials subscoping is > > disabled, the table config returned to the client not only omits S3 > > credentials, which is expected, but also omits some otherwise very > > important S3 settings, such as: s3.endpoint, s3.path-style-access or > > client.region, *even if these were properly configured at the catalog > > level*. As a result, the client is unable to access the MinIO storage > > properly. > > > > For me, use case #2 is just not achievable right now in Polaris. > > Enabling credentials subscoping solves the issue of course, but also > > creates a somewhat artificial link between credentials vending and > > "generic" storage configuration. > > > > Thanks, > > Alex > > > > On Thu, Aug 21, 2025 at 6:18 AM Dennis Huo <huoi...@gmail.com> wrote: > > > > > > Reposting my comment from the github issue here for further discussion: > > > > > > It seems like there are three distinct "new" use cases: > > > > > > 1. Using DefaultCredentialsProvider directly without subscoping to > access > > > storage when running on AWS and using AWS S3 > > > 2. Using DefaultCredentialsProvider directly without subscoping to > access > > > non-AWS s3-compat storage > > > 3. Using DefaultCredentialsProvider directly with subscoping to access > > > non-AWS s3-compat storage > > > > > > > > > These are all different from the "normal" flow: > > > > > > 4. Using DefaultCredentialsProvider as the super-root to assumeRole on > a > > > provided role with subscoping to access storage on S3 > > > > > > For (1) and (2), setting SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION=true is > > > explicitly intended for that use case, though looking at the code it > > seems > > > we still need to remove "validate" checks for roleARN, otherwise > > > parsing-validation fails at createCatalog time. > > > > > > We should verify that a "dummy" syntactically valid roleArn such as > > > "arn:aws:iam::123456789012:role/my-role" already works for the stated > use > > > case even without https://github.com/apache/polaris/pull/2329 making > > > roleArn optional if the following is set in application.properties: > > > > > > polaris.features."SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION"=true > > > > > > Looking at MinIO that's certainly very interesting that > > > AssumeRoleWithWebIdentity makes roleArn optional -- it's not 100% clear > > > whether the provide Policy is still applied to the returned token. I'm > > also > > > not 100% clear on how we map the stsClient to point at WebIdentity vs > > > CustomToken flows for MinIO - for example AssumeRoleWithCustomToken > still > > > requires roleArn: > > > > > > https://docs.min.io/enterprise/aistor-object-store/developers/security-token-service/assumerolewithcustomtoken/ > > > > > > But assuming the subscoping does work, then (3) is a substantially new > > flow > > > where the assumeRole indirection is applied, but yet the identity is > the > > > service-wide default credentials provider where > > > SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION=false is used despite being no > > > roleArn provided. This new use case would need a separate > > > FeatureConfiguration to avoid multi-tenant deployments from > > "accidentally" > > > exposing the service identity through vended credentials. > > > > > > On Tue, Aug 12, 2025 at 9:43 AM Dmitri Bourlatchkov <di...@apache.org> > > > wrote: > > > > > > > Making roleArn optional in the REST API is backward compatible and > > allows > > > > for better UX with non-AWS S3-compatible storage. > > > > > > > > This change looks good to me. > > > > > > > > Cheers, > > > > Dmitri. > > > > > > > > On Tue, Aug 12, 2025 at 5:46 AM Robert Stupp <sn...@snazy.de> wrote: > > > > > > > > > Hi all, > > > > > > > > > > Description of the PR: Having the role-arn parameter required for a > > > > catalog > > > > > is redundant in many and requires the generation of an extra role > in > > > > cases > > > > > when IRSI (for AWS) is being used. Other S3 implementations (Minio, > > Ceph, > > > > > many of the appliances) also do not all require a role-ARN. > > > > > > > > > > See issue [1] and PR [2] to fix the issue. > > > > > > > > > > Robert > > > > > > > > > > [1] https://github.com/apache/polaris/issues/2325 > > > > > [2] https://github.com/apache/polaris/pull/2329 > > > > > > > > > > > >