Hi Dmitri,

I need the SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION option for my self-managed 
Polaris deployment. Unfortunately, due to company policy, I cannot use 
credential vending and must rely on environment variables to provide 
credentials. While I would prefer to use credential vending if it were allowed, 
I am forced to use environment variables in this case.

Best,
CG

> On 6 May 2025, at 06:11, Dmitri Bourlatchkov <di...@apache.org> wrote:
> 
> SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION
> 
> 
> I do not think that this option is a solution for self-managed deployments
> at all.
> 
> It effectively disables credential vending, which is still a valuable
> feature for self-managed cases.
> 
> So basically we'd have 2 modes of running Polaris [...]
> 
> 
> I'd really like to avoid having "running modes" in the sense of having this
> "mode" as a code-level config or flag.
> 
> I believe configuration options should provide enough controls to the admin
> user to make Polaris behave in a certain way, but I believe those configs
> should apply to specific aspects of Polaris behaviour as opposed to
> defining an overarching "mode".
> 
> For example, subscoping for vended credentials is valuable, IMHO, even in
> single-tenant deployments with a plain key/secret pair for authenticating
> STS connections.
> 
> Cheers,
> Dmitri.
> 
>> On Mon, May 5, 2025 at 2:36 PM Dennis Huo <huoi...@gmail.com> wrote:
>> 
>> In general this sigv4 indirection control-flow should mirror the analogous
>> patterns we apply on the StorageConfigInfo side (and perhaps long-term we
>> can better consolidate the STS logic for the two), so I'd agree it's not
>> even necessarily federation-specific.
>> 
>> There's some precedent for the use-case of a "self-run Polaris" user
>> wanting to just use simple server-wide configuration for StorageConfigInfo
>> already: SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION
>> 
>> 
>> https://github.com/apache/polaris/blob/4db7998381a61e9cab82cdc4fded6867b0bca464/service/common/src/main/java/org/apache/polaris/service/catalog/io/FileIOUtil.java#L92
>> 
>> For this Catalog Federation sigv4 case we could introduce a similar feature
>> configuration; whether or not this feature configuration is the exact way
>> we want to do it long-term, it would make sense to refactor both the
>> ConnectionConfig and StorageConfig parts together in the future.
>> 
>> One important concept for this simple approach is that instead of getting
>> into the business of having Polaris actually try to juggle long-lived
>> credentials for IAM Users explicitly, this "simple case" can just inherit
>> "environment-provided" credentials and let low-level SDK libraries use
>> their default "credential chain" logic.
>> 
>> So basically we'd have 2 modes of running Polaris:
>> 
>> 1. Secure multi-tenant - Polaris will have opinionated/constrained
>> scaffolding via layers of credential indirection, subscoping,
>> secrets-management, etc.
>> 2. Single-tenant - Polaris will be more hands-off in terms of secrets
>> management, instead allowing thick clients to use typical
>> "environment-provided" credentials (e.g. environment variables, EC2
>> instance-metadata endpoint, local credential files, etc)
>> 
>> On Fri, May 2, 2025 at 4:28 PM Dmitri Bourlatchkov <di...@apache.org>
>> wrote:
>> 
>>> I think this discussion moves slightly out of the scope of catalog
>>> federation and into handling secrets :) ... but the points you're making
>>> are quite valid.
>>> 
>>> Let's keep them in mind when we reopen the secrets handling discussion.
>>> 
>>> Cheers,
>>> Dmitri.
>>> 
>>>> On Fri, May 2, 2025 at 7:04 PM Rulin Xing <ru...@apache.org> wrote:
>>> 
>>>> Hi Dmitri,
>>>> 
>>>> Totally agree that we need to recognize the self-managed deployment
>> case
>>>> as a first-class scenario. That means we should provide a way to
>>> configure
>>>> Polaris with long-lived credentials.
>>>> 
>>>> I see a couple of options for supporting this:
>>>> 1. From env vars or server config, e.g.:
>>>>  * POLARIS_IAM_USER_AWS_ACCESS_KEY_ID
>>>>  * POLARIS_IAM_USER_AWS_SECRET_ACCESS_KEY
>>>>  * POLARIS_IAM_USER_ARN
>>>> In this case, `roleArn` would not be required.
>>>> 
>>>> 2. Configured via the Polaris Management API: Stick to
>>>> `SigV4AuthenticationParameters`
>>>> 
>>>> If we stick with the existing `SigV4AuthenticationParameters` type, we
>>>> could:
>>>> * Make roleArn optional
>>>> * Add `iamUserAwsAccessKeyId` and `iamUserAwsSecretAccessKey` as
>> optional
>>>> fields
>>>> 
>>>> 3. Configured via the Polaris Management API: Add new auth type
>>>> 
>>>> We could create a new type to distinguish clearly:
>>>> * New AuthenticationType enum: SIGV4_STS, SIGV4_STATIC_CREDS
>>>> 
>>>> 4. Configured via the Polaris Management API: Add new auth types
>>>> 
>>>> We could create a new sub type to distinguish clearly:
>>>> e.g. new subtype under SigV4AuthenticationParameters: STS, CREDS
>>>> 
>>>> Personally, I would prefer option 4. WDYT?
>>>> 
>>>> I'll include these options in my PR as well for discussion.
>>>> 
>>>> Best,
>>>> Rulin
>>>> 
>>>> 
>>>> On 2025/05/02 17:16:44 Dmitri Bourlatchkov wrote:
>>>>> Thanks for your message, Rulin! You made good points and I agree with
>>>> them.
>>>>> 
>>>>> I'm planning to introduce a `PolarisConnectionCredentialVendor`
>>>>> 
>>>>> 
>>>>> Looking forward to this proposal!
>>>>> 
>>>>> 
>>>>> The goal is to draw a clear boundary between user-provided input and
>>>>> Polaris-generated service info [...]
>>>>> 
>>>>> 
>>>>> I support this goal, however, I'd like to emphasise that there may be
>>>> some
>>>>> skew in different deployment models.
>>>>> 
>>>>> Traditionally Polaris was envisioned as a service running for
>> multiple
>>>>> users from distinct organisations, I guess. However, when Apache
>>> Polaris
>>>>> releases binary artifacts users will be able to run their own
>>>> deployments.
>>>>> In that situation, the boundary between what is configured at the
>>>>> deployment level and what is configured via the Polaris Management
>> API
>>>> may
>>>>> not be as sharp.
>>>>> 
>>>>> I believe we need to recognise the self-managed deployment case and
>>>>> consider it as a mainstream case. I'm sure we're going to have some
>>> real
>>>>> users behind this use case soon.
>>>>> 
>>>>> Specifically for the SigV4 authentication option in Federated
>>> Catalogs, I
>>>>> guess this means that users may want to use simpler key/secret pairs
>> as
>>>>> input for secure connections to AWS services like Glue. In
>> self-managed
>>>>> deployments this is not a security risk, from my POV.
>>>>> 
>>>>> Would you consider it as a possible future enhancement?
>>>>> 
>>>>> If yes, do you think it would fall under the proposed
>>>>> SigV4AuthenticationParameters
>>>>> (as a set of new optional attributes perhaps)?.. or maybe be a
>>> different
>>>>> config type altogether? (this is related to my GH comment about type
>>>> names,
>>>>> but the problem is bigger than just naming, I think).
>>>>> 
>>>>> I do not question that the STS / assume role path offers better
>>> security
>>>>> guarantees. My point is that it may still be valuable for OSS users
>> to
>>>> have
>>>>> simpler connection options.
>>>>> 
>>>>> Thanks,
>>>>> Dmitri.
>>>>> 
>>>>> On Thu, May 1, 2025 at 9:54 PM Rulin Xing <ru...@apache.org> wrote:
>>>>> 
>>>>>> Hi Dmitri,
>>>>>> 
>>>>>> Thanks for the thoughtful questions!
>>>>>> 
>>>>>> 1. Does this assume the use of STS?
>>>>>> 
>>>>>> Yes, the current spec changes assume the use of STS. Polaris acts
>> as
>>> a
>>>>>> service provider and assumes IAM roles provided by users to access
>>> AWS
>>>>>> resources like Glue Catalogs. This model avoids long-lived
>>> credentials
>>>> and
>>>>>> enables secure, temporary access via STS-issued credentials.
>>>>>> 
>>>>>> 2. Why is plain key/secret SigV4 not an option?
>>>>>> 
>>>>>> We can support plain key/secret credentials for SigV4, particularly
>>> in
>>>>>> self-managed deployments where users own both the Polaris
>> deployment
>>>> and
>>>>>> AWS accounts. However, to reduce security risks, we don't want to
>>> store
>>>>>> long-lived credentials directly in the catalog entity. A more
>> secure
>>>>>> approach is to reference them using `UserSecretReference` (added by
>>>>>> @dennishuo) and retrieve them through `UserSecretsManager`.
>>>>>> 
>>>>>> 3. Where is Polaris expected to get credentials for STS requests?
>>>>>> 
>>>>>> Polaris obtains credentials for STS calls from its own runtime
>>>>>> environment, such as server config, environment variables, or
>>>> cloud-native
>>>>>> options like instance profiles. These are used to call AssumeRole
>> on
>>>> the
>>>>>> user-provided IAM role.
>>>>>> 
>>>>>> To support both temporary and static credential workflows, I'm
>>>> planning to
>>>>>> introduce a `PolarisConnectionCredentialVendor` (or
>>>>>> `PolarisCredentialManager`) interface. This class will:
>>>>>> * Provide Polaris-generated service info (what we call vendor info)
>>>> such
>>>>>> as `userArn`, `externalId`, , `consentUrl`, or `gcsServiceAccount`,
>>>> which
>>>>>> will be injected into the catalog entity's connection config /
>>> storage
>>>>>> config. This info is exposed to users when they load the catalog
>>>> entity and
>>>>>> is needed for setting up the appropriate permissions (e.g.,
>> allowing
>>>>>> Polaris to assume roles).
>>>>>> * Retrieve temporary credentials from cloud providers (e.g., AWS
>> STS,
>>>>>> Azure identity services) when needed to perform authenticated
>>>> operations.
>>>>>> 
>>>>>> The goal is to draw a clear boundary between user-provided input
>> and
>>>>>> Polaris-generated service info (something that's currently unclear
>> in
>>>>>> storage configs). In the long term, we're aiming to unify both
>>>> connection
>>>>>> and storage credential handling in this interface to simplify the
>>>> overall
>>>>>> architecture and improve security.
>>>>>> 
>>>>>> Best,
>>>>>> Rulin
>>>>>> 
>>>>>> On 2025/05/01 22:02:32 Dmitri Bourlatchkov wrote:
>>>>>>> Hi Rulin,
>>>>>>> 
>>>>>>> Thanks for the informative description in the PR!
>>>>>>> 
>>>>>>> It looks like the authentication method relies on STS. As such it
>>> is
>>>> a
>>>>>>> sub-case of SigV4, I believe, because SigV4 can be used with
>> plain
>>>>>>> key/secret credentials without assuming a role.
>>>>>>> 
>>>>>>> If that is so, could you clarify that in the description?
>>>>>>> 
>>>>>>> Is there any particular reason for not supporting plain
>> key/secret
>>>>>>> credentials?
>>>>>>> 
>>>>>>> When STS is in use, where is Polaris expected to get credentials
>>> for
>>>> STS
>>>>>>> requests?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Dmitri.
>>>>>>> 
>>>>>>> On Thu, May 1, 2025 at 5:37 PM Rulin Xing <ru...@apache.org>
>>> wrote:
>>>>>>> 
>>>>>>>> Hi folks,
>>>>>>>> 
>>>>>>>> Just wanted to surface a new API spec update proposal related
>> to
>>>>>> Catalog
>>>>>>>> Federation:
>>>>>>>> 
>>>>>>>> https://github.com/apache/polaris/pull/1506
>>>>>>>> 
>>>>>>>> This adds support for AWS SigV4 authentication, enabling
>> Polaris
>>> to
>>>>>>>> federate to external Iceberg REST catalogs hosted behind
>> services
>>>> like
>>>>>> AWS
>>>>>>>> Glue, S3Tables, or API Gateway.
>>>>>>>> 
>>>>>>>> It builds on earlier federation work and introduces a set of
>>>>>> properties to
>>>>>>>> support role assumption and request signing via SigV4.
>>>>>>>> 
>>>>>>>> Feedback on the spec or implementation is welcome!
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Rulin
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to