XJDKC commented on code in PR #2523: URL: https://github.com/apache/polaris/pull/2523#discussion_r2383750881
########## polaris-core/src/main/java/org/apache/polaris/core/identity/registry/ServiceIdentityRegistry.java: ########## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.polaris.core.identity.registry; + +import java.util.Optional; +import org.apache.polaris.core.identity.ServiceIdentityType; +import org.apache.polaris.core.identity.dpo.ServiceIdentityInfoDpo; +import org.apache.polaris.core.identity.resolved.ResolvedServiceIdentity; + +/** + * A registry interface for managing and resolving service identities in Polaris. + * + * <p>In a multi-tenant Polaris deployment, each catalog or tenant may be associated with a distinct + * service identity that represents the Polaris service itself when accessing external systems + * (e.g., cloud services like AWS or GCP). This registry provides a central mechanism to manage + * those identities and resolve them at runtime. + * + * <p>The registry helps abstract the configuration and retrieval of service-managed credentials + * from the logic that uses them. It ensures a consistent and secure way to handle identity + * resolution across different deployment models, including SaaS and self-managed environments. + */ +public interface ServiceIdentityRegistry { + /** + * Discover a new {@link ServiceIdentityInfoDpo} for the given service identity type. Typically + * used during entity creation to associate a default or generated identity. + * + * @param serviceIdentityType The type of service identity (e.g., AWS_IAM). + * @return A new {@link ServiceIdentityInfoDpo} representing the discovered service identity. + */ + Optional<ServiceIdentityInfoDpo> discoverServiceIdentity(ServiceIdentityType serviceIdentityType); + + /** + * Resolves the given service identity by retrieving the actual credential or secret referenced by + * it, typically from a secret manager or internal credential store. + * + * @param serviceIdentityInfo The service identity metadata to resolve. + * @return A {@link ResolvedServiceIdentity} including credentials and other resolved data. + */ + Optional<ResolvedServiceIdentity> resolveServiceIdentity( + ServiceIdentityInfoDpo serviceIdentityInfo); Review Comment: For the name, I think using a verb to explicitly express the intention of `allocating` a service identity is good in general. So the final version on my side of this method is: ```java interface ServiceIdentityProvider { Optional<ServiceIdentityInfoDpo> allocateServiceIdentity(ConnectionConfigInfo connectionConfig); } ``` > Is it necessary to couple the "identity" (e.g. role ARN) with credentials? Not strictly, but in practice we usually configure them together. I agree there are cases where we just want to expose the service identity (e.g. in the loadCatalog response) without fetching creds, but since we cache service identity, the overhead isn’t significant. Or you just want to get the creds and don't need to know the `user arn` of the identity (in `Self-Managed Deployment`), the service identity is optional and the current implementation also supports this. > I can easily imagine deployments where the Polaris service identity is configured statically, but credentials are acquired based on "workload identity". I assume we are talking about the aws credential chain: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials-chain.html#credentials-default. The current implementation also supports this case, if we just want to rely on the cloud provider's sdk to infer the credentials, all you need is just configure the `user arn` in the `application.properties` file without specifying the actual aws credentials. Then `ResolvedServiceIdentity` can contain nothing but just a class like `DefaultAwsCredentialsProvider`, so it can infer the credentials based on the workload identity (`Amazon ECS container credentials`). For in-house env / testing, we don't need to specify / configure anything, and it should just work by inferring from env. For real prod, we need to explicitly know which credentials we use, we should not relegate to `DefaultAwsCredentialsProvider`, that's why we don't recommend using `IMPLICIT` auth type. If we want to decouple them, we need **another class** to hold the `role arn` only, we already have many classes for service identity, don't want to make it more complicated. I can't come up with a clean solution for it without including the `role arn` to in the `ResolvedAwsIamServiceIdentity` or we add a new class to hold this info. The architecture is pretty clear now: 1. `ServiceIdentityInfoModel`: generate based on open-api spec, it's an user facing class, used to tell users the identity polaris uses to access their resources (storage / catalog) 2. `ServiceIdentityInfoDpo`: holds the reference to the service identity: will be persisted in metastore. 3. `ResolvedServiceIdentity`: contains all the info of a resolved service identity, it can be converted to `ServiceIdentityInfoModel` to surface service identity info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
