XJDKC commented on code in PR #2523:
URL: https://github.com/apache/polaris/pull/2523#discussion_r2388629006


##########
polaris-core/src/main/java/org/apache/polaris/core/identity/registry/ServiceIdentityRegistry.java:
##########
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.polaris.core.identity.registry;
+
+import java.util.Optional;
+import org.apache.polaris.core.identity.ServiceIdentityType;
+import org.apache.polaris.core.identity.dpo.ServiceIdentityInfoDpo;
+import org.apache.polaris.core.identity.resolved.ResolvedServiceIdentity;
+
+/**
+ * A registry interface for managing and resolving service identities in 
Polaris.
+ *
+ * <p>In a multi-tenant Polaris deployment, each catalog or tenant may be 
associated with a distinct
+ * service identity that represents the Polaris service itself when accessing 
external systems
+ * (e.g., cloud services like AWS or GCP). This registry provides a central 
mechanism to manage
+ * those identities and resolve them at runtime.
+ *
+ * <p>The registry helps abstract the configuration and retrieval of 
service-managed credentials
+ * from the logic that uses them. It ensures a consistent and secure way to 
handle identity
+ * resolution across different deployment models, including SaaS and 
self-managed environments.
+ */
+public interface ServiceIdentityRegistry {
+  /**
+   * Discover a new {@link ServiceIdentityInfoDpo} for the given service 
identity type. Typically
+   * used during entity creation to associate a default or generated identity.
+   *
+   * @param serviceIdentityType The type of service identity (e.g., AWS_IAM).
+   * @return A new {@link ServiceIdentityInfoDpo} representing the discovered 
service identity.
+   */
+  Optional<ServiceIdentityInfoDpo> discoverServiceIdentity(ServiceIdentityType 
serviceIdentityType);
+
+  /**
+   * Resolves the given service identity by retrieving the actual credential 
or secret referenced by
+   * it, typically from a secret manager or internal credential store.
+   *
+   * @param serviceIdentityInfo The service identity metadata to resolve.
+   * @return A {@link ResolvedServiceIdentity} including credentials and other 
resolved data.
+   */
+  Optional<ResolvedServiceIdentity> resolveServiceIdentity(
+      ServiceIdentityInfoDpo serviceIdentityInfo);

Review Comment:
   Yeah, I fully agree with this: `PolarisAdminService.listCatalogs() only 
needs to “know” the identity, but right now the call path also resolves 
credentials.`
   
   The tricky part is I haven't found a clean way to load just the identity 
info without pulling in the credentials too. Open to ideas if there's a simpler 
solution that doesn't complicate things too much. I use this pattern since we 
will cache the `ResolvedServiceIdentity`, even if we grab both the identity and 
credentials together, the overhead is minimal. And since 
`ResolvedServiceIdentity` is an internal concept (not exposed in persistence 
APIs or user-facing models), having both the credential and user arn in one 
class seems fine for now, we can revise this at any time later.
   
   On `obtainServiceCredentials(ServiceIdentityInfoDpo) return 
EnumMap<ConnectionCredentialProperty, String>`, I just want to highlight that 
service identity credentials are not the same as the connection credentials 
used to access remote services. If we take a look into storage creds, there's 
always an intermediate step:
   * AWS: Polaris uses an IAM user's credentials (managed by the vendor) to 
assume the users' IAM role and get temp AWS creds.
   * Azure: Polaris uses OAuth creds tied to an Azure App (managed by Polaris 
Vendor) to get a sub-scoped SAS token, which is then used to access storage.
   
   Because of this, we shouldn't treat service identity creds as 
`ConnectionCredential`, and I'm not a fan of returning creds in an EnumMap. 
That's why I’d prefer introducing a `ResolvedServiceIdentity` class, it:
   1. Centralizes the case-handling in one place (e.g., 
PolarisCredentialManager in part 4): we only need to do a switch case in one 
class.
   2. Explicitly stores creds in the cloud provider's own type (e.g., 
AwsCredentialsProvider).
   
   Looking forward to your specific suggestions!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to