dimas-b commented on code in PR #2523:
URL: https://github.com/apache/polaris/pull/2523#discussion_r2383624473


##########
polaris-core/src/main/java/org/apache/polaris/core/identity/registry/ServiceIdentityRegistry.java:
##########
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.polaris.core.identity.registry;
+
+import java.util.Optional;
+import org.apache.polaris.core.identity.ServiceIdentityType;
+import org.apache.polaris.core.identity.dpo.ServiceIdentityInfoDpo;
+import org.apache.polaris.core.identity.resolved.ResolvedServiceIdentity;
+
+/**
+ * A registry interface for managing and resolving service identities in 
Polaris.
+ *
+ * <p>In a multi-tenant Polaris deployment, each catalog or tenant may be 
associated with a distinct
+ * service identity that represents the Polaris service itself when accessing 
external systems
+ * (e.g., cloud services like AWS or GCP). This registry provides a central 
mechanism to manage
+ * those identities and resolve them at runtime.
+ *
+ * <p>The registry helps abstract the configuration and retrieval of 
service-managed credentials
+ * from the logic that uses them. It ensures a consistent and secure way to 
handle identity
+ * resolution across different deployment models, including SaaS and 
self-managed environments.
+ */
+public interface ServiceIdentityRegistry {
+  /**
+   * Discover a new {@link ServiceIdentityInfoDpo} for the given service 
identity type. Typically
+   * used during entity creation to associate a default or generated identity.
+   *
+   * @param serviceIdentityType The type of service identity (e.g., AWS_IAM).
+   * @return A new {@link ServiceIdentityInfoDpo} representing the discovered 
service identity.
+   */
+  Optional<ServiceIdentityInfoDpo> discoverServiceIdentity(ServiceIdentityType 
serviceIdentityType);
+
+  /**
+   * Resolves the given service identity by retrieving the actual credential 
or secret referenced by
+   * it, typically from a secret manager or internal credential store.
+   *
+   * @param serviceIdentityInfo The service identity metadata to resolve.
+   * @return A {@link ResolvedServiceIdentity} including credentials and other 
resolved data.
+   */
+  Optional<ResolvedServiceIdentity> resolveServiceIdentity(
+      ServiceIdentityInfoDpo serviceIdentityInfo);

Review Comment:
   > [...] each catalog or tenant may be associated with a distinct service 
identity [...]
   
   This is not quite what I meant. This describes a possible state of the 
system (i.e. Polaris). Valid / possible states of the system depend on the 
implementation. Polaris has multiple plugin points, whose implementations may 
alter what is possible of intended in a particular deployment.
   
   I'd prefer the javadoc to stays close to what implementations of this 
particular API should / could do when invoked by other components so that 
code-level behaviour expectations are clear.
   
   > Is there a reason we'd want to pass in ConnectionConfigInfo specifically? 
   
   In my mind this gives more weight to the idea that the returned identity 
object is a function of the catalog.
   
   Ideally, I'd prefer the input to be a catalog entity, but it's not available 
at call sites :shrug: 
   
   Having `ServiceIdentityType` at the parameter type is ok too. It limits the 
implementation to make decision based only on the type (and not other 
properties) but ATM I do not have a use case for using something other than the 
type.
   
   > For discoverServiceIdentity, previously we use assignServiceIdentity which 
is almost the same as the allocateServiceIdentity.
   
   Yes, the distinction is pretty subtle, I agree. In my mind "assign" means we 
connect or link something that already exists. "Discover" is when we find 
something preconfigured and return based on call arguments. However, the idea 
is (if I understand correctly) that the implementation may actively "generate" 
a new service identity on demand, hence "allocate". If you have other 
suggestions, please share :slightly_smiling_face: 
   
   > I prefer to use ServiceIdentityProvider
   
   WDYT about `ServiceIdentityProvider.forCatalog(ServiceIdentityType)`. Not 
using a verb in the method name yields less behaviour expectations, so the onus 
would be on javadoc to clearly describe what is expected.
   
   > ServiceIdentityRegistry should own both allocation and resolution of 
service identities. The main duty of it is to interact with remote secret 
manager to get the service identity info and its credential.
   
   Yeah, I guess this is where we have different visions :sweat_smile: 
   
   Is it necessary to couple the "identity" (e.g. role ARN) with credentials? I 
can easily imagine deployments where the Polaris service identity is configured 
statically, but credentials are acquired based on "workload identity". These 
would be totally different and unrelated flows in runtime, so why force then 
into the same java interface?
   
   Also, I do not think that the remote secret manage is a required component. 
In the workload identity case, for example, it is not necessary.
   
   In my proposal, implementations of `ServiceCredentialsResolver` may use the 
secret manager, if appropriate, or other means depending on the deployment.
   
   > The ResolvedServiceIdentity could contain empty credential and delegate it 
to PolarisCredentialManager to meet this use case right?
   
   I'm trying to structure the interfaces such that alternative use cases could 
be implemented without passing "fake" or "null" objects through call paths that 
are not necessary in those particular situations. Basically if an 
implementation does not need to load secrets, then the secret manager will be 
even be injected into the runtime objects.
   
   > PolarisCredentialManager
   
   This interface from your message looks reasonable to me ATM :+1: 
   
   So, to recap: from my POV I would like to separate "identity" (as in "role 
ARN") from credential lookup/resolution in this PR. Does this sound reasonable 
to you? The other matters we can refine later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to