obelix74 commented on code in PR #4707:
URL: https://github.com/apache/polaris/pull/4707#discussion_r3433036133


##########
polaris-core/src/main/java/org/apache/polaris/core/storage/gcp/GcpCredentialsStorageIntegration.java:
##########
@@ -246,6 +333,40 @@ static StorageAccessConfig 
compute(GcpStorageCredentialCacheKey key) {
     return accessConfig.build();
   }
 
+  /**
+   * Returns the credential to be used as the source for downscoping.
+   *
+   * <p>When GCS principal attribution is configured and a principal is 
present (so the cache key
+   * carries pre-computed {@link GcpAttributionParams}), the impersonation 
source is a federated
+   * identity whose subject is {@code <realm>/<principal>}, which surfaces the 
principal in {@code
+   * serviceAccountDelegationInfo} of GCS Data Access audit logs. Otherwise 
this is the standard
+   * path: impersonate the configured service account from the ambient source 
credentials, or use
+   * those credentials directly.
+   */
+  private static GoogleCredentials baseCredentialsForVending(

Review Comment:
   Renamed to `resolveSourceCredentials` — it now reads clearly as "pick the 
right source credential for downscoping" regardless of whether attribution is 
active.



##########
polaris-core/src/main/java/org/apache/polaris/core/storage/gcp/GcpCredentialsStorageIntegration.java:
##########
@@ -246,6 +333,40 @@ static StorageAccessConfig 
compute(GcpStorageCredentialCacheKey key) {
     return accessConfig.build();
   }
 
+  /**
+   * Returns the credential to be used as the source for downscoping.
+   *
+   * <p>When GCS principal attribution is configured and a principal is 
present (so the cache key
+   * carries pre-computed {@link GcpAttributionParams}), the impersonation 
source is a federated
+   * identity whose subject is {@code <realm>/<principal>}, which surfaces the 
principal in {@code
+   * serviceAccountDelegationInfo} of GCS Data Access audit logs. Otherwise 
this is the standard
+   * path: impersonate the configured service account from the ambient source 
credentials, or use
+   * those credentials directly.
+   */
+  private static GoogleCredentials baseCredentialsForVending(
+      GcpStorageCredentialCacheKey key,
+      GcpStorageConfigurationInfo storageConfig,
+      GoogleCredentials sourceCredentials,
+      GcpCredentialOps credentialOps) {
+    Optional<GcpAttributionParams> attributionParams = key.attributionParams();
+    if (attributionParams.isPresent()) {
+      GcpAttributionParams params = attributionParams.get();
+      String subject =
+          GcpAttributionSubjectBuilder.buildSubject(key.realmId(), 
key.principalName());
+      GcpFederatedCredentialsExchanger exchanger =
+          new GcpFederatedCredentialsExchanger(
+              params.tokenIssuer(),
+              params.wifAudience(),
+              Path.of(params.signingKeyFile()),

Review Comment:
   Moved the validation to `GcpAttributionParams` itself via a `@Value.Derived 
signingKeyPath()` field that calls `Path.of(signingKeyFile())`. Immutables 
evaluates derived fields at construction time, so an invalid path string now 
throws `InvalidPathException` in `resolveAttributionParams` (at integration 
creation) rather than on the first cache miss.



##########
polaris-core/src/main/java/org/apache/polaris/core/storage/gcp/GcpStorageCredentialCacheKey.java:
##########
@@ -55,25 +61,43 @@ public interface GcpStorageCredentialCacheKey extends 
StorageCredentialCacheKey
   @Value.Parameter(order = 6)
   Optional<String> refreshCredentialsEndpoint();
 
+  /**
+   * The requesting principal name, included in the cache key only when GCS 
principal attribution is
+   * enabled (otherwise empty). When attribution is active the vended 
credential carries this
+   * principal's identity, so it must participate in cache identity to avoid 
serving one principal's
+   * attributed credentials to another.
+   */
+  @Value.Parameter(order = 7)
+  String principalName();

Review Comment:
   Changed to `Optional<String>`. Also updated `buildCacheKey` to use 
`Optional.empty()` as the absent-principal sentinel and 
`context.principalName().filter(n -> !n.isEmpty())` to populate it, keeping the 
logic consistent with the rest of the file.



##########
polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java:
##########
@@ -209,6 +209,64 @@ public static void enforceFeatureEnabledOrThrow(
               .defaultValue(List.<String>of())
               .buildFeatureConfiguration();
 
+  // 
---------------------------------------------------------------------------
+  // GCS principal attribution via Workload Identity Federation
+  //
+  // GCP downscoped credentials have no session-tag mechanism (unlike AWS 
STS), and custom audit
+  // headers only reach GCS audit logs if the client forwards them. To 
attribute GCS data access
+  // to the Polaris principal for ANY client, credential vending can chain
+  // catalog-signed JWT -> STS token exchange -> per-catalog service-account 
impersonation, so the
+  // principal appears in serviceAccountDelegationInfo of every GCS Data 
Access audit log entry.
+  //
+  // Attribution activates automatically once the audience, issuer, and 
signing key file are all
+  // set (no on/off flag); it additionally requires a gcpServiceAccount on the 
storage config.
+  // 
---------------------------------------------------------------------------
+
+  public static final FeatureConfiguration<String> 
GCS_PRINCIPAL_ATTRIBUTION_WIF_AUDIENCE =

Review Comment:
   RealmConfig settings are read-only once the server is up — there is no admin 
API to mutate them in-flight. The validation in `resolveAttributionParams` 
fires once per `GcpCredentialsStorageIntegration` construction (which happens 
once per realm at startup via `PolarisStorageIntegrationProviderImpl`), so in 
practice it behaves like a startup check. Moving it to an explicit lifecycle 
hook would be cleaner but requires a larger refactor across storage providers; 
happy to file a follow-up if the project wants a uniform "validate on start" 
contract.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to