dimas-b commented on code in PR #3414:
URL: https://github.com/apache/polaris/pull/3414#discussion_r2692150954


##########
runtime/service/src/main/java/org/apache/polaris/service/catalog/io/StorageAccessConfigProvider.java:
##########
@@ -187,6 +193,62 @@ private CredentialVendingContext 
buildCredentialVendingContext(
       builder.activatedRoles(Optional.of(rolesString));
     }
 
+    // Only include trace ID when the feature flag is enabled.
+    // When enabled, trace IDs are included in AWS STS session tags and become 
part of the
+    // credential cache key (since they affect the vended credentials).
+    // When disabled (default), trace IDs are not included, allowing efficient 
credential
+    // caching across requests with different trace IDs.
+    boolean includeTraceIdInSessionTags =
+        storageCredentialsVendor
+            .getRealmConfig()
+            .getConfig(FeatureConfiguration.INCLUDE_TRACE_ID_IN_SESSION_TAGS);
+    if (includeTraceIdInSessionTags) {
+      builder.traceId(getCurrentTraceId());
+    }
+
+    // Only include request ID when the feature flag is enabled.
+    // When enabled, request IDs are included in AWS STS session tags and 
become part of the
+    // credential cache key (since they affect the vended credentials).
+    // When disabled (default), request IDs are not included, allowing 
efficient credential
+    // caching across requests with different request IDs.
+    boolean includeRequestIdInSessionTags =
+        storageCredentialsVendor
+            .getRealmConfig()
+            
.getConfig(FeatureConfiguration.INCLUDE_REQUEST_ID_IN_SESSION_TAGS);
+    if (includeRequestIdInSessionTags) {
+      builder.requestId(getCurrentRequestId());
+    }
+
     return builder.build();
   }
+
+  /**
+   * Extracts the current OpenTelemetry trace ID from the active span context.
+   *
+   * @return the trace ID if a valid span context exists, empty otherwise
+   */
+  private Optional<String> getCurrentTraceId() {
+    SpanContext spanContext = Span.current().getSpanContext();
+    if (spanContext.isValid()) {
+      return Optional.of(spanContext.getTraceId());
+    }
+    return Optional.empty();
+  }
+
+  /**
+   * Extracts the current request ID from the active RESTEasy Reactive request 
context.
+   *
+   * <p>Note: avoid injecting ContainerRequestContext here because some tests 
may run with no active
+   * request scope.
+   */
+  private Optional<String> getCurrentRequestId() {
+    // See org.jboss.resteasy.reactive.server.injection.ContextProducers
+    ResteasyReactiveRequestContext context = CurrentRequestManager.get();

Review Comment:
   Sorry, but I do not think this is a totally correct way to get request IDs. 
IIRC, `CurrentRequestManager.get()` will fail in runtime if the this method is 
called outside an HTTP (REST) request context, which is possible in async tasks.
   
   Please see how `TaskExecutorImpl` deals with realm IDs. A similar CDI 
patterns is probably necessary for request IDs, except that we may not have to 
produce a request ID for background tasks. I do not have an opinion on whether 
request IDs should be propagated from REST requests to related async tasks or 
not. However, CDI must be solid and not cause runtime exceptions even if the 
request ID is not propagated.



##########
polaris-core/src/main/java/org/apache/polaris/core/storage/aws/AwsSessionTagsBuilder.java:
##########
@@ -78,24 +98,68 @@ public static List<Tag> buildSessionTags(String 
principalName, CredentialVending
             
.value(truncateTagValue(context.tableName().orElse(TAG_VALUE_UNKNOWN)))
             .build());
 
+    // Only include trace ID if it's present in the context.
+    // The context's traceId is only populated when 
INCLUDE_TRACE_ID_IN_SESSION_TAGS is enabled.
+    // This allows efficient credential caching when trace IDs are not needed 
in session tags.
+    context
+        .traceId()
+        .ifPresent(
+            traceId ->
+                tags.add(
+                    Tag.builder()
+                        .key(CredentialVendingContext.TAG_KEY_TRACE_ID)
+                        .value(truncateTagValue(traceId))
+                        .build()));
+
+    // Only include request ID if it's present in the context.
+    // The context's requestId is only populated when 
INCLUDE_REQUEST_ID_IN_SESSION_TAGS is enabled.
+    // This allows efficient credential caching when request IDs are not 
needed in session tags.
+    context
+        .requestId()
+        .ifPresent(
+            requestId ->
+                tags.add(
+                    Tag.builder()
+                        .key(CredentialVendingContext.TAG_KEY_REQUEST_ID)
+                        .value(truncateTagValue(requestId))
+                        .build()));
+
     return tags;
   }
 
   /**
-   * Truncates a tag value to fit within AWS STS limits. AWS limits session 
tag values to 256
-   * characters. Returns "unknown" placeholder for null or empty values to 
ensure consistent tag
-   * presence in CloudTrail.
+   * Prepares a tag value for AWS STS by sanitizing invalid characters and 
truncating to fit within
+   * AWS limits. AWS limits session tag values to 256 characters and restricts 
the allowed character
+   * set to Unicode letters, whitespace, numbers, and specific special 
characters (_.:/=+\-@).
+   *
+   * <p>Invalid characters are replaced with underscores to maintain value 
readability while
+   * ensuring STS acceptance. Returns "unknown" placeholder for null or empty 
values to ensure
+   * consistent tag presence in CloudTrail.
    *
-   * @param value the value to potentially truncate
-   * @return the truncated value, or "unknown" if value is null or empty
+   * @param value the value to sanitize and potentially truncate
+   * @return the sanitized and truncated value, or "unknown" if value is null 
or empty
    */
   static String truncateTagValue(String value) {
     if (value == null || value.isEmpty()) {
       return TAG_VALUE_UNKNOWN;
     }
-    if (value.length() <= MAX_TAG_VALUE_LENGTH) {
-      return value;
+    // Sanitize invalid characters first, then truncate
+    String sanitized = sanitizeTagValue(value);
+    if (sanitized.length() <= MAX_TAG_VALUE_LENGTH) {
+      return sanitized;
     }
-    return value.substring(0, MAX_TAG_VALUE_LENGTH);
+    return sanitized.substring(0, MAX_TAG_VALUE_LENGTH);
+  }
+
+  /**
+   * Sanitizes a tag value by replacing characters that are not allowed in AWS 
STS session tag
+   * values. AWS STS allows: Unicode letters, whitespace, numbers, and 
specific special characters
+   * (_.:/=+\-@). Invalid characters are replaced with underscores.
+   *
+   * @param value the value to sanitize (must not be null)
+   * @return the sanitized value with invalid characters replaced by 
underscores
+   */
+  static String sanitizeTagValue(String value) {
+    return INVALID_TAG_VALUE_CHARS.matcher(value).replaceAll("_");

Review Comment:
   What's the point in including trace ID / request ID if their values are 
modified? They will become useless for automated processing / correlation, I 
think 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to