>From Ritik Raj <[email protected]>:

Ritik Raj has uploaded this change for review. ( 
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21198?usp=email )


Change subject: [ASTERIXDB-3768][CLOUD] Add configurable S3 checksum behavior 
for S3-compatible storage
......................................................................

[ASTERIXDB-3768][CLOUD] Add configurable S3 checksum behavior for S3-compatible 
storage

- user model changes: no
- storage format changes: no
- interface changes: no

AWS SDK Java v2 >= 2.30.0 introduced new cross-SDK checksum defaults
(WHEN_SUPPORTED) that break S3-compatible storage solutions (e.g. OCI)
which do not support the newer checksum APIs.

Add a new CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR configuration option
(values: when_required | when_supported | auto) with a smart default:
- when_required: when a custom endpoint is configured (S3-compatible)
- auto (SDK default): when using native AWS S3

ext-ref: MB-71732
Change-Id: If6618d3a336e9bf134efb1f219660421edc27c43
---
M 
asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm
M 
asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm
M 
asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm
M 
asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java
M 
asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3CloudClient.java
M 
asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java
M 
asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java
A docs/superpowers/plans/2026-05-05-s3-checksum-behavior.md
A docs/superpowers/specs/2026-05-05-s3-checksum-behavior-design.md
M 
hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/ICloudProperties.java
10 files changed, 590 insertions(+), 19 deletions(-)



  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb 
refs/changes/98/21198/1

diff --git 
a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm
 
b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm
index 10c1856..cffa1fa 100644
--- 
a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm
+++ 
b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm
@@ -40,6 +40,7 @@
     "cloud.storage.prefix" : "",
     "cloud.storage.region" : "",
     "cloud.storage.s3.access.key.id" : null,
+    "cloud.storage.s3.checksum.behavior" : "auto",
     "cloud.storage.s3.client.read.timeout" : -1,
     "cloud.storage.s3.parallel.downloader.client.type" : "crt",
     "cloud.storage.s3.secret.access.key" : null,
diff --git 
a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm
 
b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm
index 22c4fc8..6f93933 100644
--- 
a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm
+++ 
b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm
@@ -40,6 +40,7 @@
     "cloud.storage.prefix" : "",
     "cloud.storage.region" : "",
     "cloud.storage.s3.access.key.id" : null,
+    "cloud.storage.s3.checksum.behavior" : "auto",
     "cloud.storage.s3.client.read.timeout" : -1,
     "cloud.storage.s3.parallel.downloader.client.type" : "crt",
     "cloud.storage.s3.secret.access.key" : null,
diff --git 
a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm
 
b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm
index a36d3b5..b01ff20 100644
--- 
a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm
+++ 
b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm
@@ -40,6 +40,7 @@
     "cloud.storage.prefix" : "",
     "cloud.storage.region" : "",
     "cloud.storage.s3.access.key.id" : null,
+    "cloud.storage.s3.checksum.behavior" : "auto",
     "cloud.storage.s3.client.read.timeout" : -1,
     "cloud.storage.s3.parallel.downloader.client.type" : "crt",
     "cloud.storage.s3.secret.access.key" : null,
diff --git 
a/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java
 
b/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java
index dd8485c..afca2e81 100644
--- 
a/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java
+++ 
b/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java
@@ -56,12 +56,14 @@
     private final boolean roundRobinDnsResolver;
     private final String accessKeyId;
     private final String secretAccessKey;
+    private final ICloudProperties.S3ChecksumBehavior checksumBehavior;

     public S3ClientConfig(String region, String endpoint, String prefix, 
boolean anonymousAuth,
             Collection<String> certificates, long profilerLogInterval, int 
writeBufferSize,
             S3ParallelDownloaderClientType parallelDownloaderClientType, 
boolean roundRobinDnsResolver) {
         this(region, endpoint, prefix, anonymousAuth, certificates, 
profilerLogInterval, writeBufferSize, 1, 0, 0, 0,
-                false, false, 0, 0, -1, parallelDownloaderClientType, 
roundRobinDnsResolver, "", "");
+                false, false, 0, 0, -1, parallelDownloaderClientType, 
roundRobinDnsResolver, "", "",
+                ICloudProperties.S3ChecksumBehavior.WHEN_REQUIRED);
     }

     private S3ClientConfig(String region, String endpoint, String prefix, 
boolean anonymousAuth,
@@ -70,7 +72,7 @@
             boolean forcePathStyle, boolean disableSslVerify, int 
requestsMaxPendingHttpConnections,
             int requestsHttpConnectionAcquireTimeout, int 
s3ReadTimeoutInSeconds,
             S3ParallelDownloaderClientType parallelDownloaderClientType, 
boolean roundRobinDnsResolver,
-            String accessKeyId, String secretAccessKey) {
+            String accessKeyId, String secretAccessKey, 
ICloudProperties.S3ChecksumBehavior checksumBehavior) {
         this.region = Objects.requireNonNull(region, "region");
         this.endpoint = endpoint;
         this.prefix = Objects.requireNonNull(prefix, "prefix");
@@ -91,6 +93,7 @@
         this.roundRobinDnsResolver = roundRobinDnsResolver;
         this.accessKeyId = accessKeyId;
         this.secretAccessKey = secretAccessKey;
+        this.checksumBehavior = Objects.requireNonNull(checksumBehavior, 
"checksumBehavior");
     }

     public static S3ClientConfig of(ICloudProperties cloudProperties) {
@@ -104,7 +107,7 @@
                 cloudProperties.getRequestsHttpConnectionAcquireTimeout(), 
cloudProperties.getS3ReadTimeoutInSeconds(),
                 
S3ParallelDownloaderClientType.valueOf(cloudProperties.getS3ParallelDownloaderClientType()),
                 cloudProperties.useRoundRobinDnsResolver(), 
cloudProperties.getS3AccessKeyId(),
-                cloudProperties.getS3SecretAccessKey());
+                cloudProperties.getS3SecretAccessKey(), 
cloudProperties.getS3ChecksumBehavior());
     }

     public enum S3ParallelDownloaderClientType {
@@ -126,7 +129,9 @@
     }

     public static S3ClientConfig of(Map<String, String> configuration, int 
writeBufferSize) {
-        // Used to determine local vs. actual S3
+        // Used to determine local vs. actual S3.
+        // checksumBehavior defaults to "when_required" via the convenience 
constructor —
+        // appropriate here since a custom endpoint is always present.
         String endPoint = 
configuration.getOrDefault(AwsConstants.SERVICE_END_POINT_FIELD_NAME, "");
         // Disabled
         long profilerLogInterval = 0;
@@ -227,4 +232,8 @@
     public boolean useRoundRobinDnsResolver() {
         return roundRobinDnsResolver;
     }
+
+    public ICloudProperties.S3ChecksumBehavior getChecksumBehavior() {
+        return checksumBehavior;
+    }
 }
diff --git 
a/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3CloudClient.java
 
b/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3CloudClient.java
index de69617..e1fa4e9 100644
--- 
a/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3CloudClient.java
+++ 
b/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3CloudClient.java
@@ -52,6 +52,7 @@
 import org.apache.asterix.common.exceptions.RuntimeDataException;
 import org.apache.asterix.external.util.aws.AwsUtils;
 import org.apache.asterix.external.util.aws.AwsUtils.CloseableAwsClients;
+import org.apache.asterix.external.util.aws.s3.S3Utils;
 import org.apache.hyracks.api.exceptions.HyracksDataException;
 import org.apache.hyracks.api.io.FileReference;
 import org.apache.hyracks.api.util.IoUtil;
@@ -382,6 +383,7 @@
         builder.credentialsProvider(credentialsProvider);
         builder.region(Region.of(config.getRegion()));
         builder.forcePathStyle(config.isForcePathStyle());
+        S3Utils.applyChecksumBehavior(builder, config.getChecksumBehavior());

         AttributeMap.Builder customHttpConfigBuilder = AttributeMap.builder();
         if (config.getRequestsMaxHttpConnections() > 0) {
diff --git 
a/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java
 
b/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java
index 98c54f3..c4d040f 100644
--- 
a/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java
+++ 
b/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java
@@ -100,7 +100,12 @@
         CLOUD_STORAGE_S3_USE_ROUND_ROBIN_DNS_RESOLVER(BOOLEAN, false),
         CLOUD_STORAGE_S3_ACCESS_KEY_ID(STRING, (String) null),
         CLOUD_STORAGE_S3_SECRET_ACCESS_KEY(STRING, (String) null),
-        CLOUD_STORAGE_AZURE_CLIENT_ID(STRING, (String) null),;
+        CLOUD_STORAGE_AZURE_CLIENT_ID(STRING, (String) null),
+        CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR(STRING, 
(Function<IApplicationConfig, String>) app -> {
+            String endpoint = app.getString(CLOUD_STORAGE_ENDPOINT);
+            return (endpoint == null || endpoint.isEmpty()) ? 
ICloudProperties.S3ChecksumBehavior.AUTO.name()
+                    : ICloudProperties.S3ChecksumBehavior.WHEN_REQUIRED.name();
+        });

         private final IOptionType interpreter;
         private final Object defaultValue;
@@ -151,6 +156,7 @@
                 case CLOUD_STORAGE_S3_ACCESS_KEY_ID:
                 case CLOUD_STORAGE_S3_SECRET_ACCESS_KEY:
                 case CLOUD_STORAGE_AZURE_CLIENT_ID:
+                case CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR:
                     return Section.COMMON;
                 default:
                     throw new IllegalStateException("NYI: " + this);
@@ -240,6 +246,13 @@
                     return "The S3 secret access key for static credential 
authentication (defaults to null, which indicates to use default credential 
chain)";
                 case CLOUD_STORAGE_AZURE_CLIENT_ID:
                     return "The Azure user managed identity client ID 
(defaults to null, which takes the system managed identity client ID)";
+                case CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR:
+                    return "The checksum behavior for S3 requests and 
responses. Accepted values: "
+                            + "'when_required' (only checksums mandated by the 
operation), "
+                            + "'when_supported' (checksums on all eligible 
operations, SDK >= 2.30 default), "
+                            + "'auto' (no explicit override, defer to SDK 
default). "
+                            + "Defaults to 'when_required' when a custom 
endpoint is configured "
+                            + "(S3-compatible stores), 'auto' for native AWS 
S3.";
                 default:
                     throw new IllegalStateException("NYI: " + this);
             }
@@ -260,6 +273,9 @@
             if (this == CLOUD_STORAGE_S3_PARALLEL_DOWNLOADER_CLIENT_TYPE) {
                 return "crt if no custom endpoint is set; async otherwise";
             }
+            if (this == CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR) {
+                return "when_required if a custom endpoint is set; auto 
otherwise";
+            }
             return IOption.super.usageDefaultOverride(accessor, optionPrinter);
         }

@@ -406,4 +422,10 @@
     public String getAzureClientId() {
         return accessor.getString(Option.CLOUD_STORAGE_AZURE_CLIENT_ID);
     }
+
+    // Parses the stored string value to the S3ChecksumBehavior enum
+    public ICloudProperties.S3ChecksumBehavior getS3ChecksumBehavior() {
+        return ICloudProperties.S3ChecksumBehavior
+                
.fromString(accessor.getString(Option.CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR));
+    }
 }
diff --git 
a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java
 
b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java
index 07424b1..262f9bf 100644
--- 
a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java
+++ 
b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java
@@ -106,7 +106,7 @@
 import org.apache.hyracks.api.exceptions.IWarningCollector;
 import org.apache.hyracks.api.exceptions.SourceLocation;
 import org.apache.hyracks.api.exceptions.Warning;
-import org.apache.hyracks.util.annotations.AiProvenance;
+import org.apache.hyracks.cloud.io.ICloudProperties;
 import org.apache.logging.log4j.LogManager;
 import org.apache.logging.log4j.Logger;

@@ -185,24 +185,29 @@
         } else if (certificates != null && !certificates.isBlank()) {
             builder.httpClient(createHttpClient(certificates));
         }
-        if (serviceEndpoint != null) {
-            configureS3CompatibleSettings(serviceEndpoint, builder);
-        }
+        applyChecksumBehavior(builder, 
appCtx.getCloudProperties().getS3ChecksumBehavior());
         awsClients.setConsumingClient(builder.build());
         return awsClients;

     }

-    @AiProvenance(agent = AiProvenance.Agent.CLAUDE_SONNET_4_6, tool = 
AiProvenance.Tool.GITHUB_COPILOT)
-    private static void configureS3CompatibleSettings(String serviceEndpoint, 
S3ClientBuilder builder) {
-        // AWS SDK 2.43+ sends CRC64NVME request checksums by default for all 
eligible operations.
-        // S3-compatible endpoints (non-AWS) and older mock servers do not 
understand this header and
-        // may reject or mishandle requests, returning empty or error 
responses. When a custom endpoint
-        // is configured (i.e. not talking to real AWS S3), disable automatic 
checksum calculation so
-        // only operations that explicitly require a checksum will include one.
-        if (serviceEndpoint != null) {
-            
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED);
-            
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_REQUIRED);
+    public static void applyChecksumBehavior(S3ClientBuilder builder, 
ICloudProperties.S3ChecksumBehavior behavior) {
+        if (behavior == null) {
+            LOGGER.warn("checksumBehavior is null; falling back to SDK 
defaults.");
+            return;
+        }
+        switch (behavior) {
+            case WHEN_REQUIRED:
+                
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED);
+                
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_REQUIRED);
+                break;
+            case WHEN_SUPPORTED:
+                
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_SUPPORTED);
+                
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED);
+                break;
+            case AUTO:
+                // leave SDK defaults untouched
+                break;
         }
     }

diff --git a/docs/superpowers/plans/2026-05-05-s3-checksum-behavior.md 
b/docs/superpowers/plans/2026-05-05-s3-checksum-behavior.md
new file mode 100644
index 0000000..dab1945
--- /dev/null
+++ b/docs/superpowers/plans/2026-05-05-s3-checksum-behavior.md
@@ -0,0 +1,384 @@
+# S3 Checksum Behavior Configuration Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use 
superpowers:subagent-driven-development (recommended) or 
superpowers:executing-plans to implement this plan task-by-task. Steps use 
checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Add a configurable `CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR` option 
(values: `when_required` | `when_supported` | `auto`) to `CloudProperties` that 
controls AWS SDK checksum behavior for both the blob storage and external-links 
S3 client, defaulting to `when_required` when a custom endpoint is set and 
`auto` otherwise.
+
+**Architecture:** The option lives in `CloudProperties` and is surfaced 
through the `ICloudProperties` interface. For blob storage it flows through 
`S3ClientConfig` → `S3CloudClient.buildClient`. For external links it is read 
via `appCtx.getCloudProperties()` inside `S3Utils.buildClient`, replacing the 
current hardcoded `configureS3CompatibleSettings` method.
+
+**Tech Stack:** Java 17, AWS SDK v2 
(`software.amazon.awssdk.core.checksums.RequestChecksumCalculation`, 
`ResponseChecksumValidation`), JUnit 4
+
+---
+
+## Files to Change
+
+| File | Action |
+|---|---|
+| 
`hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/ICloudProperties.java`
 | Add `getS3ChecksumBehavior()` method |
+| 
`asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java`
 | Add `CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR` option + accessor |
+| 
`asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java`
 | Add `checksumBehavior` field + getter; wire into constructors + 
`of(ICloudProperties)` |
+| 
`asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3CloudClient.java`
 | Add `applyChecksumBehavior` helper; call it in `buildClient` |
+| 
`asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java`
 | Replace `configureS3CompatibleSettings` with `applyChecksumBehavior` helper; 
read setting from `appCtx.getCloudProperties()` |
+
+---
+
+## Task 1: Add `getS3ChecksumBehavior()` to `ICloudProperties`
+
+**Files:**
+- Modify: 
`hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/ICloudProperties.java`
+
+- [ ] **Step 1: Add the method to the interface**
+
+Open `ICloudProperties.java`. After the existing `getAzureClientId()` method, 
add:
+
+```java
+    String getS3ChecksumBehavior();
+```
+
+The interface tail should look like:
+
+```java
+    String getS3AccessKeyId();
+
+    String getS3SecretAccessKey();
+
+    String getAzureClientId();
+
+    String getS3ChecksumBehavior();
+}
+```
+
+- [ ] **Step 2: Build to confirm no compilation errors**
+
+```bash
+cd asterixdb
+mvn compile -pl hyracks-fullstack/hyracks/hyracks-cloud -am -q
+```
+
+Expected: `BUILD SUCCESS` (fails are acceptable here since implementors 
haven't been updated yet — just checking syntax)
+
+---
+
+## Task 2: Add `CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR` to `CloudProperties`
+
+**Files:**
+- Modify: 
`asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java`
+
+- [ ] **Step 1: Add the enum constant**
+
+In the `Option` enum, change the last three constants from:
+
+```java
+        CLOUD_STORAGE_S3_ACCESS_KEY_ID(STRING, (String) null),
+        CLOUD_STORAGE_S3_SECRET_ACCESS_KEY(STRING, (String) null),
+        CLOUD_STORAGE_AZURE_CLIENT_ID(STRING, (String) null);
+```
+
+To:
+
+```java
+        CLOUD_STORAGE_S3_ACCESS_KEY_ID(STRING, (String) null),
+        CLOUD_STORAGE_S3_SECRET_ACCESS_KEY(STRING, (String) null),
+        CLOUD_STORAGE_AZURE_CLIENT_ID(STRING, (String) null),
+        CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR(STRING, 
(Function<IApplicationConfig, String>) app -> {
+            String endpoint = app.getString(CLOUD_STORAGE_ENDPOINT);
+            return (endpoint == null || endpoint.isEmpty()) ? "auto" : 
"when_required";
+        });
+```
+
+- [ ] **Step 2: Add the case to `section()`**
+
+In the `section()` switch statement, add before `default`:
+
+```java
+                case CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR:
+                    return Section.COMMON;
+```
+
+- [ ] **Step 3: Add the case to `description()`**
+
+In the `description()` switch statement, add before `default`:
+
+```java
+                case CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR:
+                    return "The checksum behavior for S3 requests and 
responses. Accepted values: "
+                            + "'when_required' (only checksums mandated by the 
operation), "
+                            + "'when_supported' (checksums on all eligible 
operations, SDK >= 2.30 default), "
+                            + "'auto' (no explicit override, defer to SDK 
default). "
+                            + "Defaults to 'when_required' when a custom 
endpoint is configured "
+                            + "(S3-compatible stores), 'auto' for native AWS 
S3.";
+```
+
+- [ ] **Step 4: Add `usageDefaultOverride`**
+
+In the `usageDefaultOverride` method, add alongside the existing check:
+
+```java
+        @Override
+        public String usageDefaultOverride(IApplicationConfig accessor, 
Function<IOption, String> optionPrinter) {
+            if (this == CLOUD_STORAGE_S3_PARALLEL_DOWNLOADER_CLIENT_TYPE) {
+                return "crt if no custom endpoint is set; async otherwise";
+            }
+            if (this == CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR) {
+                return "when_required if a custom endpoint is set; auto 
otherwise";
+            }
+            return IOption.super.usageDefaultOverride(accessor, optionPrinter);
+        }
+```
+
+- [ ] **Step 5: Add the accessor method**
+
+At the end of the class body (after `getAzureClientId()`), add:
+
+```java
+    public String getS3ChecksumBehavior() {
+        return accessor.getString(Option.CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR);
+    }
+```
+
+- [ ] **Step 6: Build to confirm no compilation errors**
+
+```bash
+cd asterixdb
+mvn compile -pl asterixdb/asterix-common -am -q
+```
+
+Expected: `BUILD SUCCESS`
+
+---
+
+## Task 3: Add `checksumBehavior` to `S3ClientConfig`
+
+**Files:**
+- Modify: 
`asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java`
+
+- [ ] **Step 1: Add the field**
+
+After the `secretAccessKey` field, add:
+
+```java
+    private final String checksumBehavior;
+```
+
+- [ ] **Step 2: Update the private all-args constructor**
+
+Change the constructor signature to add `String checksumBehavior` as the last 
parameter, and assign it in the body:
+
+```java
+    private S3ClientConfig(String region, String endpoint, String prefix, 
boolean anonymousAuth,
+            Collection<String> certificates, long profilerLogInterval, int 
writeBufferSize, long tokenAcquireTimeout,
+            int writeMaxRequestsPerSeconds, int readMaxRequestsPerSeconds, int 
requestsMaxHttpConnections,
+            boolean forcePathStyle, boolean disableSslVerify, int 
requestsMaxPendingHttpConnections,
+            int requestsHttpConnectionAcquireTimeout, int 
s3ReadTimeoutInSeconds,
+            S3ParallelDownloaderClientType parallelDownloaderClientType, 
boolean roundRobinDnsResolver,
+            String accessKeyId, String secretAccessKey, String 
checksumBehavior) {
+        // ... existing assignments ...
+        this.accessKeyId = accessKeyId;
+        this.secretAccessKey = secretAccessKey;
+        this.checksumBehavior = Objects.requireNonNull(checksumBehavior, 
"checksumBehavior");
+    }
+```
+
+- [ ] **Step 3: Update the public convenience constructor**
+
+The public constructor delegates to private with `"", ""` at end. Add 
`"when_required"` as the checksumBehavior default (this constructor is used 
when an endpoint is always passed):
+
+```java
+    public S3ClientConfig(String region, String endpoint, String prefix, 
boolean anonymousAuth,
+            Collection<String> certificates, long profilerLogInterval, int 
writeBufferSize,
+            S3ParallelDownloaderClientType parallelDownloaderClientType, 
boolean roundRobinDnsResolver) {
+        this(region, endpoint, prefix, anonymousAuth, certificates, 
profilerLogInterval, writeBufferSize, 1, 0, 0, 0,
+                false, false, 0, 0, -1, parallelDownloaderClientType, 
roundRobinDnsResolver, "", "",
+                "when_required");
+    }
+```
+
+- [ ] **Step 4: Update `of(ICloudProperties)`**
+
+Pass `cloudProperties.getS3ChecksumBehavior()` as the last argument:
+
+```java
+    public static S3ClientConfig of(ICloudProperties cloudProperties) {
+        return new S3ClientConfig(cloudProperties.getStorageRegion(), 
cloudProperties.getStorageEndpoint(),
+                cloudProperties.getStoragePrefix(), 
cloudProperties.isStorageAnonymousAuth(),
+                cloudProperties.getStorageCertificates(), 
cloudProperties.getProfilerLogInterval(),
+                cloudProperties.getWriteBufferSize(), 
cloudProperties.getTokenAcquireTimeout(),
+                cloudProperties.getWriteMaxRequestsPerSecond(), 
cloudProperties.getReadMaxRequestsPerSecond(),
+                cloudProperties.getRequestsMaxHttpConnections(), 
cloudProperties.isStorageForcePathStyle(),
+                cloudProperties.isStorageDisableSSLVerify(), 
cloudProperties.getRequestsMaxPendingHttpConnections(),
+                cloudProperties.getRequestsHttpConnectionAcquireTimeout(), 
cloudProperties.getS3ReadTimeoutInSeconds(),
+                
S3ParallelDownloaderClientType.valueOf(cloudProperties.getS3ParallelDownloaderClientType()),
+                cloudProperties.useRoundRobinDnsResolver(), 
cloudProperties.getS3AccessKeyId(),
+                cloudProperties.getS3SecretAccessKey(), 
cloudProperties.getS3ChecksumBehavior());
+    }
+```
+
+- [ ] **Step 5: Add the getter**
+
+After `useRoundRobinDnsResolver()`, add:
+
+```java
+    public String getChecksumBehavior() {
+        return checksumBehavior;
+    }
+```
+
+- [ ] **Step 6: Build to confirm no compilation errors**
+
+```bash
+cd asterixdb
+mvn compile -pl asterixdb/asterix-cloud -am -q
+```
+
+Expected: `BUILD SUCCESS`
+
+---
+
+## Task 4: Apply checksum behavior in `S3CloudClient.buildClient`
+
+**Files:**
+- Modify: 
`asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3CloudClient.java`
+
+- [ ] **Step 1: Add SDK checksum imports**
+
+Add alongside existing AWS SDK imports:
+
+```java
+import software.amazon.awssdk.core.checksums.RequestChecksumCalculation;
+import software.amazon.awssdk.core.checksums.ResponseChecksumValidation;
+```
+
+- [ ] **Step 2: Call `applyChecksumBehavior` in `buildClient`**
+
+In `buildClient`, immediately after 
`builder.forcePathStyle(config.isForcePathStyle());`, add:
+
+```java
+        applyChecksumBehavior(builder, config.getChecksumBehavior());
+```
+
+- [ ] **Step 3: Add the helper method**
+
+Add as a private static method near `buildClient`:
+
+```java
+    private static void applyChecksumBehavior(S3ClientBuilder builder, String 
behavior) {
+        switch (behavior.toLowerCase()) {
+            case "when_required":
+                
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED);
+                
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_REQUIRED);
+                break;
+            case "when_supported":
+                
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_SUPPORTED);
+                
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED);
+                break;
+            case "auto":
+            default:
+                // leave SDK defaults untouched
+                break;
+        }
+    }
+```
+
+- [ ] **Step 4: Build and run the existing cloud S3 tests**
+
+```bash
+cd asterixdb
+mvn test -pl asterixdb/asterix-cloud -Dtest=LSMS3Test -q
+```
+
+Expected: `BUILD SUCCESS` and all tests pass.
+
+---
+
+## Task 5: Replace hardcoded checksum logic in `S3Utils`
+
+**Files:**
+- Modify: 
`asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java`
+
+- [ ] **Step 1: Replace the call site in `buildClient`**
+
+Remove this block:
+
+```java
+        if (serviceEndpoint != null) {
+            configureS3CompatibleSettings(serviceEndpoint, builder);
+        }
+```
+
+Add in its place:
+
+```java
+        applyChecksumBehavior(builder, 
appCtx.getCloudProperties().getS3ChecksumBehavior());
+```
+
+- [ ] **Step 2: Remove `configureS3CompatibleSettings`**
+
+Delete the entire `configureS3CompatibleSettings` private static method 
(including its `@AiProvenance` annotation).
+
+- [ ] **Step 3: Add `applyChecksumBehavior` helper**
+
+Add as a new private static method:
+
+```java
+    private static void applyChecksumBehavior(S3ClientBuilder builder, String 
behavior) {
+        switch (behavior.toLowerCase()) {
+            case "when_required":
+                
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED);
+                
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_REQUIRED);
+                break;
+            case "when_supported":
+                
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_SUPPORTED);
+                
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED);
+                break;
+            case "auto":
+            default:
+                // leave SDK defaults untouched
+                break;
+        }
+    }
+```
+
+- [ ] **Step 4: Verify `ResponseChecksumValidation` import**
+
+Confirm these imports exist at the top of `S3Utils.java` (they should already 
be present):
+
+```java
+import software.amazon.awssdk.core.checksums.RequestChecksumCalculation;
+import software.amazon.awssdk.core.checksums.ResponseChecksumValidation;
+```
+
+- [ ] **Step 5: Build to confirm no compilation errors**
+
+```bash
+cd asterixdb
+mvn compile -pl asterixdb/asterix-external-data -am -q
+```
+
+Expected: `BUILD SUCCESS`
+
+---
+
+## Task 6: Final verification
+
+- [ ] **Step 1: Full build of all affected modules**
+
+```bash
+cd asterixdb
+mvn compile \
+  -pl hyracks-fullstack/hyracks/hyracks-cloud \
+  -pl asterixdb/asterix-common \
+  -pl asterixdb/asterix-cloud \
+  -pl asterixdb/asterix-external-data \
+  -am -q
+```
+
+Expected: `BUILD SUCCESS`
+
+- [ ] **Step 2: Run tests across all affected modules**
+
+```bash
+cd asterixdb
+mvn test -pl asterixdb/asterix-cloud,asterixdb/asterix-external-data -q
+```
+
+Expected: `BUILD SUCCESS` and all tests pass.
diff --git a/docs/superpowers/specs/2026-05-05-s3-checksum-behavior-design.md 
b/docs/superpowers/specs/2026-05-05-s3-checksum-behavior-design.md
new file mode 100644
index 0000000..7aa2567
--- /dev/null
+++ b/docs/superpowers/specs/2026-05-05-s3-checksum-behavior-design.md
@@ -0,0 +1,122 @@
+# S3 Checksum Behavior Configuration
+
+**Date:** 2026-05-05
+**Status:** Approved
+
+## Problem
+
+AWS SDK Java v2 ≥ 2.30.0 changed its default checksum behavior:
+
+- `requestChecksumCalculation = WHEN_SUPPORTED` — SDK sends checksums on all 
eligible S3 operations
+- `responseChecksumValidation = WHEN_SUPPORTED` — SDK validates response 
checksums when present
+
+This is the correct behavior for native AWS S3 but breaks S3-compatible 
storage solutions (e.g. OCI) that do not understand or support these checksum 
headers, causing request failures.
+
+AsterixDB uses S3 in two code paths:
+1. **Blob storage** — `S3CloudClient` / `S3ClientConfig` (configured via 
`CloudProperties`)
+2. **S3 external links** — `S3Utils.buildClient` (configured per-query via 
`Map<String, String>`, but `appCtx.getCloudProperties()` is available at build 
time)
+
+The external-data path already has a hardcoded workaround 
(`configureS3CompatibleSettings`) that sets `WHEN_REQUIRED` when a custom 
endpoint is present. The blob storage path has no checksum configuration at 
all. Neither path is user-configurable.
+
+## Goal
+
+Add a single, configurable option `CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR` to 
`CloudProperties` that controls checksum behavior for both code paths, with a 
smart default that is safe for S3-compatible stores out of the box.
+
+## Design
+
+### Config Option (`CloudProperties`)
+
+```java
+CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR(STRING, (Function<IApplicationConfig, 
String>) app -> {
+    String endpoint = app.getString(CLOUD_STORAGE_ENDPOINT);
+    return (endpoint == null || endpoint.isEmpty()) ? "auto" : "when_required";
+})
+```
+
+**Accepted values:**
+
+| Value | Request checksum | Response validation | Use case |
+|---|---|---|---|
+| `when_required` | `WHEN_REQUIRED` | `WHEN_REQUIRED` | S3-compatible stores 
(OCI, MinIO, etc.) |
+| `when_supported` | `WHEN_SUPPORTED` | `WHEN_SUPPORTED` | Explicitly enable 
SDK ≥ 2.30 default |
+| `auto` | *(SDK default)* | *(SDK default)* | Native AWS S3 without explicit 
override |
+
+**Default logic:** `when_required` if `CLOUD_STORAGE_ENDPOINT` is non-empty 
(S3-compatible), `auto` otherwise (native AWS). This matches the existing 
hardcoded behavior in `S3Utils` and makes it configurable.
+
+**Usage override string:** `when_required if a custom endpoint is set; auto 
otherwise`
+
+### Interface (`ICloudProperties`)
+
+Add one method:
+```java
+String getS3ChecksumBehavior();
+```
+
+### Config accessor (`CloudProperties`)
+
+```java
+public String getS3ChecksumBehavior() {
+    return accessor.getString(Option.CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR);
+}
+```
+
+### Config carrier (`S3ClientConfig`)
+
+- Add `String checksumBehavior` field
+- Add to the private all-args constructor
+- Add getter `getChecksumBehavior()`
+- `of(ICloudProperties)` factory reads 
`cloudProperties.getS3ChecksumBehavior()`
+- `of(Map<String, String>, int)` overload (external writer) does **not** set 
this field — the external-data path reads it directly from 
`appCtx.getCloudProperties()` at build time
+
+### Client build — blob storage (`S3CloudClient.buildClient`)
+
+After `builder.forcePathStyle(...)`, add:
+```java
+applyChecksumBehavior(builder, config.getChecksumBehavior());
+```
+
+Add private static helper:
+```java
+private static void applyChecksumBehavior(S3ClientBuilder builder, String 
behavior) {
+    switch (behavior.toLowerCase()) {
+        case "when_required":
+            
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED);
+            
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_REQUIRED);
+            break;
+        case "when_supported":
+            
builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_SUPPORTED);
+            
builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED);
+            break;
+        case "auto":
+        default:
+            // leave SDK defaults untouched
+            break;
+    }
+}
+```
+
+### Client build — external links (`S3Utils.buildClient`)
+
+Remove `configureS3CompatibleSettings(serviceEndpoint, builder)` and replace 
with:
+```java
+String checksumBehavior = appCtx.getCloudProperties().getS3ChecksumBehavior();
+applyChecksumBehavior(builder, checksumBehavior);
+```
+
+Add the same `applyChecksumBehavior` helper to `S3Utils`.
+Remove the now-dead `configureS3CompatibleSettings` method.
+
+## Files Changed
+
+| File | Change |
+|---|---|
+| `asterix-common/.../config/CloudProperties.java` | Add 
`CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR` option + accessor |
+| `hyracks-cloud/.../io/ICloudProperties.java` | Add `getS3ChecksumBehavior()` 
|
+| `asterix-cloud/.../aws/s3/S3ClientConfig.java` | Add `checksumBehavior` 
field + getter + wire into `of(ICloudProperties)` |
+| `asterix-cloud/.../aws/s3/S3CloudClient.java` | Call `applyChecksumBehavior` 
in `buildClient` |
+| `asterix-external-data/.../aws/s3/S3Utils.java` | Replace 
`configureS3CompatibleSettings` with `applyChecksumBehavior` |
+
+## Non-Goals
+
+- Per-dataset checksum override (each external dataset specifying its own 
behavior)
+- Configuring request and response checksums independently
diff --git 
a/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/ICloudProperties.java
 
b/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/ICloudProperties.java
index 103c1b3..33a860d 100644
--- 
a/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/ICloudProperties.java
+++ 
b/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/ICloudProperties.java
@@ -94,5 +94,29 @@

     String getS3SecretAccessKey();
 
+    /**
+     * Valid values for {@link #getS3ChecksumBehavior()}.
+     */
+    enum S3ChecksumBehavior {
+        WHEN_REQUIRED,
+        WHEN_SUPPORTED,
+        AUTO;
+
+        /** Parses the config string (case-insensitive). Returns {@code null} 
if unrecognized. */
+        public static S3ChecksumBehavior fromString(String s) {
+            if (s == null) {
+                return null;
+            }
+            for (S3ChecksumBehavior b : values()) {
+                if (b.name().equalsIgnoreCase(s)) {
+                    return b;
+                }
+            }
+            return null;
+        }
+    }
+
+    S3ChecksumBehavior getS3ChecksumBehavior();
+
     String getAzureClientId();
 }

--
To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21198?usp=email
To unsubscribe, or for help writing mail filters, visit 
https://asterix-gerrit.ics.uci.edu/settings?usp=email

Gerrit-MessageType: newchange
Gerrit-Project: asterixdb
Gerrit-Branch: lumina
Gerrit-Change-Id: If6618d3a336e9bf134efb1f219660421edc27c43
Gerrit-Change-Number: 21198
Gerrit-PatchSet: 1
Gerrit-Owner: Ritik Raj <[email protected]>

Reply via email to