>From Michael Blow <[email protected]>: Michael Blow has submitted this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21312?usp=email )
Change subject: [NO ISSUE][CLOUD] S3 Checksum Behavior Updates ...................................................................... [NO ISSUE][CLOUD] S3 Checksum Behavior Updates - default S3 checksum behavior to 'sdk_default' Ext-ref: MB-72035 Change-Id: I6802f9769e553825886e255b1812758d321d00f7 Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21312 Reviewed-by: Michael Blow <[email protected]> Tested-by: Jenkins <[email protected]> Integration-Tests: Jenkins <[email protected]> Reviewed-by: Hussain Towaileb <[email protected]> --- M asterixdb/asterix-app/src/test/java/org/apache/asterix/test/common/TestExecutor.java M asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm M asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm M asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm M asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml M asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java M asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java M hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/S3ChecksumBehavior.java 9 files changed, 24 insertions(+), 35 deletions(-) Approvals: Hussain Towaileb: Looks good to me, approved Jenkins: Verified; Verified Michael Blow: Looks good to me, but someone else must approve diff --git a/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/common/TestExecutor.java b/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/common/TestExecutor.java index 54c4246..923ee7b 100644 --- a/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/common/TestExecutor.java +++ b/asterixdb/asterix-app/src/test/java/org/apache/asterix/test/common/TestExecutor.java @@ -608,8 +608,8 @@ StringWriter expected = new StringWriter(); IOUtils.copy(actualFile, actual); IOUtils.copy(expectedFile, expected); - Pattern pattern = Pattern.compile(expected.toString(), Pattern.DOTALL | Pattern.MULTILINE); - if (!pattern.matcher(actual.toString()).matches()) { + Pattern pattern = Pattern.compile(expected.toString().trim(), Pattern.DOTALL | Pattern.MULTILINE); + if (!pattern.matcher(actual.toString().trim()).matches()) { // figure out where the problem first occurs... StringBuilder builder = new StringBuilder(); String[] lines = expected.toString().split("\\n"); diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm index cffa1fa..c35a1df 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm +++ b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1/cluster_state_1.1.regexadm @@ -40,7 +40,7 @@ "cloud.storage.prefix" : "", "cloud.storage.region" : "", "cloud.storage.s3.access.key.id" : null, - "cloud.storage.s3.checksum.behavior" : "auto", + "cloud.storage.s3.checksum.behavior" : "sdk_default", "cloud.storage.s3.client.read.timeout" : -1, "cloud.storage.s3.parallel.downloader.client.type" : "crt", "cloud.storage.s3.secret.access.key" : null, diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm index 6f93933..95a20a9 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm +++ b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_full/cluster_state_1_full.1.regexadm @@ -40,7 +40,7 @@ "cloud.storage.prefix" : "", "cloud.storage.region" : "", "cloud.storage.s3.access.key.id" : null, - "cloud.storage.s3.checksum.behavior" : "auto", + "cloud.storage.s3.checksum.behavior" : "sdk_default", "cloud.storage.s3.client.read.timeout" : -1, "cloud.storage.s3.parallel.downloader.client.type" : "crt", "cloud.storage.s3.secret.access.key" : null, diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm index b01ff20..6e23922 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm +++ b/asterixdb/asterix-app/src/test/resources/runtimets/results/api/cluster_state_1_less/cluster_state_1_less.1.regexadm @@ -40,7 +40,7 @@ "cloud.storage.prefix" : "", "cloud.storage.region" : "", "cloud.storage.s3.access.key.id" : null, - "cloud.storage.s3.checksum.behavior" : "auto", + "cloud.storage.s3.checksum.behavior" : "sdk_default", "cloud.storage.s3.client.read.timeout" : -1, "cloud.storage.s3.parallel.downloader.client.type" : "crt", "cloud.storage.s3.secret.access.key" : null, diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml b/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml index 2e4987a..f8ebbbe 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml +++ b/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml @@ -1297,7 +1297,7 @@ <test-case FilePath="external-dataset/s3"> <compilation-unit name="checksum-behavior/invalid-value"> <output-dir compare="Text">checksum-behavior/invalid-value</output-dir> - <expected-error>ASX1173: Invalid value for parameter 'checksumBehavior', allowed value(s): when_required, when_supported, auto</expected-error> + <expected-error>ASX1173: Invalid value for parameter 'checksumBehavior', allowed value(s): when_required, when_supported, sdk_default</expected-error> </compilation-unit> </test-case> <test-case FilePath="external-dataset/s3"> diff --git a/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java b/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java index 26c349b..2e9caea 100644 --- a/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java +++ b/asterixdb/asterix-cloud/src/main/java/org/apache/asterix/cloud/clients/aws/s3/S3ClientConfig.java @@ -66,7 +66,7 @@ S3ParallelDownloaderClientType parallelDownloaderClientType, boolean roundRobinDnsResolver) { this(region, endpoint, prefix, anonymousAuth, certificates, profilerLogInterval, writeBufferSize, 1, 0, 0, 0, false, false, 0, 0, 0, 0, -1, parallelDownloaderClientType, roundRobinDnsResolver, "", "", - S3ChecksumBehavior.defaultForEndpoint(endpoint)); + S3ChecksumBehavior.SDK_DEFAULT); } private S3ClientConfig(String region, String endpoint, String prefix, boolean anonymousAuth, diff --git a/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java b/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java index 2de95f6..1fb9901 100644 --- a/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java +++ b/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/CloudProperties.java @@ -102,10 +102,7 @@ CLOUD_STORAGE_S3_ACCESS_KEY_ID(STRING, (String) null), CLOUD_STORAGE_S3_SECRET_ACCESS_KEY(STRING, (String) null), CLOUD_STORAGE_AZURE_CLIENT_ID(STRING, (String) null), - CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR(STRING, (Function<IApplicationConfig, String>) app -> { - String endpoint = app.getString(CLOUD_STORAGE_ENDPOINT); - return S3ChecksumBehavior.defaultForEndpoint(endpoint).name().toLowerCase(); - }); + CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR(STRING, S3ChecksumBehavior.SDK_DEFAULT.stringValue()); private final IOptionType interpreter; private final Object defaultValue; @@ -250,9 +247,7 @@ return "The checksum behavior for S3 requests and responses. Accepted values: " + "'when_required' (only checksums mandated by the operation), " + "'when_supported' (checksums on all eligible operations, SDK >= 2.30 default), " - + "'auto' (no explicit override, defer to SDK default). " - + "Defaults to 'when_required' when a custom endpoint is configured " - + "(S3-compatible stores), 'auto' for native AWS S3."; + + "'sdk_default' (defer to SDK default. " + "Defaults to 'sdk_default'"; default: throw new IllegalStateException("NYI: " + this); } @@ -273,9 +268,6 @@ if (this == CLOUD_STORAGE_S3_PARALLEL_DOWNLOADER_CLIENT_TYPE) { return "crt if no custom endpoint is set; async otherwise"; } - if (this == CLOUD_STORAGE_S3_CHECKSUM_BEHAVIOR) { - return "when_required if a custom endpoint is set; auto otherwise"; - } return IOption.super.usageDefaultOverride(accessor, optionPrinter); } diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java index 81bd942..db91300 100644 --- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java +++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/aws/s3/S3Utils.java @@ -202,7 +202,7 @@ } else if (certificates != null && !certificates.isBlank()) { builder.httpClient(createHttpClient(certificates)); } - applyChecksumBehavior(builder, resolveChecksumBehavior(configuration, serviceEndpoint)); + applyChecksumBehavior(builder, resolveChecksumBehavior(configuration)); awsClients.setConsumingClient(builder.build()); return awsClients; } @@ -218,7 +218,7 @@ builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_SUPPORTED); builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED); break; - case AUTO: + case SDK_DEFAULT: // leave SDK defaults untouched break; } @@ -788,14 +788,14 @@ * Resolves the effective checksum behavior for external data (link) operations using a two-level priority chain: * <ol> * <li>Per-link {@code checksumBehavior} parameter in the configuration map (explicit override)</li> - * <li>Endpoint-based default: {@link S3ChecksumBehavior#AUTO} for native AWS S3, + * <li>Endpoint-based default: {@link S3ChecksumBehavior#SDK_DEFAULT} for native AWS S3, * {@link S3ChecksumBehavior#WHEN_REQUIRED} for S3-compatible storage</li> * </ol> * Cloud properties are intentionally not consulted here — this method is for external data (link) operations * only, not for blob storage. */ @AiProvenance(agent = CLAUDE_SONNET_4_6, tool = GITHUB_COPILOT) - public static S3ChecksumBehavior resolveChecksumBehavior(Map<String, String> configuration, String serviceEndpoint) + public static S3ChecksumBehavior resolveChecksumBehavior(Map<String, String> configuration) throws CompilationException { String perLink = configuration.get(CHECKSUM_BEHAVIOR_FIELD_NAME); if (perLink != null) { @@ -806,7 +806,8 @@ CHECKSUM_BEHAVIOR_ALLOWED_VALUES); } } - return S3ChecksumBehavior.defaultForEndpoint(serviceEndpoint); + // TODO(mblow): I don't believe this code is reachable any longer (we will always have a checksum behavior configured) + return S3ChecksumBehavior.SDK_DEFAULT; } /** diff --git a/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/S3ChecksumBehavior.java b/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/S3ChecksumBehavior.java index 0ab7e1b..63969aa 100644 --- a/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/S3ChecksumBehavior.java +++ b/hyracks-fullstack/hyracks/hyracks-cloud/src/main/java/org/apache/hyracks/cloud/io/S3ChecksumBehavior.java @@ -36,28 +36,24 @@ /** Calculate/validate checksums whenever supported — the SDK default since 2.30.0. */ WHEN_SUPPORTED, /** Leave the SDK defaults untouched. Appropriate for native AWS S3. */ - AUTO; + SDK_DEFAULT; - /** Parses the config string (case-insensitive). Returns {@code null} if the input is {@code null}. */ + public String stringValue() { + return name().toLowerCase(); + } + + /** Parses the config string (case-insensitive). Returns {@code SDK_DEFAULT} if the input is {@code null}. */ public static S3ChecksumBehavior fromString(String s) { if (s == null) { - return null; + return SDK_DEFAULT; } for (S3ChecksumBehavior b : values()) { if (b.name().equalsIgnoreCase(s)) { return b; } } - throw new IllegalArgumentException( - "Unrecognized S3 checksum behavior: '" + s + "'. Valid values: when_required, when_supported, auto"); + throw new IllegalArgumentException("Unrecognized S3 checksum behavior: '" + s + + "'. Valid values: when_required, when_supported, sdk_default"); } - /** - * Returns the appropriate default based on whether a custom S3-compatible endpoint is configured. - * When an endpoint is present the SDK's newer checksum defaults may not be supported, so {@link #WHEN_REQUIRED} - * is used. When no custom endpoint is configured (native AWS S3) {@link #AUTO} defers to SDK defaults. - */ - public static S3ChecksumBehavior defaultForEndpoint(String endpoint) { - return (endpoint == null || endpoint.isEmpty()) ? AUTO : WHEN_REQUIRED; - } } -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21312?usp=email To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings?usp=email Gerrit-MessageType: merged Gerrit-Project: asterixdb Gerrit-Branch: lumina Gerrit-Change-Id: I6802f9769e553825886e255b1812758d321d00f7 Gerrit-Change-Number: 21312 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Blow <[email protected]> Gerrit-Reviewer: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Hussain Towaileb <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Michael Blow <[email protected]> Gerrit-Reviewer: Ritik Raj <[email protected]> Gerrit-CC: Anon. E. Moose #1000171
