[
https://issues.apache.org/jira/browse/NIFI-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058546#comment-18058546
]
ASF subversion and git services commented on NIFI-15583:
--------------------------------------------------------
Commit 65ed41e6ee2bd9b08609102014edb5bf80c5613e in nifi's branch
refs/heads/main from Pierre Villard
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=65ed41e6ee ]
NIFI-15583 Fixed S3 Processors use global instead of regional endpoint for
us-east-1 (#10887)
Signed-off-by: David Handermann <[email protected]>
> S3 Processors use global endpoint instead of regional endpoint for us-east-1
> ----------------------------------------------------------------------------
>
> Key: NIFI-15583
> URL: https://issues.apache.org/jira/browse/NIFI-15583
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Pierre Villard
> Assignee: Pierre Villard
> Priority: Major
> Time Spent: 40m
> Remaining Estimate: 0h
>
> h2. Problem
> S3 processors fail to connect to S3 buckets in the us-east-1 region in
> environments that require the regional endpoint
> ({{{}s3.us-east-1.amazonaws.com{}}}) instead of the global endpoint
> ({{{}s3.amazonaws.com{}}}). This is the case in environments with outbound
> PrivateLink, where network rules are configured to allow traffic to
> {{.s3.us-east-1.amazonaws.com}} but not to the global endpoint.
> The error observed is:
> {noformat}
> <bucket-name>.s3.amazonaws.com: Name or service not known
> {noformat}
> h2. Root Cause
> This is caused by a bug in AWS SDK for Java v2 (confirmed in 2.41.26) where
> {{DefaultsMode.STANDARD}} does not correctly configure regional S3 endpoints
> for us-east-1.
> NiFi correctly sets {{DefaultsMode.STANDARD}} on the {{S3Client}} builder in
> {{{}AbstractS3Processor.createClientBuilder(){}}}. Per the [AWS
> documentation|https://docs.aws.amazon.com/sdkref/latest/guide/setting-global-aws_defaults_mode.html],
> {{STANDARD}} mode should configure the SDK to use the regional S3 endpoint
> for us-east-1 ({{{}s3.us-east-1.amazonaws.com{}}}) instead of the legacy
> global endpoint ({{{}s3.amazonaws.com{}}}).
> However, the SDK has a timing bug in its internal initialization sequence:
> During client construction,
> {{DefaultS3BaseClientBuilder.finalizeServiceConfiguration()}} creates a
> {{UseGlobalEndpointResolver}} and stores its result in the client
> configuration as {{{}AwsClientOption.USE_GLOBAL_ENDPOINT{}}}.
> {{UseGlobalEndpointResolver}} checks, in order: (a) the environment variable
> / system property {{{}AWS_S3_US_EAST_1_REGIONAL_ENDPOINT{}}}, (b) the AWS
> profile configuration, and (c) the {{DEFAULT_S3_US_EAST_1_REGIONAL_ENDPOINT}}
> value from the defaults mode configuration.
> The {{DEFAULT_S3_US_EAST_1_REGIONAL_ENDPOINT}} value (which
> {{DefaultsMode.STANDARD}} correctly maps to {{{}"regional"{}}}) is only
> populated later during
> {{{}AwsDefaultClientBuilder.finalizeAwsConfiguration(){}}}, which runs after
> {{{}finalizeServiceConfiguration(){}}}.
> As a result, {{UseGlobalEndpointResolver}} always reads {{null}} for the
> defaults mode value and falls back to using the global endpoint.
> At request time, {{S3ResolveEndpointInterceptor}} reads the (incorrectly
> resolved) {{USE_GLOBAL_ENDPOINT}} attribute and passes it to the
> {{S3EndpointProvider}} via {{{}S3EndpointParams.useGlobalEndpoint(true){}}},
> which causes the SDK to generate URLs with the global endpoint.
> h2. Fix
> The fix wraps the default {{S3EndpointProvider}} with a provider that
> overrides {{S3EndpointParams.useGlobalEndpoint()}} to {{false}} before
> delegating to the default provider. This ensures regional endpoints are
> always used, which is consistent with the behavior that
> {{DefaultsMode.STANDARD}} is supposed to provide.
> This approach:
> * Uses entirely public SDK API ({{{}S3EndpointProvider{}}},
> {{{}S3EndpointParams.toBuilder(){}}})
> * Does not modify global JVM state (no {{{}System.setProperty{}}})
> * Is scoped to NiFi's S3 clients only
> * Works for both the regular {{S3Client}} and {{S3EncryptionClient}} code
> paths
> * Is safe for all regions — {{useGlobalEndpoint}} is only relevant for
> us-east-1; for other regions, the value has no effect
--
This message was sent by Atlassian Jira
(v8.20.10#820010)