Kontinuation commented on issue #8601:
URL: https://github.com/apache/iceberg/issues/8601#issuecomment-1732561108
I think I have spotted the problem. This problem only reproduces when using
web identity token file to authenticate. The
`WebIdentityTokenFileCredentialsProvider` uses an STS client internally to
refresh the credential, and The STS client was closed by the finalizer of
`S3FileIO`. Here is the stacktrace of calling the `shutdown()` method of the
pooling connection manager used by the STS client:
```
java.base/java.lang.Thread.getStackTrace(Unknown Source)
software.amazon.awssdk.thirdparty.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.printStackTrace(PoolingHttpClientConnectionManager.java:553)
software.amazon.awssdk.thirdparty.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.shutdown(PoolingHttpClientConnectionManager.java:422)
software.amazon.awssdk.http.apache.ApacheHttpClient.close(ApacheHttpClient.java:247)
software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87)
software.amazon.awssdk.utils.AttributeMap.lambda$close$0(AttributeMap.java:87)
java.base/java.util.HashMap$Values.forEach(Unknown Source)
software.amazon.awssdk.utils.AttributeMap.close(AttributeMap.java:87)
software.amazon.awssdk.core.client.config.SdkClientConfiguration.close(SdkClientConfiguration.java:79)
software.amazon.awssdk.core.internal.http.HttpClientDependencies.close(HttpClientDependencies.java:80)
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient.close(AmazonSyncHttpClient.java:73)
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.close(BaseSyncClientHandler.java:88)
software.amazon.awssdk.services.sts.DefaultStsClient.close(DefaultStsClient.java:1344)
software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
software.amazon.awssdk.services.sts.internal.StsWebIdentityCredentialsProviderFactory$StsWebIdentityCredentialsProvider.close(StsWebIdentityCredentialsProviderFactory.java:99)
software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87)
software.amazon.awssdk.auth.credentials.WebIdentityTokenFileCredentialsProvider.close(WebIdentityTokenFileCredentialsProvider.java:132)
software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87)
software.amazon.awssdk.auth.credentials.AwsCredentialsProviderChain.lambda$close$2(AwsCredentialsProviderChain.java:122)
java.base/java.util.ArrayList.forEach(Unknown Source)
java.base/java.util.Collections$UnmodifiableCollection.forEach(Unknown
Source)
software.amazon.awssdk.auth.credentials.AwsCredentialsProviderChain.close(AwsCredentialsProviderChain.java:122)
software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87)
software.amazon.awssdk.utils.Lazy.close(Lazy.java:77)
software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87)
software.amazon.awssdk.auth.credentials.internal.LazyAwsCredentialsProvider.close(LazyAwsCredentialsProvider.java:50)
software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider.close(DefaultCredentialsProvider.java:131)
software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87)
software.amazon.awssdk.utils.AttributeMap.lambda$close$0(AttributeMap.java:87)
java.base/java.util.HashMap$Values.forEach(Unknown Source)
software.amazon.awssdk.utils.AttributeMap.close(AttributeMap.java:87)
software.amazon.awssdk.core.client.config.SdkClientConfiguration.close(SdkClientConfiguration.java:79)
software.amazon.awssdk.core.internal.http.HttpClientDependencies.close(HttpClientDependencies.java:80)
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient.close(AmazonSyncHttpClient.java:73)
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.close(BaseSyncClientHandler.java:88)
software.amazon.awssdk.services.s3.DefaultS3Client.close(DefaultS3Client.java:11088)
org.apache.iceberg.aws.s3.S3FileIO.close(S3FileIO.java:405)
org.apache.iceberg.aws.s3.S3FileIO.finalize(S3FileIO.java:415)
java.base/java.lang.System$2.invokeFinalize(Unknown Source)
java.base/java.lang.ref.Finalizer.runFinalizer(Unknown Source)
java.base/java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)
```
S3FileIO.close closes the Apache HTTP client used by both the S3 client and
the STS client used by the web identity token file credential provider, which
looks fine. The problem is that **all S3FileIO objects actually shares one same
credential provider**. If any of the S3FileIO object was finalized, the shared
credential provider object was broken.
[AwsProperties.java
line:1801](https://github.com/apache/iceberg/blob/apache-iceberg-1.3.1/aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java#L1801C31-L1801C31)
creates the credential provider object used by S3FileIO object, which is
actually a singleton:
```java
public final class DefaultCredentialsProvider {
private static final DefaultCredentialsProvider
DEFAULT_CREDENTIALS_PROVIDER = new DefaultCredentialsProvider(builder());
// ...
public static DefaultCredentialsProvider create() {
return DEFAULT_CREDENTIALS_PROVIDER;
}
// ...
}
```
A workaround for this problem is always creating a new instance of
`DefaultCredentialsProvider` object instead of using the singleton:
```diff
diff --git a/aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
b/aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
index 9266c83f1..0f182a20c 100644
--- a/aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
+++ b/aws/src/main/java/org/apache/iceberg/aws/AwsProperties.java
@@ -45,6 +45,7 @@ import
software.amazon.awssdk.auth.credentials.AwsBasicCredentials;
import software.amazon.awssdk.auth.credentials.AwsCredentialsProvider;
import software.amazon.awssdk.auth.credentials.AwsSessionCredentials;
import software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider;
+import
software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider.Builder;
import software.amazon.awssdk.auth.credentials.StaticCredentialsProvider;
import software.amazon.awssdk.awscore.client.builder.AwsClientBuilder;
import software.amazon.awssdk.awscore.client.builder.AwsSyncClientBuilder;
@@ -1798,7 +1799,8 @@ public class AwsProperties implements Serializable {
return credentialsProvider(this.clientCredentialsProvider);
}
- return DefaultCredentialsProvider.create();
+ Builder builder = DefaultCredentialsProvider.builder();
+ return builder.build();
}
private AwsCredentialsProvider credentialsProvider(String
credentialsProviderClass) {
```
Currently, I'm still testing this patch to see if it actually resolve this
issue. I'll let you know if the problem goes away.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]