sachinnn99 commented on code in PR #15869:
URL: https://github.com/apache/iceberg/pull/15869#discussion_r3213324789
##########
azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSFileIO.java:
##########
@@ -134,6 +155,7 @@ DataLakeFileSystemClient client(ADLSLocation location) {
synchronized (this) {
if (clientCache == null) {
clientCache = Maps.newConcurrentMap();
+ scheduleCredentialRefresh();
Review Comment:
Good catch — the guard at line 159 (`if (refreshFuture == null ||
isCancelled || isDone)`) handles the common case, but there was a narrow race:
`refreshStorageCredentials()` called `scheduleCredentialRefresh()` outside the
synchronized block, so a `client()` call interleaving between the cache
invalidation and the schedule could spawn a duplicate timer.
Fixed in the latest push — moved `scheduleCredentialRefresh()` inside the
synchronized block in `refreshStorageCredentials()` so the cache clear and
schedule happen atomically. The test
`noDuplicateRefreshAfterClientCacheRebuild` validates this doesn't happen.
##########
azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSFileIO.java:
##########
@@ -259,11 +281,93 @@ public void deletePrefix(String prefix) {
}
@Override
- public void close() {
- if (vendedAdlsCredentialProvider != null) {
- vendedAdlsCredentialProvider.close();
+ public void setCredentials(List<StorageCredential> credentials) {
+ Preconditions.checkArgument(credentials != null, "Invalid storage
credentials: null");
+ // stop any refresh that might be scheduled
+ if (refreshFuture != null) {
+ refreshFuture.cancel(true);
+ }
+
+ // copy credentials into a modifiable collection for Kryo serde
+ this.storageCredentials = Lists.newArrayList(credentials);
+
+ // if the clients are already initialized, we need to allow them to be
recreated
+ synchronized (this) {
+ if (clientCache != null) {
+ this.clientCache = null;
+ }
+ }
+ }
+
+ @Override
+ public List<StorageCredential> credentials() {
+ return ImmutableList.copyOf(storageCredentials);
+ }
+
+ private void scheduleCredentialRefresh() {
Review Comment:
This is safe — `storageCredentials` is initialized to `Lists.newArrayList()`
(never null), and `scheduleCredentialRefresh()` returns early when the list is
null or empty. So calling `client()` before `setCredentials()` just skips
scheduling and builds the client normally.
Added a test `noExceptionWhenClientCalledWithoutCredentials` that validates
this path.
##########
azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSFileIO.java:
##########
@@ -259,11 +281,93 @@ public void deletePrefix(String prefix) {
}
@Override
- public void close() {
- if (vendedAdlsCredentialProvider != null) {
- vendedAdlsCredentialProvider.close();
+ public void setCredentials(List<StorageCredential> credentials) {
+ Preconditions.checkArgument(credentials != null, "Invalid storage
credentials: null");
+ // stop any refresh that might be scheduled
+ if (refreshFuture != null) {
+ refreshFuture.cancel(true);
+ }
+
+ // copy credentials into a modifiable collection for Kryo serde
+ this.storageCredentials = Lists.newArrayList(credentials);
+
+ // if the clients are already initialized, we need to allow them to be
recreated
+ synchronized (this) {
+ if (clientCache != null) {
+ this.clientCache = null;
+ }
+ }
+ }
+
+ @Override
+ public List<StorageCredential> credentials() {
+ return ImmutableList.copyOf(storageCredentials);
+ }
+
+ private void scheduleCredentialRefresh() {
+ storageCredentials.stream()
+ .flatMap(cred -> cred.config().entrySet().stream())
Review Comment:
The `startsWith(ROOT_PREFIX)` on line 343 filters credential prefixes (URIs
like `abfss://container@account...`), not config property keys. `ROOT_PREFIX =
"abfs"` matches both `abfs://` and `abfss://` URI schemes, which are the only
two ADLS schemes.
The config key lookup uses exact `Map.get()` with a fully constructed key
(`ADLS_SAS_TOKEN_EXPIRES_AT_MS_PREFIX + storageAccount`), so there's no fuzzy
matching on config properties.
##########
azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSFileIO.java:
##########
@@ -259,11 +281,93 @@ public void deletePrefix(String prefix) {
}
@Override
- public void close() {
- if (vendedAdlsCredentialProvider != null) {
- vendedAdlsCredentialProvider.close();
+ public void setCredentials(List<StorageCredential> credentials) {
+ Preconditions.checkArgument(credentials != null, "Invalid storage
credentials: null");
+ // stop any refresh that might be scheduled
+ if (refreshFuture != null) {
+ refreshFuture.cancel(true);
+ }
+
+ // copy credentials into a modifiable collection for Kryo serde
+ this.storageCredentials = Lists.newArrayList(credentials);
+
+ // if the clients are already initialized, we need to allow them to be
recreated
+ synchronized (this) {
+ if (clientCache != null) {
+ this.clientCache = null;
+ }
+ }
+ }
+
+ @Override
+ public List<StorageCredential> credentials() {
+ return ImmutableList.copyOf(storageCredentials);
+ }
+
+ private void scheduleCredentialRefresh() {
+ storageCredentials.stream()
+ .flatMap(cred -> cred.config().entrySet().stream())
+ .filter(e ->
e.getKey().startsWith(AzureProperties.ADLS_SAS_TOKEN_EXPIRES_AT_MS_PREFIX))
+ .map(e -> Instant.ofEpochMilli(Long.parseLong(e.getValue())))
+ .min(Comparator.naturalOrder())
+ .ifPresent(
+ expiresAt -> {
+ Instant prefetchAt = expiresAt.minus(5, ChronoUnit.MINUTES);
+ long delay = Duration.between(Instant.now(),
prefetchAt).toMillis();
+ this.refreshFuture =
+ executorService()
+ .schedule(this::refreshStorageCredentials, delay,
TimeUnit.MILLISECONDS);
+ });
+ }
+
+ private void refreshStorageCredentials() {
+ if (isResourceClosed.get() || vendedAdlsCredentialProvider == null) {
+ return;
+ }
+
+ try {
+ List<StorageCredential> refreshed =
+
vendedAdlsCredentialProvider.fetchCredentials().credentials().stream()
+ .filter(c -> c.prefix().startsWith(ROOT_PREFIX))
Review Comment:
Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]