manjum-a11y opened a new issue, #14942:
URL: https://github.com/apache/iceberg/issues/14942

   ### Apache Iceberg version
   
   1.7.2
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   When reading a large Iceberg table from S3 using S3FileIO with S3 Access 
Grants enabled, Spark jobs intermittently fail with a NullPointerException 
inside the AWS SDK v2 AttributeMap$Builder.resolveValue, called from 
S3AccessGrantsIdentityProvider.resolveIdentity.
   
   This only appears under high concurrency / large datasets (e.g., 
spark.read.table(...).count() over many files). Smaller tables or lower 
parallelism may run successfully, but increasing parallelism makes the failure 
reproducible.
   
   The error message from the AWS SDK is:
   
   Encountered a null value when resolving configuration attributes. This is 
commonly caused by concurrent modifications to non-thread-safe types. Ensure 
you're synchronizing access to all non-thread-safe types.
   
   From the Iceberg side we are using S3FileIO with S3 Access Grants configured 
according to the docs, and the S3 client is built via S3Client.builder() with 
S3FileIOProperties.applyS3AccessGrantsConfigurations(...) (or equivalent).
   
   We have already tried these below combos where still the NPE issue persist
   
   **Iceberg versions**
   1.7.2 and upgraded to 1.10.0 → NPE persists in both.
   **AWS SDK v2 versions**
   Tried 2.24.6, 2.30.31, 2.32.1→ NPE persists across all.
   **S3 Access Grants plugin versions**
   Tried 2.0.2 and 2.3.0 → NPE persists across both.
   **Spark / JDK combinations**
   Spark 3.5.6 with JDK17 and Spark 4.0.1 (JDK21 inside image) → same NPE in 
both.
   **Parallelism tuning** - Reduced spark.sql.shuffle.partitions / 
spark.default.parallelism → can change frequency but does not reliably remove 
the NPE on large tables.
   
   Could you please help me to understand the issue:
   **1. Known issue?**
   Are you aware of any known concurrency problems between Iceberg’s S3FileIO 
S3 Access Grants integration and AWS SDK v2 / aws-s3-accessgrants-java-plugin 
that could cause AttributeMap$Builder.resolveValue to throw an NPE under high 
Spark parallelism?
   **2. Recommended version matrix?**
   Is there a recommended or validated combination of:
   Iceberg version
   AWS SDK v2 version
   aws-s3-accessgrants-java-plugin version
   for running S3 Access Grants with S3FileIO in a high‑concurrency Spark 
environment?
   **3. Client factory / configuration guidance?**
   From Iceberg’s side, is there any specific guidance on how the S3 client 
factory should be implemented (or additional S3FileIO / S3AG configuration) to 
avoid shared, non‑thread‑safe state that might trigger this NPE?
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to