impothnis opened a new issue, #14039:
URL: https://github.com/apache/iceberg/issues/14039

   When using Managed Identity Authentication for creating Apache Iceberg 
tables on ADLSGen2, HadoopTableOperations sometimes fails with HTTP 429 errors 
due to throttling. The error occurs in the authentication call to Azure AD for 
MSI token acquisition. Microsoft has confirmed the following limits for Managed 
Identity:
   
   - 20 requests per second
   - 5 concurrent requests
   
   Suspected cause: The library may be making more than 5 concurrent requests, 
triggering throttling.
   
   Stack trace and error details:
   ```
   org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator$HttpException: 
HTTP Error 429; url='http://169.254.169.254/metadata/identity/oauth2/token' ... 
{"error":"invalid_request","error_description":"Temporarily throttled, too many 
requests"}
   org.apache.iceberg.exceptions.RuntimeIOException: Failed to refresh the table
       at 
org.apache.iceberg.hadoop.HadoopTableOperations.refresh(HadoopTableOperations.java:128)
       ...
   ```
   
   This only happens with Managed Identity Authentication, not with other auth 
methods. Microsoft suspects the library is exceeding the concurrent request 
limits. Issue is being tracked with Microsoft, and their investigation points 
to library behavior as a potential factor.
   
   Please investigate the concurrency and request rate of authentication calls 
in HadoopTableOperations when using ADLSGen2 with Managed Identity. Consider 
adding retries, throttling, or connection pooling to respect the documented 
limits. Additional context:
   - Library: Apache Iceberg
   - Environment: ADLSGen2, Managed Identity Authentication
   - Microsoft MSI limits: 20 req/sec, 5 concurrent
   - Existing support ticket with Microsoft
   
   Logs and error details are available above. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to