fivetran-rahulprakash opened a new pull request, #3113:
URL: https://github.com/apache/polaris/pull/3113
### Problem
The `getAccessToken()` method in `AzureCredentialsStorageIntegration` used
an unbounded blocking call which could hang indefinitely if Azure's token
endpoint was slow or unresponsive. This could lead to:
- Thread pool exhaustion in high-concurrency scenarios
- Cascading failures when Azure AD experiences degraded performance
- Poor user experience with no visibility into token fetch failures
### Solution
This PR adds defensive timeout and retry mechanisms using Project Reactor's
built-in capabilities:
- **15-second timeout** per individual token request attempt to prevent
indefinite blocking
- **Exponential backoff retry** (3 attempts with delays: 2s, 4s, 8s) with
50% jitter to prevent thundering herd during mass failures
- **90-second overall timeout** as a safety net for the complete operation
- **Intelligent retry filtering** for known transient Azure AD errors:
- `AADSTS50058` - Token endpoint timeout
- `AADSTS50078` - Service temporarily unavailable
- `AADSTS700084` - Token refresh required
- `503` - Service unavailable
- `429` - Too many requests
- **Enhanced logging** for better observability (warnings on errors, info on
retries)
### Testing
- Verified compilation with `./gradlew :polaris-core:compileJava`
- Code leverages existing reactor dependencies (no new dependencies)
- Follows existing Polaris patterns for reactive error handling
### Benefits
- Improves system resilience to transient Azure service issues
- Prevents indefinite blocking that could cascade to request timeouts
- Provides better observability with structured logging
- Uses well-established retry patterns with exponential backoff and jitter
## Checklist
- [x] ๐ก๏ธ Don't disclose security issues! (Not applicable - this is a
resilience improvement)
- [x] ๐ Clearly explained why the changes are needed
- [x] ๐งช Manually tested via compilation; reactive behavior follows
reactor-core semantics
- [x] ๐ก Added comprehensive Javadoc explaining the retry strategy
- [ ] ๐งพ Updated `CHANGELOG.md` (awaiting maintainer guidance on format)
- [ ] ๐ Updated documentation (no user-facing config changes)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]