yuqi1129 opened a new pull request, #10546:
URL: https://github.com/apache/gravitino/pull/10546

   ### What changes were proposed in this pull request?
   
   This PR eliminates the N+1 query pattern in 
`JcasbinAuthorizer.loadRolePrivilege()` and fixes a DBCP2 connection pool 
misconfiguration.
   
   **Auth N+1 fix:**
   - Added `listSecurableObjectsByRoleIds(List<Long>)` to 
`SecurableObjectMapper` and `SecurableObjectBaseSQLProvider`, which fetches 
securable objects for multiple roles in a single `WHERE role_id IN (...)` query.
   - Added `RoleMetaService.batchListSecurableObjectsForRoles(List<Long>)` that 
issues this single batch query and groups results by role ID. Extracted 
`buildSecurableObjectsFromPOs` helper reused by both the single and batch paths.
   - Refactored `JcasbinAuthorizer.loadRolePrivilege()` to collect all unloaded 
role IDs, call the batch method once, then load policies serially. Removed the 
per-role `CompletableFuture` + `entityStore.get()` pattern.
   
   **DBCP2 connection pool fix:**
   - `minEvictableIdleTimeMillis`: `1000ms → 30000ms` — prevents connections 
from being destroyed after 1 second idle, eliminating constant reconnect churn.
   - `minIdle`: `0 → 5` — keeps a pool of warm connections ready.
   - `maxIdle`: `5 → 10` — allows more idle connections to be retained.
   
   **Query count improvement (cold path):**
   
   | | Before | After |
   |---|---|---|
   | Role stubs | 1 query | 1 query |
   | Role metadata | N queries | 0 (already in stubs) |
   | Securable objects | N queries | **1 query** |
   | Name resolution | T queries | T queries |
   | **Total** | `2 + 2N + T` | `2 + 1 + T` |
   
   ### Why are the changes needed?
   
   Fix: #10545
   
   Each call to `loadRolePrivilege` issued `2N` DB queries (one per role for 
role metadata, one per role for securable objects). With `N=5` roles and `T=3` 
object types, that's 16 queries per authorization check. This is a bottleneck 
under high concurrency or when the cache is cold (e.g., after a restart or TTL 
expiry in HA deployments).
   
   The DBCP2 misconfiguration caused connection destroy-then-reconnect on 
nearly every request (1s idle eviction with `minIdle=0`), adding ~5–20ms 
latency per request.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No API changes. The connection pool defaults change slightly (`minIdle`, 
`maxIdle`, `minEvictableIdleTimeMillis`) which improves performance 
transparently.
   
   ### How was this patch tested?
   
   - `TestRoleMetaService` — all 33 tests pass (H2 and PostgreSQL backends).
   - `TestJcasbinAuthorizer` — all 7 tests pass. Updated test to mock 
`RoleMetaService.batchListSecurableObjectsForRoles` instead of per-role 
`entityStore.get()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to