The GitHub Actions job "Benchmarks PR Comment" on texera.git/main has succeeded.
Run started by GitHub user aglinxinyuan (triggered by aglinxinyuan).

Head commit for run:
01e070fd83dd7fd9a5574964880fe67d7f8f1257 / Matthew B. <[email protected]>
perf(auth): collapse N+1 in getComputingUnitAccess into one join (#5648)

### What changes were proposed in this PR?
- Replace the two-DAO lookup in
`ComputingUnitAccess.getComputingUnitAccess` (fetch the unit by `cuid`,
then fetch all of the user's access rows by `uid` and filter them in
memory) with a single `LEFT JOIN` of `WORKFLOW_COMPUTING_UNIT` to
`COMPUTING_UNIT_USER_ACCESS` on `(cuid, uid)`, returning the owner uid
and the caller's privilege in one round-trip.
- Resolve the privilege from that single row: the owner gets `WRITE`, a
present access row yields its privilege, and a null join (no grant)
yields `NONE`.
- A missing computing unit now resolves to `NONE` instead of throwing;
the sole caller (`AccessControlResource`) already maps both `NONE` and a
thrown exception to `403 FORBIDDEN`, so the externally observable
behavior is unchanged.

### Performance comparison (previous vs current logic)

Measured with a throwaway benchmark against the same embedded Postgres
the unit tests use (`MockTexeraDB`), exercising the non-owner path (the
only path that previously issued a second query). For each grant count
the benchmark seeds a user holding that many grants, then times 2000
calls per implementation after a 200-call warmup. Microseconds per call,
lower is better:

| Grants held by the user | Previous (2 queries) | Current (1 join) |
Speedup |
| ---: | ---: | ---: | ---: |
| 1 | 4382.5 us | 2395.1 us | 1.83x |
| 10 | 3892.7 us | 1786.2 us | 2.18x |
| 100 | 3260.0 us | 1690.4 us | 1.93x |
| 1000 | 3650.8 us | 1630.0 us | 2.24x |

So the rewrite is roughly 2x faster across the board. The speedup stays
flat rather than growing with the grant count, which is itself
informative: on this loopback setup the dominant cost is the extra
database round-trip (two statements vs one), not the in-memory
transfer-and-filter of the user's grant rows. Against a real database
over a network, where per-statement latency is higher, halving the
round-trips should matter at least as much. This addresses the fair
concern that "more SQL queries do not necessarily mean slower": here the
previous logic issued fewer total joins but paid for an extra
round-trip, and measuring confirms the single join wins.

These numbers come from a local micro-benchmark, not committed to the
suite, and are absolute timings on embedded Postgres; treat the ratio as
the signal, not the raw microseconds.

### Any related issues, documentation, discussions?
Closes: #5645

### How was this PR tested?
- Added `ComputingUnitAccessSpec` covering the owner (`WRITE`),
shared-grant (granted privilege), no-grant (`NONE`), and missing-unit
(`NONE`) cases.
- `sbt Auth/compile` and `sbt Auth/test`, expect a clean compile of the
rewritten jOOQ query and a passing spec.
- Manual, via the access-control route backed by
`AccessControlResource`: request a computing unit as its owner, expect
`WRITE`; as a user holding a shared grant, expect that grant's
privilege; as a user with no grant, or for a non-existent `cuid`, expect
`403 FORBIDDEN`.
- Behavior-preservation check: confirm the only caller maps `NONE` to
`403`, so the missing-unit path (previously a thrown NPE, now `NONE`)
still yields `403`.

### Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.8 (Claude Code), in compliance with the ASF
Generative Tooling Guidance

Report URL: https://github.com/apache/texera/actions/runs/27489767501

With regards,
GitHub Actions via GitBox

Reply via email to