wombatu-kun opened a new issue, #19037:
URL: https://github.com/apache/hudi/issues/19037

   ## Task Description
   
   **What needs to be done:**
   
   PR #19004 de-flaked the three Trino-plugin file-operation tests 
(`TestHudiNoCacheFileOperations`, `TestHudiMemoryCacheFileOperations`, 
`TestHudiAlluxioCacheFileOperations`) by dropping all `METADATA_TABLE` 
operations from `getFileOperations` (and all `Alluxio.*` operations in the 
Alluxio class) before asserting the per-query multiset of filesystem-access 
spans. That removed the per-query flakiness but also removed the assertions' 
ability to detect metadata-table read amplification: a future change that, for 
example, doubles the number of metadata-table reads per query would now pass 
silently because no test counts those reads anymore.
   
   Find a way to restore a regression signal on metadata-table read volume for 
these tests without re-introducing the span-leak flakiness that #19004 (and the 
earlier #18766 / #18995) fought.
   
   **Why this task is needed:**
   
   The metadata-table read counts were the main thing these `FileOperations` 
tests pinned down - how many low-level reads each query issues against the 
metadata table. After #19004 the metadata-table dimension is no longer asserted 
at all, so read-amplification regressions on the Trino read path are now 
invisible to CI. (The Alluxio cache-hit dimension is separately re-covered by 
the count-independent `testReadsServedFromAlluxioCache` added in the same PR, 
so only the metadata-table dimension is uncovered.)
   
   ## Background: why the obvious fixes do not work
   
   Trino resets the OpenTelemetry span exporter at the start of each 
`executeWithPlan`, so any span emitted by a Hudi background thread (the shared 
split-loader / split-manager / `ForkJoinPool.commonPool` pools that read the 
metadata table) after the synchronous query returns lands in the *next* 
measurement window. The result is a symmetric off-by-N: one query is counted 
long and the paired query short by almost the same amount.
   
   - An **exact-count** assertion on metadata-table spans flakes - this is the 
original failure.
   - A **tolerance / lower-bound** assertion on metadata-table spans still 
flakes, because the leak is bidirectional: a query can be counted short (its 
own spans leaked out) as well as long, and a lower bound is violated by the 
short case. This is the key difference from the Alluxio cache-hit check, where 
leaked spans only ever *add* hits (monotonic), so a lower bound there is safe.
   
   ## Candidate directions (to validate, not decided)
   
   These are hypotheses for the follow-up, not a committed design:
   
   1. **Aggregate / conservation assertion.** The leak shifts spans between 
adjacent windows but does not create or destroy them, so the *total* 
metadata-table read count across the paired measurements (or across the whole 
test class) should be conserved even though the per-query split is not. 
Asserting that aggregate would still catch a 2x amplification (which doubles 
the total) while tolerating the attribution jitter. Needs validation that 
nothing leaks past the chosen aggregation boundary (for example the last 
query's late spans).
   2. **Deterministic drain / quiesce** of the background metadata-table reader 
pools before the measurement window closes, so the metadata-table spans are 
captured inside the synchronous query window and exact counts become 
deterministic again. The obstacle is that those pools are shared / global with 
no clean await hook exposed to the Trino test harness.
   3. (Recorded as rejected) A span-stability poll - the #18766 approach - did 
not bound the race and is not a path to revisit.
   
   ## Task Type
   
   Test enhancement
   
   ## Related Issues
   
   - Originating PR: #19004 (prior attempts: #18766, #18995)
   - Reviewer call-out: 
https://github.com/apache/hudi/pull/19004#pullrequestreview-4522612971
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to