Yicong-Huang commented on PR #4527:
URL: https://github.com/apache/texera/pull/4527#issuecomment-4321545571

   Closing this experiment. CI made it clear the change needs more work than 
fits in a single PR.
   
   ### What we learned
   
   Removing `Global / concurrentRestrictions += Tags.limit(Tags.Test, 1)` 
exposes several layered problems:
   
   1. **`OverlappingFileLockException` on `EmbeddedPostgres.prepareBinaries`** 
— every spec mixing `MockTexeraDB` was starting its own embedded Postgres and 
racing on the shared `/tmp/embedded-pg/` binary directory.
      *Mitigated* in this branch by making `MockTexeraDB` a JVM singleton.
   
   2. **Per-suite `BeforeAndAfterAll` ordering was too fragile** — the original 
`SerializedSuite` only acquired its lock if the spec's `beforeAll` actually 
called `super.beforeAll`, and 10 of 14 DB-using specs do not. *Mitigated* by 
overriding `Suite.run` instead.
   
   3. **Data leakage between specs** — even once specs were serialized, rows 
from a previous spec (`uid=1`, `wid=1`, etc.) collided with the next spec's 
inserts because afterAll cleanup is inconsistent. *Mitigated* by `TRUNCATE`-ing 
every `texera_db` table on each spec setup, both for the singleton 
EmbeddedPostgres and the CI's `texera_db_for_test_cases` service.
   
   4. **Cross-module classloader singletons** — sbt's default `Test / fork := 
false` runs all subprojects' tests in one JVM but with separate classloaders, 
so the "JVM-wide" singletons we relied on were actually per-classloader. 
Setting `Test / fork := true` recovers true per-JVM singletons, but the forked 
JVMs default their working directory to the subproject's base directory — which 
breaks every spec that reads `sql/texera_ddl.sql` (and 
`./common/workflow-operator`, and more) by relative path.
   
   ### Why I'm closing
   
   After (1)–(4), tests that were previously passing started aborting with 
`NoSuchFileException` for relative paths embedded in test code and operator 
code. Pinning `Test / baseDirectory` to the repo root via `ThisBuild` did not 
propagate as expected, and there are at least two other relative-path 
conventions in the test sources (e.g. `./common/workflow-operator`) that would 
each need their own fix.
   
   Removing the global serialization is therefore not a localized change — it 
requires:
   
   1. Auditing every test for cwd-relative paths and converting them to 
classpath resources or absolute paths derived from `LocalRootProject / 
baseDirectory`.
   2. Verifying every spec's `beforeAll`/`afterAll` cleanup story is actually 
robust (not just relying on the chain we now have).
   3. Picking a real strategy for cross-module classloader/JVM singletons 
(`Test / fork := true` plus the right `forkOptions / workingDirectory`, or 
alternatively cross-classloader synchronization).
   
   Each of these is a meaningful PR on its own. Trying to land all of them in 
this branch was the wrong shape.
   
   ### Disposition
   
   - The mitigations from this branch (singleton `MockTexeraDB`, 
`TRUNCATE`-on-setup, `SerializedSuite` via `Suite.run`) are useful in their own 
right and could be extracted into smaller follow-up PRs.
   - The actual `Tags.limit` removal stays parked under #4525 until the 
prerequisites land.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to