Yicong-Huang opened a new issue, #4525:
URL: https://github.com/apache/texera/issues/4525

   ### Task Summary
   
   Every Scala module in the build (`amber`, `common/auth`, `common/config`, 
`common/dao`, `common/pybuilder`, `common/workflow-core`, 
`common/workflow-operator`, `access-control-service`, `file-service`, 
`config-service`, `computing-unit-managing-service`, 
`workflow-compiling-service`) carries the same line:
   
   ```sbt
   Global / concurrentRestrictions += Tags.limit(Tags.Test, 1)
   ```
   
   This forces sbt to run **all tests across all modules sequentially in a 
single JVM**. With CI's `Run backend tests` step at ~4m30s, this is the largest 
single contributor we have not touched.
   
   ### Where it came from
   
   `git log -S "concurrentRestrictions in Global"` traces this back to commit 
`6a79a655ca` (2020-12-28), as part of PR #938 (*Extending the runtime error 
raising framework to all actors*). The PR mainly reworked exception handling 
across `Controller`, `Principal`, and `WorkerBase`. The `Tags.limit` line was 
added in that same PR with the only comment being "ensuring no parallel 
execution of multiple tasks" — no specific reason given.
   
   The most likely explanation is that PR #938's actor-error refactor 
introduced some shared mutable state (or singleton) that couldn't tolerate 
parallel test JVMs, and a global `Tags.limit` was applied as a quick 
workaround. Subsequently every new module that was added copy-pasted the line.
   
   It is now five years and many refactors later. The original race may no 
longer exist; if it does, it is worth identifying the specific shared state 
instead of carrying a workaround that costs us several minutes of CI time per 
build.
   
   ### Proposed experiment
   
   1. Pick one low-coupling module (e.g. `common/workflow-core`).
   2. Remove its `Tags.limit` line and add `Test / fork := true` + `Test / 
parallelExecution := true`.
   3. Run that module's tests locally and on CI several times to check for 
flakiness.
   4. If stable, repeat for the next module. If flaky, identify the specific 
shared state and either fix it or scope the restriction to that one module.
   
   Goal: collapse the global restriction into per-module restrictions where 
actually needed, or remove it entirely.
   
   ### Priority
   
   P2 – Medium
   
   ### Task Type
   
   - [x] Refactor / Cleanup
   - [x] Testing / QA
   - [x] DevOps / Deployment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to