The GitHub Actions job "Benchmarks PR Comment" on texera.git/main has failed. Run started by GitHub user wwong0 (triggered by wwong0).
Head commit for run: da99a359458bb285908752779b4c2dc3f11ecd89 / Xinyuan Lin <[email protected]> feat(scheduling): reuse output storage across region re-executions (#5707) ### What changes were proposed in this PR? Adds an opt-in mechanism for an output port to **reuse** its storage when the owning operator's region re-executes, instead of recreating the document each time. Dormant and behavior-preserving — no operator sets the flag in this PR. - `OutputPort` gains a `reuseStorage: Boolean` proto field (alongside `blocking` / `mode`). It marks a port whose output accumulates across region re-executions — e.g. a Loop End port whose result builds up over the iterations of its own loop. - `DocumentFactory.createOrReuseDocument(uri, schema, reuseExisting, …)` is the create-or-reuse decision: when reuse is requested and a document already exists it opens and returns that one; otherwise it creates a fresh one. It always returns the document, so the call site does not branch. - `RegionExecutionCoordinator` reads each output port's `reuseStorage` flag while provisioning that port's result/state documents and routes through `createOrReuseDocument`. | port flag | region re-run behavior | |---|---| | `false` (every operator today) | recreate output/state documents — unchanged | | `true` (set by Loop End in the loop PR) | keep and reopen the existing documents | A runtime guard in `RegionExecutionCoordinator` asserts no port sets `reuseStorage` for now: the flag activates only with the loop operators, which are not yet on `main`. The guard keeps the dormant reuse path from being silently exercised before its consumer exists, and is removed when the loop operators land. ### Any related issues, documentation, discussions? Resolves #5709 (sub-issue of #4442 "Introduce for loop"). Split out of #5700 to keep that PR reviewable, per @Xiao-zhen-Liu's [review](https://github.com/apache/texera/pull/4206#pullrequestreview-4482667715). ### How was this PR tested? - `DocumentFactorySpec` — pins the create-or-reuse decision (the reuse × exists matrix plus the "no-reuse never probes existence" short-circuit) with injected document stubs, no iceberg backend. - `OutputPortReuseFlagSpec` — guards that no registered operator enables `reuseStorage` on any output port. - `WorkflowCore` / `WorkflowOperator` / `WorkflowExecutionService` compile; scalafmt + scalafix clean. ### Was this PR authored or co-authored using generative AI tooling? Co-authored with Claude Opus 4.8 in compliance with ASF. Report URL: https://github.com/apache/texera/actions/runs/27667778105 With regards, GitHub Actions via GitBox
