aglinxinyuan commented on code in PR #4206:
URL: https://github.com/apache/texera/pull/4206#discussion_r3308566712
##########
common/workflow-core/src/main/scala/org/apache/texera/amber/core/workflow/PhysicalOp.scala:
##########
@@ -198,6 +198,7 @@ case class PhysicalOp(
// schema propagation function
propagateSchema: SchemaPropagationFunc = SchemaPropagationFunc(schemas =>
schemas),
isOneToManyOp: Boolean = false,
+ isLoopEnd: Boolean = false,
Review Comment:
Good point — fixed in 540b7ba274, with the flag renamed in bbec98282e.
The branch had since moved to detecting loop ends by string-matching the
operator id (`startsWith("LoopEnd-operator-")`), which keys the behavior to the
operator type even more tightly than `isLoopEnd` did. Replaced both with a
behavior-named flag:
* `PhysicalOp.reusesOutputStorageOnReExecution: Boolean = false` —
documented as "this operator's output storage should be reused (reopened)
rather than recreated fresh when its region is executed more than once",
explicitly noting any operator can set it, not just Loop End.
* `LoopEndOpDesc` sets it via `.withReusesOutputStorageOnReExecution(true)`.
* `RegionExecutionCoordinator` now checks
`region.getOperators.exists(_.reusesOutputStorageOnReExecution)` instead of the
id prefix.
The name states exactly what the scheduler does — reopen existing output
storage on a region re-run instead of recreating it — so it reasons about the
property rather than the operator, and a future operator needing the same
treatment just sets the flag.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]