aglinxinyuan commented on code in PR #5720:
URL: https://github.com/apache/texera/pull/5720#discussion_r3450017761
##########
common/workflow-core/src/main/scala/org/apache/texera/amber/core/workflow/PhysicalOp.scala:
##########
@@ -198,6 +198,11 @@ case class PhysicalOp(
// schema propagation function
propagateSchema: SchemaPropagationFunc = SchemaPropagationFunc(schemas =>
schemas),
isOneToManyOp: Boolean = false,
+ // Whether this operator can only run correctly under a fully-materialized
+ // schedule (e.g. a loop operator, whose back-edge is a cross-region
+ // materialized state channel). The schedule generator forces materialized
+ // region boundaries when any operator sets this. Default false.
+ requiresMaterializedExecution: Boolean = false,
Review Comment:
Reworded: the doc now says the scheduler runs the WHOLE workflow fully
materialized (every link, nothing pipelined) when any op sets the flag — not
just that op's region boundaries. To your question: whole-plan materialization
is what loops need today (the back-edge is a cross-region materialized channel
that requires region-based re-execution), and I noted that narrowing it to only
the requiring op's regions is a possible future optimization.
_🤖 Addressed by [Claude Code](https://claude.com/claude-code)_
##########
amber/src/main/scala/org/apache/texera/amber/engine/architecture/scheduling/CostBasedScheduleGenerator.scala:
##########
@@ -304,7 +321,14 @@ class CostBasedScheduleGenerator(
*/
private def createRegionDAG(): DirectedAcyclicGraph[Region, RegionLink] = {
val searchResultFuture: Future[SearchResult] = Future {
Review Comment:
Dropped — the named `effectiveExecutionMode` helper is self-explanatory, and
its scaladoc plus the field doc already explain the behavior.
_🤖 Addressed by [Claude Code](https://claude.com/claude-code)_
##########
amber/src/test/scala/org/apache/texera/amber/engine/architecture/scheduling/CostBasedScheduleGeneratorSpec.scala:
##########
@@ -515,4 +516,40 @@ class CostBasedScheduleGeneratorSpec extends AnyFlatSpec
with MockFactory {
)
}
+ "CostBasedScheduleGenerator.effectiveExecutionMode" should
Review Comment:
Deferring the 'createRegionDAG actually calls the helper' coverage to the
loop integration tests, as you suggested — `LoopIntegrationSpec` (in the loop
PR #5700) drives a real loop plan end-to-end through `createRegionDAG`, while
the helper's two branches are unit-tested here.
_🤖 Addressed by [Claude Code](https://claude.com/claude-code)_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]