aglinxinyuan opened a new issue, #5448:
URL: https://github.com/apache/texera/issues/5448

   ## Background
   
   Two modules in `engine/architecture/controller/execution` currently lack a 
dedicated unit-spec, despite sitting on the controller's execution-tracking hot 
path:
   
   | Source class | Purpose |
   | --- | --- |
   | `OperatorExecution` | Tracks per-worker execution state for an operator; 
aggregates worker states + statistics into operator-level metrics |
   | `RegionExecution` | Tracks per-region execution: collection of 
`OperatorExecution`s + `LinkExecution`s; aggregates operator stats and computes 
the region-level state from port-completion |
   
   Sibling classes in the same package (`LinkExecution`, `WorkerPortExecution`, 
`WorkflowExecution`, `ExecutionUtils`) already have specs; these two are the 
gap.
   
   ## Behavior to pin
   
   ### `OperatorExecution`
   
   | Surface | Contract |
   | --- | --- |
   | `initWorkerExecution(workerId)` | creates a fresh `WorkerExecution`, 
registers it under `workerId`, returns it |
   | second `initWorkerExecution` for the same id | throws `AssertionError` |
   | `getWorkerExecution(workerId)` | returns the previously-initialized 
`WorkerExecution` |
   | `getWorkerIds` | returns the set of all initialized worker ids (empty on a 
fresh operator) |
   | `getState` (no workers) | aggregates over empty → `UNINITIALIZED` per 
`ExecutionUtils.aggregateStates` |
   | `getState` (all completed) | returns `COMPLETED` |
   | `getState` (any running) | returns `RUNNING` |
   | `getStats` | aggregates input/output port metrics across workers; per-port 
metrics sum count + size; aggregate `dataProcessingTime` / 
`controlProcessingTime` / `idleTime` are the per-worker sums |
   | `isInputPortCompleted` / `isOutputPortCompleted` | `true` only when every 
worker reports the requested port as completed |
   
   ### `RegionExecution`
   
   | Surface | Contract |
   | --- | --- |
   | `initOperatorExecution(opId)` | creates and registers a fresh 
`OperatorExecution`, returns it |
   | `initOperatorExecution(opId, Some(inherited))` | deep-clones the inherited 
`OperatorExecution` via `com.rits.cloning.Cloner` |
   | second `initOperatorExecution` for the same id | throws `AssertionError` |
   | `getOperatorExecution(opId)` / `hasOperatorExecution(opId)` | retrieval 
semantics; `hasOperatorExecution` returns `false` for an unknown id |
   | `getAllOperatorExecutions` | returns every registered `(opId, 
OperatorExecution)` pair |
   | `initLinkExecution(link)` | creates a fresh `LinkExecution`, registers it; 
second call for the same link throws `AssertionError` |
   | `getAllLinkExecutions` | returns every registered `(PhysicalLink, 
LinkExecution)` pair |
   | `getStats` | returns one `OperatorMetrics` per registered 
`OperatorExecution` |
   | `getState` / `isCompleted` | for a region with no ports, `getState == 
COMPLETED` (vacuous `forall`) and `isCompleted == true` |
   
   ## Scope
   
   - New spec files (one per source class per the spec-filename convention):
     - `OperatorExecutionSpec.scala`
     - `RegionExecutionSpec.scala`
   - No production-code changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to