aglinxinyuan opened a new pull request, #5449:
URL: https://github.com/apache/texera/pull/5449

   Closes #5448.
   
   ## Summary
   
   Pin behavior of two previously-uncovered modules in 
`engine/architecture/controller/execution`:
   
   | Spec | Source class | Tests |
   | --- | --- | --- |
   | `OperatorExecutionSpec` | `OperatorExecution` | 14 (1 pending) |
   | `RegionExecutionSpec` | `RegionExecution` | 14 |
   
   Both spec files follow the `<srcClassName>Spec.scala` one-to-one convention.
   
   ## Behavior pinned
   
   ### `OperatorExecution`
   
   | Surface | Contract |
   | --- | --- |
   | `initWorkerExecution(workerId)` | registers a fresh `WorkerExecution` 
under `workerId` and returns it |
   | `getWorkerExecution(workerId)` | returns the previously-initialized 
`WorkerExecution` |
   | `getWorkerIds` | empty for a fresh operator; otherwise the set of every 
initialized worker id |
   | `getState` (no workers) | `UNINITIALIZED` (empty-iterable fallthrough in 
`ExecutionUtils.aggregateStates`) |
   | `getState` (all COMPLETED) | `COMPLETED` |
   | `getState` (any RUNNING) | `RUNNING` |
   | `getState` (all UNINITIALIZED) | `UNINITIALIZED` |
   | `getStats` | per-port input / output metrics summed across workers (count 
+ size per `portId`); time fields are per-worker sums; distinct ports stay 
separate |
   | `getStats` (no workers) | empty input / output metrics; zero time totals |
   | `isInputPortCompleted` / `isOutputPortCompleted` | `true` only when every 
worker reports the requested port as completed; input and output ports with the 
same `portId` are tracked independently |
   
   ### `RegionExecution`
   
   | Surface | Contract |
   | --- | --- |
   | `initOperatorExecution(opId)` | registers a fresh `OperatorExecution` and 
returns it |
   | `initOperatorExecution(opId, Some(inherited))` | **deep-clones** the 
inherited `OperatorExecution`; mutations on the original do not leak into the 
clone |
   | second `initOperatorExecution(opId)` for the same id | throws 
`AssertionError` |
   | `getOperatorExecution` / `hasOperatorExecution` / 
`getAllOperatorExecutions` | retrieval semantics |
   | `initLinkExecution(link)` | registers a fresh `LinkExecution`; second call 
for the same link throws `AssertionError`; distinct links and their inner 
channel-executions stay independent |
   | `getStats` | one `OperatorMetrics` per registered `OperatorExecution`, 
keyed by `opId`; empty when no operator is registered |
   | `getState` / `isCompleted` | for a region with no ports, vacuously 
`COMPLETED` |
   
   ## Notes
   
   While writing `OperatorExecutionSpec` I discovered a 
**docstring-vs-implementation mismatch** in 
`OperatorExecution.initWorkerExecution`:
   
   ```scala
   def initWorkerExecution(workerId: ActorVirtualIdentity): WorkerExecution = {
     assert(
       !workerExecutions.contains(workerId),  // <-- checks VALUES, not KEYS
       s"WorkerExecution already exists for workerId: $workerId"
     )
     workerExecutions.put(workerId, WorkerExecution())
     ...
   }
   ```
   
   `workerExecutions` is a `java.util.concurrent.ConcurrentHashMap` and 
`.contains(Object)` is the legacy `Hashtable` method that checks **values**, 
not **keys**. So the assertion never fires and the second call silently 
overwrites the prior `WorkerExecution` — contradicting the docstring (`"throws 
AssertionError"`).
   
   This PR pins the **current** behavior with a characterization test and the 
**intended** behavior with `pendingUntilFixed`, so the day the implementation 
is corrected (`containsKey`) CI surfaces it via the 
`pendingUntilFixed`-pass-now-fails signal and the characterization test starts 
failing. The fix itself is out of scope for a test-coverage PR.
   
   ## Test plan
   
   - [x] `WorkflowExecutionService/testOnly 
org.apache.texera.amber.engine.architecture.controller.execution.OperatorExecutionSpec
 
org.apache.texera.amber.engine.architecture.controller.execution.RegionExecutionSpec`
 — 28 succeed, 1 pending (`pendingUntilFixed` for the duplicate-id assertion)
   - [x] `scalafmtCheckAll` — clean
   - [ ] CI to confirm


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to