aglinxinyuan opened a new issue, #5734:
URL: https://github.com/apache/texera/issues/5734

   ### Task Summary
   
   Add dedicated unit-specs for three `LogicalOp` descriptors in the SET 
operator family (`union`, `distinct`, `difference`). Pin the descriptor → 
`PhysicalOp` translation (operator class name, input/output port shape, 
partitioning requirements) so a refactor that drifts any one of those wires is 
caught immediately.
   
   ## Background
   
   Three concrete `LogicalOp` descriptors in 
`common/workflow-operator/operator/` currently lack a dedicated unit-spec. Each 
describes a set-style operator (union / distinct set-difference) and wires its 
physical-op class name + port shape + partition requirements through 
`getPhysicalOp`:
   
   | Source class | Package | What's wired |
   | --- | --- | --- |
   | `UnionOpDesc` | `operator.union` | 
`OpExecWithClassName("…operator.union.UnionOpExec")`; one input port, one 
output port; no partition requirement |
   | `DistinctOpDesc` | `operator.distinct` | 
`OpExecWithClassName("…operator.distinct.DistinctOpExec")`; `HashPartition` 
input + derived; blocking output |
   | `DifferenceOpDesc` | `operator.difference` | 
`OpExecWithClassName("…operator.difference.DifferenceOpExec")`; two input ports 
(`left`, `right`) with `HashPartition`; blocking output; schema propagation 
requires both inputs to share one schema |
   
   ## Behavior to pin
   
   For each descriptor:
   
   | Surface | Contract |
   | --- | --- |
   | `getPhysicalOp(workflowId, executionId)` | constructs a `PhysicalOp` 
referencing the correct executor class name |
   | Input ports / output ports | counts and (for `Difference`) display names 
match `operatorInfo` |
   | `operatorInfo` | name, description, group constant |
   | Partition requirement (for `Distinct` / `Difference`) | `HashPartition` |
   | `derivePartition` (for `Distinct` / `Difference`) | returns 
`HashPartition` regardless of input |
   | `Difference` schema propagation | accepts a single shared input schema; 
throws `IllegalArgumentException` when input schemas diverge |
   | `OperatorGroupConstants.SET_GROUP` / 
`OperatorGroupConstants.CLEANING_GROUP` | match the production constants |
   
   ## Scope
   
   - New spec files (one per source class per the spec-filename convention):
     - `UnionOpDescSpec.scala`
     - `DistinctOpDescSpec.scala`
     - `DifferenceOpDescSpec.scala`
   - No production-code changes.
   
   ### Task Type
   
   - [ ] Refactor / Cleanup
   - [ ] DevOps / Deployment / CI
   - [x] Testing / QA
   - [ ] Documentation
   - [ ] Performance
   - [ ] Other


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to