The GitHub Actions job "Required Checks" on 
texera.git/gh-readonly-queue/main/pr-5738-8a90f1f667c44bc26c0faf9eee619392e3f57ddf
 has succeeded.
Run started by GitHub user aglinxinyuan (triggered by aglinxinyuan).

Head commit for run:
0efbc0f59cad0a660912fab63de04a4860d8b42c / Xinyuan Lin <[email protected]>
test(workflow-operator): add unit test coverage for SET-family LogicalOp 
descriptors (#5738)

### What changes were proposed in this PR?

Pin behavior of three previously-uncovered `LogicalOp` descriptors in
the SET / cleaning operator family. Each descriptor wires a physical-op
class name + port shape + (where applicable) partitioning +
schema-propagation contract through `getPhysicalOp`. No production-code
changes.

| Spec | Source class | Tests |
| --- | --- | --- |
| `UnionOpDescSpec` | `UnionOpDesc` | 5 |
| `DistinctOpDescSpec` | `DistinctOpDesc` | 7 |
| `DifferenceOpDescSpec` | `DifferenceOpDesc` | 9 |

All three spec files follow the `<srcClassName>Spec.scala` one-to-one
convention. `IntersectOpDescSpec` already exists and gave us the
spec-shape template.

**Behavior pinned — `UnionOpDesc`**

| Surface | Contract |
| --- | --- |
| `operatorInfo` | name `"Union"`, group `SET_GROUP`, description
mentions "Union" |
| Ports | one input, one non-blocking output |
| `getPhysicalOp` | wires
`OpExecWithClassName("…operator.union.UnionOpExec")` |
| Partition requirement | empty (no hash-alignment forced; unlike
Distinct / Difference / Intersect, Union preserves whatever the upstream
produced) |
| Independent instances | no static state shared across `new
UnionOpDesc` |

**Behavior pinned — `DistinctOpDesc`**

| Surface | Contract |
| --- | --- |
| `operatorInfo` | name `"Distinct"`, group `CLEANING_GROUP`,
description mentions "duplicate" |
| Ports | one input, one **blocking** output |
| `getPhysicalOp` | wires
`OpExecWithClassName("…operator.distinct.DistinctOpExec")`;
`partitionRequirement` is `List(Option(HashPartition()))`;
`derivePartition` always returns `HashPartition` regardless of input
partition kind |

**Behavior pinned — `DifferenceOpDesc`**

| Surface | Contract |
| --- | --- |
| `operatorInfo` | name `"Difference"`, group `SET_GROUP`, description
mentions "difference"; two input ports with `displayName` `"left"`
(PortIdentity 0) and `"right"` (PortIdentity 1); one **blocking** output
|
| `getPhysicalOp` | wires
`OpExecWithClassName("…operator.difference.DifferenceOpExec")`;
`partitionRequirement` is `List(Option(HashPartition()),
Option(HashPartition()))` (both inputs); `derivePartition` always
returns `HashPartition` |
| Schema propagation | accepts a single shared input schema and produces
that schema on every output port; throws `IllegalArgumentException` when
the two inputs do not share one schema |

### Any related issues, documentation, discussions?

Closes #5734.

### How was this PR tested?

Pure unit-test additions; verified locally with:

- `sbt "WorkflowOperator/testOnly
org.apache.texera.amber.operator.union.UnionOpDescSpec
org.apache.texera.amber.operator.distinct.DistinctOpDescSpec
org.apache.texera.amber.operator.difference.DifferenceOpDescSpec"` — 21
tests, all green
- `sbt scalafmtCheckAll` — clean
- CI to confirm

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7 [1M context])

Report URL: https://github.com/apache/texera/actions/runs/27722882811

With regards,
GitHub Actions via GitBox

Reply via email to