aglinxinyuan opened a new pull request, #5814:
URL: https://github.com/apache/texera/pull/5814

   ### What changes were proposed in this PR?
   
   Pin behavior of three previously-untested descriptors that shape/reduce 
output row counts in `common/workflow-operator/`. No production-code changes.
   
   | Spec | Source class | Tests |
   | --- | --- | --- |
   | `LimitOpDescSpec` | `LimitOpDesc` | 5 |
   | `RandomKSamplingOpDescSpec` | `RandomKSamplingOpDesc` | 3 |
   | `ReservoirSamplingOpDescSpec` | `ReservoirSamplingOpDesc` | 3 |
   
   All three spec files follow the `<srcClassName>Spec.scala` one-to-one 
convention.
   
   **Behavior pinned — `LimitOpDesc`**
   
   | Surface | Contract |
   | --- | --- |
   | `operatorInfo` | `Limit`, `CLEANING_GROUP`, 1-in/1-out, 
`supportReconfiguration == true` |
   | Polymorphic deserialize | `{"operatorType":"Limit","limit":N}` via 
`classOf[LogicalOp]` yields a `LimitOpDesc` with `limit == N` |
   | `getPhysicalOp` | non-parallelizable; wires `LimitOpExec`; ports carried 
forward |
   | `runtimeReconfiguration` | returns `Success` with a `StateTransferFunc`; 
the func copies the running `count` from the old `LimitOpExec` to the new one 
(exercised end-to-end with two real exec instances) |
   
   **Behavior pinned — `RandomKSamplingOpDesc`**
   
   | Surface | Contract |
   | --- | --- |
   | `operatorInfo` | `Random K Sampling`, `UTILITY_GROUP`, 
`supportReconfiguration == true` |
   | `percentage` round-trip | serializes under the spaced wire-key `random k 
sample percentage`; survives a polymorphic round-trip |
   | `getPhysicalOp` | wires `RandomKSamplingOpExec`; ports carried forward |
   
   **Behavior pinned — `ReservoirSamplingOpDesc`**
   
   | Surface | Contract |
   | --- | --- |
   | `operatorInfo` | `Reservoir Sampling`, `UTILITY_GROUP`, 
`supportReconfiguration == false` (the intentional difference vs 
RandomKSampling — pinned so a future "fix" that flips it is caught) |
   | `k` round-trip | serializes under the wire-key `number of item sampled in 
reservoir sampling` |
   | `getPhysicalOp` | wires `ReservoirSamplingOpExec`; ports carried forward |
   
   ### Any related issues, documentation, discussions?
   
   Closes #5807.
   
   ### How was this PR tested?
   
   Pure unit-test additions; verified locally with:
   
   - `sbt "WorkflowOperator/testOnly 
org.apache.texera.amber.operator.limit.LimitOpDescSpec 
org.apache.texera.amber.operator.randomksampling.RandomKSamplingOpDescSpec 
org.apache.texera.amber.operator.reservoirsampling.ReservoirSamplingOpDescSpec"`
 — 11 tests, all green
   - `sbt "WorkflowOperator/Test/scalafmtCheck"` and `sbt 
"WorkflowOperator/Test/scalafix --check"` — clean
   - CI to confirm
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Opus 4.8 [1M context])


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to