aglinxinyuan opened a new pull request, #5769:
URL: https://github.com/apache/texera/pull/5769

   ### What changes were proposed in this PR?
   
   Pin behavior of two previously-uncovered standalone operators (descriptor + 
executor pairs). No production-code changes.
   
   | Spec | Source class | Tests |
   | --- | --- | --- |
   | `SplitOpDescSpec` | `SplitOpDesc` | 8 |
   | `SplitOpExecSpec` | `SplitOpExec` | 7 |
   | `UrlVizOpDescSpec` | `UrlVizOpDesc` | 7 |
   | `UrlVizOpExecSpec` | `UrlVizOpExec` | 6 |
   
   All four spec files follow the `<srcClassName>Spec.scala` one-to-one 
convention.
   
   **Behavior pinned — `SplitOpDesc`**
   
   | Surface | Contract |
   | --- | --- |
   | `operatorInfo` | name `"Split"`, group `UTILITY_GROUP`, one input, two 
outputs (PortIdentity 0 = training, PortIdentity 1 = testing) |
   | Field defaults | `k = 80`, `random = true`, `seed = 1` |
   | `getPhysicalOp` | wires 
`OpExecWithClassName("…operator.split.SplitOpExec", <json>)`; 
non-parallelizable; payload includes the `k` / `random` / `seed` wire-keys |
   | Schema propagation | propagates the single input schema to every output 
port; throws `IllegalArgumentException` unless exactly one input is supplied |
   | Independent instances | `operatorIdentifier` (UUID-seeded) differs across 
`new` |
   
   **Behavior pinned — `SplitOpExec`**
   
   | Surface | Contract |
   | --- | --- |
   | `k = 100` | every tuple emitted on PortIdentity 0 (training) |
   | `k = 0` | every tuple emitted on PortIdentity 1 (testing) |
   | Deterministic seed | two fresh instances with the same `(k, seed)` produce 
identical port sequences over 200 tuples |
   | `k = 50` (deterministic seed) | ~50% ratio over 2000 tuples (±150 band — 
safely outside binomial 3σ ≈ 67) |
   | `close()` | clears the `random` reference to `null` |
   | `processTuple` (single-port overload) | throws `NotImplementedError` |
   | Malformed descriptor JSON | construction throws `JsonProcessingException` |
   
   **Behavior pinned — `UrlVizOpDesc`**
   
   | Surface | Contract |
   | --- | --- |
   | `operatorInfo` | name `"URL Visualizer"`, group 
`VISUALIZATION_MEDIA_GROUP` |
   | `getPhysicalOp` | wires 
`OpExecWithClassName("…operator.visualization.urlviz.UrlVizOpExec", <json>)` |
   | Output schema | propagation function ignores input and emits a single 
`html-content` STRING attribute |
   | `urlContentAttrName` annotations | `@JsonProperty(required = true)` + 
`@AutofillAttributeName` + `@NotNull` (verified via reflection) |
   | Class-level `@JsonSchemaInject` | restricts `urlContentAttrName` to STRING 
attributes |
   | Independent instances | `operatorIdentifier` (UUID-seeded) differs across 
`new` |
   
   **Behavior pinned — `UrlVizOpExec`**
   
   | Surface | Contract |
   | --- | --- |
   | `processTuple` | emits a single `TupleLike` whose only value contains the 
generated HTML |
   | Generated HTML | `<!DOCTYPE html>` preamble; `<iframe src="…">` 
interpolates the input URL; `frameborder="0"` and the `height:100vh; 
width:100%; border:none` sizing style |
   | Per-tuple cardinality | exactly one emission per `processTuple` call |
   | Distinct URLs | interpolated into distinct outputs |
   | Malformed descriptor JSON | construction throws `JsonProcessingException` |
   
   **Test-harness note**
   
   `UrlVizOpDesc` declares `urlContentAttrName: val = ""`; the production code 
seeds it via `objectMapper.readValue` and the `jackson-module-no-ctor-deser` 
module that bypasses immutable vals. To test the executor without touching 
production code, `UrlVizOpExecSpec` builds the descriptor JSON via Jackson's 
tree API and injects both the `operatorType` discriminator (`"URLVisualizer"`, 
per `LogicalOp`'s `@JsonSubTypes` table) and the `urlContentAttrName` field.
   
   ### Any related issues, documentation, discussions?
   
   Closes #5766.
   
   ### How was this PR tested?
   
   Pure unit-test additions; verified locally with:
   
   - `sbt "WorkflowOperator/testOnly 
org.apache.texera.amber.operator.split.SplitOpDescSpec 
org.apache.texera.amber.operator.split.SplitOpExecSpec 
org.apache.texera.amber.operator.visualization.urlviz.UrlVizOpDescSpec 
org.apache.texera.amber.operator.visualization.urlviz.UrlVizOpExecSpec"` — 30 
tests, all green
   - `sbt scalafmtCheckAll` — clean
   - CI to confirm
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Opus 4.7 [1M context])


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to