aglinxinyuan opened a new pull request, #5813:
URL: https://github.com/apache/texera/pull/5813

   ### What changes were proposed in this PR?
   
   Pin behavior of four previously-untested text-search/match descriptors in 
`common/workflow-operator/`. They share the same shape — match/filter tuples by 
a string predicate on a column — and contribute operator metadata, physical-op 
wiring, and (for DictionaryMatcher) output-schema propagation. No 
production-code changes.
   
   | Spec | Source class | Tests |
   | --- | --- | --- |
   | `KeywordSearchOpDescSpec` | `KeywordSearchOpDesc` | 6 |
   | `SubstringSearchOpDescSpec` | `SubstringSearchOpDesc` | 4 |
   | `RegexOpDescSpec` | `RegexOpDesc` | 3 |
   | `DictionaryMatcherOpDescSpec` | `DictionaryMatcherOpDesc` | 5 |
   
   All four spec files follow the `<srcClassName>Spec.scala` one-to-one 
convention.
   
   **Behavior pinned**
   
   | Surface | Contract |
   | --- | --- |
   | `operatorInfo` | exact `userFriendlyName`; group `SEARCH_GROUP`; one input 
/ one output port; `supportReconfiguration == true` |
   | Field defaults | `KeywordSearch`/`Substring` `isCaseSensitive == false` |
   | `getPhysicalOp` | `opExecInitInfo` pattern-matches 
`OpExecWithClassName(<FQCN>, descString)` with the exact executor class name 
and a non-empty payload; ports carried forward from `operatorInfo` |
   | Polymorphic JSON round-trip | serialize → deserialize via 
`classOf[LogicalOp]` → correct subtype with fields preserved (pins the 
`@JsonTypeInfo` discriminator + `@JsonProperty` wire-keys) |
   | `DictionaryMatcher` schema propagation | `getExternalOutputSchemas` 
appends a `BOOLEAN` column named by `resultAttribute` to the input schema |
   | `DictionaryMatcher` MatchingType | serializes via its `@JsonValue` name 
(`SCANBASED` → `"Scan"`) and round-trips |
   
   Mirrors the established `SleepOpDescSpec` / `SortOpDescSpec` patterns 
(AnyFlatSpec + Matchers; `OpExecWithClassName` match instead of brittle 
`toString`; polymorphic deserialize via `classOf[LogicalOp]`).
   
   ### Any related issues, documentation, discussions?
   
   Closes #5806.
   
   ### How was this PR tested?
   
   Pure unit-test additions; verified locally with:
   
   - `sbt "WorkflowOperator/testOnly 
org.apache.texera.amber.operator.keywordSearch.KeywordSearchOpDescSpec 
org.apache.texera.amber.operator.substringSearch.SubstringSearchOpDescSpec 
org.apache.texera.amber.operator.regex.RegexOpDescSpec 
org.apache.texera.amber.operator.dictionary.DictionaryMatcherOpDescSpec"` — 18 
tests, all green
   - `sbt "WorkflowOperator/Test/scalafmtCheck"` and `sbt 
"WorkflowOperator/Test/scalafix --check"` — clean
   - CI to confirm
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Opus 4.8 [1M context])


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to