eugenegujing opened a new pull request, #5722:
URL: https://github.com/apache/texera/pull/5722
### What changes were proposed in this PR?
This PR adds a dedicated `GlobalPortIdentitySerdeSpec` for
`GlobalPortIdentitySerde`, the helper that serializes/deserializes a
`GlobalPortIdentity` to the compact, underscore-free string used in VFS URIs
and file paths.
`GlobalPortIdentitySerde` was previously exercised inside
`PortIdentitySerdeSpec`, which mixed it with the unrelated `PortIdentity`
Jackson key (de)serializer tests. Rather than duplicate that coverage, this PR:
- **Moves** the `GlobalPortIdentitySerde` cases into their own file (the
name issue #5717 asks for), so the two serde concerns are tested independently.
- **Trims** `PortIdentitySerdeSpec` down to only the `PortIdentity` Jackson
key (de)serialization tests, dropping the now-unused imports.
- **Adds** seven edge cases that were previously unpinned (see below).
There are **no production-code changes** and **no duplication** between the
two specs.
New edge cases added beyond the migrated coverage:
1. **serialize/deserialize asymmetry** — the deserializer regex accepts
underscores even though the serializer rejects them; characterized so a future
tightening of the regex breaks the test on purpose.
2. **unescaped comma** — a `logicalOpId` containing a `,` fails to
round-trip, because the format does not escape its separator.
3. **portId boundary** — `Int.MaxValue` round-trips, and an out-of-Int-range
value (`9999999999`) is rejected with `NumberFormatException` on deserialize.
4. **mixed-case booleans** — `True` / `FALSE` are accepted on deserialize,
since Scala's `String.toBoolean` is case-insensitive.
5. **`=` in a value round-trips** — only `,` is a sensitive separator; an
embedded `=` is captured fine, the counterpart to the comma case.
6. **empty `logicalOpId`** — serializes into `(logicalOpId=,...)` but the
result can no longer be deserialized (`[^,]+` requires ≥1 char), the serialize
side of the empty-field asymmetry.
7. **require messages name the field** — the serialize-side guards identify
the offending field (`logicalOpId` / `layerName` / `portId`) in their message,
catching a regression that wires a check to the wrong field.
| File | Change |
| --- | --- |
| `common/workflow-core/.../util/serde/GlobalPortIdentitySerdeSpec.scala` |
**new** — 23 tests (`AnyFlatSpec with Matchers`) |
| `common/workflow-core/.../util/serde/PortIdentitySerdeSpec.scala` |
removed the migrated `GlobalPortIdentitySerde` block + now-unused imports (−189
lines); `PortIdentity` Jackson key (de)serialization tests kept intact |
### Any related issues, documentation, discussions?
Closes #5717
### How was this PR tested?
Test-only PR. All checks below were run locally from the repository root and
pass:
1. **Unit tests** — the new and modified specs:
```
sbt "WorkflowCore/testOnly
org.apache.texera.amber.util.serde.GlobalPortIdentitySerdeSpec
org.apache.texera.amber.util.serde.PortIdentitySerdeSpec"
```
Result: **31 tests, 0 failures** (1 `pendingUntilFixed`, pre-existing in
`PortIdentitySerdeSpec`).
2. **Lint (scalafix)** — `sbt "scalafixAll --check"` → success.
3. **Format (scalafmt)** — `sbt scalafmtCheckAll` → success.
### Was this PR authored or co-authored using generative AI tooling?
Co-authored by: Claude Opus 4.8
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]