aglinxinyuan opened a new issue, #5776:
URL: https://github.com/apache/texera/issues/5776
### Task Summary
Add dedicated unit-specs (`PortIdentityKeySerializerSpec.scala` and
`PortIdentityKeyDeserializerSpec.scala`) that pin the string-key format Texera
uses for `PortIdentity` JSON **map keys**, and the round-trip between the
serializer and the deserializer.
## Background
`PortIdentity` (ScalaPB-generated from `workflow.proto`: `id: Int`,
`internal: Boolean`) is used as a **map key** in JSON-serialized structures.
Jackson can't serialize an object as a JSON key, so `common/workflow-core`
provides a paired key serializer / deserializer that flattens it to an
`"id_internal"` string and back. Neither has a dedicated unit-spec today.
```scala
// PortIdentityKeySerializer.scala
case object PortIdentityKeySerializer {
def portIdToString(portId: PortIdentity): String =
s"${portId.id}_${portId.internal}"
}
class PortIdentityKeySerializer extends JsonSerializer[PortIdentity] {
override def serialize(key, gen, serializers): Unit =
gen.writeFieldName(portIdToString(key))
}
// PortIdentityKeyDeserializer.scala
class PortIdentityKeyDeserializer extends KeyDeserializer {
override def deserializeKey(key: String, ctxt): PortIdentity = {
val parts = key.split("_")
PortIdentity(parts(0).toInt, parts(1).toBoolean)
}
}
```
## Behavior to pin
| Surface | Contract |
| --- | --- |
| `PortIdentityKeySerializer.portIdToString` | `PortIdentity(3, internal =
false)` → `"3_false"`; `PortIdentity(0, internal = true)` → `"0_true"` (format
is exactly `"${id}_${internal}"`) |
| `PortIdentityKeyDeserializer.deserializeKey` | `"3_false"` →
`PortIdentity(3, internal = false)`; `"0_true"` → `PortIdentity(0, internal =
true)` |
| Round-trip | `deserializeKey(portIdToString(p)) == p` for representative
`p` — `internal` both `true`/`false`, `id` zero / large / negative |
| Full Jackson map round-trip (recommended) | a `Map[PortIdentity, V]`
serialized with the project mapper
(`org.apache.texera.amber.util.JSONUtils.objectMapper`, which registers these
as the key (de)serializers) reads back with identical keys |
> The deserializer currently assumes exactly two `_`-separated parts and a
numeric id — pin **today's** behavior; this task does not add new validation.
## Scope
- New spec files under
`common/workflow-core/src/test/scala/org/apache/texera/amber/util/serde/` — one
per source class (`<srcClassName>Spec.scala` convention). Bundling both in a
single PR is fine.
- Exercise the serializer via `portIdToString` and/or a Jackson `Map`
round-trip; no engine/runtime needed.
- No production-code changes. Follow the module's existing spec style (e.g.
`AnyFunSuite`/`AnyFlatSpec`).
### Task Type
- [ ] Refactor / Cleanup
- [ ] DevOps / Deployment / CI
- [x] Testing / QA
- [ ] Documentation
- [ ] Performance
- [ ] Other
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]