Yaron Neuman created BEAM-9502:
----------------------------------
Summary: SchemaCoder assigns random UUID, causes Dataflow's
compatibility check to fail
Key: BEAM-9502
URL: https://issues.apache.org/jira/browse/BEAM-9502
Project: Beam
Issue Type: Bug
Components: runner-dataflow, sdk-java-core
Reporter: Yaron Neuman
After fe4b7794, _Schema.equals_ comparing only the UUIDs for faster comparison.
After 0b3b18c6 _SchemaCoder_ forcing random UUID when schema.uuid is null.
thus, when trying to update a job with row schemas in user-code, the second run
(the update) the pipelines compatibility check fails because SchemaCoder
produce another random UUID.
The user can set the UUID after creating the Schema, but not with Schema.Builder
and I'm afraid most users, that are not aware to the internal implementation,
won't do that.
In my branch, I added _.withUUID_ and _.withRandomUUID_ to _Schema.Builder_
But I think a better solution will be to calculate the UUID based on the schema
itself.
any thoughts?
[~reuvenlax]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)