+1 to standardizing on a deterministic ordering for inference if none is imposed by the structure.
On Wed, Feb 5, 2020, 8:55 AM Gleb Kanterov <g...@spotify.com> wrote: > There are Beam schema providers that use Java reflection to get fields for > classes with fields and auto-value classes. It isn't relevant for POJOs > with "creators", because function arguments are ordered. We cache instances > of schema coders, but there is no guarantee that it's deterministic between > JVMs. As a result, I've seen cases when the construction of pipeline graphs > and output schema is non-deterministic. It's especially relevant when > writing data to external storage, where row schema becomes a table schema. > There is a workaround to apply a transform that would make schema > deterministic, for instance, by ordering fields by name. > > I would see a benefit in making schemas deterministic by default or at > least introducing a way to do so without writing custom code. What are your > thoughts? >