Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/20280
Let me restate what I think the intended behavior of Row is:
If a `Row` is made from kwargs, then the order of the fields can not be
relied upon and whenever accessing data, it must be done like a dict with the
field name. Because of this, when applying a schema to the data, the schema
fields must also be fields in the `Row` objects. Field position can change as
long as the name matches.
If a `Row` is made from generating a custom class, like `TestRow =
Row("key", "value")` then `row = TestRow('a', 1)`, the the schema will be
applied base on position and the elements in the `Row` objects are accessed by
index. The name of each field in the schema can differ as long as the element
at that index can be converted to the specified schema type.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]