Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/20280
  
    After looking into this, it seems like the behavior of the `Row` class is 
as follows:
    
    If a `Row` is made from kwargs, then the order of the fields can not be 
relied upon and whenever accessing data, it must be done like a dict with the 
field name.  When this is the case, the order of the supplied schema doesn't 
matter but the field name must be a subset of what is in each row.
    
    If a `Row` is made from generating a custom class, like `TestRow = 
Row("key", "value")` then `row = TestRow('a', 1)`, then the position of each 
element is what is important and data is accessed by position in the tuple.  
The supplied schema for this must match the types of the rows exactly, however 
field names are not important and can be changed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to