Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/20280
Also, this will cause a breaking change if `Row`s are defined with kwargs
and schema changes field names, like this:
```
data = [Row(key=i, value=str(i)) for i in range(100)]
rdd = self.sc.parallelize(data, 5)
df = rdd.toDF(" a: int, b: string ")
```
and this would work but might be slower, depending on how complicated the
schema is, because now the field names are searched for instead of just going
by position
```
df = rdd.toDF(" key: int, value: string ")
```
So if we go forward with this fix, I should probably add something in the
migration guide
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]