Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20503
I meant things like this:
```python
>>> from pyspark.sql import Row
>>> RowClass = Row(1)
>>> RowClass("a")
Row(1='a')
```
```python
>>> spark.createDataFrame([RowClass("a")])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../spark/python/pyspark/sql/session.py", line 686, in
createDataFrame
rdd, schema = self._createFromLocal(map(prepare, data), schema)
File "/.../spark/python/pyspark/sql/session.py", line 410, in
_createFromLocal
struct = self._inferSchemaFromList(data, names=schema)
File "/.../spark/python/pyspark/sql/session.py", line 342, in
_inferSchemaFromList
schema = reduce(_merge_type, (_infer_schema(row, names) for row in
data))
File "/.../spark/python/pyspark/sql/session.py", line 342, in <genexpr>
schema = reduce(_merge_type, (_infer_schema(row, names) for row in
data))
File "/.../spark/python/pyspark/sql/types.py", line 1099, in _infer_schema
fields = [StructField(k, _infer_type(v), True) for k, v in items]
File "/.../spark/python/pyspark/sql/types.py", line 407, in __init__
assert isinstance(name, basestring), "field name should be string"
AssertionError: field name should be string
```
The reason I initially didn't suggest to use `str` is, it breaks `unicode`
in Python 2 IIRC. For example,
```
str(u"ì")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\uc544' in
position 0: ordinal not in range(128)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]