Kazuki Yokoishi created SPARK-15244:
---------------------------------------
Summary: Type of column name created with
sqlContext.createDataFrame() is not consistent.
Key: SPARK-15244
URL: https://issues.apache.org/jira/browse/SPARK-15244
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 2.0.0
Environment: CentOS 7, Spark 1.6.0
Reporter: Kazuki Yokoishi
Priority: Minor
StructField() converts field name to str in __init__.
But, when list of str/unicode is passed to sqlContext.createDataFrame() as a
schema, the type of StructField.name is not converted.
To reproduce:
{noformat}
>>> schema = StructType([StructField(u"col", StringType())])
>>> df1 = sqlContext.createDataFrame([("a",)], schema)
>>> df1.columns # "col" is str
['col']
>>> df2 = sqlContext.createDataFrame([("a",)], [u"col"])
>>> df2.columns # "col" is unicode
[u'col']
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]