[jira] [Created] (SPARK-15244) Type of column name created with sqlContext.createDataFrame() is not consistent.

Kazuki Yokoishi (JIRA) Mon, 09 May 2016 21:36:55 -0700

Kazuki Yokoishi created SPARK-15244:
---------------------------------------


             Summary: Type of column name created with 
sqlContext.createDataFrame() is not consistent.
                 Key: SPARK-15244
                 URL: https://issues.apache.org/jira/browse/SPARK-15244
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.0.0
         Environment: CentOS 7, Spark 1.6.0
            Reporter: Kazuki Yokoishi
            Priority: Minor


StructField() converts field name to str in __init__.
But, when list of str/unicode is passed to sqlContext.createDataFrame() as a 
schema, the type of StructField.name is not converted.

To reproduce:
{noformat}
>>> schema = StructType([StructField(u"col", StringType())])
>>> df1 = sqlContext.createDataFrame([("a",)], schema)
>>> df1.columns # "col" is str
['col']
>>> df2 = sqlContext.createDataFrame([("a",)], [u"col"])
>>> df2.columns # "col" is unicode
[u'col']
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-15244) Type of column name created with sqlContext.createDataFrame() is not consistent.

Reply via email to