[jira] [Created] (SPARK-13802) Fields order in Row is not consistent with Schema.toInternal method

Szymon Matejczyk (JIRA) Thu, 10 Mar 2016 05:56:47 -0800

Szymon Matejczyk created SPARK-13802:
----------------------------------------


             Summary: Fields order in Row is not consistent with 
Schema.toInternal method
                 Key: SPARK-13802
                 URL: https://issues.apache.org/jira/browse/SPARK-13802
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.6.0
            Reporter: Szymon Matejczyk


When using Row constructor from kwargs, fields in the tuple underneath are 
sorted by name. When Schema is reading the row, it is not using the fields in 
this order.

{code:python}
from pyspark.sql import Row
from pyspark.sql.types import *

schema = StructType([
    StructField("id", StringType()),
    StructField("first_name", StringType())])
row = Row(id="39", first_name="Szymon")
schema.toInternal(row)
Out[5]: ('Szymon', '39')
{code}

{code:python}
df = sqlContext.createDataFrame([row], schema)
df.show(1)

+----------+----------+
|    id      |first_name|
+----------+----------+
|Szymon|        39|
+----------+----------+
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-13802) Fields order in Row is not consistent with Schema.toInternal method

Reply via email to