Takuya Ueshin created SPARK-42998:
-------------------------------------

             Summary: Fix DataFrame.collect with null struct.
                 Key: SPARK-42998
                 URL: https://issues.apache.org/jira/browse/SPARK-42998
             Project: Spark
          Issue Type: Sub-task
          Components: Connect
    Affects Versions: 3.4.0
            Reporter: Takuya Ueshin


In Spark Connect:

{code:python}
>>> df = spark.sql("values (1, struct('a' as x)), (null, null) as t(a, b)")
>>> df.show()
+----+----+
|   a|   b|
+----+----+
|   1| {a}|
|null|null|
+----+----+

>>> df.collect()
[Row(a=1, b=Row(x='a')), Row(a=None, b=<Row()>)]
{code}

whereas PySpark:

{code:python}
>>> df.collect()
[Row(a=1, b=Row(x='a')), Row(a=None, b=None)]
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to