[
https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sandeep Singh updated SPARK-41824:
----------------------------------
Description:
{code:java}
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
line 1296, in pyspark.sql.connect.dataframe.DataFrame.explain
Failed example:
df.explain()
Expected:
== Physical Plan ==
*(1) Scan ExistingRDD[age...,name...]
Got:
== Physical Plan ==
LocalTableScan [age#1148L, name#1149]
<BLANKLINE>
<BLANKLINE>
**********************************************************************
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
line 1314, in pyspark.sql.connect.dataframe.DataFrame.explain
Failed example:
df.explain(mode="formatted")
Expected:
== Physical Plan ==
* Scan ExistingRDD (...)
(1) Scan ExistingRDD [codegen id : ...]
Output [2]: [age..., name...]
...
Got:
== Physical Plan ==
LocalTableScan (1)
<BLANKLINE>
<BLANKLINE>
(1) LocalTableScan
Output [2]: [age#1170L, name#1171]
Arguments: [age#1170L, name#1171]
<BLANKLINE>
<BLANKLINE>{code}
was:
{code:java}
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
line 254, in pyspark.sql.connect.dataframe.DataFrame.drop
Failed example:
df.join(df2, df.name == df2.name, 'inner').drop('name').show()
Exception raised:
Traceback (most recent call last):
File
"/usr/local/Cellar/[email protected]/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py",
line 1350, in __run
exec(compile(example.source, filename, "single",
File "<doctest pyspark.sql.connect.dataframe.DataFrame.drop[5]>", line 1,
in <module>
df.join(df2, df.name == df2.name, 'inner').drop('name').show()
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
line 534, in show
print(self._show_string(n, truncate, vertical))
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
line 423, in _show_string
).toPandas()
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
line 1031, in toPandas
return self._session.client.to_pandas(query)
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/client.py", line
413, in to_pandas
return self._execute_and_fetch(req)
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/client.py", line
573, in _execute_and_fetch
self._handle_error(rpc_error)
File
"/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/client.py", line
619, in _handle_error
raise SparkConnectAnalysisException(
pyspark.sql.connect.client.SparkConnectAnalysisException:
[AMBIGUOUS_REFERENCE] Reference `name` is ambiguous, could be: [`name`, `name`].
Plan: {code}
> Implement DataFrame.explain format to be similar to PySpark
> -----------------------------------------------------------
>
> Key: SPARK-41824
> URL: https://issues.apache.org/jira/browse/SPARK-41824
> Project: Spark
> Issue Type: Sub-task
> Components: Connect
> Affects Versions: 3.4.0
> Reporter: Sandeep Singh
> Priority: Major
>
> {code:java}
> File
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
> line 1296, in pyspark.sql.connect.dataframe.DataFrame.explain
> Failed example:
> df.explain()
> Expected:
> == Physical Plan ==
> *(1) Scan ExistingRDD[age...,name...]
> Got:
> == Physical Plan ==
> LocalTableScan [age#1148L, name#1149]
> <BLANKLINE>
> <BLANKLINE>
> **********************************************************************
> File
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py",
> line 1314, in pyspark.sql.connect.dataframe.DataFrame.explain
> Failed example:
> df.explain(mode="formatted")
> Expected:
> == Physical Plan ==
> * Scan ExistingRDD (...)
> (1) Scan ExistingRDD [codegen id : ...]
> Output [2]: [age..., name...]
> ...
> Got:
> == Physical Plan ==
> LocalTableScan (1)
> <BLANKLINE>
> <BLANKLINE>
> (1) LocalTableScan
> Output [2]: [age#1170L, name#1171]
> Arguments: [age#1170L, name#1171]
> <BLANKLINE>
> <BLANKLINE>{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]