beliefer opened a new pull request, #39436:
URL: https://github.com/apache/spark/pull/39436
### What changes were proposed in this pull request?
Currently, the output of explain API is different between pyspark, scala and
connect.
There already created a dataframe with
`df = spark.createDataFrame([(14, "Tom"), (23, "Alice"), (16, "Bob")],
["age", "name"]) `
and then execute
`df.explain()`
The output of pyspark show below.
```
== Physical Plan ==
*(1) Scan ExistingRDD[age...,name...]
```
But the scala and connect API output different content.
```
== Physical Plan ==
LocalTableScan [age#1148L, name#1149]
<BLANKLINE>
<BLANKLINE>
```
The similar issue occurs when executing `df.explain(mode="formatted")` too.
It's actually implementation details in PySpark. It would be difficult to
make it matched. So this PR want ignore the two doc tests.
### Why are the changes needed?
Currently, the output of explain API is different between pyspark, scala and
connect.
This PR want ignore the two doc tests.
### Does this PR introduce _any_ user-facing change?
'No'.
New feature.
### How was this patch tested?
Manual tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]