Khalid Mammadov created SPARK-45782:
---------------------------------------
Summary: Add Dataframe API df.explainString()
Key: SPARK-45782
URL: https://issues.apache.org/jira/browse/SPARK-45782
Project: Spark
Issue Type: Improvement
Components: Connect, PySpark, Spark Core
Affects Versions: 4.0.0
Reporter: Khalid Mammadov
This frequently needed feature for performance optimization purposes. Users
often want to look into this output in running systems and so would like to
save/extract this output from running systems for later analysis.
Current API only provided for Scala i.e.
{{df.queryExecution.toString()}}
and also not located in intuitive place where average Spark user (i.e. non
Expert/Scala dev) can see immediately.
It will also avoid users using workarounds and capturing outputs with
{code:java}
with io.StringIO() as buf ...`:
df.explain(True)
{code}
So, it would help users a lot have this output avalilable as:
df.explainString()
i.e. next to
df.explain()
so users can easily locate it and use.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]