Khalid Mammadov created SPARK-45782:
---------------------------------------

             Summary: Add Dataframe API df.explainString()
                 Key: SPARK-45782
                 URL: https://issues.apache.org/jira/browse/SPARK-45782
             Project: Spark
          Issue Type: Improvement
          Components: Connect, PySpark, Spark Core
    Affects Versions: 4.0.0
            Reporter: Khalid Mammadov


This frequently needed feature for performance optimization purposes. Users 
often want to look into this output in running systems and so would like to 
save/extract this output from running systems for later analysis.

Current API only provided for Scala i.e. 

{{df.queryExecution.toString()}}

and also not located in intuitive place where average Spark user (i.e. non 
Expert/Scala dev) can see immediately.

It will also avoid users using workarounds and capturing outputs with
{code:java}
with io.StringIO() as buf ...`:
    df.explain(True)
{code}
 

So, it would help users a lot have this output avalilable as:

df.explainString()

i.e. next to

df.explain()

so users can easily locate it and use.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to