Ryan Blue created SPARK-24215:
---------------------------------

             Summary: Implement __repr__ and _repr_html_ for dataframes in 
PySpark
                 Key: SPARK-24215
                 URL: https://issues.apache.org/jira/browse/SPARK-24215
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 2.3.0
            Reporter: Ryan Blue
             Fix For: 2.4.0


To help people that are new to Spark get feedback more easily, we should 
implement the repr methods for Jupyter python kernels. That way, when users run 
pyspark in jupyter console or notebooks, they get good feedback.

That output should include an option for eager evaluation, like 
spark.jupyter.eager-eval. When set, the formatting methods would run dataframes 
and produce output like {{show}}. This is a good balance between not hiding 
Spark's action behavior and getting feedback to users that don't know to call 
actions.

Here's the dev list thread for context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/eager-execution-and-debuggability-td23928.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to