[ 
https://issues.apache.org/jira/browse/SPARK-31903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127028#comment-17127028
 ] 

Apache Spark commented on SPARK-31903:
--------------------------------------

User 'ueshin' has created a pull request for this issue:
https://github.com/apache/spark/pull/28740

> toPandas with Arrow enabled doesn't show metrics in Query UI.
> -------------------------------------------------------------
>
>                 Key: SPARK-31903
>                 URL: https://issues.apache.org/jira/browse/SPARK-31903
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, R
>    Affects Versions: 2.4.5, 3.0.0
>            Reporter: Takuya Ueshin
>            Assignee: Takuya Ueshin
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: Screen Shot 2020-06-03 at 4.47.07 PM.png, Screen Shot 
> 2020-06-03 at 4.47.27 PM.png
>
>
> When calling {{toPandas}}, usually Query UI shows each plan node's metric and 
> corresponding Stage ID and Task ID:
> {code:java}
> >>> df = spark.createDataFrame([(1, 10, 'abc'), (2, 20, 'def')], schema=['x', 
> >>> 'y', 'z'])
> >>> df.toPandas()
>    x   y    z
> 0  1  10  abc
> 1  2  20  def
> {code}
> !Screen Shot 2020-06-03 at 4.47.07 PM.png!
> but if Arrow execution is enabled, it shows only plan nodes and the duration 
> is not correct:
> {code:java}
> >>> spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True)
> >>> df.toPandas()
>    x   y    z
> 0  1  10  abc
> 1  2  20  def{code}
>  
> !Screen Shot 2020-06-03 at 4.47.27 PM.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to