[ 
https://issues.apache.org/jira/browse/SPARK-31903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin updated SPARK-31903:
----------------------------------
    Description: 
When calling {{toPandas}}, usually Query UI shows each plan node's metric and 
corresponding Stage ID and Task ID:
{code:java}
>>> df = spark.createDataFrame([(1, 10, 'abc'), (2, 20, 'def')], schema=['x', 
>>> 'y', 'z'])
>>> df.toPandas()
   x   y    z
0  1  10  abc
1  2  20  def
{code}
!Screen Shot 2020-06-03 at 4.47.07 PM.png!

 

but if Arrow execution is enabled, it shows only plan nodes:
{code:java}
>>> spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True)
>>> df.toPandas()
   x   y    z
0  1  10  abc
1  2  20  def{code}
 

!Screen Shot 2020-06-03 at 4.47.27 PM.png!

  was:
When calling {{toPandas}}, usually Query UI shows each plan node's metric and 
corresponding Stage ID and Task ID:

 
{code:java}
>>> df = spark.createDataFrame([(1, 10, 'abc'), (2, 20, 'def')], schema=['x', 
>>> 'y', 'z'])
>>> df.toPandas()
   x   y    z
0  1  10  abc
1  2  20  def
{code}
!Screen Shot 2020-06-03 at 4.47.07 PM.png!

 

but if Arrow execution is enabled, it shows only plan nodes:

 
{code:java}
>>> spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True)
>>> df.toPandas()
   x   y    z
0  1  10  abc
1  2  20  def{code}
 

!Screen Shot 2020-06-03 at 4.47.27 PM.png!


> toPandas with Arrow enabled doesn't show metrics in Query UI.
> -------------------------------------------------------------
>
>                 Key: SPARK-31903
>                 URL: https://issues.apache.org/jira/browse/SPARK-31903
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.4.5, 3.0.0
>            Reporter: Takuya Ueshin
>            Priority: Major
>         Attachments: Screen Shot 2020-06-03 at 4.47.07 PM.png, Screen Shot 
> 2020-06-03 at 4.47.27 PM.png
>
>
> When calling {{toPandas}}, usually Query UI shows each plan node's metric and 
> corresponding Stage ID and Task ID:
> {code:java}
> >>> df = spark.createDataFrame([(1, 10, 'abc'), (2, 20, 'def')], schema=['x', 
> >>> 'y', 'z'])
> >>> df.toPandas()
>    x   y    z
> 0  1  10  abc
> 1  2  20  def
> {code}
> !Screen Shot 2020-06-03 at 4.47.07 PM.png!
>  
> but if Arrow execution is enabled, it shows only plan nodes:
> {code:java}
> >>> spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True)
> >>> df.toPandas()
>    x   y    z
> 0  1  10  abc
> 1  2  20  def{code}
>  
> !Screen Shot 2020-06-03 at 4.47.27 PM.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to