[ https://issues.apache.org/jira/browse/SPARK-31903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-31903: ------------------------------------ Assignee: (was: Apache Spark) > toPandas with Arrow enabled doesn't show metrics in Query UI. > ------------------------------------------------------------- > > Key: SPARK-31903 > URL: https://issues.apache.org/jira/browse/SPARK-31903 > Project: Spark > Issue Type: Bug > Components: PySpark, R > Affects Versions: 2.4.5, 3.0.0 > Reporter: Takuya Ueshin > Priority: Major > Attachments: Screen Shot 2020-06-03 at 4.47.07 PM.png, Screen Shot > 2020-06-03 at 4.47.27 PM.png > > > When calling {{toPandas}}, usually Query UI shows each plan node's metric and > corresponding Stage ID and Task ID: > {code:java} > >>> df = spark.createDataFrame([(1, 10, 'abc'), (2, 20, 'def')], schema=['x', > >>> 'y', 'z']) > >>> df.toPandas() > x y z > 0 1 10 abc > 1 2 20 def > {code} > !Screen Shot 2020-06-03 at 4.47.07 PM.png! > but if Arrow execution is enabled, it shows only plan nodes and the duration > is not correct: > {code:java} > >>> spark.conf.set('spark.sql.execution.arrow.pyspark.enabled', True) > >>> df.toPandas() > x y z > 0 1 10 abc > 1 2 20 def{code} > > !Screen Shot 2020-06-03 at 4.47.27 PM.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org