Ryan Blue created SPARK-20213:
---------------------------------
Summary: DataFrameWriter operations do not show up in SQL tab
Key: SPARK-20213
URL: https://issues.apache.org/jira/browse/SPARK-20213
Project: Spark
Issue Type: Bug
Components: SQL, Web UI
Affects Versions: 2.1.0, 2.0.2
Reporter: Ryan Blue
In 1.6.1, {{DataFrame}} writes started using {{DataFrameWriter}} actions like
{{insertInto}} would show up in the SQL tab. In 2.0.0 and later, they no longer
do. The problem is that 2.0.0 and later no longer wrap execution with
{{SQLExecution.withNewExecutionId}}, which emits
{{SparkListenerSQLExecutionStart}}.
Here are the relevant parts of the stack traces:
{code:title=Spark 1.6.1}
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
org.apache.spark.sql.execution.QueryExecution$$anonfun$toRdd$1.apply(QueryExecution.scala:56)
org.apache.spark.sql.execution.QueryExecution$$anonfun$toRdd$1.apply(QueryExecution.scala:56)
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:53)
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:56)
=> holding
Monitor(org.apache.spark.sql.hive.HiveContext$QueryExecution@424773807})
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:196)
{code}
{code:title=Spark 2.0.0}
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
=> holding Monitor(org.apache.spark.sql.execution.QueryExecution@490977924})
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:301)
{code}
I think this was introduced by
[54d23599|https://github.com/apache/spark/commit/54d23599]. The fix should be
to add withNewExecutionId to
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L610
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]