Dong Wang created SPARK-30444: --------------------------------- Summary: The same job will be computated for many times when using Dataset.show() Key: SPARK-30444 URL: https://issues.apache.org/jira/browse/SPARK-30444 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.4, 2.4.3 Reporter: Dong Wang
When I run the example sql.SparkSQLExample, df.show() at line 60 would trigger an action. Strangely, I noticed that this API creates 5 jobs, all of which have the same lineage graph with the same RDDs and the same call stacks. That means Spark recomputate the job for 5 times. But strangely, sqlDF.show() at line 123 only creates 1 job. I don't know what happened at show() at line 60. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org