cloud-fan commented on a change in pull request #31968:
URL: https://github.com/apache/spark/pull/31968#discussion_r603071304
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -223,11 +224,18 @@ class Dataset[T] private[sql](
@transient private[sql] val logicalPlan: LogicalPlan = {
// For various commands (like DDL) and queries with side effects, we force
query execution
// to happen right away to let these side effects take place eagerly.
+ def eagerRun(plan: LogicalPlan): LogicalPlan = {
+ val relation =
+ LocalRelation(plan.output, withAction("command",
queryExecution)(_.executeCollect()))
Review comment:
The problem is we do submit two Spark jobs (invoking the command and
collecting the `LocalRelation`), and it's unclear which one we should wrap with
a SQL execution.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]