peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r321960752
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ##########
 @@ -245,6 +253,141 @@ case class FilterExec(condition: Expression, child: 
SparkPlan)
   }
 }
 
+/**
+ * Physical plan node for a recursive relation that encapsulates the physical 
plan of the anchor
+ * term and the logical plan of the recursive term.
+ *
+ * Anchor is used to initialize the query in the first run.
+ * Recursive term is used to extend the result with new rows, They are logical 
plans and contain
+ * references to the result of the previous iteration or to the so far 
cumulated result. These
+ * references are updated with new statistics and data and then compiled to 
physical plan before
+ * execution.
+ *
+ * The execution terminates once the anchor term or the current iteration of 
the recursive term
+ * return no rows.
+ *
+ * During the execution of a recursive query the previously computed results 
are reused multiple
+ * times. To avoid massive recomputation of these pieces they are cached.
+ *
+ * @param cteName the name of the recursive relation
+ * @param anchorTerm this child is used for initializing the query
+ * @param output the attributes of the recursive relation
+ */
+case class RecursiveRelationExec(
+    cteName: String,
+    anchorTerm: SparkPlan,
+    output: Seq[Attribute],
+    @transient queryExecution: QueryExecution) extends SparkPlan {
+  @transient
+  lazy val logicalRecursiveTerm = 
logicalLink.get.asInstanceOf[RecursiveRelation].recursiveTerm
+
+  override def children: Seq[SparkPlan] = anchorTerm :: Nil
+
+  override def innerChildren: Seq[QueryPlan[_]] = logicalRecursiveTerm +: 
super.innerChildren
+
+  override def stringArgs: Iterator[Any] = Iterator(cteName, output)
+
+  private val physicalRecursiveTerms = new ConcurrentLinkedQueue[SparkPlan]
 
 Review comment:
   You are right, it doesn't need to be concurrent.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to