dtenedor commented on code in PR #42174:
URL: https://github.com/apache/spark/pull/42174#discussion_r1279997355


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/FunctionTableSubqueryArgumentExpression.scala:
##########
@@ -61,5 +75,36 @@ case class FunctionTableSubqueryArgumentExpression(
   final override def nodePatternsInternal: Seq[TreePattern] =
     Seq(FUNCTION_TABLE_RELATION_ARGUMENT_EXPRESSION)
 
-  lazy val evaluable: LogicalPlan = 
Project(Seq(Alias(CreateStruct(plan.output), "c")()), plan)
+  def hasRepartitioning: Boolean = withSinglePartition || 
partitionByExpressions.nonEmpty
+
+  lazy val evaluable: LogicalPlan = {
+    val subquery = if (hasRepartitioning) {
+      // If the TABLE argument includes the WITH SINGLE PARTITION or PARTITION 
BY or ORDER BY
+      // clause(s), add a corresponding logical operator to represent the 
repartitioning operation
+      // in the query plan.
+      RepartitionForTableFunctionCall(

Review Comment:
   Good idea, I updated the code to use this, now we don't need to add a new 
operator!



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/FunctionTableSubqueryArgumentExpression.scala:
##########
@@ -32,11 +32,22 @@ import org.apache.spark.sql.types.DataType
  * table in the catalog. In the latter case, the relation argument comprises
  * a table subquery that may itself refer to one or more tables in its own
  * FROM clause.
+ *
+ * Each TABLE argument may also optionally include a PARTITION BY clause. If 
present, these indicate
+ * how to logically split up the input relation such that the table-valued 
function evaluates
+ * exactly once for each partition, and returns the union of all results. If 
no partitioning list is
+ * present, this splitting of the input relation is undefined. Furthermore, if 
the PARTITION BY
+ * clause includes a following ORDER BY clause, Catalyst will sort the rows in 
each partition such
+ * that the table-valued function receives them one-by-one in the requested 
order. Otherwise, if no
+ * such ordering is specified, the ordering of rows within each partition is 
undefined.

Review Comment:
   Sounds good, done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to