[GitHub] [spark] EnricoMi commented on a diff in pull request #37407: [SPARK-39876][SQL] Add UNPIVOT to SQL syntax

GitBox Tue, 27 Sep 2022 03:52:35 -0700


EnricoMi commented on code in PR #37407:
URL: https://github.com/apache/spark/pull/37407#discussion_r981083737



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -1098,6 +1106,87 @@ class AstBuilder extends 
SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
     }
   }
 
+  /**
+   * Add an [[Unpivot]] to a logical plan.
+   */
+  private def withUnpivot(
+      ctx: UnpivotClauseContext,
+      query: LogicalPlan): LogicalPlan = withOrigin(ctx) {
+    // this is needed to create unpivot and to filter unpivot for nulls 
further down
+    val valueColumnNames =
+      Option(ctx.unpivotOperator().unpivotSingleValueColumnClause())
+        .map(_.unpivotValueColumn().identifier().getText)
+        .map(Seq(_))
+      .getOrElse(
+        Option(ctx.unpivotOperator().unpivotMultiValueColumnClause())
+          .map(_.unpivotValueColumns.asScala.map(_.identifier().getText).toSeq)
+          .get
+      )
+
+    val unpivot = if (ctx.unpivotOperator().unpivotSingleValueColumnClause() 
!= null) {
+      val unpivotClause = 
ctx.unpivotOperator().unpivotSingleValueColumnClause()
+      val variableColumnName = 
unpivotClause.unpivotNameColumn().identifier().getText
+      val unpivotColumns = 
unpivotClause.unpivotColumns.asScala.map(visitUnpivotColumn).toSeq
+
+      Unpivot(
+        None,
+        Some(unpivotColumns.map(Seq(_))),
+        None,
+        variableColumnName,
+        valueColumnNames,
+        query
+      )
+    } else {
+      val unpivotClause = ctx.unpivotOperator().unpivotMultiValueColumnClause()
+      val variableColumnName = 
unpivotClause.unpivotNameColumn().identifier().getText
+      val (unpivotColumns, unpivotAliases) =
+        
unpivotClause.unpivotColumnSets.asScala.map(visitUnpivotColumnSet).toSeq.unzip
+
+      Unpivot(
+        None,
+        Some(unpivotColumns),
+        Some(unpivotAliases),
+        variableColumnName,
+        valueColumnNames,
+        query
+      )
+    }
+
+    // exclude null values
+    val filtered = if (ctx.nullOperator != null && ctx.nullOperator.EXCLUDE() 
!= null) {
+      
Filter(IsNotNull(Coalesce(valueColumnNames.map(UnresolvedAttribute(_)))), 
unpivot)

Review Comment:
   [Oracle](https://www.oracletutorial.com/oracle-basics/oracle-unpivot/) is 
not specific about this, nor provides it an example that exemplifies NULL 
values with multiple unpivot columns:
   
       The EXCLUDE NULLS clause, on the other hand, eliminates null-valued rows 
from the returned result set.
   
   
[BigQuery](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#unpivot_operator)
   
       EXCLUDE NULLS: Do not add rows with NULL values to the result.
   
   From 
[here](https://stackoverflow.com/questions/10747355/oracle-11g-unpivot-multiple-columns-and-include-column-name)
 it looks like Oracle behaves the same.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] EnricoMi commented on a diff in pull request #37407: [SPARK-39876][SQL] Add UNPIVOT to SQL syntax

Reply via email to