Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24105 )

Change subject: IMPALA-2083: Add PIVOT clause
......................................................................


Patch Set 9:

(9 comments)

I still need some more time to look into the details. Left some comments for 
discussion first.

http://gerrit.cloudera.org:8080/#/c/24105/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/24105/3/fe/src/main/cup/sql-parser.cup@3525
PS3, Line 3525: ident_or_unreserved
> Thanks! I think nested fields can be extracted in a subquery.
If we don't support this, let's mention it in the commit message.


http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@625
PS9, Line 625:         if (stmt_ instanceof SelectStmt) {
It'd be more tidy to use a method like requiresSubqueryRewrite(). Then we can 
also remove the changed_ flag used here.


http://gerrit.cloudera.org:8080/#/c/24105/3/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java
File fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java:

http://gerrit.cloudera.org:8080/#/c/24105/3/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java@96
PS3, Line 96:   // Creates an unanalyzed clone.
> Thanks! Changed to cloning the whole list.
Can we add a test for this?


http://gerrit.cloudera.org:8080/#/c/24105/3/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java@147
PS3, Line 147:           throw new AnalysisException("Duplicate column name: " 
+ fieldName);
> Thanks! I think they are supported in the new Patch Set.
The new patch set still cast the Expr to LiteralExpr which doesn't work for 
FunctionalCallExpr.

It'd be nice to support statements like

 SELECT v1, v2 FROM functional.alltypes PIVOT ( count(id) for month in 
(month(now()) AS m1, month(now())+1 AS m2) ) AS t;

month(now()) is evaluated to the current month.

This needs a better error message instead of hitting ClassCastException:

 SELECT v1, v2 FROM functional.alltypes PIVOT ( count(id) for month in (int_col 
AS v1, 2 AS v2) ) AS t;


http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java
File fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java:

http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java@21
PS9, Line 21: import java.util.HashMap;
nit: unused import


http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java@73
PS9, Line 73:   private final Map<String, FunctionCallExpr> analyzedAggList_ = 
new LinkedHashMap<>();
            :   private final Map<String, LiteralExpr> headerValueMap_ = new 
LinkedHashMap<>();
            :   private final Map<String, SlotDescriptor> groupingSlotMap_ = 
new LinkedHashMap<>();
nit: Let's add a comment mentioning what are the keys of these maps, e.g. 
analyzedAggList_ maps alias to the analyzed agg exprs.


http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java@277
PS9, Line 277: uniqueAggAlias
nit: at first glance, it seems all items use the same alias. But later I 
realized we require writing alias for multiple aggs on L184. Maybe it's more 
clear by adding a comment above 'uniqueAggAlias' mentioning it's only used for 
single agg scenario.


http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java@291
PS9, Line 291:     Expr headerSlotRef = 
transposeGroupingExprs.get(headerSlotRefIndex);
nit: not used


http://gerrit.cloudera.org:8080/#/c/24105/9/fe/src/main/java/org/apache/impala/analysis/PivotTableRef.java@298
PS9, Line 298:     for (SelectListItem aggItem : aggList_.getItems()) {
             :       String aggAlias = aggItem.getAlias() == null ? 
uniqueAggAlias : aggItem.getAlias();
             :       for (Map.Entry<String, LiteralExpr> entry : 
headerValueMap_.entrySet()) {
Oracle iterates the header values before the agg alias: 
https://docs.oracle.com/cd/E11882_01/server.112/e25554/analysis.htm#sthref1224
Same for SparkSQL: 
https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-pivot.html

It'd be better to be consistent with them.

SELECT * FROM person
    PIVOT (
        SUM(age) AS a, AVG(class) AS c
        FOR name IN ('John' AS john, 'Mike' AS mike)
    );
+------+-----------+---------+---------+---------+---------+
|  id  |  address  | john_a  | john_c  | mike_a  | mike_c  |
+------+-----------+---------+---------+---------+---------+
| 200  | Street 2  | NULL    | NULL    | NULL    | NULL    |
| 100  | Street 1  | 30      | 1.0     | NULL    | NULL    |
| 300  | Street 3  | NULL    | NULL    | 80      | 3.0     |
| 400  | Street 4  | NULL    | NULL    | NULL    | NULL    |
+------+-----------+---------+---------+---------+---------+



--
To view, visit http://gerrit.cloudera.org:8080/24105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib799b772be18d2a3db8983b24e1068e7dfdf1ca9
Gerrit-Change-Number: 24105
Gerrit-PatchSet: 9
Gerrit-Owner: Xuebin Su <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Xuebin Su <[email protected]>
Gerrit-Comment-Date: Wed, 13 May 2026 09:29:20 +0000
Gerrit-HasComments: Yes

Reply via email to