Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21142 )

Change subject: IMPALA-12737: List columns in profile and query history
......................................................................


Patch Set 26:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21142/26//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21142/26//COMMIT_MSG@9
PS26, Line 9: "Select Columns", "Where Columns", "Join Columns", "Aggregate
            : Columns", and "OrderBy Columns"
> Should this add "Having Columns" as well?
I thought it was only used with "GROUP BY", but looks like it can be used 
separately. Currently Aggregate serves as an umbrella for both. I'll consider 
it.


http://gerrit.cloudera.org:8080/#/c/21142/26/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/21142/26/common/thrift/Frontend.thrift@705
PS26, Line 705:   // Columns referenced in a select list.
              :   21: optional list<string> select_columns
              :
              :   // Columns referenced in a where clause.
              :   22: optional list<string> where_columns
              :
              :   // Columns referenced in a join clause.
              :   23: optional list<string> join_columns
              :
              :   // Columns referenced in an aggregation.
              :   24: optional list<string> aggregate_columns
              :
              :   // Columns referenced in an order by clause.
              :   25: optional list<string> orderby_columns
> I think there should be limit on maximum column names to log, or total leng
Perhaps we don't want that in the profile. I think in the query log table the 
intention is to be exhaustive.


http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4488
PS26, Line 4488: Stream<Path>
> Is there benefit of passing Stream<Path> argument instead of Set<String>?
Mostly avoiding allocating another Set. But I do agree it's an unusual pattern.


http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@383
PS26, Line 383: Stream<SelectListItem> nonStarItems =
              :           selectList_.getItems().stream().filter(elem -> 
!elem.isStar());
              :       nonStarItems.forEach(item -> 
item.getExpr().collect(SlotRef.class, slotRefs));
> Can be combined into one?
Will do.


http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@387
PS26, Line 387: .filter(path -> path != null)
> Why is the filtering applied here and not in the concatenated Stream?
Probably not needed for concatenated stream, but wouldn't hurt. There are a few 
test failures I need to look into that could be related.



--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 26
Gerrit-Owner: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Jason Fehr <jf...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Apr 2024 23:05:47 +0000
Gerrit-HasComments: Yes

Reply via email to