Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 )
Change subject: IMPALA-12737: List columns in profile and query history ...................................................................... Patch Set 26: (5 comments) http://gerrit.cloudera.org:8080/#/c/21142/26//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21142/26//COMMIT_MSG@9 PS26, Line 9: "Select Columns", "Where Columns", "Join Columns", "Aggregate : Columns", and "OrderBy Columns" > Should this add "Having Columns" as well? I thought it was only used with "GROUP BY", but looks like it can be used separately. Currently Aggregate serves as an umbrella for both. I'll consider it. http://gerrit.cloudera.org:8080/#/c/21142/26/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/21142/26/common/thrift/Frontend.thrift@705 PS26, Line 705: // Columns referenced in a select list. : 21: optional list<string> select_columns : : // Columns referenced in a where clause. : 22: optional list<string> where_columns : : // Columns referenced in a join clause. : 23: optional list<string> join_columns : : // Columns referenced in an aggregation. : 24: optional list<string> aggregate_columns : : // Columns referenced in an order by clause. : 25: optional list<string> orderby_columns > I think there should be limit on maximum column names to log, or total leng Perhaps we don't want that in the profile. I think in the query log table the intention is to be exhaustive. http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4488 PS26, Line 4488: Stream<Path> > Is there benefit of passing Stream<Path> argument instead of Set<String>? Mostly avoiding allocating another Set. But I do agree it's an unusual pattern. http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@383 PS26, Line 383: Stream<SelectListItem> nonStarItems = : selectList_.getItems().stream().filter(elem -> !elem.isStar()); : nonStarItems.forEach(item -> item.getExpr().collect(SlotRef.class, slotRefs)); > Can be combined into one? Will do. http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@387 PS26, Line 387: .filter(path -> path != null) > Why is the filtering applied here and not in the concatenated Stream? Probably not needed for concatenated stream, but wouldn't hurt. There are a few test failures I need to look into that could be related. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 26 Gerrit-Owner: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Jason Fehr <jf...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Mon, 15 Apr 2024 23:05:47 +0000 Gerrit-HasComments: Yes