dilipbiswal commented on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command URL: https://github.com/apache/spark/pull/24759#issuecomment-521896970 @cloud-fan > I'm OK to stay with the TreeNode string methods for now, but I do think we should refactor it to a visitor pattern in the future. It's more flexible and maintainable. We can put the related code in one place instead of spreading it to many places. Ok. > BTW I have a concern about subqueries. How do we handle nested subqueries? It looks like the current format will flatten the nested subqueries, and put them all together in the subqueries section. So wenchen, when we print the subqueries, we also print the hosting operator id and hosting expression id. This is a example nesting subquery output i had sent to Maryann when she was helping me review the code. In fact the idea to remove the subquery plan from main query was from her. ``` Input Query : sql( """ |EXPLAIN SELECT * from l |where |a = (select max(c) from r where d = 1 | and c = (select max(c) from | x where d = 1)) """.stripMargin).show(false) |== Physical Plan == Project (3) +- Filter (2) +- LocalTableScan (1) (1) LocalTableScan Output: [_1#220, _2#221] (2) Filter [codegen id : 1] Input : [_1#220, _2#221] Condition : (isnotnull(_1#220) AND (_1#220 = Subquery subquery269)) (3) Project [codegen id : 1] Input: [_1#220, _2#221] ===== Subqueries == Subquery#1 Hosting operator id = 2 Expression = Subquery subquery269 HashAggregate (9) +- Exchange (8) +- HashAggregate (7) +- Project (6) +- Filter (5) +- LocalTableScan (4) (4) LocalTableScan Output: [_1#231, _2#232] (5) Filter [codegen id : 1] Input : [_1#231, _2#232] Condition : (((isnotnull(_2#232) AND isnotnull(_1#231)) AND (_2#232 = 1.0)) AND (_1#231 = Subquery subquery268)) (6) Project [codegen id : 1] Input: [_1#231, _2#232] (7) HashAggregate [codegen id : 1] Input: [c#236] (8) Exchange Input: [max#277] (9) HashAggregate [codegen id : 2] Input: [max#277] Subquery#2 Hosting operator id = 5 Expression = Subquery subquery268 HashAggregate (15) +- Exchange (14) +- HashAggregate (13) +- Project (12) +- Filter (11) +- LocalTableScan (10) (10) LocalTableScan Output: [_1#242, _2#243] (11) Filter [codegen id : 1] Input : [_1#242, _2#243] Condition : (isnotnull(_2#243) AND (_2#243 = 1.0)) (12) Project [codegen id : 1] Input: [_1#242, _2#243] (13) HashAggregate [codegen id : 1] Input: [c#247] (14) Exchange Input: [max#279] (15) HashAggregate [codegen id : 2] Input: [max#279] ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org