dilipbiswal commented on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN 
command
URL: https://github.com/apache/spark/pull/24759#issuecomment-521896970
 
 
   @cloud-fan 
   > I'm OK to stay with the TreeNode string methods for now, but I do think we 
should refactor it to a visitor pattern in the future. It's more flexible and 
maintainable. We can put the related code in one place instead of spreading it 
to many places.
   
   Ok.
   
   > BTW I have a concern about subqueries. How do we handle nested subqueries? 
It looks like the current format will flatten the nested subqueries, and put 
them all together in the subqueries section.
   
   So wenchen, when we print the subqueries, we also print the hosting operator 
id and hosting expression id.  This is a example nesting subquery output i had 
sent to Maryann when she was helping me review the code. In fact the idea to 
remove the subquery plan from main query was from her.
   
   ```
   Input Query :
   sql(
     """
       |EXPLAIN SELECT * from l
       |where
       |a = (select max(c) from r where d = 1
       |     and c = (select max(c) from
       |              x where d = 1))
     """.stripMargin).show(false)
   |== Physical Plan ==
   Project (3)
   +- Filter (2)
      +- LocalTableScan (1)
   
   (1) LocalTableScan 
   Output: [_1#220, _2#221]
        
   (2) Filter [codegen id : 1]
   Input     : [_1#220, _2#221]
   Condition : (isnotnull(_1#220) AND (_1#220 = Subquery subquery269))
        
   (3) Project [codegen id : 1]
   Input: [_1#220, _2#221]
        
   ===== Subqueries ==
   Subquery#1 Hosting operator id = 2 Expression = Subquery subquery269
   HashAggregate (9)
   +- Exchange (8)
      +- HashAggregate (7)
         +- Project (6)
            +- Filter (5)
               +- LocalTableScan (4)
   
   (4) LocalTableScan 
   Output: [_1#231, _2#232]
        
   (5) Filter [codegen id : 1]
   Input     : [_1#231, _2#232]
   Condition : (((isnotnull(_2#232) AND isnotnull(_1#231)) AND (_2#232 = 1.0)) 
AND (_1#231 = Subquery subquery268))
        
   (6) Project [codegen id : 1]
   Input: [_1#231, _2#232]
        
   (7) HashAggregate [codegen id : 1]
   Input: [c#236]
        
   (8) Exchange 
   Input: [max#277]
        
   (9) HashAggregate [codegen id : 2]
   Input: [max#277]
        
   Subquery#2 Hosting operator id = 5 Expression = Subquery subquery268
   HashAggregate (15)
   +- Exchange (14)
      +- HashAggregate (13)
         +- Project (12)
            +- Filter (11)
               +- LocalTableScan (10)
   
   (10) LocalTableScan 
   Output: [_1#242, _2#243]
        
   (11) Filter [codegen id : 1]
   Input     : [_1#242, _2#243]
   Condition : (isnotnull(_2#243) AND (_2#243 = 1.0))
        
   (12) Project [codegen id : 1]
   Input: [_1#242, _2#243]
        
   (13) HashAggregate [codegen id : 1]
   Input: [c#247]
        
   (14) Exchange 
   Input: [max#279]
        
   (15) HashAggregate [codegen id : 2]
   Input: [max#279]
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to