scarlin-cloudera commented on code in PR #4442:
URL: https://github.com/apache/hive/pull/4442#discussion_r1320426662


##########
ql/src/test/results/clientpositive/llap/tablevalues.q.out:
##########
@@ -175,14 +175,27 @@ STAGE PLANS:
         TableScan
           alias: mytbl_n1
           Select Operator
-            expressions: 
array(struct(key,value,'A',10,key),struct(key,value,'B',20,key)) (type: 
array<struct<col1:string,col2:string,col3:string,col4:int,col5:string>>)
+            expressions: key (type: string)
             outputColumnNames: _col0
-            UDTF Operator
-              function name: inline
+            Lateral View Forward
               Select Operator
-                expressions: col3 (type: string), col4 (type: int), col5 
(type: string)
-                outputColumnNames: _col0, _col1, _col2
-                ListSink
+                Lateral View Join Operator
+                  outputColumnNames: _col2, _col3, _col4
+                  Select Operator
+                    expressions: _col2 (type: string), _col3 (type: int), 
_col4 (type: string)
+                    outputColumnNames: _col0, _col1, _col2
+                    ListSink
+              Select Operator
+                expressions: array(struct('A',10,_col0),struct('B',20,_col0)) 
(type: array<struct<col1:string,col2:int,col3:string>>)
+                outputColumnNames: _col0
+                UDTF Operator
+                  function name: inline
+                  Lateral View Join Operator
+                    outputColumnNames: _col2, _col3, _col4
+                    Select Operator
+                      expressions: _col2 (type: string), _col3 (type: int), 
_col4 (type: string)
+                      outputColumnNames: _col0, _col1, _col2
+                      ListSink

Review Comment:
   Ok, so this one is interesting and leaves us with a few choices. 
   
   This regression is indeed because of the introduction of lateral in 
HIveTableFunctionScan. The previous iteration was doing an extra performance 
enhancement that handled this case as well as others.  The addition of the 
"lateral" function allowed the ASTConverter to convert the RelNodes back to its 
original AST with the lateral keyword.
   
   The regression here is that the original code had one special case that went 
through CBO.  When the lateral view contained an "Inline(array(...))", it went 
through the CBO path. When it got converted back to an AST, there was no code 
to handle the lateral syntax, so it did the optimization to remove the Lateral 
View Operator.
   
   I have thought of 3 options on how to handle this.
   1) Leave the regression in and handle this in a separate Jira.  The separate 
Jira can handle all the other places where the Lateral View Operator can be 
removed and it can be done in a generic way.  It would be too much to combine 
with this review, so it should be done separately.  We can probably get away 
with the regression because the "inline(array(..))" entails that the rows are 
specified within the "array" function, which would limit the number of rows to 
be processed.  It would probably never be used with more than tens of rows, so 
the performance regression would never be felt by an end user.
   2) Roll back one previous version in this pull request which does not have 
the regression.  I don't think I like this option because your previous 
suggestions did make the code better
   3) Put in some special case logic to ensure the "lateral" function does not 
get added in this one special cases.  I included this in the most recent pull 
request.  I'm not sure I like it because it does add special case code here as 
well as a new rule that was created.
   
   I'll let you call which is best.  If we go with option 3, you can review the 
most recent code in this pull request.  If you go with option 1), I'll roll 
back that one change and remove the rule that handles the special case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to