[ 
https://issues.apache.org/jira/browse/HIVE-24945?focusedWorklogId=613981&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613981
 ]

ASF GitHub Bot logged work on HIVE-24945:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Jun/21 11:42
            Start Date: 23/Jun/21 11:42
    Worklog Time Spent: 10m 
      Work Description: abstractdog commented on a change in pull request #2278:
URL: https://github.com/apache/hive/pull/2278#discussion_r657012890



##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFGroupBatches.java
##########
@@ -858,6 +858,15 @@ private void runEvaluatorForRow(int evaluatorIndex, 
VectorPTFEvaluatorBase evalu
 
     Object result = null;
     if (evaluator.canRunOptimizedCalculation(rowNum, range)) {
+      /*
+       * A classic evaluator (which doesn't take advantage of optimized 
calculation) usually
+       * evaluates its input expression in evaluateGroupBatch. The optimized 
calculation doesn't
+       * necessarily work on batches, but input expressions still have to be 
evaluated, so we take
+       * care of them here.
+       */
+      RowPositionInBatch rp = getPosition(rowNum);
+      evaluator.evaluateInputExpr(bufferedBatches.get(rp.batchIndex));

Review comment:
       I need to rethink this...I can see two problems:
   
   1. sum/avg runs optimized calculation but still ends up calling 
evaluateGroupBatch which takes care of this part, so not needed (so I would say 
this is for lead/lag only), so maybe I have to create another method reflecting 
this property of an evaluator
   2. this code evaluates input expression for the same batch which contains 
the current row, but only for that one, which can be still problematic if a 
lead function points to a later record, which is present in the next batch for 
instance...
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 613981)
    Time Spent: 1h 20m  (was: 1h 10m)

> PTF: Support vectorization for lead/lag functions
> -------------------------------------------------
>
>                 Key: HIVE-24945
>                 URL: https://issues.apache.org/jira/browse/HIVE-24945
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to