[ https://issues.apache.org/jira/browse/HIVE-24945?focusedWorklogId=613981&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613981 ]
ASF GitHub Bot logged work on HIVE-24945: ----------------------------------------- Author: ASF GitHub Bot Created on: 23/Jun/21 11:42 Start Date: 23/Jun/21 11:42 Worklog Time Spent: 10m Work Description: abstractdog commented on a change in pull request #2278: URL: https://github.com/apache/hive/pull/2278#discussion_r657012890 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFGroupBatches.java ########## @@ -858,6 +858,15 @@ private void runEvaluatorForRow(int evaluatorIndex, VectorPTFEvaluatorBase evalu Object result = null; if (evaluator.canRunOptimizedCalculation(rowNum, range)) { + /* + * A classic evaluator (which doesn't take advantage of optimized calculation) usually + * evaluates its input expression in evaluateGroupBatch. The optimized calculation doesn't + * necessarily work on batches, but input expressions still have to be evaluated, so we take + * care of them here. + */ + RowPositionInBatch rp = getPosition(rowNum); + evaluator.evaluateInputExpr(bufferedBatches.get(rp.batchIndex)); Review comment: I need to rethink this...I can see two problems: 1. sum/avg runs optimized calculation but still ends up calling evaluateGroupBatch which takes care of this part, so not needed (so I would say this is for lead/lag only), so maybe I have to create another method reflecting this property of an evaluator 2. this code evaluates input expression for the same batch which contains the current row, but only for that one, which can be still problematic if a lead function points to a later record, which is present in the next batch for instance... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 613981) Time Spent: 1h 20m (was: 1h 10m) > PTF: Support vectorization for lead/lag functions > ------------------------------------------------- > > Key: HIVE-24945 > URL: https://issues.apache.org/jira/browse/HIVE-24945 > Project: Hive > Issue Type: Sub-task > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)