abstractdog commented on a change in pull request #2278:
URL: https://github.com/apache/hive/pull/2278#discussion_r657012890
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFGroupBatches.java
##########
@@ -858,6 +858,15 @@ private void runEvaluatorForRow(int evaluatorIndex,
VectorPTFEvaluatorBase evalu
Object result = null;
if (evaluator.canRunOptimizedCalculation(rowNum, range)) {
+ /*
+ * A classic evaluator (which doesn't take advantage of optimized
calculation) usually
+ * evaluates its input expression in evaluateGroupBatch. The optimized
calculation doesn't
+ * necessarily work on batches, but input expressions still have to be
evaluated, so we take
+ * care of them here.
+ */
+ RowPositionInBatch rp = getPosition(rowNum);
+ evaluator.evaluateInputExpr(bufferedBatches.get(rp.batchIndex));
Review comment:
I need to rethink this...I can see two problems:
1. sum/avg runs optimized calculation but still ends up calling
evaluateGroupBatch which takes care of this part, so not needed (so I would say
this is for lead/lag only), so maybe I have to create another method reflecting
this property of an evaluator
2. this code evaluates input expression for the same batch which contains
the current row, but only for that one, which can be still problematic if a
lead function points to a later record, which is present in the next batch for
instance...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]