[
https://issues.apache.org/jira/browse/HIVE-24761?focusedWorklogId=585725&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-585725
]
ASF GitHub Bot logged work on HIVE-24761:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 20/Apr/21 11:35
Start Date: 20/Apr/21 11:35
Worklog Time Spent: 10m
Work Description: abstractdog commented on a change in pull request #2099:
URL: https://github.com/apache/hive/pull/2099#discussion_r616599645
##########
File path:
ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt
##########
@@ -34,20 +34,17 @@ public class <ClassName> extends VectorExpression {
private static final long serialVersionUID = 1L;
- private final int colNum1;
private final int colNum2;
Review comment:
I agree that the current solution is not really clean by having only the
first column put into VectorExpression
a couple of notes here, which needs to be discussed before proceeding with
this huge refactor (which I'm happy to do once we 100% certain about the
"perfect" solution):
1. unary, binary is not enough, unfortunately, we have even expressions
involving even more cols, this is not a problem, we have the language support
for that :) tertiary, quaternary...
2. what's confusing is, how to show with simple class names that
unary/binary/... is only a story about the input columns? an expression can
have constants too, e.g. in IfExprScalarScalar.txt:
```
this.arg1Column = arg1Column;
this.arg2Scalar = arg2Scalar;
this.arg3Scalar = arg3Scalar;
```
in our terminology here, this is a unary expression because of arg1Column +
scalars, but in reality, it's obviously not a unary function...
3. with subclasses, we'll have to implement a general
VectorExpression.setInputColumnNum(int i, int j, int k, ...vararg), otherwise,
we won't be able to change the input column numbers (which is important, this
was the intention of this huge vector expression refactor), I think this will
simply work by simply overriding vararg method in subclasses
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 585725)
Time Spent: 1.5h (was: 1h 20m)
> Vectorization: Support PTF - bounded start windows
> --------------------------------------------------
>
> Key: HIVE-24761
> URL: https://issues.apache.org/jira/browse/HIVE-24761
> Project: Hive
> Issue Type: Sub-task
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> {code}
> notVectorizedReason: PTF operator: *** only UNBOUNDED start frame is
> supported
> {code}
> Currently, bounded windows are not supported in VectorPTFOperator. If we
> simply remove the check compile-time:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java#L2911
> {code}
> if (!windowFrameDef.isStartUnbounded()) {
> setOperatorIssue(functionName + " only UNBOUNDED start frame is
> supported");
> return false;
> }
> {code}
> We get incorrect results, that's because vectorized codepath completely
> ignores boundaries, and simply iterates through all the input batches in
> [VectorPTFGroupBatches|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFGroupBatches.java#L172]:
> {code}
> for (VectorPTFEvaluatorBase evaluator : evaluators) {
> evaluator.evaluateGroupBatch(batch);
> if (isLastGroupBatch) {
> evaluator.doLastBatchWork();
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)