[
https://issues.apache.org/jira/browse/HIVE-24761?focusedWorklogId=597561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-597561
]
ASF GitHub Bot logged work on HIVE-24761:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 17/May/21 11:34
Start Date: 17/May/21 11:34
Worklog Time Spent: 10m
Work Description: abstractdog commented on a change in pull request #2099:
URL: https://github.com/apache/hive/pull/2099#discussion_r633453402
##########
File path:
ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt
##########
@@ -34,20 +34,17 @@ public class <ClassName> extends VectorExpression {
private static final long serialVersionUID = 1L;
- private final int colNum1;
private final int colNum2;
Review comment:
what do you think about this @ramesh0201?
I think a general input col array would be nice (option b) )
however, there some rare cases where it's not obvious which position should
be used, but it's up to agreement e.g.:
IfExprScalarColumn.txt
```
protected final int arg1Column;
protected final <OperandType2> arg2Scalar;
protected final int arg3Column;
```
this is tricky because there is a scalar interleaved into the columns, input
col array might look like:
1. new int[] { arg1Column, -1, arg3Column};
to emphasize that that the second argument is a scalar, so we'll refactor as:
```
arg3Column => inputColumnNums[2]
```
2. new int[] { arg1Column, arg3Column, -1};
to ignore the fact that there is an interleaved scalar input, so we'll
refactor as:
```
arg3Column => inputColumnNums[1]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 597561)
Time Spent: 2h 40m (was: 2.5h)
> Vectorization: Support PTF - bounded start windows
> --------------------------------------------------
>
> Key: HIVE-24761
> URL: https://issues.apache.org/jira/browse/HIVE-24761
> Project: Hive
> Issue Type: Sub-task
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> {code}
> notVectorizedReason: PTF operator: *** only UNBOUNDED start frame is
> supported
> {code}
> Currently, bounded windows are not supported in VectorPTFOperator. If we
> simply remove the check compile-time:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java#L2911
> {code}
> if (!windowFrameDef.isStartUnbounded()) {
> setOperatorIssue(functionName + " only UNBOUNDED start frame is
> supported");
> return false;
> }
> {code}
> We get incorrect results, that's because vectorized codepath completely
> ignores boundaries, and simply iterates through all the input batches in
> [VectorPTFGroupBatches|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFGroupBatches.java#L172]:
> {code}
> for (VectorPTFEvaluatorBase evaluator : evaluators) {
> evaluator.evaluateGroupBatch(batch);
> if (isLastGroupBatch) {
> evaluator.doLastBatchWork();
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)