[
https://issues.apache.org/jira/browse/HIVE-18524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376866#comment-16376866
]
Matt McCline commented on HIVE-18524:
-------------------------------------
-1. I'm at a total loss to understand why it is necessary to inject this IF
vector expression inside the VectorUDFAdaptor class.
I'm going to come up with an alternative design for IF statements that have
expressions in the THEN or ELSE that have a vector expression that doesn't use
the VectorUDFAdaptor.
> Vectorization: Execution failure related to non-standard embedding of
> IfExprConditionalFilter inside VectorUDFAdaptor (Revert HIVE-17139)
> -----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-18524
> URL: https://issues.apache.org/jira/browse/HIVE-18524
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 3.0.0
> Reporter: Matt McCline
> Assignee: Matt McCline
> Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18524.01.patch, HIVE-18524.02.patch
>
>
> {noformat}
> insert overwrite table insert_10_1
> select cast(gpa as float),
> age,
> IF(age>40,cast('2011-01-01 01:01:01' as timestamp),NULL),
> IF(LENGTH(name)>10,cast(name as binary),NULL)
> from studentnull10k
> vectorizationSchemaColumns: [0:name:string, 1:age:int, 2:gpa:double]
> ExprNodeDescs:
> UDFToFloat(gpa) (type: float),
> age (type: int),
> if((age > 40), 2011-01-01 01:01:01.0, null) (type: timestamp),
> if((length(name) > 10), CAST( name AS BINARY), null) (type: binary)
> selectExpressions:
> VectorUDFAdaptor(if((age > 40), 2011-01-01 01:01:01.0, null))
> (children: LongColGreaterLongScalar(col 1:int, val 40) -> 4:boolean)
> -> 5:timestamp,
> VectorUDFAdaptor(if((length(name) > 10), CAST( name AS BINARY), null))
> (children: LongColGreaterLongScalar(col 4:int, val 10)(children:
> StringLength(col 0:string) -> 4:int) -> 6:boolean,
> VectorUDFAdaptor(CAST( name AS BINARY)) -> 7:binary) -> 8:binary
> {noformat}
> *// Notice there is no vector expression shown for the last IF stmt.* It has
> been magically embedded inside the VectorUDFAdaptor object...
> Execution results in this call stack.
> {nocode}
> Caused by: java.lang.NullPointerException
> at java.util.Arrays.copyOfRange(Arrays.java:3521)
> at
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$9.writeValue(VectorExpressionWriterFactory.java:1101)
> at
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterBytes.writeValue(VectorExpressionWriterFactory.java:343)
> at
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
> at
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:211)
> at
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:177)
> at
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:145)
> ... 22 more
> {nocode}
> Change is due to:
> HIVE-17139: Conditional expressions optimization: skip the expression
> evaluation if the condition is not satisfied for vectorization engine. (Jia
> Ke, reviewed by Ferdinand Xu)
> Embedding a raw vector expression outside of VectorizationContext is quite
> non-standard and evidently buggy.
> [~Ferd] [~Ke Jia] I am inclined to revert this change. Comments? CC:
> [~ashutoshc] [~hagleitn]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)