[jira] [Commented] (HIVE-20563) Exception in vectorization execution of CASE statement

Jesus Camacho Rodriguez (JIRA) Fri, 14 Sep 2018 13:50:26 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16615371#comment-16615371
 ]


Jesus Camacho Rodriguez commented on HIVE-20563:
------------------------------------------------

This is the explain vectorization plan:

{code}
PLAN VECTORIZATION:
  enabled: true
  enabledConditionsMet: [hive.vectorized.execution.enabled IS true]

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
            TableScan Vectorization:
                native: true
              Filter Vectorization:
                  className: VectorFilterOperator
                  native: true
                  predicateExpression: SelectColumnIsNull(col 5:double)
                Select Vectorization:
                    className: VectorSelectOperator
                    native: true
                    projectedOutputColumnNums: [6, 2, 4, 1, 23]
                    selectExpressions: IfExprColumnCondExpr(col 13:boolean, col 
6:stringcol 22:string)(children: IsNotNull(col 6:string) -> 13:boolean, col 
6:string, VectorUDFAdaptor(CASE WHEN (cint is not null) THEN (cint) WHEN 
(cfloat is not null) THEN (cfloat) WHEN (csmallint is not null) THEN 
(csmallint) ELSE (null) END)(children: IsNotNull(col 2:int) -> 18:boolean, 
IsNotNull(col 4:float) -> 19:boolean, IsNotNull(col 1:smallint) -> 21:boolean) 
-> 22:string) -> 23:string
                  Reduce Sink Vectorization:
                      className: VectorReduceSinkOperator
                      native: false
                      nativeConditionsMet: 
hive.vectorized.execution.reducesink.new.enabled IS true, No PTF TopN IS true, 
No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
LazyBinarySerDe for values IS true
                      nativeConditionsNotMet: hive.execution.engine mr IN [tez, 
spark] IS false
      Execution mode: vectorized
      Map Vectorization:
          enabled: true
          enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS 
true
          inputFormatFeatureSupport: [DECIMAL_64]
          featureSupportInUse: [DECIMAL_64]
          inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
          allNative: false
          usesVectorUDFAdaptor: true
          vectorized: true
      Reduce Vectorization:
          enabled: false
          enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true
          enableConditionsNotMet: hive.execution.engine mr IN [tez, spark] IS 
false
      Reduce Operator Tree:

  Stage: Stage-0
    Fetch Operator
{code}

> Exception in vectorization execution of CASE statement
> ------------------------------------------------------
>
>                 Key: HIVE-20563
>                 URL: https://issues.apache.org/jira/browse/HIVE-20563
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Matt McCline
>            Priority: Major
>
> With the following stacktrace:
> {code}
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
> ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) 
> [hadoop-mapreduce-client-common-3.1.0.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_181]
>         at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:973)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_181]
>         at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> cstring1
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:149)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:136)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:812)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:845)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_181]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_181]
>         at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.FloatWritable 
> cannot be cast to org.apache.hadoop.io.Text
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:471)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:146)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.IfExprCondExprBase.conditionalEvaluate(IfExprCondExprBase.java:68)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.IfExprColumnCondExpr.evaluate(IfExprColumnCondExpr.java:113)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:136)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:812)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:845)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
> ...
> {code}
> To repro:
> {code:sql}
> --! qt:dataset:alltypesorc
> set hive.stats.fetch.column.stats=true;
> set hive.explain.user=false;
> SET hive.vectorized.execution.enabled=true;
> set hive.fetch.task.conversion=none;
> -- SORT_QUERY_RESULTS
> EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cdouble, cstring1, cint, cfloat, 
> csmallint,
>   case
>     when (cdouble is not null) then cdouble
>     when (cstring1 is not null) then cstring1
>     when (cint is not null) then cint
>     when (cfloat is not null) then cfloat
>     when (csmallint is not null) then csmallint
>     else null
>     end as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10;
> SELECT cdouble, cstring1, cint, cfloat, csmallint,
>   case
>     when (cdouble is not null) then cdouble
>     when (cstring1 is not null) then cstring1
>     when (cint is not null) then cint
>     when (cfloat is not null) then cfloat
>     when (csmallint is not null) then csmallint
>     else null
>     end as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20563) Exception in vectorization execution of CASE statement

Reply via email to