[ 
https://issues.apache.org/jira/browse/HIVE-20808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793489#comment-16793489
 ] 

Laszlo Bodor commented on HIVE-20808:
-------------------------------------

[~barrm]: in my understanding, VectorUDFAdaptor in the trace could mean that 
the child expression of VectorUDFMapIndexBaseScalar is vectorized actually, 
that could be one of the reasons why the query is slower than "expected", 
especially in case it's used heavily
could you please provide a reproduction scenario (table schema, query)?

> Queries with map() constructor are slow with vectorization
> ----------------------------------------------------------
>
>                 Key: HIVE-20808
>                 URL: https://issues.apache.org/jira/browse/HIVE-20808
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.0.0
>            Reporter: Matthew Barr
>            Priority: Major
>
> Queries involving map operator with vectorization enabled appear to be 
> slowing down due to vector UDF adaptor.
> Corresponding jstack for slow task:
> {code:java}
> "TezChild" #23 daemon prio=5 os_prio=0 tid=0x00007f1e44f1b080 nid=0x9419 
> runnable [0x00007f1e28137000] 
> java.lang.Thread.State: RUNNABLE 
> at 
> org.apache.hadoop.hive.ql.exec.vector.ColumnVector.ensureSize(ColumnVector.java:232)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector.ensureSize(DecimalColumnVector.java:208)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:587)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:146)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFMapIndexBaseScalar.evaluate(VectorUDFMapIndexBaseScalar.java:57)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  
> at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:136)
>  
> at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:812)
>  
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:845)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>  
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) 
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>  
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>  
> at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>  
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to