[ 
https://issues.apache.org/jira/browse/HIVE-18422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328367#comment-16328367
 ] 

Vihang Karajgaonkar commented on HIVE-18422:
--------------------------------------------

[~mmccline] Did you get a chance to take a look at this patch? Thanks!

> Vectorized input format should not be used when input format is excluded and 
> row.serde is enabled
> -------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18422
>                 URL: https://issues.apache.org/jira/browse/HIVE-18422
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Minor
>         Attachments: HIVE-18422.01.patch, HIVE-18422.02.patch
>
>
> HIVE-17534 introduced a config which gives a capability to exclude certain 
> inputformat from vectorized execution without affecting other input formats. 
> If an input format is excluded and row.serde is enabled at the same time, 
> vectorizer still sets the {{useVectorizedInputFormat}} to true which causes 
> Vectorized readers to be used in row.serde mode.
> In order to reproduce:
> {noformat}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.use.row.serde.deserialize=true;
> set hive.vectorized.use.vector.serde.deserialize=true;
> set hive.vectorized.execution.enabled=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.vectorized.row.serde.inputformat.excludes=;
> -- SORT_QUERY_RESULTS
> -- exclude MapredParquetInputFormat from vectorization, this should cause 
> mapwork vectorization to be disabled
> set 
> hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat;
> set hive.vectorized.use.vectorized.input.format=true;
> create table orcTbl (t1 tinyint, t2 tinyint)
> stored as orc;
> insert into orcTbl values (54, 9), (-104, 25), (-112, 24);
> explain vectorization select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
> select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to