Vihang Karajgaonkar created HIVE-18422:
------------------------------------------

             Summary: Vectorized input format should not be used when input 
format is excluded and row.serde is enabled
                 Key: HIVE-18422
                 URL: https://issues.apache.org/jira/browse/HIVE-18422
             Project: Hive
          Issue Type: Bug
          Components: Vectorization
    Affects Versions: 3.0.0, 2.4.0
            Reporter: Vihang Karajgaonkar
            Assignee: Vihang Karajgaonkar
            Priority: Minor


HIVE-17534 introduced a config which gives a capability to exclude certain 
inputformat from vectorized execution without affecting other input formats. If 
an input format is excluded and row.serde is enabled at the same time, 
vectorizer still sets the {{useVectorizedInputFormat}} to true which causes 
Vectorized readers to be used in row.serde mode.

In order to reproduce:
{noformat}
set hive.fetch.task.conversion=none;
set hive.vectorized.use.row.serde.deserialize=true;
set hive.vectorized.use.vector.serde.deserialize=true;
set hive.vectorized.execution.enabled=true;
set hive.vectorized.execution.reduce.enabled=true;
set hive.vectorized.row.serde.inputformat.excludes=;

-- SORT_QUERY_RESULTS

-- exclude MapredParquetInputFormat from vectorization, this should cause 
mapwork vectorization to be disabled
set 
hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat;
set hive.vectorized.use.vectorized.input.format=true;


create table orcTbl (t1 tinyint, t2 tinyint)
stored as orc;

insert into orcTbl values (54, 9), (-104, 25), (-112, 24);
explain vectorization select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to