Vihang Karajgaonkar created HIVE-18422: ------------------------------------------
Summary: Vectorized input format should not be used when input format is excluded and row.serde is enabled Key: HIVE-18422 URL: https://issues.apache.org/jira/browse/HIVE-18422 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 3.0.0, 2.4.0 Reporter: Vihang Karajgaonkar Assignee: Vihang Karajgaonkar Priority: Minor HIVE-17534 introduced a config which gives a capability to exclude certain inputformat from vectorized execution without affecting other input formats. If an input format is excluded and row.serde is enabled at the same time, vectorizer still sets the {{useVectorizedInputFormat}} to true which causes Vectorized readers to be used in row.serde mode. In order to reproduce: {noformat} set hive.fetch.task.conversion=none; set hive.vectorized.use.row.serde.deserialize=true; set hive.vectorized.use.vector.serde.deserialize=true; set hive.vectorized.execution.enabled=true; set hive.vectorized.execution.reduce.enabled=true; set hive.vectorized.row.serde.inputformat.excludes=; -- SORT_QUERY_RESULTS -- exclude MapredParquetInputFormat from vectorization, this should cause mapwork vectorization to be disabled set hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat; set hive.vectorized.use.vectorized.input.format=true; create table orcTbl (t1 tinyint, t2 tinyint) stored as orc; insert into orcTbl values (54, 9), (-104, 25), (-112, 24); explain vectorization select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10; select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10; {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)