[ https://issues.apache.org/jira/browse/HIVE-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214500#comment-16214500 ]
Vihang Karajgaonkar commented on HIVE-17876: -------------------------------------------- CC: [~mmccline] > row.serde.deserialize broken for non-vectorized file inputformats > ----------------------------------------------------------------- > > Key: HIVE-17876 > URL: https://issues.apache.org/jira/browse/HIVE-17876 > Project: Hive > Issue Type: Bug > Affects Versions: 3.0.0, 2.4.0 > Reporter: Vihang Karajgaonkar > > Vectorization using {{hive.vectorized.use.row.serde.deserialize}} errors out > for both Orc and Parquet input format. > Steps to reproduce: > {noformat} > set hive.fetch.task.conversion=none; > set hive.vectorized.use.row.serde.deserialize=true; > set > hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.orc.OrcInputFormat; > set hive.vectorized.execution.enabled=true; > explain vectorization select * from alltypesorc where cint = 528534767 limit > 10; > +----------------------------------------------------+ > | Explain | > +----------------------------------------------------+ > | PLAN VECTORIZATION: | > | enabled: true | > | enabledConditionsMet: [hive.vectorized.execution.enabled IS true] | > | | > | STAGE DEPENDENCIES: | > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > | | > | STAGE PLANS: | > | Stage: Stage-1 | > | Map Reduce | > | Map Operator Tree: | > | TableScan | > | alias: alltypesorc | > | Statistics: Num rows: 12288 Data size: 2641964 Basic stats: > COMPLETE Column stats: NONE | > | Filter Operator | > | predicate: (cint = 528534767) (type: boolean) | > | Statistics: Num rows: 6144 Data size: 1320982 Basic stats: > COMPLETE Column stats: NONE | > | Select Operator | > | expressions: ctinyint (type: tinyint), csmallint (type: > smallint), 528534767 (type: int), cbigint (type: bigint), cfloat (type: > float), cdouble (type: double), cstring1 (type: string), cstring2 (type: > string), ctimestamp1 (type: timestamp), ctimestamp2 (type: timestamp), > cboolean1 (type: boolean), cboolean2 (type: boolean) | > | outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11 | > | Statistics: Num rows: 6144 Data size: 1320982 Basic stats: > COMPLETE Column stats: NONE | > | Limit | > | Number of rows: 10 | > | Statistics: Num rows: 10 Data size: 2150 Basic stats: > COMPLETE Column stats: NONE | > | File Output Operator | > | compressed: false | > | Statistics: Num rows: 10 Data size: 2150 Basic stats: > COMPLETE Column stats: NONE | > | table: | > | input format: > org.apache.hadoop.mapred.SequenceFileInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | > | serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | > | Execution mode: vectorized | > | Map Vectorization: | > | enabled: true | > | enabledConditionsMet: hive.vectorized.use.row.serde.deserialize > IS true | > | groupByVectorOutput: true | > | inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > | > | allNative: false | > | usesVectorUDFAdaptor: false | > | vectorized: true | > | | > | Stage: Stage-0 | > | Fetch Operator | > | limit: 10 | > | Processor Tree: | > | ListSink | > | | > +----------------------------------------------------+ > 48 rows selected (0.742 seconds) > 0: jdbc:hive2://localhost:10000/default> > 0: jdbc:hive2://localhost:10000/default> select * from alltypesorc where cint > = 528534767 limit 10; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)