Mithun Radhakrishnan created HIVE-10598: -------------------------------------------
Summary: Vectorization borks when column is added to table. Key: HIVE-10598 URL: https://issues.apache.org/jira/browse/HIVE-10598 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Consider the following table definition: {code:sql} create table foobar ( foo string, bar string ) partitioned by (dt string) stored as orc; alter table foobar add partition( dt='20150101' ) ; {code} Say the partition has the following data: {noformat} 1 one 20150101 2 two 20150101 3 three 20150101 {noformat} If a new column is added to the table-schema (and the partition continues to have the old schema), vectorized read from the old partitions fail thus: {code:sql} alter table foobar add columns( goo string ); select count(1) from foobar; {code} {code:title=stacktrace} java.lang.Exception: java.lang.RuntimeException: Error creating a batch at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.RuntimeException: Error creating a batch at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.createValue(VectorizedOrcInputFormat.java:114) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.createValue(VectorizedOrcInputFormat.java:52) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.createValue(CombineHiveRecordReader.java:84) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.createValue(CombineHiveRecordReader.java:42) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.createValue(HadoopShimsSecure.java:156) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.createValue(MapTask.java:180) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: No type entry found for column 3 in map {4=Long} at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addScratchColumnsToBatch(VectorizedRowBatchCtx.java:632) at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.createVectorizedRowBatch(VectorizedRowBatchCtx.java:343) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.createValue(VectorizedOrcInputFormat.java:112) ... 14 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)