[
https://issues.apache.org/jira/browse/HIVE-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qiuzhuang Lian reassigned HIVE-14004:
-------------------------------------
Assignee: Qiuzhuang Lian (was: Owen O'Malley)
> Minor compaction produces ArrayIndexOutOfBoundsException: 7 in
> SchemaEvolution.getFileType
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-14004
> URL: https://issues.apache.org/jira/browse/HIVE-14004
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 2.2.0
> Reporter: Eugene Koifman
> Assignee: Qiuzhuang Lian
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14004.01.patch, HIVE-14004.02.patch,
> HIVE-14004.03.patch, HIVE-14004.patch
>
>
> Easiest way to repro is to add TestTxnCommands2
> {noformat}
> @Test
> public void testCompactWithDelete() throws Exception {
> int[][] tableData = {{1,2},{3,4}};
> runStatementOnDriver("insert into " + Table.ACIDTBL + "(a,b) " +
> makeValuesClause(tableData));
> runStatementOnDriver("alter table "+ Table.ACIDTBL + " compact 'MAJOR'");
> Worker t = new Worker();
> t.setThreadId((int) t.getId());
> t.setHiveConf(hiveConf);
> AtomicBoolean stop = new AtomicBoolean();
> AtomicBoolean looped = new AtomicBoolean();
> stop.set(true);
> t.init(stop, looped);
> t.run();
> runStatementOnDriver("delete from " + Table.ACIDTBL + " where b = 4");
> runStatementOnDriver("update " + Table.ACIDTBL + " set b = -2 where b =
> 2");
> runStatementOnDriver("alter table "+ Table.ACIDTBL + " compact 'MINOR'");
> t.run();
> }
> {noformat}
> to TestTxnCommands2 and run it.
> Test won't fail but if you look
> in target/tmp/log/hive.log for the following exception (from Minor
> compaction).
> {noformat}
> 2016-06-09T18:36:39,071 WARN [Thread-190[]]: mapred.LocalJobRunner
> (LocalJobRunner.java:run(560)) - job_local1233973168_0005
> java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 7
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
> ~[hadoop-mapreduce-client-common-2.6.1.jar:?]
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> [hadoop-mapreduce-client-common-2.6.1.jar:?]
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
> at
> org.apache.orc.impl.SchemaEvolution.getFileType(SchemaEvolution.java:67)
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.orc.impl.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2031)
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.<init>(TreeReaderFactory.java:1716)
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.orc.impl.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2077)
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.<init>(TreeReaderFactory.java:1716)
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.orc.impl.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2077)
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:208)
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:63)
> ~[classes/:?]
> at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:365)
> ~[classes/:?]
> at
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.<init>(OrcRawRecordMerger.java:207)
> ~[classes/:?]
> at
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:508)
> ~[classes/:?]
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:1977)
> ~[classes/:?]
> at
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:630)
> ~[classes/:?]
> at
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:609)
> ~[classes/:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> ~[hadoop-mapreduce-client-core-2.6.1.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> ~[hadoop-mapreduce-client-core-2.6.1.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> ~[hadoop-mapreduce-client-core-2.6.1.jar:?]
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
> ~[hadoop-mapreduce-client-common-2.6.1.jar:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> ~[?:1.7.0_71]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> ~[?:1.7.0_71]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[?:1.7.0_71]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[?:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_71]
> {noformat}
> I observed the same on a real cluster.
> Based on my observations, running Major compaction instead of minor, works
> fine.
> Replacing the DELETE operation with update, makes both Major/Minor run fine.
> The issue itself should be addressed by HIVE-13974 but need to make sure to
> add the test.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)