This is just a suggestion but I recently ran into an issue with vectorized query execution and a map column type, specifically when inserting into an HBase table with a map to column family setup. Try using “set hive.vectorized.execution.enabled=false;”
Thanks, Aaron From: Bernard Quizon <bernard.qui...@cheetahdigital.com> Sent: Tuesday, July 14, 2020 9:57 AM To: user@hive.apache.org Subject: Re: Intermittent ArrayIndexOutOfBoundsException on Hive Merge Hi. I see that this piece of code is the source of the error: final int maxSize = (vectorizedTestingReducerBatchSize > 0 ? Math.min(vectorizedTestingReducerBatchSize, batch.getMaxSize()) : batch.getMaxSize()); Preconditions.checkState(maxSize > 0); int rowIdx = 0; int batchBytes = keyBytes.length; try { for (Object value : values) { if (rowIdx >= maxSize || (rowIdx > 0 && batchBytes >= BATCH_BYTES)) { // Batch is full AND we have at least 1 more row... batch.size = rowIdx; if (handleGroupKey) { reducer.setNextVectorBatchGroupStatus(/* isLastGroupBatch */ false); } reducer.process(batch, tag); // Reset just the value columns and value buffer. for (int i = firstValueColumnOffset; i < batch.numCols; i++) { // Note that reset also resets the data buffer for bytes column vectors. batch.cols[i].reset(); } rowIdx = 0; batchBytes = keyBytes.length; } if (valueLazyBinaryDeserializeToRow != null) { // Deserialize value into vector row columns. BytesWritable valueWritable = (BytesWritable) value; byte[] valueBytes = valueWritable.getBytes(); int valueLength = valueWritable.getLength(); batchBytes += valueLength; valueLazyBinaryDeserializeToRow.setBytes(valueBytes, 0, valueLength); valueLazyBinaryDeserializeToRow.deserialize(batch, rowIdx); } rowIdx++; } `valueLazyBinaryDeserializeToRow.deserialize(batch, rowIdx)` throws an exception due to `rowIdx` having a value of 1024, it should have a value of1023 at most. But it seems to me that `maxSize` will always be < 1024 then why would `rowIdx` on the expression `valueLazyBinaryDeserializeToRow.deserialize(batch, rowIdx)` have anything >= 1024. Am I missing something here? Thanks, Bernard On Tue, Jul 14, 2020 at 5:44 PM Bernard Quizon <bernard.qui...@cheetahdigital.com<mailto:bernard.qui...@cheetahdigital.com>> wrote: Hi. I'm using Hive 3.1.0 (Tez Execution Engine) and I'm running into intermittent errors when doing Hive Merge. Just to clarify, the Hive Merge query probably succeeds 60% of the time using the same source and destination table for the Hive Merge query. By the way, both the source and destination table has columns with complex data types such as ARRAY<STRING> and MAP<STRING, STRING>. Here's the error : TaskAttempt 0 failed, info= » Error: Error while running task ( failure ) : attempt_1594345704665_28139_1_06_000007_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 4) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 4) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:396) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:249) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) (vectorizedVertexNum 4) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:493) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:387) ... 19 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:187) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storePrimitiveRowColumn(VectorDeserializeRow.java:588) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeComplexFieldRowColumn(VectorDeserializeRow.java:778) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeMapRowColumn(VectorDeserializeRow.java:855) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(VectorDeserializeRow.java:941) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:1360) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470) ... 20 more Would someone know a workaround for this? Thanks, Bernard