[ 
https://issues.apache.org/jira/browse/HIVE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Nastetsky updated HIVE-7186:
---------------------------------

    Environment: Hortonworks Data Platform 2.0.6.0  (was: Hortonworks Data 
Platform 2.0)

> Unable to perform join on table
> -------------------------------
>
>                 Key: HIVE-7186
>                 URL: https://issues.apache.org/jira/browse/HIVE-7186
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>         Environment: Hortonworks Data Platform 2.0.6.0
>            Reporter: Alex Nastetsky
>
> Occasionally, a table will start exhibiting behavior that will prevent it 
> from being used in a JOIN. 
> When doing a map join, it will just stall at "Starting to launch local task 
> to process map join; ".
> When doing a regular join, it will make progress but then error out with a 
> IndexOutOfBoundsException:
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:365)
>         at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
>         at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
>         at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
>         at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
>         ... 9 more
> Caused by: java.lang.IndexOutOfBoundsException
>         at java.nio.Buffer.checkIndex(Buffer.java:532)
>         at 
> java.nio.ByteBufferAsIntBufferL.put(ByteBufferAsIntBufferL.java:131)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1153)
>         at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:586)
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:372)
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:334)
>         ... 15 more
>         
> Doing simple selects against this table work fine and do not show any 
> apparent problems with the data.
> Assume that the table in question is called tableA and was created by queryA.
> Doing either of the following has helped resolve the issue in the past.
> 1) create table tableB as select * from tableA;
>   Then just use tableB instead in the JOIN.
> 2) regenerate tableA using queryA
>   Then use tableA in the JOIN again. It usually works the second time.
>   
> When doing a "describe formatted" on the tables, the totalSize will be 
> different between the original tableA and tableB, and sometimes (but not 
> always) between the original tableA and the regenerated tableA. The numRows 
> will be the same across all versions of the tables.
> This problem can not be reproduced consistently, but the issue always happens 
> when we try to use an affected table in a JOIN.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to