[ 
https://issues.apache.org/jira/browse/HIVE-11016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611100#comment-14611100
 ] 

Sergey Shelukhin edited comment on HIVE-11016 at 7/1/15 10:28 PM:
------------------------------------------------------------------

[~vikram.dixit] can you take a look at java change?
Issue appears to be that the first fetch in process() sets firstFetchHappened = 
true before doing it, whereas in closeOp/joinFinalLeftData, it would call 
fetchNextGroup first when there's > 1 small tables (because firstFetchHappened 
is set only in the last iteration of the loop).
That would cause first fetch to be done again when operator calls itself thru 
ReduceRecordSource and observes firstFetchDone is false; so, it would call 
fetch on all small tables, including the one for which it was called by 
joinFinalLeftData on the same callstack; that would set fetchDone to true for 
this tag because it is exhausted, and then when recursion unwinds back to the 
call made from joinFinalLeftData, it would be reset back to false.
I changed the logic to be the same in both places for the first fetch, added 
warning log for the blind reset (such recursion shouldn't happen though, so I 
added warning and didn't remove the reset).
Also after first fetch logic removal, the loop logic in joinFinalLeftData 
became really strange... I replaced it with an if and left a TODO 


was (Author: sershe):
[~vikram.dixit] can you take a look at java change?
Issue appears to be that the first fetch in process() sets firstFetchHappened = 
true before doing it, whereas in closeOp/joinFinalLeftData, it would call 
fetchNextGroup first when there's > 1 small tables (because firstFetchHappened 
is set only in the last iteration of the loop).
That would cause first fetch to be done again when operator calls itself thru 
ReduceRecordSource, fetchDone to be set to true for this source in that 
recursive call because it was exhausted, and reset to false again in the 
top-level fetchOneRow called (via some methods) from the first fetch logic at 
joinFinalLeftData.
I changed the logic to be the same in both places for the first fetch, added 
warning log for the blind reset (such recursion shouldn't happen though, so I 
added warning and didn't remove the reset).
Also after first fetch logic removal, the loop logic in joinFinalLeftData 
became really strange... I replaced it with an if and left a TODO 

> LLAP: MiniTez mergejoin test fails with Tez input error
> -------------------------------------------------------
>
>                 Key: HIVE-11016
>                 URL: https://issues.apache.org/jira/browse/HIVE-11016
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-11016.patch
>
>
> Didn't spend a lot of time investigating, but from the code it looks like we 
> shouldn't be calling it after false at least on this path (after false from 
> next, pushRecord returns false, which causes fetchDone to be set for the tag; 
> and fetchOneRow is not called if that is set; should be ok unless tags are 
> messed up?)
> {noformat}
> 2015-06-15 17:28:17,272 ERROR [main]: SessionState 
> (SessionState.java:printError(984)) - Vertex failed, vertexName=Reducer 2, 
> vertexId=vertex_1434414363282_0002_17_03, diagnostics=[Task failed, 
> taskId=task_1434414363282_0002_17_03_000002, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running task: 
> attempt_1434414363282_0002_17_03_000002_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Hive Runtime Error while closing operators: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:181)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:146)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
>       at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.lang.RuntimeException: java.io.IOException: Please check if 
> you are invoking moveToNext() even after it returned false.
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:338)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
>       ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
>       at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412)
>       at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:380)
>       at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:449)
>       at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:389)
>       at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:651)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:314)
>       ... 15 more
> Caused by: java.lang.RuntimeException: java.io.IOException: Please check if 
> you are invoking moveToNext() even after it returned false.
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
>       at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
>       ... 20 more
> Caused by: java.io.IOException: Please check if you are invoking moveToNext() 
> even after it returned false.
>       at 
> org.apache.tez.runtime.library.common.ValuesIterator.hasCompletedProcessing(ValuesIterator.java:223)
>       at 
> org.apache.tez.runtime.library.common.ValuesIterator.moveToNext(ValuesIterator.java:105)
>       at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput$OrderedGroupedKeyValuesReader.next(OrderedGroupedKVInput.java:308)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:260)
>       ... 21 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to