[
https://issues.apache.org/jira/browse/HIVE-26179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532146#comment-17532146
]
zhengchenyu edited comment on HIVE-26179 at 5/5/22 8:44 AM:
------------------------------------------------------------
I test on hive-3.1.2 with my dataset, NPE does not occur. By my debug, I found
NPE is fixed in HIVE-13809 by [~wzheng] (Note: remove loadCalled, then fix).
But in my debug process, I still found we also get inconsistency result by
Operator::completeInitialization in high-level hive version, this bug will
raise some unpredictable manner.
[~zabetak]
was (Author: zhengchenyu):
I test on hive-3.1.2 with my dataset, NPE does not occur. By my debug, I found
NPE is fixed in HIVE-13809 by [~wzheng] (Note: remove loadCalled, then fix).
But in my debug process, I still found we also get inconsistency result by
Operator::completeInitialization in high-level hive version, this bug will
raise some unpredictable manner.
> In tez reuse container mode, asyncInitOperations are not clear.
> ---------------------------------------------------------------
>
> Key: HIVE-26179
> URL: https://issues.apache.org/jira/browse/HIVE-26179
> Project: Hive
> Issue Type: Bug
> Components: Hive, Tez
> Affects Versions: 1.2.1
> Environment: engine: Tez (Note: tez.am.container.reuse.enabled is
> true)
>
> Reporter: zhengchenyu
> Assignee: zhengchenyu
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> In our cluster, we found error like this.
> {code:java}
> Vertex failed, vertexName=Map 1, vertexId=vertex_1650608671415_321290_1_11,
> diagnostics=[Task failed, taskId=task_1650608671415_321290_1_11_000422,
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task (
> failure ) :
> attempt_1650608671415_321290_1_11_000422_0:java.lang.RuntimeException:
> java.lang.RuntimeException: Hive Runtime Error while closing operators
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:135)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
> at
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing
> operators
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:349)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:161)
> ... 16 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:488)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:684)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:698)
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:338)
> ... 17 more
> {code}
> When tez reuse container is enable, and use MapJoinOperator, if same tasks's
> different taskattemp execute in same container, will throw NPE.
> By my debug, I found the second task attempt use first task's
> asyncInitOperations. asyncInitOperations are not clear when close op, then
> second taskattemp may use first taskattepmt's mapJoinTables which
> HybridHashTableContainer.HashPartition is closed, so throw NPE.
> We must clear asyncInitOperations when op is closed.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)