[
https://issues.apache.org/jira/browse/HIVE-12947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vaibhav Gumashta updated HIVE-12947:
------------------------------------
Target Version/s: 2.1.0, 2.0.0, 1.3.0 (was: 2.0.0, 2.1.0)
> SMB join in tez has ClassCastException when container reuse is on
> -----------------------------------------------------------------
>
> Key: HIVE-12947
> URL: https://issues.apache.org/jira/browse/HIVE-12947
> Project: Hive
> Issue Type: Bug
> Components: Tez
> Affects Versions: 2.0.0
> Reporter: Vikram Dixit K
> Assignee: Vikram Dixit K
> Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12947.1.patch, HIVE-12947.2.patch,
> HIVE-12947.3.patch, HIVE-12947.4.patch
>
>
> SMB join in tez has multiple work items that are connected based on input tag
> followed by input initialization etc. In case of container re-use, what ends
> up happening is that we try to reconnect the work items and fail. If we try
> to work around that issue by recognizing somehow that the cache was in play,
> we will run into other initialization issues with respect to record readers.
> So the plan is to disable caching of the SMB work items by clearing out
> during the close phase.
> {code}
> java.lang.RuntimeException: Map operator initialization failed
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassCastException:
> org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to
> org.apache.hadoop.hive.ql.exec.DummyStoreOperator
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300)
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302)
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302)
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302)
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302)
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189)
> ... 15 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)