[
https://issues.apache.org/jira/browse/HIVE-24322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marta Kuczora reassigned HIVE-24322:
------------------------------------
> In case of direct insert, the attempt ID has to be checked when reading the
> manifest files
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-24322
> URL: https://issues.apache.org/jira/browse/HIVE-24322
> Project: Hive
> Issue Type: Bug
> Affects Versions: 4.0.0
> Reporter: Marta Kuczora
> Assignee: Marta Kuczora
> Priority: Major
> Fix For: 4.0.0
>
>
> In [IMPALA-10247|https://issues.apache.org/jira/browse/IMPALA-10247] there
> was an exception from Hive when tyring to load the data:
> {noformat}
> 2020-10-13T16:50:53,424 ERROR [HiveServer2-Background-Pool: Thread-23832]
> exec.Task: Job Commit failed with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)'
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1468)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803)
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:627)
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:342)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
> at
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
> at
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at
> org.apache.hadoop.hive.ql.exec.Utilities.handleDirectInsertTableFinalPath(Utilities.java:4587)
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1462)
> ... 29 more
> {noformat}
> The reason of the exception was that Hive was trying to read an empty
> manifest file. Manifest files are used in case of direct insert to determine
> which files needs to be kept and which one needs to be cleaned up. They are
> created by the tasks and they use the tast attempt Id as postfix. In this
> particular test what happened is that one of the container ran out of memory
> so Tez decided to kill it right after the manifest file got created but
> before the pathes got written into the manifest file. This was the manifest
> file for the task attempt 0. Then Tez assigned a new container to the task,
> so a new attemp was made with attemptId=1. This one was successful, and wrote
> the manifest file correctly. But Hive didn't know about this, since this out
> of memory issue got handled by Tez under the hood, so there was no exception
> in Hive, therefore no clean-up in the manifest folder. And when Hive is
> reading the manifest files, it just reads every file from the defined folder,
> so it tried to read the manifest files for attemp 0 and 1 as well.
> If there are multiple manifest files with the same name but different
> attemptId, Hive should only read the one with the biggest attempt Id.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)