[jira] [Commented] (HIVE-29297) The directory of the direct insert manifest files should be hidden from read queries

Denys Kuzmenko (Jira) Fri, 31 Oct 2025 13:58:07 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-29297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18034494#comment-18034494
 ]


Denys Kuzmenko commented on HIVE-29297:
---------------------------------------

Merged to master

Thanks for the fix, [~kuczoram] !

> The directory of the direct insert manifest files should be hidden from read 
> queries
> ------------------------------------------------------------------------------------
>
>                 Key: HIVE-29297
>                 URL: https://issues.apache.org/jira/browse/HIVE-29297
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>            Reporter: Marta Kuczora
>            Assignee: Marta Kuczora
>            Priority: Major
>              Labels: hive-4.2.0-must, pull-request-available, regresion
>
> In HIVE-27536 the value of the "tmpPrefix" variable in the Utilities class 
> was changed to "-tmp." from "_tmp.". This prefix was used to create the 
> temporary directory for the manifest files. As a side effect of this change, 
> if there is an insert and the read query running at the same time on the same 
> table, the manifest files are read by the the read query. This leads to 
> exception like this
> {code:java}
> ERROR : Failed with exception java.io.IOException:java.lang.RuntimeException: 
> ORC split generation failed with exception: 
> org.apache.orc.FileFormatException: Malformed ORC file 
> hdfs://ccycloud-1.kuczoram731.root.comops.site:8020/warehouse/tablespace/managed/hive/acid_test/-tmp.delta_0000002_0000002_0000/000000_0.manifest.
>  Invalid postscript.
> java.io.IOException: java.lang.RuntimeException: ORC split generation failed 
> with exception: org.apache.orc.FileFormatException: Malformed ORC file 
> hdfs://ccycloud-1.kuczoram731.root.comops.site:8020/warehouse/tablespace/managed/hive/acid_test/-tmp.delta_0000002_0000002_0000/000000_0.manifest.
>  Invalid postscript.
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:646)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:553)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:217)
>       at org.apache.hadoop.hive.ql.exec.FetchTask.execute(FetchTask.java:114)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:819)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:547)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:541)
>       at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:190)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:92)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:341)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
>       at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:361)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: ORC split generation failed with 
> exception: org.apache.orc.FileFormatException: Malformed ORC file 
> hdfs://ccycloud-1.kuczoram731.root.comops.site:8020/warehouse/tablespace/managed/hive/acid_test/-tmp.delta_0000002_0000002_0000/000000_0.manifest.
>  Invalid postscript.
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1891)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1980)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.generateWrappedSplits(FetchOperator.java:457)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:424)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:328)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:584)
>       ... 21 more
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.orc.FileFormatException: Malformed ORC file 
> hdfs://ccycloud-1.kuczoram731.root.comops.site:8020/warehouse/tablespace/managed/hive/acid_test/-tmp.delta_0000002_0000002_0000/000000_0.manifest.
>  Invalid postscript.
>       at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>       at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1885)
>       ... 26 more
> Caused by: org.apache.orc.FileFormatException: Malformed ORC file 
> hdfs://ccycloud-1.kuczoram731.root.comops.site:8020/warehouse/tablespace/managed/hive/acid_test/-tmp.delta_0000002_0000002_0000/000000_0.manifest.
>  Invalid postscript.
>       at org.apache.orc.impl.ReaderImpl.ensureOrcFooter(ReaderImpl.java:464)
>       at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:812)
>       at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:567)
>       at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:61)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:112)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:1686)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1574)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2900(OrcInputFormat.java:1357)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1546)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1543)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1543)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1357)
>       ... 4 more {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-29297) The directory of the direct insert manifest files should be hidden from read queries

Reply via email to