[
https://issues.apache.org/jira/browse/HIVE-21436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944357#comment-16944357
]
Piotr Findeisen edited comment on HIVE-21436 at 10/4/19 9:42 AM:
-----------------------------------------------------------------
First-time select works:
{code:java}
jdbc:hive2://localhost:10000/default> SELECT * FROM t;
...
+------+
| 42 |
+------+ {code}
But all subsequent fail:
{code:java}
jdbc:hive2://localhost:10000/default> SELECT * FROM t;
going to print operations logs
printed operations logs
Getting log thread is interrupted, since query is done!
INFO : Compiling
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768):
SELECT * FROM t
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:t.a,
type:bigint, comment:null)], properties:null)
INFO : Completed compiling
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768); Time
taken: 0.24 seconds
INFO : Executing
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768):
SELECT * FROM t
INFO : Completed executing
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768); Time
taken: 0.0 seconds
INFO : OK
Error: java.io.IOException: java.lang.RuntimeException: ORC split generation
failed with exception: Malformed ORC file. Invalid postscript length 17
(state=,code=0)
org.apache.hive.service.cli.HiveSQLException: java.io.IOException:
java.lang.RuntimeException: ORC split generation failed with exception:
Malformed ORC file. Invalid postscript length 17
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)
at
org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:379)
at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:56)
at
org.apache.hive.beeline.IncrementalRowsWithNormalization.<init>(IncrementalRowsWithNormalization.java:50)
at org.apache.hive.beeline.BeeLine.print(BeeLine.java:2305)
at org.apache.hive.beeline.Commands.executeInternal(Commands.java:1026)
at org.apache.hive.beeline.Commands.execute(Commands.java:1201)
at org.apache.hive.beeline.Commands.sql(Commands.java:1130)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1480)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1342)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1126)
at
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException:
java.lang.RuntimeException: ORC split generation failed with exception:
Malformed ORC file. Invalid postscript length 17
at
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:478)
at
org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
at
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:952)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy50.fetchResults(Unknown Source)
at
org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564)
at
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:792)
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.RuntimeException: ORC split
generation failed with exception: Malformed ORC file. Invalid postscript length
17
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:602)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:509)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2738)
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
at
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
... 25 more
Caused by: java.lang.RuntimeException: ORC split generation failed with
exception: Malformed ORC file. Invalid postscript length 17
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1851)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1939)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.generateWrappedSplits(FetchOperator.java:425)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:395)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:314)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:540)
... 30 more
Caused by: org.apache.orc.FileFormatException: Malformed ORC file. Invalid
postscript length 17
at org.apache.orc.impl.ReaderImpl.ensureOrcFooter(ReaderImpl.java:297)
at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:463)
at
org.apache.hadoop.hive.ql.io.orc.LocalCache.getAndValidate(LocalCache.java:107)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$ETLSplitStrategy.getSplits(OrcInputFormat.java:881)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$ETLSplitStrategy.runGetSplitsSync(OrcInputFormat.java:995)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$ETLSplitStrategy.generateSplitWork(OrcInputFormat.java:968)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.scheduleSplits(OrcInputFormat.java:1879)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1823)
... 35 more {code}
For the above, I am attaching the contents of
{{/user/hive/warehouse/t/20191004_092959_00001_5g5sx_d609b289-f8b0-4b29-abe7-69cfebe70fcb}}
(the only file in the {{t}} table):
{code:java}
$ base64 < 20191004_092959_00001_5g5sx_d609b289-f8b0-4b29-abe7-69cfebe70fcb
T1JDJwAAChEKAwAAABIKCAESBghUEFQYVAcAAE4AVFYAAONi42ATYJQQ42LjYATSbEIsHAwCDECS
SYBBis+xODNR3zuxJCM3MS+lFAAlAAAKEAoCCAEKCggBEgYIVBBUGFQiAQCT4uJgFhCTYFPQ02BU
4uDgEWJklGJMVGLiYNHS5BItKEotLsnXKy/KLEkt0itLLSrOzM8TEjA2tNQ1NLLUTTc1tjQzMU7R
0uPihyiNLyxNLaqMz0wRkjYyMLQ0NDAwiTewNLI0tYw3AALDeNN00+IKLSUuPqh6nGYaMFoxcTBa
cXEwCrFxhAiESIQ4TPDzYAIACJQBEAEYgIAQIgIADCgVMAYR {code}
{code:java}
$ sha1sum 20191004_092959_00001_5g5sx_d609b289-f8b0-4b29-abe7-69cfebe70fcb
9f35b293ceb4e4a71ee92d4a897a66fee3e79a13
20191004_092959_00001_5g5sx_d609b289-f8b0-4b29-abe7-69cfebe70fcb {code}
(JIRA won't let me add this as a normal attachment.)
was (Author: findepi):
First-time select works:
{code:java}
jdbc:hive2://localhost:10000/default> SELECT * FROM t;
...
+------+
| 42 |
+------+ {code}
But all subsequent fail:
{code:java}
jdbc:hive2://localhost:10000/default> SELECT * FROM t;
going to print operations logs
printed operations logs
Getting log thread is interrupted, since query is done!
INFO : Compiling
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768):
SELECT * FROM t
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:t.a,
type:bigint, comment:null)], properties:null)
INFO : Completed compiling
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768); Time
taken: 0.24 seconds
INFO : Executing
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768):
SELECT * FROM t
INFO : Completed executing
command(queryId=hive_20191004151730_e7c48562-51c8-4d39-9622-62231a499768); Time
taken: 0.0 seconds
INFO : OK
Error: java.io.IOException: java.lang.RuntimeException: ORC split generation
failed with exception: Malformed ORC file. Invalid postscript length 17
(state=,code=0)
org.apache.hive.service.cli.HiveSQLException: java.io.IOException:
java.lang.RuntimeException: ORC split generation failed with exception:
Malformed ORC file. Invalid postscript length 17
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)
at
org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:379)
at org.apache.hive.beeline.BufferedRows.<init>(BufferedRows.java:56)
at
org.apache.hive.beeline.IncrementalRowsWithNormalization.<init>(IncrementalRowsWithNormalization.java:50)
at org.apache.hive.beeline.BeeLine.print(BeeLine.java:2305)
at org.apache.hive.beeline.Commands.executeInternal(Commands.java:1026)
at org.apache.hive.beeline.Commands.execute(Commands.java:1201)
at org.apache.hive.beeline.Commands.sql(Commands.java:1130)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1480)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1342)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1126)
at
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException:
java.lang.RuntimeException: ORC split generation failed with exception:
Malformed ORC file. Invalid postscript length 17
at
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:478)
at
org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
at
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:952)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy50.fetchResults(Unknown Source)
at
org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564)
at
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:792)
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.RuntimeException: ORC split
generation failed with exception: Malformed ORC file. Invalid postscript length
17
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:602)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:509)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2738)
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
at
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
... 25 more
Caused by: java.lang.RuntimeException: ORC split generation failed with
exception: Malformed ORC file. Invalid postscript length 17
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1851)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1939)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.generateWrappedSplits(FetchOperator.java:425)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:395)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:314)
at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:540)
... 30 more
Caused by: org.apache.orc.FileFormatException: Malformed ORC file. Invalid
postscript length 17
at org.apache.orc.impl.ReaderImpl.ensureOrcFooter(ReaderImpl.java:297)
at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:463)
at
org.apache.hadoop.hive.ql.io.orc.LocalCache.getAndValidate(LocalCache.java:107)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$ETLSplitStrategy.getSplits(OrcInputFormat.java:881)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$ETLSplitStrategy.runGetSplitsSync(OrcInputFormat.java:995)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$ETLSplitStrategy.generateSplitWork(OrcInputFormat.java:968)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.scheduleSplits(OrcInputFormat.java:1879)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1823)
... 35 more {code}
For the above, I am attaching the contents of
{{/user/hive/warehouse/t/20191004_092959_00001_5g5sx_d609b289-f8b0-4b29-abe7-69cfebe70fcb}}
(the only file in the {{t}} table).
> "Malformed ORC file. Invalid postscript length 17" when only one data-file in
> external table directory
> ------------------------------------------------------------------------------------------------------
>
> Key: HIVE-21436
> URL: https://issues.apache.org/jira/browse/HIVE-21436
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.1.0
> Reporter: archon gum
> Priority: Blocker
> Attachments: 1.jpg, 2.jpg, hive-insert-into.orc,
> org-apache-orc-java-code.orc, presto-insert-into.orc
>
>
> h1. env
> * Presto 305
> * Hive 3.1.0
>
> h1. step
>
> {code:java}
> -- create external table using hiveserver2
> CREATE EXTERNAL TABLE `dw.dim_date2`(
> `d` date
> )
> STORED AS ORC
> LOCATION
> 'hdfs://datacenter1:8020/user/hive/warehouse/dw.db/dim_date2'
> ;
> -- upload the 'presto-insert-into.orc' file from attachments
> -- OR
> -- insert one row using presto
> insert into dim_date2 values (current_date);
> {code}
>
>
> when using `hiveserver2` to query, it works only at the first query and error
> after then
> !1.jpg!
>
> If I insert another row, it works
> {code:java}
> -- upload the 'presto-insert-into.orc' file from attachments
> -- OR
> -- insert one row using presto
> insert into dim_date2 values (current_date);
> {code}
> !2.jpg!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)