[ 
https://issues.apache.org/jira/browse/IMPALA-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167918#comment-17167918
 ] 

Peter Vary commented on IMPALA-9923:
------------------------------------

[~laszlog]: On Hive side we wait until the DFSClient returns that the file is 
created. And then we continue with processing and we expect to be able to read 
the files.

Your suggestion sounds reasonable, that multiple cores can allow us to arrive 
to a place where we try to read the file and this way we fail more often.

On the other hand, based on the logs above I think this should be handled on 
HDFS side. I think if HDFS starts to behave like S3 with inconsistent file 
creation that would be a serious problem :)

> Data loading of TPC-DS ORC fails with "Fail to get checksum"
> ------------------------------------------------------------
>
>                 Key: IMPALA-9923
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9923
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>            Reporter: Tim Armstrong
>            Assignee: Zoltán Borók-Nagy
>            Priority: Critical
>              Labels: broken-build, flaky
>         Attachments: load-tpcds-core-hive-generated-orc-def-block.sql, 
> load-tpcds-core-hive-generated-orc-def-block.sql.log
>
>
> {noformat}
> INFO  : Loading data to table tpcds_orc_def.store_sales partition 
> (ss_sold_date_sk=null) from 
> hdfs://localhost:20500/test-warehouse/managed/tpcds.store_sales_orc_def
> INFO  : 
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Fail to get 
> checksum, since file 
> /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2451646/base_0000003/_orc_acid_version
>  is under construction.
> INFO  : Completed executing 
> command(queryId=ubuntu_20200707055650_a1958916-1e85-4db5-b1bc-cc63d80b3537); 
> Time taken: 14.512 seconds
> INFO  : OK
> Error: Error while compiling statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Fail to 
> get checksum, since file 
> /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2451646/base_0000003/_orc_acid_version
>  is under construction. (state=08S01,code=1)
> java.sql.SQLException: Error while compiling statement: FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. 
> java.io.IOException: Fail to get checksum, since file 
> /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2451646/base_0000003/_orc_acid_version
>  is under construction.
>       at 
> org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:401)
>       at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:266)
>       at org.apache.hive.beeline.Commands.executeInternal(Commands.java:1007)
>       at org.apache.hive.beeline.Commands.execute(Commands.java:1217)
>       at org.apache.hive.beeline.Commands.sql(Commands.java:1146)
>       at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1497)
>       at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1355)
>       at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:1329)
>       at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1127)
>       at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082)
>       at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
>       at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> {noformat}
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11223/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to