[jira] [Resolved] (IMPALA-5765) Flaky tpc-ds data loading

2019-06-27 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5765.
---
Resolution: Won't Fix

Hasn't happened for a long time and I don't think we would realistically put 
the effort in to work around the hive issue with the current frequency.

> Flaky tpc-ds data loading
> -
>
> Key: IMPALA-5765
> URL: https://issues.apache.org/jira/browse/IMPALA-5765
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.10.0
>Reporter: Matthew Jacobs
>Assignee: Philip Zeyliger
>Priority: Major
>  Labels: flaky
>
> Saw this on a number of gerrit-verify-dryrun jobs:
> {code}
> 23:49:37 Loading TPC-DS data (logging to 
> /home/ubuntu/Impala/logs/data_loading/load-tpcds.log)... 
> 23:55:39 FAILED (Took: 6 min 2 sec)
> 23:55:39 'load-data tpcds core' failed. Tail of log:
> 23:55:39 ss_net_profit,
> 23:55:39 ss_sold_date_sk
> 23:55:39 from store_sales_unpartitioned
> 23:55:39 WHERE ss_sold_date_sk < 2451272
> 23:55:39 distribute by ss_sold_date_sk
> 23:55:39 INFO  : Query ID = 
> ubuntu_2017073123_26963c6a-a58b-4cad-b0c7-c3790f9b22dc
> 23:55:39 INFO  : Total jobs = 1
> 23:55:39 INFO  : Launching Job 1 out of 1
> 23:55:39 INFO  : Starting task [Stage-1:MAPRED] in serial mode
> 23:55:39 INFO  : Number of reduce tasks not specified. Estimated from input 
> data size: 2
> 23:55:39 INFO  : In order to change the average load for a reducer (in bytes):
> 23:55:39 INFO  :   set hive.exec.reducers.bytes.per.reducer=
> 23:55:39 INFO  : In order to limit the maximum number of reducers:
> 23:55:39 INFO  :   set hive.exec.reducers.max=
> 23:55:39 INFO  : In order to set a constant number of reducers:
> 23:55:39 INFO  :   set mapreduce.job.reduces=
> 23:55:39 INFO  : number of splits:2
> 23:55:39 INFO  : Submitting tokens for job: job_local1252085428_0826
> 23:55:39 INFO  : The url to track the job: http://localhost:8080/
> 23:55:39 INFO  : Job running in-process (local Hadoop)
> 23:55:39 INFO  : 2017-07-31 23:55:06,606 Stage-1 map = 0%,  reduce = 0%
> 23:55:39 INFO  : 2017-07-31 23:55:13,609 Stage-1 map = 100%,  reduce = 0%
> 23:55:39 INFO  : 2017-07-31 23:55:28,621 Stage-1 map = 100%,  reduce = 33%
> 23:55:39 ERROR : Ended Job = job_local1252085428_0826 with errors
> 23:55:39 ERROR : FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 23:55:39 INFO  : MapReduce Jobs Launched: 
> 23:55:39 INFO  : Stage-Stage-1:  HDFS Read: 26483258512 HDFS Write: 
> 19378762131 FAIL
> 23:55:39 INFO  : Total MapReduce CPU Time Spent: 0 msec
> 23:55:39 INFO  : Completed executing 
> command(queryId=ubuntu_2017073123_26963c6a-a58b-4cad-b0c7-c3790f9b22dc); 
> Time taken: 33.276 seconds
> 23:55:39 Error: Error while processing statement: FAILED: Execution Error, 
> return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask 
> (state=08S01,code=2)
> 23:55:39 java.sql.SQLException: Error while processing statement: FAILED: 
> Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 23:55:39  at 
> org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:292)
> 23:55:39  at 
> org.apache.hive.beeline.Commands.executeInternal(Commands.java:989)
> 23:55:39  at org.apache.hive.beeline.Commands.execute(Commands.java:1203)
> 23:55:39  at org.apache.hive.beeline.Commands.sql(Commands.java:1117)
> 23:55:39  at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1176)
> 23:55:39  at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1010)
> 23:55:39  at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:987)
> 23:55:39  at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:914)
> 23:55:39  at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:518)
> 23:55:39  at org.apache.hive.beeline.BeeLine.main(BeeLine.java:501)
> 23:55:39  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 23:55:39  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 23:55:39  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 23:55:39  at java.lang.reflect.Method.invoke(Method.java:606)
> 23:55:39  at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> 23:55:39  at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> 23:55:39 
> 23:55:39 Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> 23:55:39 Error executing file from Hive: load-tpcds-core-hive-generated.sql
> 23:55:39 Error in /home/ubuntu/Impala/testdata/bin/create-load-data.sh at 
> line 48: LOAD_DATA_ARGS=""
> {code}
> https://jenkins.impala.io/job/ubuntu-14.04-from-scratch/1827/
> It's been reported a few times in the last week. Here's another 

[jira] [Resolved] (IMPALA-5765) Flaky tpc-ds data loading

2019-06-27 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5765.
---
Resolution: Won't Fix

Hasn't happened for a long time and I don't think we would realistically put 
the effort in to work around the hive issue with the current frequency.

> Flaky tpc-ds data loading
> -
>
> Key: IMPALA-5765
> URL: https://issues.apache.org/jira/browse/IMPALA-5765
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.10.0
>Reporter: Matthew Jacobs
>Assignee: Philip Zeyliger
>Priority: Major
>  Labels: flaky
>
> Saw this on a number of gerrit-verify-dryrun jobs:
> {code}
> 23:49:37 Loading TPC-DS data (logging to 
> /home/ubuntu/Impala/logs/data_loading/load-tpcds.log)... 
> 23:55:39 FAILED (Took: 6 min 2 sec)
> 23:55:39 'load-data tpcds core' failed. Tail of log:
> 23:55:39 ss_net_profit,
> 23:55:39 ss_sold_date_sk
> 23:55:39 from store_sales_unpartitioned
> 23:55:39 WHERE ss_sold_date_sk < 2451272
> 23:55:39 distribute by ss_sold_date_sk
> 23:55:39 INFO  : Query ID = 
> ubuntu_2017073123_26963c6a-a58b-4cad-b0c7-c3790f9b22dc
> 23:55:39 INFO  : Total jobs = 1
> 23:55:39 INFO  : Launching Job 1 out of 1
> 23:55:39 INFO  : Starting task [Stage-1:MAPRED] in serial mode
> 23:55:39 INFO  : Number of reduce tasks not specified. Estimated from input 
> data size: 2
> 23:55:39 INFO  : In order to change the average load for a reducer (in bytes):
> 23:55:39 INFO  :   set hive.exec.reducers.bytes.per.reducer=
> 23:55:39 INFO  : In order to limit the maximum number of reducers:
> 23:55:39 INFO  :   set hive.exec.reducers.max=
> 23:55:39 INFO  : In order to set a constant number of reducers:
> 23:55:39 INFO  :   set mapreduce.job.reduces=
> 23:55:39 INFO  : number of splits:2
> 23:55:39 INFO  : Submitting tokens for job: job_local1252085428_0826
> 23:55:39 INFO  : The url to track the job: http://localhost:8080/
> 23:55:39 INFO  : Job running in-process (local Hadoop)
> 23:55:39 INFO  : 2017-07-31 23:55:06,606 Stage-1 map = 0%,  reduce = 0%
> 23:55:39 INFO  : 2017-07-31 23:55:13,609 Stage-1 map = 100%,  reduce = 0%
> 23:55:39 INFO  : 2017-07-31 23:55:28,621 Stage-1 map = 100%,  reduce = 33%
> 23:55:39 ERROR : Ended Job = job_local1252085428_0826 with errors
> 23:55:39 ERROR : FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 23:55:39 INFO  : MapReduce Jobs Launched: 
> 23:55:39 INFO  : Stage-Stage-1:  HDFS Read: 26483258512 HDFS Write: 
> 19378762131 FAIL
> 23:55:39 INFO  : Total MapReduce CPU Time Spent: 0 msec
> 23:55:39 INFO  : Completed executing 
> command(queryId=ubuntu_2017073123_26963c6a-a58b-4cad-b0c7-c3790f9b22dc); 
> Time taken: 33.276 seconds
> 23:55:39 Error: Error while processing statement: FAILED: Execution Error, 
> return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask 
> (state=08S01,code=2)
> 23:55:39 java.sql.SQLException: Error while processing statement: FAILED: 
> Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> 23:55:39  at 
> org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:292)
> 23:55:39  at 
> org.apache.hive.beeline.Commands.executeInternal(Commands.java:989)
> 23:55:39  at org.apache.hive.beeline.Commands.execute(Commands.java:1203)
> 23:55:39  at org.apache.hive.beeline.Commands.sql(Commands.java:1117)
> 23:55:39  at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1176)
> 23:55:39  at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1010)
> 23:55:39  at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:987)
> 23:55:39  at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:914)
> 23:55:39  at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:518)
> 23:55:39  at org.apache.hive.beeline.BeeLine.main(BeeLine.java:501)
> 23:55:39  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 23:55:39  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 23:55:39  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 23:55:39  at java.lang.reflect.Method.invoke(Method.java:606)
> 23:55:39  at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> 23:55:39  at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> 23:55:39 
> 23:55:39 Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> 23:55:39 Error executing file from Hive: load-tpcds-core-hive-generated.sql
> 23:55:39 Error in /home/ubuntu/Impala/testdata/bin/create-load-data.sh at 
> line 48: LOAD_DATA_ARGS=""
> {code}
> https://jenkins.impala.io/job/ubuntu-14.04-from-scratch/1827/
> It's been reported a few times in the last week. Here's another