Re:Re: Re: Failed to load test data about TPC-H
OK. Thanks! Quanlong 在 2017-06-02 00:21:40,"Tim Armstrong"写道: We don't test with mixed versions like that unfortunately. On Thu, Jun 1, 2017 at 8:02 AM, 黄权隆 wrote: Hi Tim, Thanks for you reply! I'll try these scripts later. One more question. Is the latest Impala compatible with components in CDH-5.7.3? For example, Hadoop-2.6.0 and Hive-1.1.0? We use the old version cdh-5.7.3-release just due to the concern of incompatibility. Thanks Quanlong At 2017-06-01 21:31:17, "Tim Armstrong" wrote: >Hi Quanlong, > It looks like you're missing the TPC-H data. In older versions of Impala >you had to generate the data manually and put it in that directory. We've >automated that in more recent versions (I think probably since a year ago). >If you can switch to a newer version, then this will just work. Data >loading is a lot more reliable now. > >Otherwise this is the script that generates the data. You can probably copy >this script to your repository and run it by hand: > >https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpch/preload > >You will also need to do the same for TPC-DS: >https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpcds/preload > > >Cheers, >Tim > >On Thu, Jun 1, 2017 at 12:54 AM, 黄权隆 wrote: > >> Hi friends, >> >> >> I'm trying to run the impala tests. What I referred is the wiki 'How to >> load and run Impala tests'. >> Although I just want to run some end-to-end tests, I know I should load >> the test data first. So I use >> | >> ./buildall.sh -noclean -testdata >> | >> It succeeded to load the functional test data, but failed to load the tpch >> data set. Here are some related logs: >> >> >> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- >> release/testdata/target >> SUCCESS, data generated into /home/CORP/quanlong.huang/ >> workspace/Impala-cdh5.7.3-release/testdata/target >> Loading Hive Builtins (logging to load-hive-builtins.log)... OK >> Generating HBase data (logging to create-hbase.log)... OK >> Creating /test-warehouse HDFS directory (logging to >> create-test-warehouse-dir.log)... OK >> Starting Impala cluster (logging to start-impala-cluster.log)... OK >> Setting up HDFS environment (logging to setup-hdfs-env.log)... OK >> Loading custom schemas (logging to load-custom-schemas.log)... OK >> Loading functional-query data (logging to load-functional-query.log)... OK >> Loading TPC-H data (logging to load-tpch.log)... FAILED >> 'load-data tpch core' failed. Tail of log: >> Log for command 'load-data tpch core' >> Loading workload 'tpch' Using exploration strategy 'core'. Logging to >> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- >> release/cluster_logs/data_loading/data-load-tpch-core.log >> Error loading data. The end of the log file is: >> at org.apache.thrift.ProcessFunction.process( >> ProcessFunction.java:39) >> at org.apache.thrift.TBaseProcessor.process( >> TBaseProcessor.java:39) >> at org.apache.hive.service.auth.TSetIpAddressProcessor.process( >> TSetIpAddressProcessor.java:56) >> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run( >> TThreadPoolServer.java:285) >> at java.util.concurrent.ThreadPoolExecutor.runWorker( >> ThreadPoolExecutor.java:1145) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >> ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 >> Invalid path ''/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- >> release/testdata/impala-data/tpch/lineitem'': No files matching path >> file:/home/CORP/quanlong.huang/workspace/Impala-cdh5.7. >> 3-release/testdata/impala-data/tpch/lineitem >> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. >> applyConstraints(LoadSemanticAnalyzer.java:139) >> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. >> analyzeInternal(LoadSemanticAnalyzer.java:230) >> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer. >> analyze(BaseSemanticAnalyzer.java:222) >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445) >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311) >> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver. >> java:1189) >> at org.apache.hadoop.hive.ql.Driver.compileAndRespond( >> Driver.java:1176) >> at org.apache.hive.service.cli.operation.SQLOperation. >> prepare(SQLOperation.java:134) >> ... 26 more >> >> >> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none >> Error executing file from Hive: load-tpch-core-hive-generated.sql >> Error in /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- >> release/testdata/bin/create-load-data.sh at line 41: while [ -n "$*" ] >> Error in ./buildall.sh at line 368: >>
Re: Re: Failed to load test data about TPC-H
We don't test with mixed versions like that unfortunately. On Thu, Jun 1, 2017 at 8:02 AM, 黄权隆wrote: > Hi Tim, > > Thanks for you reply! I'll try these scripts later. One more question. > Is the latest Impala compatible with components in CDH-5.7.3? > For example, Hadoop-2.6.0 and Hive-1.1.0? > > We use the old version cdh-5.7.3-release just due to the concern > of incompatibility. > > Thanks > > Quanlong > > > At 2017-06-01 21:31:17, "Tim Armstrong" wrote: > >Hi Quanlong, > > It looks like you're missing the TPC-H data. In older versions of Impala > >you had to generate the data manually and put it in that directory. We've > >automated that in more recent versions (I think probably since a year ago). > >If you can switch to a newer version, then this will just work. Data > >loading is a lot more reliable now. > > > >Otherwise this is the script that generates the data. You can probably copy > >this script to your repository and run it by hand: > > > >https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpch/preload > > > >You will also need to do the same for TPC-DS: > >https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpcds/preload > > > > > >Cheers, > >Tim > > > >On Thu, Jun 1, 2017 at 12:54 AM, 黄权隆 wrote: > > > >> Hi friends, > >> > >> > >> I'm trying to run the impala tests. What I referred is the wiki 'How to > >> load and run Impala tests'. > >> Although I just want to run some end-to-end tests, I know I should load > >> the test data first. So I use > >> | > >> ./buildall.sh -noclean -testdata > >> | > >> It succeeded to load the functional test data, but failed to load the tpch > >> data set. Here are some related logs: > >> > >> > >> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > >> release/testdata/target > >> SUCCESS, data generated into /home/CORP/quanlong.huang/ > >> workspace/Impala-cdh5.7.3-release/testdata/target > >> Loading Hive Builtins (logging to load-hive-builtins.log)... OK > >> Generating HBase data (logging to create-hbase.log)... OK > >> Creating /test-warehouse HDFS directory (logging to > >> create-test-warehouse-dir.log)... OK > >> Starting Impala cluster (logging to start-impala-cluster.log)... OK > >> Setting up HDFS environment (logging to setup-hdfs-env.log)... OK > >> Loading custom schemas (logging to load-custom-schemas.log)... OK > >> Loading functional-query data (logging to load-functional-query.log)... OK > >> Loading TPC-H data (logging to load-tpch.log)... FAILED > >> 'load-data tpch core' failed. Tail of log: > >> Log for command 'load-data tpch core' > >> Loading workload 'tpch' Using exploration strategy 'core'. Logging to > >> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > >> release/cluster_logs/data_loading/data-load-tpch-core.log > >> Error loading data. The end of the log file is: > >> at org.apache.thrift.ProcessFunction.process( > >> ProcessFunction.java:39) > >> at org.apache.thrift.TBaseProcessor.process( > >> TBaseProcessor.java:39) > >> at org.apache.hive.service.auth.TSetIpAddressProcessor.process( > >> TSetIpAddressProcessor.java:56) > >> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run( > >> TThreadPoolServer.java:285) > >> at java.util.concurrent.ThreadPoolExecutor.runWorker( > >> ThreadPoolExecutor.java:1145) > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > >> ThreadPoolExecutor.java:615) > >> at java.lang.Thread.run(Thread.java:745) > >> Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 > >> Invalid path ''/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > >> release/testdata/impala-data/tpch/lineitem'': No files matching path > >> file:/home/CORP/quanlong.huang/workspace/Impala-cdh5.7. > >> 3-release/testdata/impala-data/tpch/lineitem > >> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. > >> applyConstraints(LoadSemanticAnalyzer.java:139) > >> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. > >> analyzeInternal(LoadSemanticAnalyzer.java:230) > >> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer. > >> analyze(BaseSemanticAnalyzer.java:222) > >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445) > >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311) > >> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver. > >> java:1189) > >> at org.apache.hadoop.hive.ql.Driver.compileAndRespond( > >> Driver.java:1176) > >> at org.apache.hive.service.cli.operation.SQLOperation. > >> prepare(SQLOperation.java:134) > >> ... 26 more > >> > >> > >> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none > >> Error executing file from Hive: load-tpch-core-hive-generated.sql > >> Error in /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > >> release/testdata/bin/create-load-data.sh at
Re: Failed to load test data about TPC-H
Hi Quanlong, It looks like you're missing the TPC-H data. In older versions of Impala you had to generate the data manually and put it in that directory. We've automated that in more recent versions (I think probably since a year ago). If you can switch to a newer version, then this will just work. Data loading is a lot more reliable now. Otherwise this is the script that generates the data. You can probably copy this script to your repository and run it by hand: https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpch/preload You will also need to do the same for TPC-DS: https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpcds/preload Cheers, Tim On Thu, Jun 1, 2017 at 12:54 AM, 黄权隆wrote: > Hi friends, > > > I'm trying to run the impala tests. What I referred is the wiki 'How to > load and run Impala tests'. > Although I just want to run some end-to-end tests, I know I should load > the test data first. So I use > | > ./buildall.sh -noclean -testdata > | > It succeeded to load the functional test data, but failed to load the tpch > data set. Here are some related logs: > > > /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/testdata/target > SUCCESS, data generated into /home/CORP/quanlong.huang/ > workspace/Impala-cdh5.7.3-release/testdata/target > Loading Hive Builtins (logging to load-hive-builtins.log)... OK > Generating HBase data (logging to create-hbase.log)... OK > Creating /test-warehouse HDFS directory (logging to > create-test-warehouse-dir.log)... OK > Starting Impala cluster (logging to start-impala-cluster.log)... OK > Setting up HDFS environment (logging to setup-hdfs-env.log)... OK > Loading custom schemas (logging to load-custom-schemas.log)... OK > Loading functional-query data (logging to load-functional-query.log)... OK > Loading TPC-H data (logging to load-tpch.log)... FAILED > 'load-data tpch core' failed. Tail of log: > Log for command 'load-data tpch core' > Loading workload 'tpch' Using exploration strategy 'core'. Logging to > /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/cluster_logs/data_loading/data-load-tpch-core.log > Error loading data. The end of the log file is: > at org.apache.thrift.ProcessFunction.process( > ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process( > TBaseProcessor.java:39) > at org.apache.hive.service.auth.TSetIpAddressProcessor.process( > TSetIpAddressProcessor.java:56) > at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run( > TThreadPoolServer.java:285) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 > Invalid path ''/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/testdata/impala-data/tpch/lineitem'': No files matching path > file:/home/CORP/quanlong.huang/workspace/Impala-cdh5.7. > 3-release/testdata/impala-data/tpch/lineitem > at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. > applyConstraints(LoadSemanticAnalyzer.java:139) > at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. > analyzeInternal(LoadSemanticAnalyzer.java:230) > at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer. > analyze(BaseSemanticAnalyzer.java:222) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver. > java:1189) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond( > Driver.java:1176) > at org.apache.hive.service.cli.operation.SQLOperation. > prepare(SQLOperation.java:134) > ... 26 more > > > Closing: 0: jdbc:hive2://localhost:11050/default;auth=none > Error executing file from Hive: load-tpch-core-hive-generated.sql > Error in /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/testdata/bin/create-load-data.sh at line 41: while [ -n "$*" ] > Error in ./buildall.sh at line 368: > ${IMPALA_HOME}/testdata/bin/create-load-data.sh > ${CREATE_LOAD_DATA_ARGS} <<< Y > > > I'm using version cdh5.7.3-release. The directory > ${IMPALA_HOME}/testdata/impala-data > dose not exist. > > > Could you tell me how to generate this data set? Or where can I download > the snapshot file of test-warehouse so I can skip this step? > > > Thanks > > Quanlong > > > > 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>> > > > > 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>> > > > > 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>>