Re:Re: Re: Failed to load test data about TPC-H

2017-06-01 Thread 黄权隆
OK. Thanks!




Quanlong

在 2017-06-02 00:21:40,"Tim Armstrong"  写道:

We don't test with mixed versions like that unfortunately.



On Thu, Jun 1, 2017 at 8:02 AM, 黄权隆  wrote:

Hi Tim,


Thanks for you reply! I'll try these scripts later. One more question.
Is the latest Impala compatible with components in CDH-5.7.3? 
For example, Hadoop-2.6.0 and Hive-1.1.0?


We use the old version cdh-5.7.3-release just due to the concern
of incompatibility.


Thanks


Quanlong



At 2017-06-01 21:31:17, "Tim Armstrong"  wrote:
>Hi Quanlong,
>  It looks like you're missing the TPC-H data. In older versions of Impala
>you had to generate the data manually and put it in that directory. We've
>automated that in more recent versions (I think probably since a year ago).
>If you can switch to a newer version, then this will just work. Data
>loading is a lot more reliable now.
>
>Otherwise this is the script that generates the data. You can probably copy
>this script to your repository and run it by hand:
>
>https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpch/preload
>
>You will also need to do the same for TPC-DS:
>https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpcds/preload
>
>
>Cheers,
>Tim
>
>On Thu, Jun 1, 2017 at 12:54 AM, 黄权隆  wrote:
>
>> Hi friends,
>>
>>
>> I'm trying to run the impala tests. What I referred is the wiki 'How to
>> load and run Impala tests'.
>> Although I just want to run some end-to-end tests, I know I should load
>> the test data first. So I use
>> |
>> ./buildall.sh -noclean -testdata
>> |
>> It succeeded to load the functional test data, but failed to load the tpch
>> data set. Here are some related logs:
>>
>>
>> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
>> release/testdata/target
>> SUCCESS, data generated into /home/CORP/quanlong.huang/
>> workspace/Impala-cdh5.7.3-release/testdata/target
>> Loading Hive Builtins (logging to load-hive-builtins.log)... OK
>> Generating HBase data (logging to create-hbase.log)... OK
>> Creating /test-warehouse HDFS directory (logging to
>> create-test-warehouse-dir.log)... OK
>> Starting Impala cluster (logging to start-impala-cluster.log)... OK
>> Setting up HDFS environment (logging to setup-hdfs-env.log)... OK
>> Loading custom schemas (logging to load-custom-schemas.log)... OK
>> Loading functional-query data (logging to load-functional-query.log)... OK
>> Loading TPC-H data (logging to load-tpch.log)... FAILED
>> 'load-data tpch core' failed. Tail of log:
>> Log for command 'load-data tpch core'
>> Loading workload 'tpch' Using exploration strategy 'core'. Logging to
>> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
>> release/cluster_logs/data_loading/data-load-tpch-core.log
>> Error loading data. The end of the log file is:
>> at org.apache.thrift.ProcessFunction.process(
>> ProcessFunction.java:39)
>> at org.apache.thrift.TBaseProcessor.process(
>> TBaseProcessor.java:39)
>> at org.apache.hive.service.auth.TSetIpAddressProcessor.process(
>> TSetIpAddressProcessor.java:56)
>> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(
>> TThreadPoolServer.java:285)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23
>> Invalid path ''/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
>> release/testdata/impala-data/tpch/lineitem'': No files matching path
>> file:/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.
>> 3-release/testdata/impala-data/tpch/lineitem
>> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.
>> applyConstraints(LoadSemanticAnalyzer.java:139)
>> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.
>> analyzeInternal(LoadSemanticAnalyzer.java:230)
>> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.
>> analyze(BaseSemanticAnalyzer.java:222)
>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445)
>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
>> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.
>> java:1189)
>> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(
>> Driver.java:1176)
>> at org.apache.hive.service.cli.operation.SQLOperation.
>> prepare(SQLOperation.java:134)
>> ... 26 more
>>
>>
>> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
>> Error executing file from Hive: load-tpch-core-hive-generated.sql
>> Error in /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
>> release/testdata/bin/create-load-data.sh at line 41: while [ -n "$*" ]
>> Error in ./buildall.sh at line 368: 
>> 

Re: Re: Failed to load test data about TPC-H

2017-06-01 Thread Tim Armstrong
We don't test with mixed versions like that unfortunately.

On Thu, Jun 1, 2017 at 8:02 AM, 黄权隆  wrote:

> Hi Tim,
>
> Thanks for you reply! I'll try these scripts later. One more question.
> Is the latest Impala compatible with components in CDH-5.7.3?
> For example, Hadoop-2.6.0 and Hive-1.1.0?
>
> We use the old version cdh-5.7.3-release just due to the concern
> of incompatibility.
>
> Thanks
> 
> Quanlong
>
>
> At 2017-06-01 21:31:17, "Tim Armstrong"  wrote:
> >Hi Quanlong,
> >  It looks like you're missing the TPC-H data. In older versions of Impala
> >you had to generate the data manually and put it in that directory. We've
> >automated that in more recent versions (I think probably since a year ago).
> >If you can switch to a newer version, then this will just work. Data
> >loading is a lot more reliable now.
> >
> >Otherwise this is the script that generates the data. You can probably copy
> >this script to your repository and run it by hand:
> >
> >https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpch/preload
> >
> >You will also need to do the same for TPC-DS:
> >https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpcds/preload
> >
> >
> >Cheers,
> >Tim
> >
> >On Thu, Jun 1, 2017 at 12:54 AM, 黄权隆  wrote:
> >
> >> Hi friends,
> >>
> >>
> >> I'm trying to run the impala tests. What I referred is the wiki 'How to
> >> load and run Impala tests'.
> >> Although I just want to run some end-to-end tests, I know I should load
> >> the test data first. So I use
> >> |
> >> ./buildall.sh -noclean -testdata
> >> |
> >> It succeeded to load the functional test data, but failed to load the tpch
> >> data set. Here are some related logs:
> >>
> >>
> >> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> >> release/testdata/target
> >> SUCCESS, data generated into /home/CORP/quanlong.huang/
> >> workspace/Impala-cdh5.7.3-release/testdata/target
> >> Loading Hive Builtins (logging to load-hive-builtins.log)... OK
> >> Generating HBase data (logging to create-hbase.log)... OK
> >> Creating /test-warehouse HDFS directory (logging to
> >> create-test-warehouse-dir.log)... OK
> >> Starting Impala cluster (logging to start-impala-cluster.log)... OK
> >> Setting up HDFS environment (logging to setup-hdfs-env.log)... OK
> >> Loading custom schemas (logging to load-custom-schemas.log)... OK
> >> Loading functional-query data (logging to load-functional-query.log)... OK
> >> Loading TPC-H data (logging to load-tpch.log)... FAILED
> >> 'load-data tpch core' failed. Tail of log:
> >> Log for command 'load-data tpch core'
> >> Loading workload 'tpch' Using exploration strategy 'core'. Logging to
> >> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> >> release/cluster_logs/data_loading/data-load-tpch-core.log
> >> Error loading data. The end of the log file is:
> >> at org.apache.thrift.ProcessFunction.process(
> >> ProcessFunction.java:39)
> >> at org.apache.thrift.TBaseProcessor.process(
> >> TBaseProcessor.java:39)
> >> at org.apache.hive.service.auth.TSetIpAddressProcessor.process(
> >> TSetIpAddressProcessor.java:56)
> >> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(
> >> TThreadPoolServer.java:285)
> >> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> >> ThreadPoolExecutor.java:1145)
> >> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> >> ThreadPoolExecutor.java:615)
> >> at java.lang.Thread.run(Thread.java:745)
> >> Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23
> >> Invalid path ''/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> >> release/testdata/impala-data/tpch/lineitem'': No files matching path
> >> file:/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.
> >> 3-release/testdata/impala-data/tpch/lineitem
> >> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.
> >> applyConstraints(LoadSemanticAnalyzer.java:139)
> >> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.
> >> analyzeInternal(LoadSemanticAnalyzer.java:230)
> >> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.
> >> analyze(BaseSemanticAnalyzer.java:222)
> >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445)
> >> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
> >> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.
> >> java:1189)
> >> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(
> >> Driver.java:1176)
> >> at org.apache.hive.service.cli.operation.SQLOperation.
> >> prepare(SQLOperation.java:134)
> >> ... 26 more
> >>
> >>
> >> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> >> Error executing file from Hive: load-tpch-core-hive-generated.sql
> >> Error in /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> >> release/testdata/bin/create-load-data.sh at 

Re: Failed to load test data about TPC-H

2017-06-01 Thread Tim Armstrong
Hi Quanlong,
  It looks like you're missing the TPC-H data. In older versions of Impala
you had to generate the data manually and put it in that directory. We've
automated that in more recent versions (I think probably since a year ago).
If you can switch to a newer version, then this will just work. Data
loading is a lot more reliable now.

Otherwise this is the script that generates the data. You can probably copy
this script to your repository and run it by hand:

https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpch/preload

You will also need to do the same for TPC-DS:
https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpcds/preload


Cheers,
Tim

On Thu, Jun 1, 2017 at 12:54 AM, 黄权隆  wrote:

> Hi friends,
>
>
> I'm trying to run the impala tests. What I referred is the wiki 'How to
> load and run Impala tests'.
> Although I just want to run some end-to-end tests, I know I should load
> the test data first. So I use
> |
> ./buildall.sh -noclean -testdata
> |
> It succeeded to load the functional test data, but failed to load the tpch
> data set. Here are some related logs:
>
>
> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> release/testdata/target
> SUCCESS, data generated into /home/CORP/quanlong.huang/
> workspace/Impala-cdh5.7.3-release/testdata/target
> Loading Hive Builtins (logging to load-hive-builtins.log)... OK
> Generating HBase data (logging to create-hbase.log)... OK
> Creating /test-warehouse HDFS directory (logging to
> create-test-warehouse-dir.log)... OK
> Starting Impala cluster (logging to start-impala-cluster.log)... OK
> Setting up HDFS environment (logging to setup-hdfs-env.log)... OK
> Loading custom schemas (logging to load-custom-schemas.log)... OK
> Loading functional-query data (logging to load-functional-query.log)... OK
> Loading TPC-H data (logging to load-tpch.log)... FAILED
> 'load-data tpch core' failed. Tail of log:
> Log for command 'load-data tpch core'
> Loading workload 'tpch' Using exploration strategy 'core'. Logging to
> /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> release/cluster_logs/data_loading/data-load-tpch-core.log
> Error loading data. The end of the log file is:
> at org.apache.thrift.ProcessFunction.process(
> ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(
> TBaseProcessor.java:39)
> at org.apache.hive.service.auth.TSetIpAddressProcessor.process(
> TSetIpAddressProcessor.java:56)
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(
> TThreadPoolServer.java:285)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23
> Invalid path ''/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> release/testdata/impala-data/tpch/lineitem'': No files matching path
> file:/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.
> 3-release/testdata/impala-data/tpch/lineitem
> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.
> applyConstraints(LoadSemanticAnalyzer.java:139)
> at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.
> analyzeInternal(LoadSemanticAnalyzer.java:230)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.
> analyze(BaseSemanticAnalyzer.java:222)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.
> java:1189)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(
> Driver.java:1176)
> at org.apache.hive.service.cli.operation.SQLOperation.
> prepare(SQLOperation.java:134)
> ... 26 more
>
>
> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> Error executing file from Hive: load-tpch-core-hive-generated.sql
> Error in /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3-
> release/testdata/bin/create-load-data.sh at line 41: while [ -n "$*" ]
> Error in ./buildall.sh at line 368: 
> ${IMPALA_HOME}/testdata/bin/create-load-data.sh
> ${CREATE_LOAD_DATA_ARGS} <<< Y
>
>
> I'm using version cdh5.7.3-release. The directory 
> ${IMPALA_HOME}/testdata/impala-data
> dose not exist.
>
>
> Could you tell me how to generate this data set? Or where can I download
> the snapshot file of test-warehouse so I can skip this step?
>
>
> Thanks
> 
> Quanlong
>
>
>
> 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>>
>
>
>
> 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>>
>
>
>
> 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>>