Hi Amos! load-data.py assumes that you have a running cluster. You need to first get these working: testdata/bin/run-all.sh bin/start-impala-cluster.py
The first command starts all dependent services like HDFS, YARN, Hive Metastore, Hive HS2, etc. The second command starts an Impala mini-cluster with 3 nodes. This command assumes all dependent services are already running/ Hope it helps! Alex On Mon, Aug 15, 2016 at 5:20 AM, Amos Bird <[email protected]> wrote: > > I was trying to build a new test warehouse. After successfully running > 'bin/create_testdata.sh', I did 'bin/load_data.py -w all'. Unfortunately it > ended up with this: > > ERROR : Job Submission failed with exception 'java.io.IOException(java. > util.concurrent.ExecutionException: java.io.IOException: Cannot create an > instance of InputFormat class org.apache.hadoop.mapred.TextInputFormat as > specified in mapredWork!)' > java.io.IOException: java.util.concurrent.ExecutionException: > java.io.IOException: Cannot create an instance of InputFormat class > org.apache.hadoop.mapred.TextInputFormat as specified in mapredWork! > at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits( > CombineHiveInputFormat.java:544) > at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits( > JobSubmitter.java:332) > at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits( > JobSubmitter.java:324) > at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal( > JobSubmitter.java:200) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1693) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:578) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:573) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1693) > at org.apache.hadoop.mapred.JobClient.submitJobInternal( > JobClient.java:573) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:564) > at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute( > ExecDriver.java:430) > at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute( > MapRedTask.java:137) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential( > TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120) > at org.apache.hive.service.cli.operation.SQLOperation. > runQuery(SQLOperation.java:191) > at org.apache.hive.service.cli.operation.SQLOperation.access$ > 100(SQLOperation.java:79) > at org.apache.hive.service.cli.operation.SQLOperation$2$1. > run(SQLOperation.java:245) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1693) > at org.apache.hive.service.cli.operation.SQLOperation$2.run( > SQLOperation.java:258) > at java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.concurrent.ExecutionException: java.io.IOException: > Cannot create an instance of InputFormat class > org.apache.hadoop.mapred.TextInputFormat > as specified in mapredWork! > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits( > CombineHiveInputFormat.java:532) > ... 37 more > Caused by: java.io.IOException: Cannot create an instance of InputFormat > class org.apache.hadoop.mapred.TextInputFormat as specified in mapredWork! > at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputFormatFromCache( > HiveInputFormat.java:211) > at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$ > CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:111) > at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$ > CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:88) > ... 4 more > Caused by: java.lang.RuntimeException: Error in configuring object > at org.apache.hadoop.util.ReflectionUtils.setJobConf( > ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf( > ReflectionUtils.java:75) > at org.apache.hadoop.util.ReflectionUtils.newInstance( > ReflectionUtils.java:133) > at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputFormatFromCache( > HiveInputFormat.java:203) > ... 6 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.ReflectionUtils.setJobConf( > ReflectionUtils.java:106) > ... 9 more > Caused by: java.lang.IllegalArgumentException: Compression codec > com.hadoop.compression.lzo.LzoCodec not found. > at org.apache.hadoop.io.compress.CompressionCodecFactory. > getCodecClasses(CompressionCodecFactory.java:135) > at org.apache.hadoop.io.compress.CompressionCodecFactory.<init> > (CompressionCodecFactory.java:175) > at org.apache.hadoop.mapred.TextInputFormat.configure( > TextInputFormat.java:45) > ... 14 more > Caused by: java.lang.ClassNotFoundException: Class > com.hadoop.compression.lzo.LzoCodec not found > at org.apache.hadoop.conf.Configuration.getClassByName( > Configuration.java:2105) > at org.apache.hadoop.io.compress.CompressionCodecFactory. > getCodecClasses(CompressionCodecFactory.java:128) > ... 16 more > > ERROR : FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.mr.MapRedTask > INFO : Completed executing command(queryId=amos_ > 20160815034646_1d786772-c41e-4804-9d3c-dc768656ca3a); Time taken: 0.475 > seconds > Error: Error while processing statement: FAILED: Execution Error, return > code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask > (state=08S01,code=1) > java.sql.SQLException: Error while processing statement: FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask > at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:279) > at org.apache.hive.beeline.Commands.executeInternal(Commands.java:893) > at org.apache.hive.beeline.Commands.execute(Commands.java:1079) > at org.apache.hive.beeline.Commands.sql(Commands.java:976) > at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1089) > at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:921) > at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:899) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:841) > at org.apache.hive.beeline.BeeLine.mainWithInputRedirection( > BeeLine.java:482) > at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > > It seems like a mr framework is needed to be running, but > 'testdata/bin/run-all.sh' doesn't start it. > > Any help is much appreciated. > > regards, > Amos. > > >
