Looks like the root cause is that Hadoop's LZO codec could not be found. You probably need to fetch and build this: https://github.com/twitter/hadoop-lzo
bin/impala-config.sh assumes that you've checked out that repo into a specific directory $IMPALA_HOME/../hadoop-lzo Alternatively, you could also to try to only load data for the non-LZO formats. Alex On Mon, Aug 15, 2016 at 7:05 PM, Amos Bird <[email protected]> wrote: > Hi Alex, > Thanks for the reply! I have indeed get them all up and running. Here is > the jps list: > > 13184 > 12833 LlamaAMMain > 14081 HQuorumPeer > 40870 Jps > 11302 DataNode > 15081 RunJar > 15722 RunJar > 13259 > 11980 NodeManager > 11244 DataNode > 14700 HRegionServer > 11278 NameNode > 14353 HRegionServer > 11218 DataNode > 11923 NodeManager > 11955 ResourceManager > 12982 Bootstrap > 14166 HMaster > 11896 NodeManager > 16089 RunJar > 14527 HRegionServer > > I also played around a few SQLs inside impala. It works fine. In fact I > was able to create the test data before. I have no clue what happened in > the newest impala build. > > > Alex Behm writes: > > > Hi Amos! > > > > load-data.py assumes that you have a running cluster. You need to first > get > > these working: > > testdata/bin/run-all.sh > > bin/start-impala-cluster.py > > > > The first command starts all dependent services like HDFS, YARN, Hive > > Metastore, Hive HS2, etc. > > The second command starts an Impala mini-cluster with 3 nodes. This > command > > assumes all dependent services are already running/ > > > > Hope it helps! > > > > Alex > > > > On Mon, Aug 15, 2016 at 5:20 AM, Amos Bird <[email protected]> wrote: > > > >> > >> I was trying to build a new test warehouse. After successfully running > >> 'bin/create_testdata.sh', I did 'bin/load_data.py -w all'. > Unfortunately it > >> ended up with this: > >> > >> ERROR : Job Submission failed with exception 'java.io.IOException(java. > >> util.concurrent.ExecutionException: java.io.IOException: Cannot create > an > >> instance of InputFormat class org.apache.hadoop.mapred.TextInputFormat > as > >> specified in mapredWork!)' > >> java.io.IOException: java.util.concurrent.ExecutionException: > >> java.io.IOException: Cannot create an instance of InputFormat class > >> org.apache.hadoop.mapred.TextInputFormat as specified in mapredWork! > >> at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits( > >> CombineHiveInputFormat.java:544) > >> at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits( > >> JobSubmitter.java:332) > >> at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits( > >> JobSubmitter.java:324) > >> at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal( > >> JobSubmitter.java:200) > >> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307) > >> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304) > >> at java.security.AccessController.doPrivileged(Native Method) > >> at javax.security.auth.Subject.doAs(Subject.java:422) > >> at org.apache.hadoop.security.UserGroupInformation.doAs( > >> UserGroupInformation.java:1693) > >> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304) > >> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:578) > >> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:573) > >> at java.security.AccessController.doPrivileged(Native Method) > >> at javax.security.auth.Subject.doAs(Subject.java:422) > >> at org.apache.hadoop.security.UserGroupInformation.doAs( > >> UserGroupInformation.java:1693) > >> at org.apache.hadoop.mapred.JobClient.submitJobInternal( > >> JobClient.java:573) > >> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:564) > >> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute( > >> ExecDriver.java:430) > >> at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute( > >> MapRedTask.java:137) > >> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > >> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential( > >> TaskRunner.java:100) > >> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782) > >> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539) > >> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318) > >> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127) > >> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120) > >> at org.apache.hive.service.cli.operation.SQLOperation. > >> runQuery(SQLOperation.java:191) > >> at org.apache.hive.service.cli.operation.SQLOperation.access$ > >> 100(SQLOperation.java:79) > >> at org.apache.hive.service.cli.operation.SQLOperation$2$1. > >> run(SQLOperation.java:245) > >> at java.security.AccessController.doPrivileged(Native Method) > >> at javax.security.auth.Subject.doAs(Subject.java:422) > >> at org.apache.hadoop.security.UserGroupInformation.doAs( > >> UserGroupInformation.java:1693) > >> at org.apache.hive.service.cli.operation.SQLOperation$2.run( > >> SQLOperation.java:258) > >> at java.util.concurrent.Executors$RunnableAdapter. > >> call(Executors.java:511) > >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >> at java.util.concurrent.ThreadPoolExecutor.runWorker( > >> ThreadPoolExecutor.java:1142) > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > >> ThreadPoolExecutor.java:617) > >> at java.lang.Thread.run(Thread.java:745) > >> Caused by: java.util.concurrent.ExecutionException: > java.io.IOException: > >> Cannot create an instance of InputFormat class org.apache.hadoop.mapred. > TextInputFormat > >> as specified in mapredWork! > >> at java.util.concurrent.FutureTask.report(FutureTask.java:122) > >> at java.util.concurrent.FutureTask.get(FutureTask.java:192) > >> at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits( > >> CombineHiveInputFormat.java:532) > >> ... 37 more > >> Caused by: java.io.IOException: Cannot create an instance of InputFormat > >> class org.apache.hadoop.mapred.TextInputFormat as specified in > mapredWork! > >> at org.apache.hadoop.hive.ql.io.HiveInputFormat. > getInputFormatFromCache( > >> HiveInputFormat.java:211) > >> at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$ > >> CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:111) > >> at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$ > >> CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:88) > >> ... 4 more > >> Caused by: java.lang.RuntimeException: Error in configuring object > >> at org.apache.hadoop.util.ReflectionUtils.setJobConf( > >> ReflectionUtils.java:109) > >> at org.apache.hadoop.util.ReflectionUtils.setConf( > >> ReflectionUtils.java:75) > >> at org.apache.hadoop.util.ReflectionUtils.newInstance( > >> ReflectionUtils.java:133) > >> at org.apache.hadoop.hive.ql.io.HiveInputFormat. > getInputFormatFromCache( > >> HiveInputFormat.java:203) > >> ... 6 more > >> Caused by: java.lang.reflect.InvocationTargetException > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> at sun.reflect.NativeMethodAccessorImpl.invoke( > >> NativeMethodAccessorImpl.java:62) > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > >> DelegatingMethodAccessorImpl.java:43) > >> at java.lang.reflect.Method.invoke(Method.java:498) > >> at org.apache.hadoop.util.ReflectionUtils.setJobConf( > >> ReflectionUtils.java:106) > >> ... 9 more > >> Caused by: java.lang.IllegalArgumentException: Compression codec > >> com.hadoop.compression.lzo.LzoCodec not found. > >> at org.apache.hadoop.io.compress.CompressionCodecFactory. > >> getCodecClasses(CompressionCodecFactory.java:135) > >> at org.apache.hadoop.io.compress.CompressionCodecFactory.<init> > >> (CompressionCodecFactory.java:175) > >> at org.apache.hadoop.mapred.TextInputFormat.configure( > >> TextInputFormat.java:45) > >> ... 14 more > >> Caused by: java.lang.ClassNotFoundException: Class > >> com.hadoop.compression.lzo.LzoCodec not found > >> at org.apache.hadoop.conf.Configuration.getClassByName( > >> Configuration.java:2105) > >> at org.apache.hadoop.io.compress.CompressionCodecFactory. > >> getCodecClasses(CompressionCodecFactory.java:128) > >> ... 16 more > >> > >> ERROR : FAILED: Execution Error, return code 1 from > >> org.apache.hadoop.hive.ql.exec.mr.MapRedTask > >> INFO : Completed executing command(queryId=amos_ > >> 20160815034646_1d786772-c41e-4804-9d3c-dc768656ca3a); Time taken: 0.475 > >> seconds > >> Error: Error while processing statement: FAILED: Execution Error, return > >> code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask > >> (state=08S01,code=1) > >> java.sql.SQLException: Error while processing statement: FAILED: > Execution > >> Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask > >> at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:279) > >> at org.apache.hive.beeline.Commands.executeInternal( > Commands.java:893) > >> at org.apache.hive.beeline.Commands.execute(Commands.java:1079) > >> at org.apache.hive.beeline.Commands.sql(Commands.java:976) > >> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1089) > >> at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:921) > >> at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:899) > >> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:841) > >> at org.apache.hive.beeline.BeeLine.mainWithInputRedirection( > >> BeeLine.java:482) > >> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:465) > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> at sun.reflect.NativeMethodAccessorImpl.invoke( > >> NativeMethodAccessorImpl.java:62) > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > >> DelegatingMethodAccessorImpl.java:43) > >> at java.lang.reflect.Method.invoke(Method.java:498) > >> at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > >> at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > >> > >> It seems like a mr framework is needed to be running, but > >> 'testdata/bin/run-all.sh' doesn't start it. > >> > >> Any help is much appreciated. > >> > >> regards, > >> Amos. > >> > >> > >> > >
