Re: 'hive.merge.mapfiles' is broken in trunk

2010-08-30 Thread Ning Zhang
I think it is because CDH does not support CombineFileInputFormat (or 
incompatible with Hadoop 0.20.2). If you want to merge, you can set 
hive.mergejob.maponly=false, then it will not use CombineFileInputFormat. 



On Aug 30, 2010, at 7:13 PM, 김영우 wrote:

 Hi folks,
 
 'hive.merge.mapfiles=true' is a default for trunk. but I've got an error like 
 below:
 
 Exception in thread main java.lang.NoSuchMethodError: 
 org.apache.hadoop.mapred.lib.CombineFileInputFormat.createPool(Lorg/apache/hadoop/mapred/JobConf;[Lorg/apache/hadoop/fs/PathFilter;)V
at 
 org.apache.hadoop.hive.shims.Hadoop20Shims$CombineFileInputFormatShim.createPool(Hadoop20Shims.java:322)
at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
at 
 org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:851)
at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:822)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:610)
at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:120)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 
 However after 'SET hive.merge.mapfiles=false', My query works fine. it is a 
 simple INSERT... SELECT ... query.
 I'm wondering anyone have experienced this before. 
 
 I'm Using CDH3, Hive 0.7(trunk).
 
 Thanks,
 
 Youngwoo
 



Re: 'hive.merge.mapfiles' is broken in trunk

2010-08-30 Thread 김영우
Ning,

I've just found a similar issue, http://bit.ly/d5zc8G
CDH3' CombineFileInputFormat is incompatible with Hadoop 0.20.2 :-(

Thanks for your quick reply.

Youngwoo

2010/8/31 Ning Zhang nzh...@facebook.com

 I think it is because CDH does not support CombineFileInputFormat (or c).
 If you want to merge, you can set hive.mergejob.maponly=false, then it will
 not use CombineFileInputFormat.



 On Aug 30, 2010, at 7:13 PM, 김영우 wrote:

  Hi folks,
 
  'hive.merge.mapfiles=true' is a default for trunk. but I've got an error
 like below:
 
  Exception in thread main java.lang.NoSuchMethodError:
 org.apache.hadoop.mapred.lib.CombineFileInputFormat.createPool(Lorg/apache/hadoop/mapred/JobConf;[Lorg/apache/hadoop/fs/PathFilter;)V
 at
 org.apache.hadoop.hive.shims.Hadoop20Shims$CombineFileInputFormatShim.createPool(Hadoop20Shims.java:322)
 at
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
 at
 org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:851)
 at
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:822)
 at
 org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
 at
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:610)
 at
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:120)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
 at
 org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 
  However after 'SET hive.merge.mapfiles=false', My query works fine. it is
 a simple INSERT... SELECT ... query.
  I'm wondering anyone have experienced this before.
 
  I'm Using CDH3, Hive 0.7(trunk).
 
  Thanks,
 
  Youngwoo