Hi, Thanks for your answers. - crunch tmp dir permissions are fine, crunch create a new folder inside everytime I launch the batch - crunch example jar (wordcount) - hbase connection is OK (I scan the table) - I updated to crunch 0.6.0, logs are enabled and I have now more information about the error. It looks like it can't find some Hbase dependency jars to ship them to the cluster. I think all necessary dependancies are packaged inside my jar. All hadoop depandancies are from the cloudera repository (so I don't think it's a version issue).
Any ideas ? Exception in thread "main" org.apache.crunch.CrunchRuntimeException: java.io.IOException: java.lang.RuntimeException: java.io.IOException: No such file or directory at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:153) at org.apache.crunch.impl.mr.MRPipeline.runAsync(MRPipeline.java:172) at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:160) at org.apache.crunch.impl.mr.MRPipeline.done(MRPipeline.java:181) at com.myprocurement.crunch.job.extractor.ExtractAndConcatJob.run(ExtractAndConcatJob.java:102) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at com.myprocurement.crunch.job.fullpage.CrunchLauncher.launch(CrunchLauncher.java:40) at com.myprocurement.crunch.BatchMain.main(BatchMain.java:31) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.io.IOException: java.lang.RuntimeException: java.io.IOException: No such file or directory at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:521) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:472) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:438) at org.apache.crunch.io.hbase.HBaseSourceTarget.configureSource(HBaseSourceTarget.java:100) at org.apache.crunch.impl.mr.plan.JobPrototype.build(JobPrototype.java:192) at org.apache.crunch.impl.mr.plan.JobPrototype.getCrunchJob(JobPrototype.java:123) at org.apache.crunch.impl.mr.plan.MSCRPlanner.plan(MSCRPlanner.java:159) at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:151) ... 12 more Caused by: java.lang.RuntimeException: java.io.IOException: No such file or directory at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:164) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:518) ... 19 more Caused by: java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1705) at java.io.File.createTempFile0(File.java:1726) at java.io.File.createTempFile(File.java:1803) at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:156) ... 23 more 2013/5/13 Josh Wills <[email protected]> > It does sound like a permission issue-- you can set the crunch.tmp.dir > property on the commandline (assuming you're implementing the Tool > interface) by setting -Dcrunch.tmp.dir=... to see if that helps. > > > On Mon, May 13, 2013 at 5:15 AM, Christian Tzolov <[email protected]>wrote: > >> You can try MRPipelien.enableDebug() to lower the log level. >> >> >> On Mon, May 13, 2013 at 12:06 PM, Quentin Ambard < >> [email protected]> wrote: >> >>> The problem is that I d'ont see my job on the JobTracker page. It's like >>> the job don't even start ! >>> Is there a way to improve log level to get more information on the error >>> ? >>> >>> >>> 2013/5/12 Josh Wills <[email protected]> >>> >>>> Something probably failed in the MapReduce job itself, which meant that >>>> there weren't any outputs for Crunch to move around. What do the error logs >>>> for the individual tasks look like on the JobTracker status page(s)? >>>> >>>> >>>> On Sat, May 11, 2013 at 5:02 PM, Quentin Ambard < >>>> [email protected]> wrote: >>>> >>>>> Hi, >>>>> I'm running a simple job on hadoop cdh 4.1.2 based on crunch. >>>>> The job is quite simple : it scan a hbase table, extract some data >>>>> from each entry of hbase, group the result by key and combine them using >>>>> an >>>>> aggreator, then write it back to another hbase table. >>>>> It works fine on my computer, however when I try to launch it on my >>>>> hadoop cluster I get the following : >>>>> >>>>> >>hadoop jar uber-crunch-1.0-SNAPSHOT.jar description >>>>> /home/quentin/default.properties >>>>> 13/05/12 01:57:50 INFO support.ClassPathXmlApplicationContext: >>>>> Refreshing >>>>> org.springframework.context.support.ClassPathXmlApplicationContext@1f4384c2: >>>>> startup date [Sun May 12 01:57:50 CEST 2013]; root of context hierarchy >>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean >>>>> definitions from class path resource [context/job-description-context.xml] >>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean >>>>> definitions from class path resource [context/default-batch-context.xml] >>>>> 13/05/12 01:57:51 INFO annotation.ClassPathBeanDefinitionScanner: >>>>> JSR-330 'javax.inject.Named' annotation found and supported for component >>>>> scanning >>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading >>>>> properties file from URL >>>>> [file:/tmp/hadoop-hdfs/hadoop-unjar7637839123250781784/default.properties] >>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading >>>>> properties file from URL >>>>> [jar:file:/home/quentin/uber-crunch-1.0-SNAPSHOT.jar!/default.properties] >>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading >>>>> properties file from URL [file:/home/quentin/default.properties] >>>>> 13/05/12 01:57:51 INFO >>>>> annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 >>>>> 'javax.inject.Inject' annotation found and supported for autowiring >>>>> 13/05/12 01:57:51 INFO support.DefaultListableBeanFactory: >>>>> Pre-instantiating singletons in >>>>> org.springframework.beans.factory.support.DefaultListableBeanFactory@5b7b0998: >>>>> defining beans >>>>> [org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,applicationContextHolder,descriptionLauncher,descriptionExtractor,emailExtractor,rawTextExtractor,keywordsExtractor,org.springframework.context.annotation.internalConfigurationAnnotationProcessor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor,org.springframework.context.annotation.ConfigurationClassPostProcessor$ImportAwareBeanPostProcessor#0]; >>>>> root of factory hierarchy >>>>> 13/05/12 01:57:52 INFO hbase.HBaseTarget: HBaseTarget ignores checks >>>>> for existing outputs... >>>>> 13/05/12 01:57:53 INFO collect.PGroupedTableImpl: Setting num reduce >>>>> tasks to 2 >>>>> 13/05/12 01:57:53 ERROR mr.MRPipeline: >>>>> org.apache.crunch.CrunchRuntimeException: java.io.IOException: >>>>> java.lang.RuntimeException: java.io.IOException: No such file or directory >>>>> 13/05/12 01:57:53 WARN mr.MRPipeline: Not running cleanup while output >>>>> targets remain >>>>> >>>>> Any idea of the origin of the problem ? Maybe it's something with >>>>> permissions or a crunch tmp file, but I can't find out where it come from >>>>> >>>>> Thanks for your help >>>>> >>>>> >>>>> Quentin >>>>> >>>> >>>> >>>> >>>> -- >>>> Director of Data Science >>>> Cloudera <http://www.cloudera.com> >>>> Twitter: @josh_wills <http://twitter.com/josh_wills> >>>> >>> >>> >>> >>> -- >>> Quentin Ambard >>> >> >> > > > -- > Director of Data Science > Cloudera <http://www.cloudera.com> > Twitter: @josh_wills <http://twitter.com/josh_wills> > -- Quentin Ambard
