Well I finally find out.. My jar was owned by a different unix user than the one launching the job, that's why I had the error !
2013/5/15 Quentin Ambard <[email protected]> > Hi, > Thanks for your answers. > - crunch tmp dir permissions are fine, crunch create a new folder inside > everytime I launch the batch > - crunch example jar (wordcount) > - hbase connection is OK (I scan the table) > - I updated to crunch 0.6.0, logs are enabled and I have now more > information about the error. > It looks like it can't find some Hbase dependency jars to ship them to > the cluster. I think all necessary dependancies are packaged inside my jar. > All hadoop depandancies are from the cloudera repository (so I don't think > it's a version issue). > > Any ideas ? > > Exception in thread "main" org.apache.crunch.CrunchRuntimeException: > java.io.IOException: java.lang.RuntimeException: java.io.IOException: No > such file or directory > at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:153) > at org.apache.crunch.impl.mr.MRPipeline.runAsync(MRPipeline.java:172) > at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:160) > at org.apache.crunch.impl.mr.MRPipeline.done(MRPipeline.java:181) > at > com.myprocurement.crunch.job.extractor.ExtractAndConcatJob.run(ExtractAndConcatJob.java:102) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > com.myprocurement.crunch.job.fullpage.CrunchLauncher.launch(CrunchLauncher.java:40) > at com.myprocurement.crunch.BatchMain.main(BatchMain.java:31) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > Caused by: java.io.IOException: java.lang.RuntimeException: > java.io.IOException: No such file or directory > at > org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:521) > at > org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:472) > at > org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:438) > at > org.apache.crunch.io.hbase.HBaseSourceTarget.configureSource(HBaseSourceTarget.java:100) > at > org.apache.crunch.impl.mr.plan.JobPrototype.build(JobPrototype.java:192) > at > org.apache.crunch.impl.mr.plan.JobPrototype.getCrunchJob(JobPrototype.java:123) > at org.apache.crunch.impl.mr.plan.MSCRPlanner.plan(MSCRPlanner.java:159) > at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:151) > ... 12 more > Caused by: java.lang.RuntimeException: java.io.IOException: No such file > or directory > at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:164) > at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:518) > ... 19 more > Caused by: java.io.IOException: No such file or directory > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.checkAndCreate(File.java:1705) > at java.io.File.createTempFile0(File.java:1726) > at java.io.File.createTempFile(File.java:1803) > at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:156) > ... 23 more > > > > > > > 2013/5/13 Josh Wills <[email protected]> > >> It does sound like a permission issue-- you can set the crunch.tmp.dir >> property on the commandline (assuming you're implementing the Tool >> interface) by setting -Dcrunch.tmp.dir=... to see if that helps. >> >> >> On Mon, May 13, 2013 at 5:15 AM, Christian Tzolov <[email protected]>wrote: >> >>> You can try MRPipelien.enableDebug() to lower the log level. >>> >>> >>> On Mon, May 13, 2013 at 12:06 PM, Quentin Ambard < >>> [email protected]> wrote: >>> >>>> The problem is that I d'ont see my job on the JobTracker page. It's >>>> like the job don't even start ! >>>> Is there a way to improve log level to get more information on the >>>> error ? >>>> >>>> >>>> 2013/5/12 Josh Wills <[email protected]> >>>> >>>>> Something probably failed in the MapReduce job itself, which meant >>>>> that there weren't any outputs for Crunch to move around. What do the >>>>> error >>>>> logs for the individual tasks look like on the JobTracker status page(s)? >>>>> >>>>> >>>>> On Sat, May 11, 2013 at 5:02 PM, Quentin Ambard < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> I'm running a simple job on hadoop cdh 4.1.2 based on crunch. >>>>>> The job is quite simple : it scan a hbase table, extract some data >>>>>> from each entry of hbase, group the result by key and combine them using >>>>>> an >>>>>> aggreator, then write it back to another hbase table. >>>>>> It works fine on my computer, however when I try to launch it on my >>>>>> hadoop cluster I get the following : >>>>>> >>>>>> >>hadoop jar uber-crunch-1.0-SNAPSHOT.jar description >>>>>> /home/quentin/default.properties >>>>>> 13/05/12 01:57:50 INFO support.ClassPathXmlApplicationContext: >>>>>> Refreshing >>>>>> org.springframework.context.support.ClassPathXmlApplicationContext@1f4384c2: >>>>>> startup date [Sun May 12 01:57:50 CEST 2013]; root of context hierarchy >>>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean >>>>>> definitions from class path resource >>>>>> [context/job-description-context.xml] >>>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean >>>>>> definitions from class path resource [context/default-batch-context.xml] >>>>>> 13/05/12 01:57:51 INFO annotation.ClassPathBeanDefinitionScanner: >>>>>> JSR-330 'javax.inject.Named' annotation found and supported for component >>>>>> scanning >>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading >>>>>> properties file from URL >>>>>> [file:/tmp/hadoop-hdfs/hadoop-unjar7637839123250781784/default.properties] >>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading >>>>>> properties file from URL >>>>>> [jar:file:/home/quentin/uber-crunch-1.0-SNAPSHOT.jar!/default.properties] >>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading >>>>>> properties file from URL [file:/home/quentin/default.properties] >>>>>> 13/05/12 01:57:51 INFO >>>>>> annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 >>>>>> 'javax.inject.Inject' annotation found and supported for autowiring >>>>>> 13/05/12 01:57:51 INFO support.DefaultListableBeanFactory: >>>>>> Pre-instantiating singletons in >>>>>> org.springframework.beans.factory.support.DefaultListableBeanFactory@5b7b0998: >>>>>> defining beans >>>>>> [org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,applicationContextHolder,descriptionLauncher,descriptionExtractor,emailExtractor,rawTextExtractor,keywordsExtractor,org.springframework.context.annotation.internalConfigurationAnnotationProcessor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor,org.springframework.context.annotation.ConfigurationClassPostProcessor$ImportAwareBeanPostProcessor#0]; >>>>>> root of factory hierarchy >>>>>> 13/05/12 01:57:52 INFO hbase.HBaseTarget: HBaseTarget ignores checks >>>>>> for existing outputs... >>>>>> 13/05/12 01:57:53 INFO collect.PGroupedTableImpl: Setting num reduce >>>>>> tasks to 2 >>>>>> 13/05/12 01:57:53 ERROR mr.MRPipeline: >>>>>> org.apache.crunch.CrunchRuntimeException: java.io.IOException: >>>>>> java.lang.RuntimeException: java.io.IOException: No such file or >>>>>> directory >>>>>> 13/05/12 01:57:53 WARN mr.MRPipeline: Not running cleanup while >>>>>> output targets remain >>>>>> >>>>>> Any idea of the origin of the problem ? Maybe it's something with >>>>>> permissions or a crunch tmp file, but I can't find out where it come from >>>>>> >>>>>> Thanks for your help >>>>>> >>>>>> >>>>>> Quentin >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Director of Data Science >>>>> Cloudera <http://www.cloudera.com> >>>>> Twitter: @josh_wills <http://twitter.com/josh_wills> >>>>> >>>> >>>> >>>> >>>> -- >>>> Quentin Ambard >>>> >>> >>> >> >> >> -- >> Director of Data Science >> Cloudera <http://www.cloudera.com> >> Twitter: @josh_wills <http://twitter.com/josh_wills> >> > > > > -- > Quentin Ambard > -- Quentin Ambard
