[ https://issues.apache.org/jira/browse/CRUNCH-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Wills reassigned CRUNCH-586: --------------------------------- Assignee: Josh Wills > SparkPipeline does not work with HBaseSourceTarget > -------------------------------------------------- > > Key: CRUNCH-586 > URL: https://issues.apache.org/jira/browse/CRUNCH-586 > Project: Crunch > Issue Type: Bug > Components: Spark > Affects Versions: 0.13.0 > Reporter: Stefan De Smit > Assignee: Josh Wills > > final Pipeline pipeline = new SparkPipeline("local", "crunchhbase", > HBaseInputSource.class, conf); > final PTable<ImmutableBytesWritable, Result> read = pipeline.read(new > HBaseSourceTarget("t1", new Scan())); > return an empty table, while it works with MRPipeline. > root cause is the combination of sparks getJavaRDDLike method: > source.configureSource(job, -1); > Converter converter = source.getConverter(); > JavaPairRDD<?, ?> input = runtime.getSparkContext().newAPIHadoopRDD( > job.getConfiguration(), > CrunchInputFormat.class, > converter.getKeyClass(), > converter.getValueClass()); > That assumes "CrunchInputFormat.class" (and always uses -1) > and hbase configureSoruce method: > if (inputId == -1) { > job.setMapperClass(CrunchMapper.class); > job.setInputFormatClass(inputBundle.getFormatClass()); > inputBundle.configure(conf); > } else { > Path dummy = new Path("/hbase/" + table); > CrunchInputs.addInputPath(job, dummy, inputBundle, inputId); > } > easiest solution I see, is always calling CrunchInputs.addInputPath, in every > source. -- This message was sent by Atlassian JIRA (v6.3.4#6332)