What's the minimum supported version of HBase Crunch will support? We have the exact same need but because the fix for HBASE-3996 and its requirement for region server changes it wasn't as each to patch back to 0.92 or 0.94.2 (CDH 4.2).
On Mon, Apr 8, 2013 at 3:47 PM, Josh Wills <[email protected]> wrote: > Maybe we need something based on this? > > https://issues.apache.org/jira/browse/HBASE-3996 > > > On Mon, Apr 8, 2013 at 1:41 PM, Chad Urso McDaniel <[email protected]>wrote: > >> This may be a core hadoop question. >> >> We are using Crunch with HBase. >> We typically set up the input PTable like so: >> --- >> Scan scan = ... >> HBaseSourceTarget source = new HBaseSourceTarget(tableName, scan); >> PTable<ImmutableBytesWritable, Result> data = pipeline.read(source); >> --- >> >> A use case that we want to use in order to speed up the processing with >> Crunch is using multiple Scans into one PTable. >> >> We know which sections of the HBase table we want and they are not >> contiguous. >> >> We have tried unioning the PTables but that turns out to be incredibly >> slow. >> Currently we are using a filter that results in many unnecessary reads. >> >> How do others solve this? >> >> I'm temped to write a TableSource that can do this. >> >> thanks >> > > > > -- > Director of Data Science > Cloudera <http://www.cloudera.com> > Twitter: @josh_wills <http://twitter.com/josh_wills> >
