What's the minimum supported version of HBase Crunch will support?  We have
the exact same need but because the fix for HBASE-3996 and its requirement
for region server changes it wasn't as each to patch back to 0.92 or 0.94.2
(CDH 4.2).



On Mon, Apr 8, 2013 at 3:47 PM, Josh Wills <[email protected]> wrote:

> Maybe we need something based on this?
>
> https://issues.apache.org/jira/browse/HBASE-3996
>
>
> On Mon, Apr 8, 2013 at 1:41 PM, Chad Urso McDaniel <[email protected]>wrote:
>
>> This may be a core hadoop question.
>>
>> We are using Crunch with HBase.
>> We typically set up the input PTable like so:
>> ---
>>       Scan scan = ...
>>       HBaseSourceTarget source = new HBaseSourceTarget(tableName, scan);
>>       PTable<ImmutableBytesWritable, Result> data = pipeline.read(source);
>> ---
>>
>> A use case that we want to use in order to speed up the processing with
>> Crunch is using multiple Scans into one PTable.
>>
>> We know which sections of the HBase table we want and they are not
>> contiguous.
>>
>> We have tried unioning the PTables but that turns out to be incredibly
>> slow.
>> Currently we are using a filter that results in many unnecessary reads.
>>
>> How do others solve this?
>>
>> I'm temped to write a TableSource that can do this.
>>
>> thanks
>>
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>

Reply via email to