(fwiw, HBaseStorage works fine for me when I use it to pull whole protocol
buffer messages down as byte arrays)

On Tue, Sep 6, 2011 at 10:10 AM, Dmitriy Ryaboy <[email protected]> wrote:

> That's interesting... we should be able to return a byte array properly
> (though this is a bit risky for people who try to later turn this bytearray
> into a long using Pig, since the conversion from bytes to longs in Pig is
> different than in HBase).
>
> Could you guys open a jira, preferably with an easy way to reproduce the
> error?
>
> D
>
>
> On Tue, Sep 6, 2011 at 10:03 AM, Bryce Poole <[email protected]> wrote:
>
>> My load looks like this
>>
>> .... AS (key:chararray, value:long);
>>
>> and I'm able to return data.
>>
>> I changed the load to
>>
>> .... AS (key:chararray, value:bytearray);
>>
>> and had results that match yours.
>>
>> Try changing the value to long or int type and see if that helps.
>>
>> -bp
>>
>>
>> On Tue, Sep 6, 2011 at 9:00 AM, shazz Ng <[email protected]> wrote:
>>
>> > the 'funny' thing is that if I look at the other CF name (from an byte
>> id
>> > gives the name, reverse way) :
>> >
>> > grunt> tsd_metrics2     = LOAD 'hbase://tsdb-uid' using
>> > org.apache.pig.backend.hadoop.hbase.HBaseStorage('name:metrics',
>> > '-caster=HBaseBinaryConverter -loadKey=true') AS (key:bytearray,
>> > metrics:bytearray);
>> >
>> > I've got the same issue:
>> > (,proc.loadavg.1m)
>> > (,proc.loadavg.5m)
>> > (,Measurement_1)
>> > (,Measurement_2)
>> > (,Measurement_3)
>> >
>> > So there is a real issue with byte array....
>> >
>> > On Tue, Sep 6, 2011 at 4:30 PM, shazz Ng <[email protected]> wrote:
>> >
>> > > Hello Bryce,
>> > >
>> > > not better... :-(
>> > >
>> > > grunt> tsd_metrics2     = LOAD 'hbase://tsdb-uid' using
>> > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('id:metrics',
>> > > '-caster=HBaseBinaryConverter -loadKey=true') AS (key:bytearray,
>> > > metrics:bytearray);
>> > > grunt> dump tsd_metrics2;
>> > >
>> > > [...]
>> > >
>> > > (Measurement_1,)
>> > > (Measurement_2,)
>> > > (Measurement_3,)
>> > > (proc.loadavg.1m,)
>> > > (proc.loadavg.5m,)
>> > >
>> > >
>> > > On Tue, Sep 6, 2011 at 4:18 PM, Bryce Poole <[email protected]> wrote:
>> > >
>> > >> Try adding -caster=HBaseBinaryConverter along with loadKey
>> > >>
>> > >> '-caster=HBaseBinaryConverter -loadKey=true'
>> > >>
>> > >> -bp
>> > >>
>> > >> On Tue, Sep 6, 2011 at 7:59 AM, shazz Ng <[email protected]> wrote:
>> > >>
>> > >> > Hello Norbert,
>> > >> >
>> > >> > Unfortunately, same result :
>> > >> > (Measurement_1,)
>> > >> > (Measurement_2,)
>> > >> > (Measurement_3,)
>> > >> > (proc.loadavg.1m,)
>> > >> > (proc.loadavg.5m,)
>> > >> >
>> > >> > the row key is well extracted (Measurement_1 for example) but the
>> > value,
>> > >> > the
>> > >> > id I need for timestamp data querying, the bytearray, is not :(
>> > >> >
>> > >> > shazz
>> > >> >
>> > >> > On Tue, Sep 6, 2011 at 3:37 PM, Norbert Burger <
>> > >> [email protected]
>> > >> > >wrote:
>> > >> >
>> > >> > > On Tue, Sep 6, 2011 at 7:58 AM, shazz Ng <[email protected]>
>> > wrote:
>> > >> > > > So from Pig when I want to retrieve only the metrics and their
>> > value
>> > >> (=
>> > >> > > id
>> > >> > > > for the data table) I do :
>> > >> > > > tsd_metrics     = LOAD 'hbase://tsdb-uid' using
>> > >> > > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('id:metrics',
>> > >> > '-loadKey
>> > >> > > > true') AS (metrics:bytearray);
>> > >> > > > dump tsd_metrics;
>> > >> > >
>> > >> > > Shazz -- if you use the "-loadKey" option to HbaseStorage, then
>> your
>> > >> > > LOAD schema includes an extra column containing the row key, and
>> you
>> > >> > > should add equivalent to your schema column mapping (the AS
>> clause).
>> > >> > > Try the following:
>> > >> > >
>> > >> > > tsd_metrics = LOAD 'hbase://tsdb-uid' using
>> > >> > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('id:metrics',
>> > >> > > '-loadKey true') AS (key:bytearray, metrics:bytearray);
>> > >> > >
>> > >> > > Norbert
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Reply via email to