That's interesting... we should be able to return a byte array properly (though this is a bit risky for people who try to later turn this bytearray into a long using Pig, since the conversion from bytes to longs in Pig is different than in HBase).
Could you guys open a jira, preferably with an easy way to reproduce the error? D On Tue, Sep 6, 2011 at 10:03 AM, Bryce Poole <[email protected]> wrote: > My load looks like this > > .... AS (key:chararray, value:long); > > and I'm able to return data. > > I changed the load to > > .... AS (key:chararray, value:bytearray); > > and had results that match yours. > > Try changing the value to long or int type and see if that helps. > > -bp > > > On Tue, Sep 6, 2011 at 9:00 AM, shazz Ng <[email protected]> wrote: > > > the 'funny' thing is that if I look at the other CF name (from an byte id > > gives the name, reverse way) : > > > > grunt> tsd_metrics2 = LOAD 'hbase://tsdb-uid' using > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('name:metrics', > > '-caster=HBaseBinaryConverter -loadKey=true') AS (key:bytearray, > > metrics:bytearray); > > > > I've got the same issue: > > (,proc.loadavg.1m) > > (,proc.loadavg.5m) > > (,Measurement_1) > > (,Measurement_2) > > (,Measurement_3) > > > > So there is a real issue with byte array.... > > > > On Tue, Sep 6, 2011 at 4:30 PM, shazz Ng <[email protected]> wrote: > > > > > Hello Bryce, > > > > > > not better... :-( > > > > > > grunt> tsd_metrics2 = LOAD 'hbase://tsdb-uid' using > > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('id:metrics', > > > '-caster=HBaseBinaryConverter -loadKey=true') AS (key:bytearray, > > > metrics:bytearray); > > > grunt> dump tsd_metrics2; > > > > > > [...] > > > > > > (Measurement_1,) > > > (Measurement_2,) > > > (Measurement_3,) > > > (proc.loadavg.1m,) > > > (proc.loadavg.5m,) > > > > > > > > > On Tue, Sep 6, 2011 at 4:18 PM, Bryce Poole <[email protected]> wrote: > > > > > >> Try adding -caster=HBaseBinaryConverter along with loadKey > > >> > > >> '-caster=HBaseBinaryConverter -loadKey=true' > > >> > > >> -bp > > >> > > >> On Tue, Sep 6, 2011 at 7:59 AM, shazz Ng <[email protected]> wrote: > > >> > > >> > Hello Norbert, > > >> > > > >> > Unfortunately, same result : > > >> > (Measurement_1,) > > >> > (Measurement_2,) > > >> > (Measurement_3,) > > >> > (proc.loadavg.1m,) > > >> > (proc.loadavg.5m,) > > >> > > > >> > the row key is well extracted (Measurement_1 for example) but the > > value, > > >> > the > > >> > id I need for timestamp data querying, the bytearray, is not :( > > >> > > > >> > shazz > > >> > > > >> > On Tue, Sep 6, 2011 at 3:37 PM, Norbert Burger < > > >> [email protected] > > >> > >wrote: > > >> > > > >> > > On Tue, Sep 6, 2011 at 7:58 AM, shazz Ng <[email protected]> > > wrote: > > >> > > > So from Pig when I want to retrieve only the metrics and their > > value > > >> (= > > >> > > id > > >> > > > for the data table) I do : > > >> > > > tsd_metrics = LOAD 'hbase://tsdb-uid' using > > >> > > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('id:metrics', > > >> > '-loadKey > > >> > > > true') AS (metrics:bytearray); > > >> > > > dump tsd_metrics; > > >> > > > > >> > > Shazz -- if you use the "-loadKey" option to HbaseStorage, then > your > > >> > > LOAD schema includes an extra column containing the row key, and > you > > >> > > should add equivalent to your schema column mapping (the AS > clause). > > >> > > Try the following: > > >> > > > > >> > > tsd_metrics = LOAD 'hbase://tsdb-uid' using > > >> > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('id:metrics', > > >> > > '-loadKey true') AS (key:bytearray, metrics:bytearray); > > >> > > > > >> > > Norbert > > >> > > > > >> > > > >> > > > > > > > > >
