Great! :) Thanks for helping me out.
All the best, Anze On Tuesday 26 October 2010, Dmitriy Ryaboy wrote: > I think that you might be able to get away with 20.2 if you don't use > the filtering options. > > On Mon, Oct 25, 2010 at 3:39 PM, Anze <[email protected]> wrote: > > Dmitriy, thanks for the answer! > > > > The problem with upgrading to HBase 0.20.6 is that cloudera doesn't ship > > it yet and we would like to keep our install at "official" versions, > > even if beta. Of course, since this is a development / testing cluster, > > we could bend the rules if really necessary... > > > > I have written a small MR job (actually, just "M" job :) that exports the > > tables to files (allowing me to use Pig 0.7), but that is a bit > > cumbersome and slow. > > > > If I install the latest Pig (0.8), will it work at all with HBase 0.20.2? > > In other words, are scan filters (which were fixed in 0.20.6) needed as > > part of user-defined parameters or as part of Pig optimizations in > > reading from HBase? Hope my question makes sense... :) > > > > Thanks again, > > > > Anze > > > > On Tuesday 26 October 2010, Dmitriy Ryaboy wrote: > >> Anze, the reason we bumped up to 20.6 in the ticket was because HBase > >> 20.2 had a bug in it. Ask the HBase folks, but I'd say you should > >> upgrade. > >> FWIW we upgraded to 20.6 from 20.2 a few months back and it's been > >> working smoothly. > >> > >> The Elephant-Bird hbase loader for pig 0.6 does add row keys and most > >> of the other features we added to the built-in loader for pig 0.8 > >> (notably, it does not do storage). But I don't recommend downgrading > >> to pig 0.6, as 7 and especially 8 are great improvements to the > >> software. > >> > >> -D > >> > >> On Mon, Oct 25, 2010 at 7:01 AM, Anze <[email protected]> wrote: > >> > Hi all! > >> > > >> > I am struggling to find a working solution to load data from HBase > >> > directly. I am using Cloudera CDH3b3 which comes with Pig 0.7. What > >> > would be the easiest way to load data from HBase? > >> > If it matters: we need the rows to be included, too. > >> > > >> > I have checked ElephantBird, but it seems to require Pig 0.6. I could > >> > downgrade, but it seems... well... :) > >> > > >> > On the other hand, loading from HBase with rows is only added in Pig > >> > 0.8: https://issues.apache.org/jira/browse/PIG-915 > >> > https://issues.apache.org/jira/browse/PIG-1205 > >> > But judging from the last issue Pig 0.8 requires HBase 0.20.6? > >> > > >> > I can install latest Pig from source if needed, but I'd rather leave > >> > Hadoop and HBase at their versions (0.20.2 and 0.89.20100924 > >> > respectively). > >> > > >> > Should I write my own UDF? I'd appreciate some pointers. > >> > > >> > Thanks, > >> > > >> > Anze
