Hi all! I am struggling to find a working solution to load data from HBase directly. I am using Cloudera CDH3b3 which comes with Pig 0.7. What would be the easiest way to load data from HBase? If it matters: we need the rows to be included, too.
I have checked ElephantBird, but it seems to require Pig 0.6. I could downgrade, but it seems... well... :) On the other hand, loading from HBase with rows is only added in Pig 0.8: https://issues.apache.org/jira/browse/PIG-915 https://issues.apache.org/jira/browse/PIG-1205 But judging from the last issue Pig 0.8 requires HBase 0.20.6? I can install latest Pig from source if needed, but I'd rather leave Hadoop and HBase at their versions (0.20.2 and 0.89.20100924 respectively). Should I write my own UDF? I'd appreciate some pointers. Thanks, Anze
