Hi all!

I am struggling to find a working solution to load data from HBase directly. I 
am using Cloudera CDH3b3 which comes with Pig 0.7. What would be the easiest 
way to load data from HBase? 
If it matters: we need the rows to be included, too.

I have checked ElephantBird, but it seems to require Pig 0.6. I could 
downgrade, but it seems... well... :)

On the other hand, loading from HBase with rows is only added in Pig 0.8:
https://issues.apache.org/jira/browse/PIG-915
https://issues.apache.org/jira/browse/PIG-1205
But judging from the last issue Pig 0.8 requires HBase 0.20.6? 

I can install latest Pig from source if needed, but I'd rather leave Hadoop 
and HBase at their versions (0.20.2 and 0.89.20100924 respectively).

Should I write my own UDF? I'd appreciate some pointers.

Thanks,

Anze

Reply via email to