This question might be better diagnosed as an Hbase issue, but since it's
ultimately a Pig script I want to use, I figure someone on this group could
help me out. I tried asking the IRC channel, but I think it was in a lull.
My scenario: I want to use Pig to call an HBase store.
My installs: Apache Pig version 0.8.0-CDH3B4 --- hbase version:
hbase-0.90.1-CDH3B4.
My sample script:
-----------
A = load 'passwd' using PigStorage(':');
rawDocs = LOAD 'hbase://daniel_product'
USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('base:testCol1');
vals = foreach rawDocs generate $0 as val;
dump vals;
store vals into 'daniel.out';
-----------
I am consistently getting a
Failed Jobs:
JobId Alias Feature Message Outputs
N/A rawDocs,vals MAP_ONLY Message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Timed out
trying to locate root region
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
Googling shows me similar issues:
http://search-hadoop.com/m/RPLkD1bmY4l&subj=Re+Cannot+connect+HBase+to+Pig
My current understanding is that somewhere in the interaction between Pig,
Hadoop, HBase, and Zookeper, there is a configuration file that needs to be
included in a classpath or a configuration directory somewhere. I have
tried various combinations of making hadoop aware of Hbase and vice-versa.
I have tried ZK running on its own, and also managed by HBase.
Can someone explain the dependencies here? Any insight as to what I am
missing? What would your diagnosis of the above message be?
thanks,
daniel