I'm by no means an expert, but I think it's the latter. My rudimentary understanding is that pig
uses HBaseStorage to load the data from hbase and passes the input splits along to hadoop/MR. Feel
free to correct me if I'm wrong.
--
Jameson Lopp
Software Engineer
Bronto Software, Inc.
On 04/12/2011 10:50 AM, Daniel Eklund wrote:
As a follow-up to my own question, which accurately describes the component
call-stack of the pig script I included in my post?
pig -> mapreduce/hadoop -> Hbase
pig -> Hbase -> mapreduce/hadoop
On Tue, Apr 12, 2011 at 9:53 AM, Daniel Eklund<[email protected]> wrote:
This question might be better diagnosed as an Hbase issue, but since it's
ultimately a Pig script I want to use, I figure someone on this group could
help me out. I tried asking the IRC channel, but I think it was in a lull.
My scenario: I want to use Pig to call an HBase store.
My installs: Apache Pig version 0.8.0-CDH3B4 --- hbase version:
hbase-0.90.1-CDH3B4.
My sample script:
-----------
A = load 'passwd' using PigStorage(':');
rawDocs = LOAD 'hbase://daniel_product'
USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('base:testCol1');
vals = foreach rawDocs generate $0 as val;
dump vals;
store vals into 'daniel.out';
-----------
I am consistently getting a
Failed Jobs:
JobId Alias Feature Message Outputs
N/A rawDocs,vals MAP_ONLY Message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Timed out
trying to locate root region
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
Googling shows me similar issues:
http://search-hadoop.com/m/RPLkD1bmY4l&subj=Re+Cannot+connect+HBase+to+Pig
My current understanding is that somewhere in the interaction between Pig,
Hadoop, HBase, and Zookeper, there is a configuration file that needs to be
included in a classpath or a configuration directory somewhere. I have
tried various combinations of making hadoop aware of Hbase and vice-versa.
I have tried ZK running on its own, and also managed by HBase.
Can someone explain the dependencies here? Any insight as to what I am
missing? What would your diagnosis of the above message be?
thanks,
daniel