Interesting. My exact stacktrace is:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Timed out
trying to locate root region
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: Timed
out trying to locate root region
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:983)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:625)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:601)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:670)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:630)
So, I go to
https://repository.cloudera.com/content/repositories/releases/org/apache/hbase/hbase/0.90.1-CDH3B4/hbase-0.90.1-CDH3B4-sources.jar
to look at HConnectionManager and see that there's no locateRootRegion()
method there.
So, it looks like while I am running an HBase 0.90, the pig libs show me in
/usr/lib/pig/lib
hbase-0.20.6.jar zookeeper-hbase-1329.jar
I am not quite sure about the cloudera versus apache versioning schemes
going on here.
On Tue, Apr 12, 2011 at 6:35 PM, Bill Graham <[email protected]> wrote:
> Can you include more of your stack trace? I'm not sure of the
> specifics of what is stored where in ZK, but it seems you're timing
> out just trying to connect to ZK. Are you seeing any exceptions on the
> TT nodes, or just on the client?
>
>
> On Tue, Apr 12, 2011 at 3:24 PM, Daniel Eklund <[email protected]> wrote:
> > Bill, I have done all that both you and Jameson have suggested and still
> > get the same error.
> >
> > I can telnet into the zookeeper. I have also used the zkClient.sh and
> can
> > look at /hbase/rs to see the regionservers.
> > Should I be able to see anything at /hbase/root-region-server?
> >
> > thanks,
> > daniel
> >
> >
> > On Tue, Apr 12, 2011 at 11:58 AM, Bill Graham <[email protected]>
> wrote:
> >
> >> Yes, Pig's HBaseStorage using the HBase client to read/write directly
> >> to HBase from within a MR job, but chains to other Pig-generated MR
> >> jobs as needed to transform.
> >>
> >> Daniel, check that you have defined HBASE_CONF_DIR properly, or that
> >> you have hbase-site.xml in your classpath. Then try to telnet to the
> >> defined zookeeper host from the machine where the exception is being
> >> generated. There is some communication from Pig to HBase/ZK from the
> >> node that the client runs on before the MR jobs start on the cluster
> >> FYI.
> >>
> >>
> >> On Tue, Apr 12, 2011 at 8:40 AM, Jameson Lopp <[email protected]>
> wrote:
> >> > I'm by no means an expert, but I think it's the latter. My rudimentary
> >> > understanding is that pig uses HBaseStorage to load the data from
> hbase
> >> and
> >> > passes the input splits along to hadoop/MR. Feel free to correct me if
> >> I'm
> >> > wrong.
> >> > --
> >> > Jameson Lopp
> >> > Software Engineer
> >> > Bronto Software, Inc.
> >> >
> >> > On 04/12/2011 10:50 AM, Daniel Eklund wrote:
> >> >>
> >> >> As a follow-up to my own question, which accurately describes the
> >> >> component
> >> >> call-stack of the pig script I included in my post?
> >> >>
> >> >> pig -> mapreduce/hadoop -> Hbase
> >> >> pig -> Hbase -> mapreduce/hadoop
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Apr 12, 2011 at 9:53 AM, Daniel Eklund<[email protected]>
> >> wrote:
> >> >>
> >> >>> This question might be better diagnosed as an Hbase issue, but since
> >> it's
> >> >>> ultimately a Pig script I want to use, I figure someone on this
> group
> >> >>> could
> >> >>> help me out. I tried asking the IRC channel, but I think it was in a
> >> >>> lull.
> >> >>>
> >> >>> My scenario: I want to use Pig to call an HBase store.
> >> >>> My installs: Apache Pig version 0.8.0-CDH3B4 --- hbase version:
> >> >>> hbase-0.90.1-CDH3B4.
> >> >>> My sample script:
> >> >>>
> >> >>> -----------
> >> >>> A = load 'passwd' using PigStorage(':');
> >> >>> rawDocs = LOAD 'hbase://daniel_product'
> >> >>> USING
> >> >>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('base:testCol1');
> >> >>> vals = foreach rawDocs generate $0 as val;
> >> >>> dump vals;
> >> >>> store vals into 'daniel.out';
> >> >>> -----------
> >> >>>
> >> >>> I am consistently getting a
> >> >>> Failed Jobs:
> >> >>> JobId Alias Feature Message Outputs
> >> >>> N/A rawDocs,vals MAP_ONLY Message:
> >> >>> org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> Timed
> >> >>> out
> >> >>> trying to locate root region
> >> >>> at
> >> >>>
> >> >>>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
> >> >>>
> >> >>>
> >> >>> Googling shows me similar issues:
> >> >>>
> >> >>>
> >> >>>
> >>
> http://search-hadoop.com/m/RPLkD1bmY4l&subj=Re+Cannot+connect+HBase+to+Pig
> >> >>>
> >> >>> My current understanding is that somewhere in the interaction
> between
> >> >>> Pig,
> >> >>> Hadoop, HBase, and Zookeper, there is a configuration file that
> needs
> >> to
> >> >>> be
> >> >>> included in a classpath or a configuration directory somewhere. I
> have
> >> >>> tried various combinations of making hadoop aware of Hbase and
> >> >>> vice-versa.
> >> >>> I have tried ZK running on its own, and also managed by HBase.
> >> >>>
> >> >>> Can someone explain the dependencies here? Any insight as to what I
> am
> >> >>> missing? What would your diagnosis of the above message be?
> >> >>>
> >> >>> thanks,
> >> >>> daniel
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >
>