OK, so we have solved the problem and yes it is a config/classpath problem.

Our solution is to put a symlink to zoo.cfg into the HADOOP_INSTALL/conf
directory. Maybe this will help someone else in future...

On our installation (hadoop1.0.3-SNAPSHOT/HBase 0.95-SNAPSHOT/Pig
0.11-SNAPSHOT) our code using PigServer does not work with zoo.cfg just
being on the Pig, HBase, Hadoop CLASSPATHs. The tasktrackers do not get the
right IP address for zookeeper and hang with connection refused errors. The
symlink fixes it.

However, code not using PigServer but going direct to HBase client DOES work
WITHOUT the symlink.

Our understanding of the Pig/HBase/Hadoop stack CLASSPATH/config universe is
not perfect but it seems that the PigServer map-reduce launcher does not
pass through the path to zoo.cfg?

Royston


-----Original Message-----
From: Norbert Burger [mailto:[email protected]] 
Sent: 02 May 2012 12:54
To: [email protected]
Cc: [email protected]
Subject: Re: HBaseStorage not working

This is a config/classpath issue, no?  At the lowest level, Hadoop MR tasks
don't pick up settings from the HBase conf directory unless they're
explicitly added to the classpath, usually via hadoop/conf/hadoop-env.sh:

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-su
mmary.html#classpath

Perhaps the classpath that's being added to your Java jobs is slightly
different?

Norbert

On Wed, May 2, 2012 at 6:48 AM, Royston Sellman <
[email protected]> wrote:

> Hi,
>
> We are still experiencing 40-60 minutes of task failure before our 
> HBaseStorage jobs run but we think we've narrowed the problem down to 
> a specific zookeeper issue.
>
> The HBaseStorage map task only works when it lands on a machine that 
> actually is running zookeeper server as part of the quorum. It 
> typically attempts from several different nodes in the cluster, 
> failing repeatedly before it hits on a zookeeper node.
>
> Logs show the failing task attempts are trying to connect to the 
> localhost machine on port 2181 to make a ZooKeeper connection (as part 
> of the Load/HBaseStorage map task):
>
> ...
> > 2012-04-24 11:57:27,441 INFO org.apache.zookeeper.ClientCnxn: 
> > Opening socket connection to server /127.0.0.1:2181
> ...
> > java.net.ConnectException: Connection refused
> ...
>
> This explains why the job succeeds eventually, as we have a zookeeper 
> quorum server running on one of our worker nodes, but not on the other 
> 3.
> Therefore, the job fails repeatedly until it is redistributed onto the 
> node with the ZK server, at which point it succeeds immediately.
>
> We therefore suspect the issue is in our ZK configuration. Our 
> hbase-site.xml defines the zookeeper quorum as follows:
>
>    <property>
>      <name>hbase.zookeeper.quorum</name>
>      <value>namenode,jobtracker,slave0</value>
>    </property>
>
> Therefore, we would expect the tasks to connect to one of those hosts 
> when attempting a zookeeper connection, however it appears to be 
> attempting to connect to "localhost" (which is the default). It is as 
> if the hbase configuration settings here are not used.
>
> Does anyone have any suggestions as to what might be the cause of this 
> behaviour?
>
> Sending this to both lists although it is only Pig HBaseStorage jobs 
> that suffer this problem on our cluster. HBase Java client jobs work
normally.
>
> Thanks,
> Royston
>
> -----Original Message-----
> From: Subir S [mailto:[email protected]]
> Sent: 24 April 2012 13:29
> To: [email protected]; [email protected]
> Subject: Re: HBaseStorage not working
>
> Looping HBase group.
>
> On Tue, Apr 24, 2012 at 5:18 PM, Royston Sellman < 
> [email protected]> wrote:
>
> > We still haven't cracked this but  bit more info (HBase 0.95; Pig 0.11):
> >
> > The script below runs fine in a few seconds using Pig in local mode 
> > but with Pig in MR mode it sometimes works rapidly but usually takes
> > 40 minutes to an hour.
> >
> > --hbaseuploadtest.pig
> > register /opt/hbase/hbase-trunk/lib/protobuf-java-2.4.0a.jar
> > register /opt/hbase/hbase-trunk/lib/guava-r09.jar
> > register /opt/hbase/hbase-trunk/hbase-0.95-SNAPSHOT.jar
> > register /opt/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.jar
> > raw_data = LOAD '/data/sse.tbl1.HEADERLESS.csv' USING PigStorage( ','
> > ) AS (mid : chararray, hid : chararray, mf : chararray, mt : 
> > chararray,
> mind :
> > chararray, mimd : chararray, mst : chararray ); dump raw_data; STORE 
> > raw_data INTO 'hbase://hbaseuploadtest' USING 
> > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:hid info:mf 
> > info:mt info:mind info:mimd info:mst);
> >
> > i.e.
> > [hadoop1@namenode hadoop-1.0.2]$ pig -x local 
> > ../pig-scripts/hbaseuploadtest.pig
> > WORKS EVERY TIME!!
> > But
> > [hadoop1@namenode hadoop-1.0.2]$ pig -x mapreduce 
> > ../pig-scripts/hbaseuploadtest.pig
> > Sometimes (but rarely) runs in under a minute, often takes more than
> > 40 minutes to get to 50% but then completes to 100% in seconds. The 
> > dataset is very small.
> >
> > Note that the dump of raw_data works in both cases. However the 
> > STORE command causes the MR job to stall and the job setup task 
> > shows the following errors:
> > Task attempt_201204240854_0006_m_000002_0 failed to report status 
> > for
> > 602 seconds. Killing!
> > Task attempt_201204240854_0006_m_000002_1 failed to report status 
> > for
> > 601 seconds. Killing!
> >
> > And task log shows the following stream of errors:
> >
> > 2012-04-24 11:57:27,427 INFO org.apache.zookeeper.ZooKeeper:
> > Initiating client connection, connectString=localhost:2181
> > sessionTimeout=180000 watcher=hconnection 0x5567d7fb
> > 2012-04-24 11:57:27,441 INFO org.apache.zookeeper.ClientCnxn: 
> > Opening socket connection to server /127.0.0.1:2181
> > 2012-04-24 11:57:27,443 WARN
> > org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
> > java.lang.SecurityException: Unable to locate a login configuration 
> > occurred when trying to find JAAS configuration.
> > 2012-04-24 11:57:27,443 INFO
> > org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not 
> > SASL-authenticate because the default JAAS configuration section
'Client'
> > could not be found. If you are not using SASL, you may ignore this. 
> > On the other hand, if you expected SASL to work, please fix your 
> > JAAS configuration.
> > 2012-04-24 11:57:27,444 WARN org.apache.zookeeper.ClientCnxn: 
> > Session
> > 0x0 for server null, unexpected error, closing socket connection and 
> > attempting reconnect
> > java.net.ConnectException: Connection refused
> >        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> >        at
> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> >        at
> >
> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocke
> > tN
> > IO.jav
> > a:286)
> >        at
> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
> > 2012-04-24 11:57:27,445 INFO
> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The 
> > identifier of this process is 6846@slave2
> > 2012-04-24 11:57:27,551 INFO org.apache.zookeeper.ClientCnxn: 
> > Opening socket connection to server /127.0.0.1:2181
> > 2012-04-24 11:57:27,552 WARN
> > org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
> > java.lang.SecurityException: Unable to locate a login configuration 
> > occurred when trying to find JAAS configuration.
> > 2012-04-24 11:57:27,552 INFO
> > org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not 
> > SASL-authenticate because the default JAAS configuration section
'Client'
> > could not be found. If you are not using SASL, you may ignore this. 
> > On the other hand, if you expected SASL to work, please fix your 
> > JAAS configuration.
> > 2012-04-24 11:57:27,552 WARN org.apache.zookeeper.ClientCnxn: 
> > Session
> > 0x0 for server null, unexpected error, closing socket connection and 
> > attempting reconnect
> > java.net.ConnectException: Connection refused
> >        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> >        at
> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> >        at
> >
> > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocke
> > tN
> > IO.jav
> > a:286)
> >        at
> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
> > 2012-04-24 11:57:27,553 WARN
> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly 
> > transient ZooKeeper exception:
> > org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
> > 2012-04-24 11:57:27,553 INFO org.apache.hadoop.hbase.util.RetryCounter:
> > Sleeping 2000ms before retry #1...
> > 2012-04-24 11:57:28,652 INFO org.apache.zookeeper.ClientCnxn: 
> > Opening socket connection to server localhost/127.0.0.1:2181
> > 2012-04-24 11:57:28,653 WARN
> > org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
> > java.lang.SecurityException: Unable to locate a login configuration 
> > occurred when trying to find JAAS configuration.
> > 2012-04-24 11:57:28,653 INFO
> > org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not 
> > SASL-authenticate because the default JAAS configuration section
'Client'
> > could not be found. If you are not using SASL, you may ignore this. 
> > On the other hand, if you expected SASL to work, please fix your 
> > JAAS configuration.
> > 2012-04-24 11:57:28,653 WARN org.apache.zookeeper.ClientCnxn: 
> > Session
> > 0x0 for server null, unexpected error, closing socket connection and 
> > attempting reconnect
> > java.net.ConnectException: Connection refused etc etc
> >
> > Any ideas? Anyone else out there successfully running Pig 0.11
> > HBaseStorage() against HBase 0.95?
> >
> > Thanks,
> > Royston
> >
> >
> >
> > -----Original Message-----
> > From: Dmitriy Ryaboy [mailto:[email protected]]
> > Sent: 20 April 2012 00:03
> > To: [email protected]
> > Subject: Re: HBaseStorage not working
> >
> > Nothing significant changed in Pig trunk, so I am guessing HBase 
> > changed something; you are more likely to get help from them (they 
> > should at least be able to point at APIs that changed and are likely 
> > to cause this sort of thing).
> >
> > You might also want to check if any of the started MR jobs have 
> > anything interesting in their task logs.
> >
> > D
> >
> > On Thu, Apr 19, 2012 at 1:41 PM, Royston Sellman 
> > <[email protected]> wrote:
> > > Does HBaseStorage work with HBase 0.95?
> > >
> > >
> > >
> > > This code was working with HBase 0.92 and Pig 0.9 but fails on 
> > > HBase
> > > 0.95 and Pig 0.11 (built from source):
> > >
> > >
> > >
> > > register /opt/hbase/hbase-trunk/hbase-0.95-SNAPSHOT.jar
> > >
> > > register /opt/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.jar
> > >
> > >
> > >
> > >
> > >
> > > tbl1 = LOAD 'input/sse.tbl1.HEADERLESS.csv' USING PigStorage( ',' 
> > > ) AS (
> > >
> > >      ID:chararray,
> > >
> > >      hp:chararray,
> > >
> > >      pf:chararray,
> > >
> > >      gz:chararray,
> > >
> > >      hid:chararray,
> > >
> > >      hst:chararray,
> > >
> > >      mgz:chararray,
> > >
> > >      gg:chararray,
> > >
> > >      epc:chararray );
> > >
> > >
> > >
> > > STORE tbl1 INTO 'hbase://sse.tbl1'
> > >
> > > USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('edrp:hp
> > > edrp:pf edrp:gz edrp:hid edrp:hst edrp:mgz edrp:gg edrp:epc');
> > >
> > >
> > >
> > > The job output (using either Grunt or PigServer makes no 
> > > difference) shows the family:descriptors being added by 
> > > HBaseStorage then starts up the MR job which (after a long pause)
reports:
> > >
> > > ------------
> > >
> > > Input(s):
> > >
> > > Failed to read data from
> > > "hdfs://namenode:8020/user/hadoop1/input/sse.tbl1.HEADERLESS.csv"
> > >
> > >
> > >
> > > Output(s):
> > >
> > > Failed to produce result in "hbase://sse.tbl1"
> > >
> > >
> > >
> > >
> > >
> > > INFO mapReduceLayer.MapReduceLauncher: Failed!
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:hp
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:pf
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:gz
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:hid
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:hst
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:mgz
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:gg
> > >
> > > INFO hbase.HBaseStorage: Adding family:descriptor filters with 
> > > values edrp:epc
> > >
> > > ------------
> > >
> > >
> > >
> > > The "Failed to read" is misleading I think because dump tbl1; in 
> > > place of the store works fine.
> > >
> > >
> > >
> > > I get nothing in the HBase logs and nothing in the Pig log.
> > >
> > >
> > >
> > > HBase works fine from the shell and can read and write to the table.
> > > Pig works fine in and out of HDFS on CSVs.
> > >
> > >
> > >
> > > Any ideas?
> > >
> > >
> > >
> > > Royston
> > >
> > >
> > >
> >
> >
>
>

Reply via email to