Hi -

I've been having problems with running Pig(0.9.2) scripts with
HBase(0.92.1) as source and target.

I'm running a 3 node cluster. Node 1 has the NameNode, JobTracker,
Zookeeper server and HBase Master
Node 2 & 3 has the DataNode, TaskTracker and HBase region servers

Pig Script:

A = LOAD 'hbase://tb1' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf1:ssId_20120730
cf1:arId_20120730','-limit 10') AS (ss_id:int, ar_id:int);
B = GROUP A BY ss_id;
C = FOREACH B GENERATE group, COUNT(A) as hits;
STORE C INTO 'hbase://tbl2' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf1:20120730');

Env variables:

HADOOP_CLASSPATH=.:/usr/lib/hbase/hbase.jar:/etc/hbase/conf
PIG_CLASSPATH=/usr/lib/hbase/hbase.jar:/usr/lib/hbase
HBASE_HOME=/usr/lib/hbase


The Pig script is able to load the data from HBase(I tried 'dump A'
which worked fine). However, when trying to store in HBase, it is
trying to look for a local zookeeper instance at port 2181 and since
there are no zookeeper instances in the corresponding nodes, the map
tasks are being killed. I assumed that this is because pig does not
take the hbase.zookeeper.quorum setting in the hbase config; so I
tried setting the value in the grunt shell manually (set
hbase.zookeeper.quorum 'node1'). But, the script still looks for a
local zookeeper instance.


2012-08-06 01:44:02,367 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server localhost.localdomain/127.0.0.1:2181
2012-08-06 01:44:02,367 WARN
org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
java.lang.SecurityException: Unable to locate a login configuration
occurred when trying to find JAAS configuration.
2012-08-06 01:44:02,367 INFO
org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not
SASL-authenticate because the default JAAS configuration section
'Client' could not be found. If you are not using SASL, you may ignore
this. On the other hand, if you expected SASL to work, please fix your
JAAS configuration.
2012-08-06 01:44:02,368 WARN org.apache.zookeeper.ClientCnxn: Session
0x0 for server null, unexpected error, closing socket connection and
attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1039)
2012-08-06 01:44:02,470 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly
transient ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/master
2012-08-06 01:44:02,470 ERROR
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper
exists failed after 3 retries
2012-08-06 01:44:02,470 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
hconnection Unable to set watcher on znode /hbase/master
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/master
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
        at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
        at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226)
        at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:82)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:569)
        at 
org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:194)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:171)
        at 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:195)
        at 
org.apache.pig.backend.hadoop.hbase.HBaseStorage.getOutputFormat(HBaseStorage.java:521)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:91)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:70)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:514)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:308)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
        at org.apache.hadoop.mapred.Child.main(Child.java:264)

I also tried this -
http://mail-archives.apache.org/mod_mbox/pig-user/201205.mbox/%[email protected]%3E.
Doesn't work either. In the thread above, the LOAD operation did not
work. In my case, the LOAD operation works fine(when I dump it in the
grunt shell) but STORE doesn't.


Thanks,
Hari Prasanna

Reply via email to