Hi Hari, This is a known issue with Pig 0.9.2. Here is how you can fix it: 1) Rebuild your pig-withouthadoop.jar with the patch for PIG-2115. 2) Set "hbase.zookeeper.quorum" to "node1:2181" in /usr/lib/pig/conf/pig.properties.
Please let me know if you're running CDH4. I can send you the pig-withouthadoop.jar that includes PIG-2115. Thanks! Cheolsoo On Mon, Aug 6, 2012 at 12:20 AM, Hari Prasanna <[email protected]> wrote: > Hi - > > I've been having problems with running Pig(0.9.2) scripts with > HBase(0.92.1) as source and target. > > I'm running a 3 node cluster. Node 1 has the NameNode, JobTracker, > Zookeeper server and HBase Master > Node 2 & 3 has the DataNode, TaskTracker and HBase region servers > > Pig Script: > > A = LOAD 'hbase://tb1' USING > org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf1:ssId_20120730 > cf1:arId_20120730','-limit 10') AS (ss_id:int, ar_id:int); > B = GROUP A BY ss_id; > C = FOREACH B GENERATE group, COUNT(A) as hits; > STORE C INTO 'hbase://tbl2' USING > org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf1:20120730'); > > Env variables: > > HADOOP_CLASSPATH=.:/usr/lib/hbase/hbase.jar:/etc/hbase/conf > PIG_CLASSPATH=/usr/lib/hbase/hbase.jar:/usr/lib/hbase > HBASE_HOME=/usr/lib/hbase > > > The Pig script is able to load the data from HBase(I tried 'dump A' > which worked fine). However, when trying to store in HBase, it is > trying to look for a local zookeeper instance at port 2181 and since > there are no zookeeper instances in the corresponding nodes, the map > tasks are being killed. I assumed that this is because pig does not > take the hbase.zookeeper.quorum setting in the hbase config; so I > tried setting the value in the grunt shell manually (set > hbase.zookeeper.quorum 'node1'). But, the script still looks for a > local zookeeper instance. > > > 2012-08-06 01:44:02,367 INFO org.apache.zookeeper.ClientCnxn: Opening > socket connection to server localhost.localdomain/127.0.0.1:2181 > 2012-08-06 01:44:02,367 WARN > org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException: > java.lang.SecurityException: Unable to locate a login configuration > occurred when trying to find JAAS configuration. > 2012-08-06 01:44:02,367 INFO > org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not > SASL-authenticate because the default JAAS configuration section > 'Client' could not be found. If you are not using SASL, you may ignore > this. On the other hand, if you expected SASL to work, please fix your > JAAS configuration. > 2012-08-06 01:44:02,368 WARN org.apache.zookeeper.ClientCnxn: Session > 0x0 for server null, unexpected error, closing socket connection and > attempting reconnect > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286) > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1039) > 2012-08-06 01:44:02,470 WARN > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly > transient ZooKeeper exception: > org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase/master > 2012-08-06 01:44:02,470 ERROR > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper > exists failed after 3 retries > 2012-08-06 01:44:02,470 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: > hconnection Unable to set watcher on znode /hbase/master > org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase/master > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:82) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:569) > at > org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186) > at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:194) > at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:171) > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:195) > at > org.apache.pig.backend.hadoop.hbase.HBaseStorage.getOutputFormat(HBaseStorage.java:521) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:91) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:70) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279) > at org.apache.hadoop.mapred.Task.initialize(Task.java:514) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:308) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > > I also tried this - > > http://mail-archives.apache.org/mod_mbox/pig-user/201205.mbox/%[email protected]%3E > . > Doesn't work either. In the thread above, the LOAD operation did not > work. In my case, the LOAD operation works fine(when I dump it in the > grunt shell) but STORE doesn't. > > > Thanks, > Hari Prasanna >
