Thanks Subir, wasn't sure if I should cross-post. Royston
On 24 Apr 2012, at 13:29, Subir S wrote: > Looping HBase group. > > On Tue, Apr 24, 2012 at 5:18 PM, Royston Sellman < > [email protected]> wrote: > >> We still haven't cracked this but bit more info (HBase 0.95; Pig 0.11): >> >> The script below runs fine in a few seconds using Pig in local mode but >> with >> Pig in MR mode it sometimes works rapidly but usually takes 40 minutes to >> an >> hour. >> >> --hbaseuploadtest.pig >> register /opt/hbase/hbase-trunk/lib/protobuf-java-2.4.0a.jar >> register /opt/hbase/hbase-trunk/lib/guava-r09.jar >> register /opt/hbase/hbase-trunk/hbase-0.95-SNAPSHOT.jar >> register /opt/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.jar >> raw_data = LOAD '/data/sse.tbl1.HEADERLESS.csv' USING PigStorage( ',' ) AS >> (mid : chararray, hid : chararray, mf : chararray, mt : chararray, mind : >> chararray, mimd : chararray, mst : chararray ); >> dump raw_data; >> STORE raw_data INTO 'hbase://hbaseuploadtest' USING >> org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:hid info:mf info:mt >> info:mind info:mimd info:mst); >> >> i.e. >> [hadoop1@namenode hadoop-1.0.2]$ pig -x local >> ../pig-scripts/hbaseuploadtest.pig >> WORKS EVERY TIME!! >> But >> [hadoop1@namenode hadoop-1.0.2]$ pig -x mapreduce >> ../pig-scripts/hbaseuploadtest.pig >> Sometimes (but rarely) runs in under a minute, often takes more than 40 >> minutes to get to 50% but then completes to 100% in seconds. The dataset is >> very small. >> >> Note that the dump of raw_data works in both cases. However the STORE >> command causes the MR job to stall and the job setup task shows the >> following errors: >> Task attempt_201204240854_0006_m_000002_0 failed to report status for 602 >> seconds. Killing! >> Task attempt_201204240854_0006_m_000002_1 failed to report status for 601 >> seconds. Killing! >> >> And task log shows the following stream of errors: >> >> 2012-04-24 11:57:27,427 INFO org.apache.zookeeper.ZooKeeper: Initiating >> client connection, connectString=localhost:2181 sessionTimeout=180000 >> watcher=hconnection 0x5567d7fb >> 2012-04-24 11:57:27,441 INFO org.apache.zookeeper.ClientCnxn: Opening >> socket >> connection to server /127.0.0.1:2181 >> 2012-04-24 11:57:27,443 WARN >> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException: >> java.lang.SecurityException: Unable to locate a login configuration >> occurred >> when trying to find JAAS configuration. >> 2012-04-24 11:57:27,443 INFO >> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not >> SASL-authenticate because the default JAAS configuration section 'Client' >> could not be found. If you are not using SASL, you may ignore this. On the >> other hand, if you expected SASL to work, please fix your JAAS >> configuration. >> 2012-04-24 11:57:27,444 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 >> for server null, unexpected error, closing socket connection and attempting >> reconnect >> java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) >> at >> >> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.jav >> a:286) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035) >> 2012-04-24 11:57:27,445 INFO >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of >> this process is 6846@slave2 >> 2012-04-24 11:57:27,551 INFO org.apache.zookeeper.ClientCnxn: Opening >> socket >> connection to server /127.0.0.1:2181 >> 2012-04-24 11:57:27,552 WARN >> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException: >> java.lang.SecurityException: Unable to locate a login configuration >> occurred >> when trying to find JAAS configuration. >> 2012-04-24 11:57:27,552 INFO >> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not >> SASL-authenticate because the default JAAS configuration section 'Client' >> could not be found. If you are not using SASL, you may ignore this. On the >> other hand, if you expected SASL to work, please fix your JAAS >> configuration. >> 2012-04-24 11:57:27,552 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 >> for server null, unexpected error, closing socket connection and attempting >> reconnect >> java.net.ConnectException: Connection refused >> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> at >> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) >> at >> >> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.jav >> a:286) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035) >> 2012-04-24 11:57:27,553 WARN >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient >> ZooKeeper exception: >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss for /hbase/hbaseid >> 2012-04-24 11:57:27,553 INFO org.apache.hadoop.hbase.util.RetryCounter: >> Sleeping 2000ms before retry #1... >> 2012-04-24 11:57:28,652 INFO org.apache.zookeeper.ClientCnxn: Opening >> socket >> connection to server localhost/127.0.0.1:2181 >> 2012-04-24 11:57:28,653 WARN >> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException: >> java.lang.SecurityException: Unable to locate a login configuration >> occurred >> when trying to find JAAS configuration. >> 2012-04-24 11:57:28,653 INFO >> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not >> SASL-authenticate because the default JAAS configuration section 'Client' >> could not be found. If you are not using SASL, you may ignore this. On the >> other hand, if you expected SASL to work, please fix your JAAS >> configuration. >> 2012-04-24 11:57:28,653 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 >> for server null, unexpected error, closing socket connection and attempting >> reconnect >> java.net.ConnectException: Connection refused etc etc >> >> Any ideas? Anyone else out there successfully running Pig 0.11 >> HBaseStorage() against HBase 0.95? >> >> Thanks, >> Royston >> >> >> >> -----Original Message----- >> From: Dmitriy Ryaboy [mailto:[email protected]] >> Sent: 20 April 2012 00:03 >> To: [email protected] >> Subject: Re: HBaseStorage not working >> >> Nothing significant changed in Pig trunk, so I am guessing HBase changed >> something; you are more likely to get help from them (they should at least >> be able to point at APIs that changed and are likely to cause this sort of >> thing). >> >> You might also want to check if any of the started MR jobs have anything >> interesting in their task logs. >> >> D >> >> On Thu, Apr 19, 2012 at 1:41 PM, Royston Sellman >> <[email protected]> wrote: >>> Does HBaseStorage work with HBase 0.95? >>> >>> >>> >>> This code was working with HBase 0.92 and Pig 0.9 but fails on HBase >>> 0.95 and Pig 0.11 (built from source): >>> >>> >>> >>> register /opt/hbase/hbase-trunk/hbase-0.95-SNAPSHOT.jar >>> >>> register /opt/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.jar >>> >>> >>> >>> >>> >>> tbl1 = LOAD 'input/sse.tbl1.HEADERLESS.csv' USING PigStorage( ',' ) AS >>> ( >>> >>> ID:chararray, >>> >>> hp:chararray, >>> >>> pf:chararray, >>> >>> gz:chararray, >>> >>> hid:chararray, >>> >>> hst:chararray, >>> >>> mgz:chararray, >>> >>> gg:chararray, >>> >>> epc:chararray ); >>> >>> >>> >>> STORE tbl1 INTO 'hbase://sse.tbl1' >>> >>> USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('edrp:hp >>> edrp:pf edrp:gz edrp:hid edrp:hst edrp:mgz edrp:gg edrp:epc'); >>> >>> >>> >>> The job output (using either Grunt or PigServer makes no difference) >>> shows the family:descriptors being added by HBaseStorage then starts >>> up the MR job which (after a long pause) reports: >>> >>> ------------ >>> >>> Input(s): >>> >>> Failed to read data from >>> "hdfs://namenode:8020/user/hadoop1/input/sse.tbl1.HEADERLESS.csv" >>> >>> >>> >>> Output(s): >>> >>> Failed to produce result in "hbase://sse.tbl1" >>> >>> >>> >>> >>> >>> INFO mapReduceLayer.MapReduceLauncher: Failed! >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:hp >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:pf >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:gz >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:hid >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:hst >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:mgz >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:gg >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:epc >>> >>> ------------ >>> >>> >>> >>> The "Failed to read" is misleading I think because dump tbl1; in place >>> of the store works fine. >>> >>> >>> >>> I get nothing in the HBase logs and nothing in the Pig log. >>> >>> >>> >>> HBase works fine from the shell and can read and write to the table. >>> Pig works fine in and out of HDFS on CSVs. >>> >>> >>> >>> Any ideas? >>> >>> >>> >>> Royston >>> >>> >>> >> >>
