Thanks Subir, wasn't sure if I should cross-post.

Royston


On 24 Apr 2012, at 13:29, Subir S wrote:

> Looping HBase group.
> 
> On Tue, Apr 24, 2012 at 5:18 PM, Royston Sellman <
> [email protected]> wrote:
> 
>> We still haven't cracked this but  bit more info (HBase 0.95; Pig 0.11):
>> 
>> The script below runs fine in a few seconds using Pig in local mode but
>> with
>> Pig in MR mode it sometimes works rapidly but usually takes 40 minutes to
>> an
>> hour.
>> 
>> --hbaseuploadtest.pig
>> register /opt/hbase/hbase-trunk/lib/protobuf-java-2.4.0a.jar
>> register /opt/hbase/hbase-trunk/lib/guava-r09.jar
>> register /opt/hbase/hbase-trunk/hbase-0.95-SNAPSHOT.jar
>> register /opt/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.jar
>> raw_data = LOAD '/data/sse.tbl1.HEADERLESS.csv' USING PigStorage( ',' ) AS
>> (mid : chararray, hid : chararray, mf : chararray, mt : chararray, mind :
>> chararray, mimd : chararray, mst : chararray );
>> dump raw_data;
>> STORE raw_data INTO 'hbase://hbaseuploadtest' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:hid info:mf info:mt
>> info:mind info:mimd info:mst);
>> 
>> i.e.
>> [hadoop1@namenode hadoop-1.0.2]$ pig -x local
>> ../pig-scripts/hbaseuploadtest.pig
>> WORKS EVERY TIME!!
>> But
>> [hadoop1@namenode hadoop-1.0.2]$ pig -x mapreduce
>> ../pig-scripts/hbaseuploadtest.pig
>> Sometimes (but rarely) runs in under a minute, often takes more than 40
>> minutes to get to 50% but then completes to 100% in seconds. The dataset is
>> very small.
>> 
>> Note that the dump of raw_data works in both cases. However the STORE
>> command causes the MR job to stall and the job setup task shows the
>> following errors:
>> Task attempt_201204240854_0006_m_000002_0 failed to report status for 602
>> seconds. Killing!
>> Task attempt_201204240854_0006_m_000002_1 failed to report status for 601
>> seconds. Killing!
>> 
>> And task log shows the following stream of errors:
>> 
>> 2012-04-24 11:57:27,427 INFO org.apache.zookeeper.ZooKeeper: Initiating
>> client connection, connectString=localhost:2181 sessionTimeout=180000
>> watcher=hconnection 0x5567d7fb
>> 2012-04-24 11:57:27,441 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket
>> connection to server /127.0.0.1:2181
>> 2012-04-24 11:57:27,443 WARN
>> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
>> java.lang.SecurityException: Unable to locate a login configuration
>> occurred
>> when trying to find JAAS configuration.
>> 2012-04-24 11:57:27,443 INFO
>> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not
>> SASL-authenticate because the default JAAS configuration section 'Client'
>> could not be found. If you are not using SASL, you may ignore this. On the
>> other hand, if you expected SASL to work, please fix your JAAS
>> configuration.
>> 2012-04-24 11:57:27,444 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
>> for server null, unexpected error, closing socket connection and attempting
>> reconnect
>> java.net.ConnectException: Connection refused
>>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>       at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>>       at
>> 
>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.jav
>> a:286)
>>       at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
>> 2012-04-24 11:57:27,445 INFO
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
>> this process is 6846@slave2
>> 2012-04-24 11:57:27,551 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket
>> connection to server /127.0.0.1:2181
>> 2012-04-24 11:57:27,552 WARN
>> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
>> java.lang.SecurityException: Unable to locate a login configuration
>> occurred
>> when trying to find JAAS configuration.
>> 2012-04-24 11:57:27,552 INFO
>> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not
>> SASL-authenticate because the default JAAS configuration section 'Client'
>> could not be found. If you are not using SASL, you may ignore this. On the
>> other hand, if you expected SASL to work, please fix your JAAS
>> configuration.
>> 2012-04-24 11:57:27,552 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
>> for server null, unexpected error, closing socket connection and attempting
>> reconnect
>> java.net.ConnectException: Connection refused
>>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>       at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>>       at
>> 
>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.jav
>> a:286)
>>       at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
>> 2012-04-24 11:57:27,553 WARN
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
>> ZooKeeper exception:
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
>> 2012-04-24 11:57:27,553 INFO org.apache.hadoop.hbase.util.RetryCounter:
>> Sleeping 2000ms before retry #1...
>> 2012-04-24 11:57:28,652 INFO org.apache.zookeeper.ClientCnxn: Opening
>> socket
>> connection to server localhost/127.0.0.1:2181
>> 2012-04-24 11:57:28,653 WARN
>> org.apache.zookeeper.client.ZooKeeperSaslClient: SecurityException:
>> java.lang.SecurityException: Unable to locate a login configuration
>> occurred
>> when trying to find JAAS configuration.
>> 2012-04-24 11:57:28,653 INFO
>> org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not
>> SASL-authenticate because the default JAAS configuration section 'Client'
>> could not be found. If you are not using SASL, you may ignore this. On the
>> other hand, if you expected SASL to work, please fix your JAAS
>> configuration.
>> 2012-04-24 11:57:28,653 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
>> for server null, unexpected error, closing socket connection and attempting
>> reconnect
>> java.net.ConnectException: Connection refused etc etc
>> 
>> Any ideas? Anyone else out there successfully running Pig 0.11
>> HBaseStorage() against HBase 0.95?
>> 
>> Thanks,
>> Royston
>> 
>> 
>> 
>> -----Original Message-----
>> From: Dmitriy Ryaboy [mailto:[email protected]]
>> Sent: 20 April 2012 00:03
>> To: [email protected]
>> Subject: Re: HBaseStorage not working
>> 
>> Nothing significant changed in Pig trunk, so I am guessing HBase changed
>> something; you are more likely to get help from them (they should at least
>> be able to point at APIs that changed and are likely to cause this sort of
>> thing).
>> 
>> You might also want to check if any of the started MR jobs have anything
>> interesting in their task logs.
>> 
>> D
>> 
>> On Thu, Apr 19, 2012 at 1:41 PM, Royston Sellman
>> <[email protected]> wrote:
>>> Does HBaseStorage work with HBase 0.95?
>>> 
>>> 
>>> 
>>> This code was working with HBase 0.92 and Pig 0.9 but fails on HBase
>>> 0.95 and Pig 0.11 (built from source):
>>> 
>>> 
>>> 
>>> register /opt/hbase/hbase-trunk/hbase-0.95-SNAPSHOT.jar
>>> 
>>> register /opt/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.jar
>>> 
>>> 
>>> 
>>> 
>>> 
>>> tbl1 = LOAD 'input/sse.tbl1.HEADERLESS.csv' USING PigStorage( ',' ) AS
>>> (
>>> 
>>>     ID:chararray,
>>> 
>>>     hp:chararray,
>>> 
>>>     pf:chararray,
>>> 
>>>     gz:chararray,
>>> 
>>>     hid:chararray,
>>> 
>>>     hst:chararray,
>>> 
>>>     mgz:chararray,
>>> 
>>>     gg:chararray,
>>> 
>>>     epc:chararray );
>>> 
>>> 
>>> 
>>> STORE tbl1 INTO 'hbase://sse.tbl1'
>>> 
>>> USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('edrp:hp
>>> edrp:pf edrp:gz edrp:hid edrp:hst edrp:mgz edrp:gg edrp:epc');
>>> 
>>> 
>>> 
>>> The job output (using either Grunt or PigServer makes no difference)
>>> shows the family:descriptors being added by HBaseStorage then starts
>>> up the MR job which (after a long pause) reports:
>>> 
>>> ------------
>>> 
>>> Input(s):
>>> 
>>> Failed to read data from
>>> "hdfs://namenode:8020/user/hadoop1/input/sse.tbl1.HEADERLESS.csv"
>>> 
>>> 
>>> 
>>> Output(s):
>>> 
>>> Failed to produce result in "hbase://sse.tbl1"
>>> 
>>> 
>>> 
>>> 
>>> 
>>> INFO mapReduceLayer.MapReduceLauncher: Failed!
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:hp
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:pf
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:gz
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:hid
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:hst
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:mgz
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:gg
>>> 
>>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values
>>> edrp:epc
>>> 
>>> ------------
>>> 
>>> 
>>> 
>>> The "Failed to read" is misleading I think because dump tbl1; in place
>>> of the store works fine.
>>> 
>>> 
>>> 
>>> I get nothing in the HBase logs and nothing in the Pig log.
>>> 
>>> 
>>> 
>>> HBase works fine from the shell and can read and write to the table.
>>> Pig works fine in and out of HDFS on CSVs.
>>> 
>>> 
>>> 
>>> Any ideas?
>>> 
>>> 
>>> 
>>> Royston
>>> 
>>> 
>>> 
>> 
>> 

Reply via email to