Thanks Rajgopal. Should I create a Jira? (never did that before). Do you know if anybody is successfully running Pig 0.11 on HBase 0.95 & Hadoop 1.0.3?
Regards, Royston On 20 Apr 2012, at 14:42, Rajgopal Vaithiyanathan wrote: > That is a mistake. Should be corrected.! > > The way you are using is right. > The ID (first column) will be the hbase's rowkey. and the other's will get > into the columns you mention in the arg of HBaseStorage. > > > On Fri, Apr 20, 2012 at 6:23 PM, Royston Sellman < > [email protected]> wrote: > >> OK, I'll ping the HBase folks... >> >> Meanwhile, are the HBaseStorage docs correct? The example shows the STORE >> command having 'USING' and 'AS' clauses, but 'AS' gives a parse error. 'AS' >> is valid in LOADs though. >> >> Cheers, >> Royston >> >> >> -----Original Message----- >> From: Dmitriy Ryaboy [mailto:[email protected]] >> Sent: 20 April 2012 00:03 >> To: [email protected] >> Subject: Re: HBaseStorage not working >> >> Nothing significant changed in Pig trunk, so I am guessing HBase changed >> something; you are more likely to get help from them (they should at least >> be able to point at APIs that changed and are likely to cause this sort of >> thing). >> >> You might also want to check if any of the started MR jobs have anything >> interesting in their task logs. >> >> D >> >> On Thu, Apr 19, 2012 at 1:41 PM, Royston Sellman >> <[email protected]> wrote: >>> Does HBaseStorage work with HBase 0.95? >>> >>> >>> >>> This code was working with HBase 0.92 and Pig 0.9 but fails on HBase >>> 0.95 and Pig 0.11 (built from source): >>> >>> >>> >>> register /opt/hbase/hbase-trunk/hbase-0.95-SNAPSHOT.jar >>> >>> register /opt/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.jar >>> >>> >>> >>> >>> >>> tbl1 = LOAD 'input/sse.tbl1.HEADERLESS.csv' USING PigStorage( ',' ) AS >>> ( >>> >>> ID:chararray, >>> >>> hp:chararray, >>> >>> pf:chararray, >>> >>> gz:chararray, >>> >>> hid:chararray, >>> >>> hst:chararray, >>> >>> mgz:chararray, >>> >>> gg:chararray, >>> >>> epc:chararray ); >>> >>> >>> >>> STORE tbl1 INTO 'hbase://sse.tbl1' >>> >>> USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('edrp:hp >>> edrp:pf edrp:gz edrp:hid edrp:hst edrp:mgz edrp:gg edrp:epc'); >>> >>> >>> >>> The job output (using either Grunt or PigServer makes no difference) >>> shows the family:descriptors being added by HBaseStorage then starts >>> up the MR job which (after a long pause) reports: >>> >>> ------------ >>> >>> Input(s): >>> >>> Failed to read data from >>> "hdfs://namenode:8020/user/hadoop1/input/sse.tbl1.HEADERLESS.csv" >>> >>> >>> >>> Output(s): >>> >>> Failed to produce result in "hbase://sse.tbl1" >>> >>> >>> >>> >>> >>> INFO mapReduceLayer.MapReduceLauncher: Failed! >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:hp >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:pf >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:gz >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:hid >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:hst >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:mgz >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:gg >>> >>> INFO hbase.HBaseStorage: Adding family:descriptor filters with values >>> edrp:epc >>> >>> ------------ >>> >>> >>> >>> The "Failed to read" is misleading I think because dump tbl1; in place >>> of the store works fine. >>> >>> >>> >>> I get nothing in the HBase logs and nothing in the Pig log. >>> >>> >>> >>> HBase works fine from the shell and can read and write to the table. >>> Pig works fine in and out of HDFS on CSVs. >>> >>> >>> >>> Any ideas? >>> >>> >>> >>> Royston >>> >>> >>> >> >> > > > -- > Thanks and Regards, > Rajgopal Vaithiyanathan.
