Wow, adding the jars did the trick. Thank you very much. On Fri, Jul 15, 2011 at 3:58 PM, Bill Graham <[email protected]> wrote:
> That's because your pig script probably doesn't register the guava jar. Be > sure to register the guava, hbase and zookeeper jars in your script. > > > On Fri, Jul 15, 2011 at 3:52 PM, sulabh choudhury <[email protected]>wrote: > >> Yes I see a few errors in JT logs :- >> java.lang.NoClassDefFoundError: com/google/common/collect/Lists >> ClassNotFoundException: >> org.apache.hadoop.hbase.filter.WritableByteArrayComparable >> >> I think it cannot find some dependent jars? How or where do I add these >> jars so that pig can see them >> >> >> On Fri, Jul 15, 2011 at 3:27 PM, Bill Graham <[email protected]>wrote: >> >>> What do you see on the map and reduce tasks logs on the JT UI for that >>> job? >>> >>> This job is failing for some reason, so there should be some hint in the >>> task logs. >>> >>> >>> On Fri, Jul 15, 2011 at 2:31 PM, sulabh choudhury <[email protected]>wrote: >>> >>>> Bill, >>>> >>>> there no useful message in logs (pasted below). >>>> I tried SET pig.usenewlogicalplan 'false' which did not help. >>>> I am using pig-0.8.0-cdh3u0. I have tried both with and without >>>> 'hbase://' prefix >>>> >>>> 2011-07-15 14:19:58,700 [main] INFO >>>> >>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>>> - 100% complete >>>> 2011-07-15 14:19:58,702 [main] ERROR >>>> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! >>>> 2011-07-15 14:19:58,703 [main] INFO >>>> org.apache.pig.tools.pigstats.PigStats - Script Statistics: >>>> >>>> HadoopVersion PigVersion UserId StartedAt FinishedAt Features >>>> 0.20.2-cdh3u0 0.8.0-cdh3u0 cxt 2011-07-15 14:18:11 2011-07-15 14:19:58 >>>> GROUP_BY,ORDER_BY >>>> >>>> Some jobs have failed! Stop running all dependent jobs >>>> >>>> Job Stats (time in seconds): >>>> JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MaxReduceTime >>>> MinReduceTime AvgReduceTime Alias Feature Outputs >>>> job_201106212025_0139 1 1 8 8 8 12 12 12 A,ct,grp GROUP_BY,COMBINER >>>> job_201106212025_0140 1 1 3 3 3 12 12 12 sorted SAMPLER >>>> >>>> Failed Jobs: >>>> JobId Alias Feature Message Outputs >>>> job_201106212025_0141 result,sorted ORDER_BY Message: Job failed! Error >>>> - NA pig_test, >>>> >>>> Input(s): >>>> Successfully read 2583 records (330 bytes) from: "hbase://transaction" >>>> >>>> Output(s): >>>> Failed to produce result in "pig_test" >>>> >>>> >>>> On Fri, Jul 15, 2011 at 1:16 PM, Bill Graham <[email protected]>wrote: >>>> >>>>> What version of Pig are you using and what errors are you seeing? >>>>> >>>>> There was PIG-1870 related to projections that might apply, but I can't >>>>> say >>>>> so for sure. If that's the case it should work if you disable the new >>>>> logical plan with -Dusenewloginalplan=false. >>>>> >>>>> Also, you might try specifying pig_test as 'hbase://pig_test'. I recall >>>>> another JIRA about that as well. >>>>> >>>>> On Fri, Jul 15, 2011 at 12:40 PM, sulabh choudhury <[email protected] >>>>> >wrote: >>>>> >>>>> > I have been trying to Store data in HBase suing HbaseStorage class. >>>>> While I >>>>> > can store the original read data, it fails when I try to store the >>>>> > processed >>>>> > data. >>>>> > Which means I might be messing up the datatypes somewhere. >>>>> > >>>>> > My script below is :- >>>>> > >>>>> > --REGISTER myudfs.jar >>>>> > --A = load 'hbase://transaction' using >>>>> > org.apache.pig.backend.hadoop.hbase.HBaseStorage('log:ref2', >>>>> '-loadKey') AS >>>>> > (row:chararray, code:chararray) ; >>>>> > --grp = group A by myudfs.Parser(code); >>>>> > --ct = foreach grp generate group,COUNT(A.code) as count; >>>>> > >>>>> > --sorted = order ct by count desc; >>>>> > --result = foreach sorted generate $0 as row,(chararray)$1; >>>>> > --store result into 'pig_test' USING >>>>> > org.apache.pig.backend.hadoop.hbase.HBaseStorage('log:count'); >>>>> > >>>>> > The dump of "result" works but the store to Hbase fails. >>>>> > WHen I try to store A it works fine. >>>>> > >>>>> > Datatypes of A and result are :- >>>>> > A: {row: chararray,code: chararray} >>>>> > result: {row: chararray,count: chararray} >>>>> > >>>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >> >> >> -- >> >> -- >> Thanks and Regards, >> Sulabh Choudhury >> >> > -- -- Thanks and Regards, Sulabh Choudhury
