(Apologies for resending but corrected script below)
This is the error I got: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. Updated code: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by (Server,Type,Ops); --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; store G3 into 'Servertest'; ----- Original Message ----- From: Prashant Kommireddi <[email protected]> To: "[email protected]" <[email protected]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script Hi, Can you paste the error message here? Sent from my iPhone On Nov 13, 2012, at 8:34 AM, "[email protected]" <[email protected]> wrote: > hey all, > > Very new Pig user here. I think I'm trying to get something very simple done > but getting a few errors. See me script below.Any guidance will be > appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: > double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
