hey all,
Very new Pig user here. I think I'm trying to get something very simple done
but getting a few errors. See me script below.Any guidance will be
appreciated.Thanks.
I get errors such as Error during parsing. Invalid alias: serverin {time:
double,count: double}
I am basically trying to duplicate the following SQL query:
select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt)
from TableA
group by 1, 2, 3;
My script is as follows:
a = LOAD 'Report' AS (
dt:chararray,
Server:chararray,
Type:chararray,
Ops:chararray,
UserID:chararray,
U_cnt:int,
U_tm:int,
U_min_tm:int,
U_max_tm:int,
U_avg_tm:float,
);
--Remove Test Servers
remtest = filter a by not Server matches 'Test%';
-- Filter to required columns
reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby
G2 = group reqd by Server,Type,Ops;
--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
--byServeroperation = order G3 by Server;
store G3 into 'Servertest';
ingvay7