such as the following: movie = LOAD '$input' AS (user_id:int, movie_id:chararray, timestamp:int); movie_group = GROUP movie by user_id; movie_count = FOREACH movie_group GENERATE group as user_id, movie_id, COUNT($1) AS MovieCount;
On Thu, May 15, 2014 at 4:25 AM, Chengi Liu <[email protected]> wrote: > Hi, > > My data is in format: > > user_id,movie_id,timestamp > 123, abc,unix_timestamp > 123, def, ... > 123, abc, ... > 234, sda, ... > > > Now, I want to compute the number of times each movie is played in pig.. > So the output I am expecting is: > > 123,abc,2 > 123,def,1 > 234,sda,1 > > and so on.. > how do i do this in pig > -- Regards Shengjun
