Hi, I'm writing a script to perform some analytics on a set of events occurring in a set of apps. I'm using Pig 0.11 and Hadoop 1.3.
Every event contains: - d: date of the event - aid: app id - uid: user id The aim of my script is to calculate for each application and for each day in my log the number of unique users during the previous x days (in the example code that is 2). After trying various approaches with no result my current scripts looks like: ________________________________________________________________ /** * describe events output: * * events: {d: chararray,aid: chararray,uid: chararray} */ eventDates = FOREACH events GENERATE d as targetDate; dates = DISTINCT eventDates; crossed = CROSS (GROUP events BY (aid)), dates; /** * describe crossed output: * * crossed: {1-7::group: chararray,1-7::events: {(d: chararray,aid: chararray,uid: chararray)},dates::targetDate: chararray} */ result = FOREACH crossed { date = ToDate(targetDate, 'yyyy-MM-dd'); filtered = FILTER events BY DaysBetween(ToDate(d, 'yyyy-MM-dd'), date) < 2 AND SecondsBetween(ToDate(d, 'yyyy-MM-dd'), date) > 0; uniqueUsers = DISTINCT filtered.uid; GENERATE group as aid, targetDate as date, COUNT(uniqueUsers) as result; } describe result; dump result; ________________________________________________________________ At this point I get the following error: 2013-12-19 05:20:17,283 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1025: <file script.pig, line 46, column 25> Invalid field projection. Projected field [targetDate] does not exist in schema: d:bytearray,aid:chararray,uid:chararray. Line 46 is equivalent to: date = ToDate(targetDate, 'yyyy-MM-dd'); But if I hardcode the date instead of reading it from the "crossed" bag: date = ToDate('2013-12-01', 'yyyy-MM-dd'); It actually works. It looks like if I nest a foreach loop inside another foreach I'm not able to project any more the first level fields. Any idea about the reason of this? Or perhaps any better way to achieve the same result? Forgive any stupidity I may have written, this is my first approach to Pig scripting! Any suggestion is highly appreciated. Thanks and Regards, Carlo -- Carlo Di Fulco