you can loop from python. I've never tried it but you have a pretty good explanation here ( http://ofps.oreilly.com/titles/9781449302641/embedding.html ) recently, I have to analyze some log files and I needed to loop (to calculate some stats) and I used an UDF
in your case, I would go with Bill proposal On Thu, Oct 27, 2011 at 5:56 AM, Marco Cadetg <[email protected]> wrote: > I have a problem where I don't know how or if pig is even suitable to solve > it. > > I have a schema like this: > > student-id,student-name,start-time,duration,course > 1,marco,1319708213,500,math > 2,ralf,1319708111,112,english > 3,greg,1319708321,333,french > 4,diva,1319708444,80,english > 5,susanne,1319708123,2000,math > 1,marco,1319708564,500,french > 2,ralf,1319708789,123,french > 7,fred,1319708213,5675,french > 8,laura,1319708233,123,math > 10,sab,1319708999,777,math > 11,fibo,1319708789,565,math > 6,dan,1319708456,50,english > 9,marco,1319708123,60,english > 12,bo,1319708456,345,math > 1,marco,1319708789,673,math > ... > ... > > I would like to retrieve a graph (interpolation) over time grouped by > course. Meaning how many students are learning for a course based on a 30 > sec interval. > The grouping by course is easy but from there I've no clue how I would > achieve the rest. I guess the rest needs to be achieved via some UDF > or is there any way how to this in pig? I often think that I need a "for > loop" or something similar in pig. > > Thanks for your help! > -Marco >
