I have a problem where I don't know how or if pig is even suitable to solve
it.

I have a schema like this:

student-id,student-name,start-time,duration,course
1,marco,1319708213,500,math
2,ralf,1319708111,112,english
3,greg,1319708321,333,french
4,diva,1319708444,80,english
5,susanne,1319708123,2000,math
1,marco,1319708564,500,french
2,ralf,1319708789,123,french
7,fred,1319708213,5675,french
8,laura,1319708233,123,math
10,sab,1319708999,777,math
11,fibo,1319708789,565,math
6,dan,1319708456,50,english
9,marco,1319708123,60,english
12,bo,1319708456,345,math
1,marco,1319708789,673,math
...
...

I would like to retrieve a graph (interpolation) over time grouped by
course. Meaning how many students are learning for a course based on a 30
sec interval.
The grouping by course is easy but from there I've no clue how I would
achieve the rest. I guess the rest needs to be achieved via some UDF
or is there any way how to this in pig? I often think that I need a "for
loop" or something similar in pig.

Thanks for your help!
-Marco

Reply via email to