Hi Jonathan, Join seems like the obvious thing to do.. I am guessing you are asking cause you don't like it for some reason? What's the concern, speed?
D On Wed, Mar 16, 2011 at 3:50 PM, Jonathan Holloway < [email protected]> wrote: > Hi, > > Given the following: > > Group 1 - Tests Totals: > (A, 4) > (B, 30) > (C, 40) > (D, 30) > > Group 2 - Tests Passed: > (A,1) > (B,30) > > How would I calculate the percentage of Group 2 / Group 1 using Pig? I'm > assuming one way is to join on the the two datasets and calculate the > percentage that way. Another way is to use a custom UDF. Is there a > better > way than either of these? > > Many thanks, > Jon. >
