Hi Matthew, Try this:
letters = load '$input_path' as (letter:chararray, ascii, value:int); letter_group = group letters by letter; letter_with_max = foreach letter_group generate group as letter, MAX(letters.ascii) as max; ascii_with_value = foreach letters generate ascii, value; joined = join ascii_with_value by ascii, letter_with_max by max using 'replicated'; results = foreach joined generate letter, max, value; dump results; Note that I am using replicated join assuming that letter-to-max of ascii is small enough to fit in memory. If that's not true, please remove it. The result looks like: (a,97.0,10) (b,98.0,20) (c,99.0,30) (d,100.0,40) (e,101.0,50) (f,102.0,60) (g,103.0,70) (h,104.0,80) (i,105.0,90) (j,106.0,100) (k,107.0,110) (l,108.0,120) (m,109.0,130) (n,110.0,140) (o,111.0,150) (p,112.0,160) (q,113.0,170) (r,114.0,180) (s,115.0,190) (t,116.0,200) (u,117.0,210) (v,118.0,220) (w,119.0,230) (x,120.0,240) (y,121.0,250) (z,122.0,260) Thanks, Cheolsoo On Wed, Jan 30, 2013 at 12:14 PM, Matthew Purdy < [email protected]> wrote: > i am trying to use a MAX function of fieldA of a group and return another > fieldB associated with the record that the function returned; however from > what i have done so far i get the MAX fieldA value along with a list of all > values of the associated fieldB that are in the group. > > to express my problem here is a trivial example i have created three files > (test.pig, test.txt, and test.out) which are the pig script the input data, > and the output results) i have also attached these files for convenience. > > it seems logical getting these results back; however, i dont know how to > have pig give me what i want. > > > given the following input file (nothing important just an example): > (fields are letter, ascii value (first upper than lower), a value) > a 65 1 > b 66 2 > c 67 3 > ... > a 97 10 > b 98 20 > c 99 30 > > i would like to return the following > (given the max of the second field (ascii value of lower case), give the > value) > (a,97,10) > (b,98,20) > (c,99,30) > ... > > however, i get the following output > (a,97.0,{(1),(10)}) > (b,98.0,{(2),(20)}) > (c,99.0,{(3),(30)}) > > my pig script is the following: > > letters = load '$input_path' as (letter:chararray, > ascii:chararray, value:int); > letter_group = group letters by letter; > letter_with_max = foreach letter_group generate group, MAX(letters.ascii), > letters.value; > dump letter_with_max; > > > > > -- > Thank You, > Matthew Purdy > > > ------------------------------------------------------------------------------------------------------------------ > Matthew Purdy > [email protected] > 443.848.1595 > -------------------------------------- > "Lead, follow, or get out of the way." -- Thomas Paine > "Make everything as simple as possible, but not simpler." -- Albert > Einstein > "The definition of insanity is doing the same thing over and over and > expecting a different result." -- Benjamin Franklin > "We can't solve problems by using the same kind of thinking we used when > we created them." -- Albert Einstein > ------------------------------------------------------------------------------------------------------------------ > >
