Hi Johnny Zhang:
What I am looking for is overall rank and rank within each group. Sorry if I was not clear. What I am looking to get is something like this. (1, 1, John, Banking, 20000) (5, 2, Jane, Banking, 35000) (3, 1, Asha, Technology, 26000) (2, 1, Hari, Real Estate, 22000) (4, 2, Chen, Real Estate, 30000) Thanks, Mythili On Mon, Apr 15, 2013 at 1:58 PM, Johnny Zhang <[email protected]> wrote: > Hi, M G: > for input data > John Banking 20000 > Jane Banking 35000 > Chen Real Estate 30000 > Hari Real Estate 22000 > Asha Technology 26000 > > > a = load '/var/lib/jenkins/income' as (name:chararray, industry:chararray, > income:int); > b = rank a by income; > c = group b by industry; > d = foreach c generate flatten(b); > dump d; > > output is: > (1,John,Banking,20000) > (5,Jane,Banking,35000) > (3,Asha,Technology,26000) > (2,Hari,Real Estate,22000) > (4,Chen,Real Estate,30000) > > Johnny > > > On Mon, Apr 15, 2013 at 1:25 PM, M G <[email protected]> wrote: > > > Is there a way to do RANK within a group in PIG 0.11.1? > > > > In the following sample dataset, I would like to Rank DESC by Income, and > > further RANK by Income for each Industry. > > > > Name Industry Income > > > > John,Banking, 20,000 > > Jane, Banking, 35,000 > > Chen,Real Estate, 30,000 > > Hari, Real Estate, 22,000 > > Asha, Technology, 26,000 > > > > I tried something like this, but I get syntax error. > > > > names_by_ind = group names by industry; > > > > rank_by_ind = foreach names_by_ind { > > results = RANK names BY income DESC; > > GENERATE flatten(results); > > } > > >
