Hi, M G: for input data John Banking 20000 Jane Banking 35000 Chen Real Estate 30000 Hari Real Estate 22000 Asha Technology 26000
a = load '/var/lib/jenkins/income' as (name:chararray, industry:chararray, income:int); b = rank a by income; c = group b by industry; d = foreach c generate flatten(b); dump d; output is: (1,John,Banking,20000) (5,Jane,Banking,35000) (3,Asha,Technology,26000) (2,Hari,Real Estate,22000) (4,Chen,Real Estate,30000) Johnny On Mon, Apr 15, 2013 at 1:25 PM, M G <[email protected]> wrote: > Is there a way to do RANK within a group in PIG 0.11.1? > > In the following sample dataset, I would like to Rank DESC by Income, and > further RANK by Income for each Industry. > > Name Industry Income > > John,Banking, 20,000 > Jane, Banking, 35,000 > Chen,Real Estate, 30,000 > Hari, Real Estate, 22,000 > Asha, Technology, 26,000 > > I tried something like this, but I get syntax error. > > names_by_ind = group names by industry; > > rank_by_ind = foreach names_by_ind { > results = RANK names BY income DESC; > GENERATE flatten(results); > } >
