Thanks a lot for your response. Much appreciated. Mythili
On Tue, Apr 16, 2013 at 12:00 PM, Gianmarco De Francisci Morales < [email protected]> wrote: > Hi, > > nested RANK is not supported yet, however it is easy to implement as a UDF. > Just sort the records and assign an increasing counter with the UDF. > We will probably add support for nested RANK in the next release. > > > Cheers, > > -- > Gianmarco > > > On Mon, Apr 15, 2013 at 11:10 PM, M G <[email protected]> wrote: > > > Hi Johnny Zhang: > > > > > > What I am looking for is overall rank and rank within each group. Sorry > if > > I was not clear. > > > > What I am looking to get is something like this. > > > > (1, 1, John, Banking, 20000) > > (5, 2, Jane, Banking, 35000) > > (3, 1, Asha, Technology, 26000) > > (2, 1, Hari, Real Estate, 22000) > > (4, 2, Chen, Real Estate, 30000) > > > > Thanks, > > Mythili > > > > > > On Mon, Apr 15, 2013 at 1:58 PM, Johnny Zhang <[email protected]> > > wrote: > > > > > Hi, M G: > > > for input data > > > John Banking 20000 > > > Jane Banking 35000 > > > Chen Real Estate 30000 > > > Hari Real Estate 22000 > > > Asha Technology 26000 > > > > > > > > > a = load '/var/lib/jenkins/income' as (name:chararray, > > industry:chararray, > > > income:int); > > > b = rank a by income; > > > c = group b by industry; > > > d = foreach c generate flatten(b); > > > dump d; > > > > > > output is: > > > (1,John,Banking,20000) > > > (5,Jane,Banking,35000) > > > (3,Asha,Technology,26000) > > > (2,Hari,Real Estate,22000) > > > (4,Chen,Real Estate,30000) > > > > > > Johnny > > > > > > > > > On Mon, Apr 15, 2013 at 1:25 PM, M G <[email protected]> wrote: > > > > > > > Is there a way to do RANK within a group in PIG 0.11.1? > > > > > > > > In the following sample dataset, I would like to Rank DESC by Income, > > and > > > > further RANK by Income for each Industry. > > > > > > > > Name Industry Income > > > > > > > > John,Banking, 20,000 > > > > Jane, Banking, 35,000 > > > > Chen,Real Estate, 30,000 > > > > Hari, Real Estate, 22,000 > > > > Asha, Technology, 26,000 > > > > > > > > I tried something like this, but I get syntax error. > > > > > > > > names_by_ind = group names by industry; > > > > > > > > rank_by_ind = foreach names_by_ind { > > > > results = RANK names BY income DESC; > > > > GENERATE flatten(results); > > > > } > > > > > > > > > >
