Thanks Matthew, I had data.table installed but totally forgot about it. I've only used it once or twice and, IIRC, that was last year. I remember thinking at the time that it was a very handy package but lack of need for this sort of thinglet me forget it.
--- On Thu, 5/12/11, Matthew Dowle <[email protected]> wrote: > From: Matthew Dowle <[email protected]> > Subject: Re: [R] Simple order() data frame question. > To: [email protected] > Received: Thursday, May 12, 2011, 11:23 AM > > With data.table, the following is routine : > > DT[order(a)] # ascending > DT[order(-a)] # descending, if a is numeric > DT[a>5,sum(z),by=c][order(-V1)] # sum > of z group by c, just where a>5, > then show me the largest first > DT[order(-a,b)] # order by a descending then by b > ascending, if a and b are > both numeric > > It avoids peppering your code with $, and becomes quite > natural after a > short while; especially compound queries such as the 3rd > example. > > Matthew > > http://datatable.r-forge.r-project.org/ > > > "Ivan Calandra" <[email protected]> > wrote in message > news:[email protected]... > I was wondering whether it would be possible to make a > method for > data.frame with sort(). > I think it would be more intuitive than using the complex > construction > of df[order(df$a),] > Is there any reason not to make it? > > Ivan > > Le 5/12/2011 15:40, Marc Schwartz a écrit : > > On May 12, 2011, at 8:09 AM, John Kane wrote: > > > >> Argh. I knew it was at least partly > obvious. I never have been able to > >> read the order() help page and understand what it > is saying. > >> > >> THanks very much. > >> > >> By the way, to me it is counter-intuitive that the > the command is > >> > >>> df1[order(df1[,2],decreasing=TRUE),] > >> For some reason I keep expecting it to be > >> order( , df1[,2],decreasing=TRUE) > >> > >> So clearly I don't understand what is going on but > at least I a lot > >> better off. I may be able to get this graph > to work. > > > > John, > > > > Perhaps it may be helpful to understand that order() > does not actually > > sort() the data. > > > > It returns a vector of indices into the data, where > those indices are the > > sorted ordering of the elements in the vector, or in > this case, the > > column. > > > > So you want the output of order() to be used within > the brackets for the > > row *indices*, to reflect the ordering of the column > (or columns in the > > case of a multi-level sort) that you wish to use to > sort the data frame > > rows. > > > > set.seed(1) > > x<- sample(10) > > > >> x > > [1] 3 4 5 > 7 2 8 9 6 10 1 > > > > > > # sort() actually returns the sorted data > >> sort(x) > > [1] 1 2 3 > 4 5 6 7 8 9 10 > > > > > > # order() returns the indices of 'x' in sorted order > >> order(x) > > [1] 10 5 1 2 > 3 8 4 6 7 9 > > > > > > # This does the same thing as sort() > >> x[order(x)] > > [1] 1 2 3 > 4 5 6 7 8 9 10 > > > > > > set.seed(1) > > df1<- data.frame(aa = letters[1:10], bb = > rnorm(10)) > > > >> df1 > > aa > bb > > 1 a -0.6264538 > > 2 b 0.1836433 > > 3 c -0.8356286 > > 4 d 1.5952808 > > 5 e 0.3295078 > > 6 f -0.8204684 > > 7 g 0.4874291 > > 8 h 0.7383247 > > 9 i 0.5757814 > > 10 j -0.3053884 > > > > > > # These are the indices of df1$bb in sorted order > >> order(df1$bb) > > [1] 3 6 1 10 > 2 5 7 9 8 4 > > > > > > # Get df1$bb in increasing order > >> df1$bb[order(df1$bb)] > > [1] -0.8356286 -0.8204684 -0.6264538 > -0.3053884 0.1836433 0.3295078 > > [7] 0.4874291 > 0.5757814 0.7383247 1.5952808 > > > > > > # Same thing as above > >> sort(df1$bb) > > [1] -0.8356286 -0.8204684 -0.6264538 > -0.3053884 0.1836433 0.3295078 > > [7] 0.4874291 > 0.5757814 0.7383247 1.5952808 > > > > > > You can't use the output of sort() to sort the data > frame rows, so you > > need to use order() to get the ordered indices and > then use that to > > extract the data frame rows in the sort order that you > desire: > > > >> df1[order(df1$bb), ] > > aa > bb > > 3 c -0.8356286 > > 6 f -0.8204684 > > 1 a -0.6264538 > > 10 j -0.3053884 > > 2 b 0.1836433 > > 5 e 0.3295078 > > 7 g 0.4874291 > > 9 i 0.5757814 > > 8 h 0.7383247 > > 4 d 1.5952808 > > > > > >> df1[order(df1$bb, decreasing = TRUE), ] > > aa > bb > > 4 d 1.5952808 > > 8 h 0.7383247 > > 9 i 0.5757814 > > 7 g 0.4874291 > > 5 e 0.3295078 > > 2 b 0.1836433 > > 10 j -0.3053884 > > 1 a -0.6264538 > > 6 f -0.8204684 > > 3 c -0.8356286 > > > > > > Does that help? > > > > Regards, > > > > Marc Schwartz > > > > ______________________________________________ > > [email protected] > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > reproducible code. > > > > -- > Ivan CALANDRA > PhD Student > University of Hamburg > Biozentrum Grindel und Zoologisches Museum > Abt. Säugetiere > Martin-Luther-King-Platz 3 > D-20146 Hamburg, GERMANY > +49(0)40 42838 6231 > [email protected] > > ********** > http://www.for771.uni-bonn.de > http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php > > > -----Inline Attachment Follows----- > > ______________________________________________ > [email protected] > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

