[R] configure ddply() to avoid reordering of '.variables'
Hello, I'm using ddply() in plyr and I notice that it has the habit of re-ordering the levels of the '.variables' by which the splitting is done. I'm concerned about correctly retrieving the original ordering. Consider: require(plyr) x - iris[ order(iris$Species, decreasing=T), ] head(x) #Sepal.Length Sepal.Width Petal.Length Petal.Width Species #101 6.3 3.3 6.0 2.5 virginica #102 5.8 2.7 5.1 1.9 virginica #103 7.1 3.0 5.9 2.1 virginica #104 6.3 2.9 5.6 1.8 virginica #105 6.5 3.0 5.8 2.2 virginica #106 7.6 3.0 6.6 2.1 virginica xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) # |==| 100% ##notice how the ordering of Species is different ##from that in the input data frame head(xa) # Species Sepal.Length mean.adj #1 setosa 5.10.094 #2 setosa 4.9 -0.106 #3 setosa 4.7 -0.306 #4 setosa 4.6 -0.406 #5 setosa 5.0 -0.006 #6 setosa 5.40.394 all.equal(xa$Species, x$Species) #[1] 100 string mismatches all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species) #[1] TRUE all.equal(xa$Sepal.Length, x$Sepal.Length) #[1] Mean relative difference: 0.2785 all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length) #[1] TRUE In my real data, should I be concerned that simply reordering by the '.variables' variable wouldn't necessarily restore the original ordering as in the input data frame? Is it possible to instruct ddply() to avoid re-ordering the supplied '.variables' variable? Regards, Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] configure ddply() to avoid reordering of '.variables'
May be this helps levels(x$Species) #[1] setosa versicolor virginica x$Species- factor(x$Species,levels=unique(x$Species)) xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) head(xa) # Species Sepal.Length mean.adj #1 virginica 6.3 -0.288 #2 virginica 5.8 -0.788 #3 virginica 7.1 0.512 #4 virginica 6.3 -0.288 #5 virginica 6.5 -0.088 #6 virginica 7.6 1.012 A.K. - Original Message - From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Cc: Sent: Monday, May 27, 2013 4:47 AM Subject: [R] configure ddply() to avoid reordering of '.variables' Hello, I'm using ddply() in plyr and I notice that it has the habit of re-ordering the levels of the '.variables' by which the splitting is done. I'm concerned about correctly retrieving the original ordering. Consider: require(plyr) x - iris[ order(iris$Species, decreasing=T), ] head(x) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species #101 6.3 3.3 6.0 2.5 virginica #102 5.8 2.7 5.1 1.9 virginica #103 7.1 3.0 5.9 2.1 virginica #104 6.3 2.9 5.6 1.8 virginica #105 6.5 3.0 5.8 2.2 virginica #106 7.6 3.0 6.6 2.1 virginica xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) # |==| 100% ##notice how the ordering of Species is different ##from that in the input data frame head(xa) # Species Sepal.Length mean.adj #1 setosa 5.1 0.094 #2 setosa 4.9 -0.106 #3 setosa 4.7 -0.306 #4 setosa 4.6 -0.406 #5 setosa 5.0 -0.006 #6 setosa 5.4 0.394 all.equal(xa$Species, x$Species) #[1] 100 string mismatches all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species) #[1] TRUE all.equal(xa$Sepal.Length, x$Sepal.Length) #[1] Mean relative difference: 0.2785 all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length) #[1] TRUE In my real data, should I be concerned that simply reordering by the '.variables' variable wouldn't necessarily restore the original ordering as in the input data frame? Is it possible to instruct ddply() to avoid re-ordering the supplied '.variables' variable? Regards, Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] configure ddply() to avoid reordering of '.variables'
Also, you can check: http://stackoverflow.com/questions/7235421/how-to-ddply-without-sorting keeping.order - function(data, fn, ...) { col - .sortColumn data[,col] - 1:nrow(data) out - fn(data, ...) if (!col %in% colnames(out)) stop(Ordering column not preserved by function) out - out[order(out[,col]),] out[,col] - NULL out } x - iris[ order(iris$Species, decreasing=T), ] xa- ddply(x,.(Species),mutate,mean.adj=Sepal.Length-mean(Sepal.Length))[-c(2:4)] xa1- keeping.order(x,ddply,.(Species),mutate,mean.adj=Sepal.Length-mean(Sepal.Length))[-c(2:4)] head(xa1) # Sepal.Length Species mean.adj #101 6.3 virginica -0.288 #102 5.8 virginica -0.788 #103 7.1 virginica 0.512 #104 6.3 virginica -0.288 #105 6.5 virginica -0.088 #106 7.6 virginica 1.012 A.K. - Original Message - From: arun smartpink...@yahoo.com To: Liviu Andronic landronim...@gmail.com Cc: R help r-help@r-project.org Sent: Monday, May 27, 2013 10:06 AM Subject: Re: [R] configure ddply() to avoid reordering of '.variables' May be this helps levels(x$Species) #[1] setosa versicolor virginica x$Species- factor(x$Species,levels=unique(x$Species)) xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) head(xa) # Species Sepal.Length mean.adj #1 virginica 6.3 -0.288 #2 virginica 5.8 -0.788 #3 virginica 7.1 0.512 #4 virginica 6.3 -0.288 #5 virginica 6.5 -0.088 #6 virginica 7.6 1.012 A.K. - Original Message - From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Cc: Sent: Monday, May 27, 2013 4:47 AM Subject: [R] configure ddply() to avoid reordering of '.variables' Hello, I'm using ddply() in plyr and I notice that it has the habit of re-ordering the levels of the '.variables' by which the splitting is done. I'm concerned about correctly retrieving the original ordering. Consider: require(plyr) x - iris[ order(iris$Species, decreasing=T), ] head(x) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species #101 6.3 3.3 6.0 2.5 virginica #102 5.8 2.7 5.1 1.9 virginica #103 7.1 3.0 5.9 2.1 virginica #104 6.3 2.9 5.6 1.8 virginica #105 6.5 3.0 5.8 2.2 virginica #106 7.6 3.0 6.6 2.1 virginica xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) # |==| 100% ##notice how the ordering of Species is different ##from that in the input data frame head(xa) # Species Sepal.Length mean.adj #1 setosa 5.1 0.094 #2 setosa 4.9 -0.106 #3 setosa 4.7 -0.306 #4 setosa 4.6 -0.406 #5 setosa 5.0 -0.006 #6 setosa 5.4 0.394 all.equal(xa$Species, x$Species) #[1] 100 string mismatches all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species) #[1] TRUE all.equal(xa$Sepal.Length, x$Sepal.Length) #[1] Mean relative difference: 0.2785 all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length) #[1] TRUE In my real data, should I be concerned that simply reordering by the '.variables' variable wouldn't necessarily restore the original ordering as in the input data frame? Is it possible to instruct ddply() to avoid re-ordering the supplied '.variables' variable? Regards, Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.