[R] ggplot2: mapping categorical variable to color aesthetic with faceting
Hello Again... I¹m making a faceted plot of a response on two categorical variables using ggplot2 and having troubles with the coloring. Here is a sample that produces the desired plot: compareCats - function(data, res, fac1, fac2, colors) { require(ggplot2) p - ggplot(data, aes(fac1, res)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit, color = colors) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) Now, if I get away from idealized data where there are the same number of data points per group (25 in this case), I run into problems. So, if you do: rem - runif(5, 1, 100) # randomly remove a few points here and there test - test[-rem,] compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) R throws an error due to mismatch between the recycling of colors and the actual number of data points: Error in `[-.data.frame`(`*tmp*`, gp, value = list(colour = c(red, : replacement element 1 has 2 rows, need 47 I'm new to ggplot2, but have been through the book and the web site enough to know that my problem is mapping the varible to the aesthetic; I also know I can either map or set the colors. The question, finally: is there an simple/elegant way to map a list of two colors corresponding to A and B onto any random sample size of A and B with faceting? If not, and I must set the colors: Do I compute the length of all possible combos of A, B with lrg, sm, and then create one long vector of colors for the entire plot? I tried something like this, and was not successful, but perhaps could be with more work. All advice appreciated, Bryan (session info below) * Bryan Hanson Professor of Chemistry Biochemistry DePauw University, Greencastle IN USA sessionInfo() R version 2.9.2 (2009-08-24) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid datasets tools utils stats graphics grDevices methods [9] base other attached packages: [1] ggplot2_0.8.3 reshape_0.8.3 proto_0.3-8mvbutils_2.2.0 [5] ChemoSpec_1.1 lattice_0.17-25mvoutlier_1.4 plyr_0.1.8 [9] RColorBrewer_1.0-2 chemometrics_0.4 som_0.3-4 robustbase_0.4-5 [13] rpart_3.1-45 pls_2.1-0 pcaPP_1.7 mvtnorm_0.9-7 [17] nnet_7.2-48mclust_3.2 MASS_7.2-48lars_0.9-7 [21] e1071_1.5-19 class_7.2-48 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: mapping categorical variable to color aesthetic with faceting
Hi, I may be missing an important design decision, but could you not have only a single data.frame as an argument of your function? From your example, it seems that the colour can be mapped to the fac1 variable of data, compareCats - function(data) { require(ggplot2) p - ggplot(data, aes(fac1, res, color=fac1)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit) + scale_colour_manual(values=c(red, blue)) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test) rem - runif(5, 1, 100) # randomly remove a few points here and there last_plot() %+% test[-rem,] # replot with new dataset HTH, baptiste 2009/10/6 Bryan Hanson han...@depauw.edu: Hello Again... I¹m making a faceted plot of a response on two categorical variables using ggplot2 and having troubles with the coloring. Here is a sample that produces the desired plot: compareCats - function(data, res, fac1, fac2, colors) { require(ggplot2) p - ggplot(data, aes(fac1, res)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit, color = colors) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) Now, if I get away from idealized data where there are the same number of data points per group (25 in this case), I run into problems. So, if you do: rem - runif(5, 1, 100) # randomly remove a few points here and there test - test[-rem,] compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) R throws an error due to mismatch between the recycling of colors and the actual number of data points: Error in `[-.data.frame`(`*tmp*`, gp, value = list(colour = c(red, : replacement element 1 has 2 rows, need 47 I'm new to ggplot2, but have been through the book and the web site enough to know that my problem is mapping the varible to the aesthetic; I also know I can either map or set the colors. The question, finally: is there an simple/elegant way to map a list of two colors corresponding to A and B onto any random sample size of A and B with faceting? If not, and I must set the colors: Do I compute the length of all possible combos of A, B with lrg, sm, and then create one long vector of colors for the entire plot? I tried something like this, and was not successful, but perhaps could be with more work. All advice appreciated, Bryan (session info below) * Bryan Hanson Professor of Chemistry Biochemistry DePauw University, Greencastle IN USA sessionInfo() R version 2.9.2 (2009-08-24) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid datasets tools utils stats graphics grDevices methods [9] base other attached packages: [1] ggplot2_0.8.3 reshape_0.8.3 proto_0.3-8 mvbutils_2.2.0 [5] ChemoSpec_1.1 lattice_0.17-25 mvoutlier_1.4 plyr_0.1.8 [9] RColorBrewer_1.0-2 chemometrics_0.4 som_0.3-4 robustbase_0.4-5 [13] rpart_3.1-45 pls_2.1-0 pcaPP_1.7 mvtnorm_0.9-7 [17] nnet_7.2-48 mclust_3.2 MASS_7.2-48 lars_0.9-7 [21] e1071_1.5-19 class_7.2-48 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: mapping categorical variable to color aesthetic with faceting
Further to my previous reply, it occurred to me that ggplot2 would only ever use data and colors in your calls to compareCats(): res = res, fac1 = fac1, fac2 = fac2 have no effect whatsoever. If you want the user to be able to specify the variables used in the ggplot2 call, you probably want to look at ?aes_string, as shown below, compareCats - function(data, fac1=fac1, fac2=fac2, res=res, colors=c(red, blue)) { require(ggplot2) p - ggplot(data, aes_string(x=fac1, y=res, color=fac1)) + facet_grid(paste(. ~ , fac2)) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit) + scale_colour_manual(values=colors) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test) rem - sample(10, 1:ncol(test)) # randomly remove a few points here and there last_plot() %+% test[-rem, ] # replot with new dataset HTH, baptiste 2009/10/6 baptiste auguie baptiste.aug...@googlemail.com: Hi, I may be missing an important design decision, but could you not have only a single data.frame as an argument of your function? From your example, it seems that the colour can be mapped to the fac1 variable of data, compareCats - function(data) { require(ggplot2) p - ggplot(data, aes(fac1, res, color=fac1)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit) + scale_colour_manual(values=c(red, blue)) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test) rem - runif(5, 1, 100) # randomly remove a few points here and there last_plot() %+% test[-rem,] # replot with new dataset HTH, baptiste 2009/10/6 Bryan Hanson han...@depauw.edu: Hello Again... I¹m making a faceted plot of a response on two categorical variables using ggplot2 and having troubles with the coloring. Here is a sample that produces the desired plot: compareCats - function(data, res, fac1, fac2, colors) { require(ggplot2) p - ggplot(data, aes(fac1, res)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit, color = colors) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) Now, if I get away from idealized data where there are the same number of data points per group (25 in this case), I run into problems. So, if you do: rem - runif(5, 1, 100) # randomly remove a few points here and there test - test[-rem,] compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) R throws an error due to mismatch between the recycling of colors and the actual number of data points: Error in `[-.data.frame`(`*tmp*`, gp, value = list(colour = c(red, : replacement element 1 has 2 rows, need 47 I'm new to ggplot2, but have been through the book and the web site enough to know that my problem is mapping the varible to the aesthetic; I also know I can either map or set the colors. The question, finally: is there an simple/elegant way to map a list of two colors corresponding to A and B onto any random sample size of A and B with faceting? If not, and I must set the colors: Do I compute the length of all possible combos of A, B with lrg, sm, and then create one long vector of colors for the entire plot? I tried something like this, and was not successful, but perhaps could be with more work. All advice appreciated, Bryan (session info below) * Bryan Hanson Professor of Chemistry Biochemistry DePauw University, Greencastle IN USA sessionInfo() R version 2.9.2 (2009-08-24) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid datasets tools utils stats graphics grDevices methods [9] base other attached packages: [1] ggplot2_0.8.3 reshape_0.8.3 proto_0.3-8 mvbutils_2.2.0 [5] ChemoSpec_1.1 lattice_0.17-25 mvoutlier_1.4 plyr_0.1.8 [9] RColorBrewer_1.0-2 chemometrics_0.4 som_0.3-4 robustbase_0.4-5 [13] rpart_3.1-45 pls_2.1-0 pcaPP_1.7 mvtnorm_0.9-7 [17] nnet_7.2-48 mclust_3.2 MASS_7.2-48 lars_0.9-7 [21] e1071_1.5-19 class_7.2-48 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
Re: [R] ggplot2: mapping categorical variable to color aesthetic with faceting
Hi Baptiste: Thanks for the suggestion. It will work perfectly. I would have never considered assigning a color to a variable that contained no colors at all! I guess this is part of the aesthetic concept, which I haven't had time to reflect on much. Then later, specify a manual color scale which then maps back onto the aesthetic. Clever. As I stated, I'm just learning ggplot2, and I'm finding the language and concepts a bit different (I'm not familiar with the grammar of graphics, nor am I a computer scientist). But, I have to say the code I am working up replaces a much much longer code in base graphics, so I am really liking the thought put into ggplot2 and the leanness of it - Thanks Hadley! Thanks again, Bryan On 10/6/09 12:36 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: Hi, I may be missing an important design decision, but could you not have only a single data.frame as an argument of your function? From your example, it seems that the colour can be mapped to the fac1 variable of data, compareCats - function(data) { require(ggplot2) p - ggplot(data, aes(fac1, res, color=fac1)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit) + scale_colour_manual(values=c(red, blue)) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test) rem - runif(5, 1, 100) # randomly remove a few points here and there last_plot() %+% test[-rem,] # replot with new dataset HTH, baptiste 2009/10/6 Bryan Hanson han...@depauw.edu: Hello Again... I¹m making a faceted plot of a response on two categorical variables using ggplot2 and having troubles with the coloring. Here is a sample that produces the desired plot: compareCats - function(data, res, fac1, fac2, colors) { require(ggplot2) p - ggplot(data, aes(fac1, res)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit, color = colors) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) Now, if I get away from idealized data where there are the same number of data points per group (25 in this case), I run into problems. So, if you do: rem - runif(5, 1, 100) # randomly remove a few points here and there test - test[-rem,] compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) R throws an error due to mismatch between the recycling of colors and the actual number of data points: Error in `[-.data.frame`(`*tmp*`, gp, value = list(colour = c(red, : replacement element 1 has 2 rows, need 47 I'm new to ggplot2, but have been through the book and the web site enough to know that my problem is mapping the varible to the aesthetic; I also know I can either map or set the colors. The question, finally: is there an simple/elegant way to map a list of two colors corresponding to A and B onto any random sample size of A and B with faceting? If not, and I must set the colors: Do I compute the length of all possible combos of A, B with lrg, sm, and then create one long vector of colors for the entire plot? I tried something like this, and was not successful, but perhaps could be with more work. All advice appreciated, Bryan (session info below) * Bryan Hanson Professor of Chemistry Biochemistry DePauw University, Greencastle IN USA sessionInfo() R version 2.9.2 (2009-08-24) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid datasets tools utils stats graphics grDevices methods [9] base other attached packages: [1] ggplot2_0.8.3 reshape_0.8.3 proto_0.3-8 mvbutils_22.0 [5] ChemoSpec_1.1 lattice_0.17-25 mvoutlier_1.4 plyr_0.1.8 [9] RColorBrewer_1.0-2 chemometrics_0.4 som_0.3-4 robustbase_0.4-5 [13] rpart_3.1-45 pls_2.1-0 pcaPP_1.7 mvtnorm_0.9-7 [17] nnet_7.2-48 mclust_3.2 MASS_7.2-48 lars_0.9-7 [21] e1071_1.5-19 class_7.2-48 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: mapping categorical variable to color aesthetic with faceting
A few days ago on the list I had wrestled with the aes() vs aes_string() issue, along with the same issue with facetting. The way I ended up handling the point you bring up, Baptiste, is perhaps rather inefficient but my data sets are not large. I allow the user to pass variables, then I use that info to construct extra data frame entries, which then are suitable for use by ggplot 2 since they are known in the data frame. Here's what the first part of the actual function looks like, you can see how I avoided aes_string and related problems with facet: compareCats - function(data = NULL, res = NULL, fac1 = NULL, fac2 = NULL, fac1order = NULL, fac2order = NULL, fac1cols = NULL, method = c(sem, iqr, mad, box, points), title = Comparison of Categories, y.lab = your text here, subtitle = optional explanatory caption) { require(ggplot2) # restructure data so names will match, re-ordering too data$res - data[, res] a - match(fac1, names(data)) b - match(fac2, names(data)) data$fac1 - factor(data[[a]], levels = fac1order) data$fac2 - factor(data[[b]], levels = fac2order) # now the plot p - ggplot(data, aes(fac1, res, color = fac1)) + facet_grid(. ~ fac2) + xlab(NULL) + opts(title = title, axis.text.x = theme_text(colour = black), axis.ticks = theme_blank()) And then depending up on the method specified by the user, additional geoms are added and the plot created. This gets the job done, but if there are further suggestions, I'd love to learn other solutions. Bryan On 10/6/09 1:08 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: Further to my previous reply, it occurred to me that ggplot2 would only ever use data and colors in your calls to compareCats(): res = res, fac1 = fac1, fac2 = fac2 have no effect whatsoever. If you want the user to be able to specify the variables used in the ggplot2 call, you probably want to look at ?aes_string, as shown below, compareCats - function(data, fac1=fac1, fac2=fac2, res=res, colors=c(red, blue)) { require(ggplot2) p - ggplot(data, aes_string(x=fac1, y=res, color=fac1)) + facet_grid(paste(. ~ , fac2)) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit) + scale_colour_manual(values=colors) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test) rem - sample(10, 1:ncol(test)) # randomly remove a few points here and there last_plot() %+% test[-rem, ] # replot with new dataset HTH, baptiste 2009/10/6 baptiste auguie baptiste.aug...@googlemail.com: Hi, I may be missing an important design decision, but could you not have only a single data.frame as an argument of your function? From your example, it seems that the colour can be mapped to the fac1 variable of data, compareCats - function(data) { require(ggplot2) p - ggplot(data, aes(fac1, res, color=fac1)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit) + scale_colour_manual(values=c(red, blue)) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test) rem - runif(5, 1, 100) # randomly remove a few points here and there last_plot() %+% test[-rem,] # replot with new dataset HTH, baptiste 2009/10/6 Bryan Hanson han...@depauw.edu: Hello Again... I¹m making a faceted plot of a response on two categorical variables using ggplot2 and having troubles with the coloring. Here is a sample that produces the desired plot: compareCats - function(data, res, fac1, fac2, colors) { require(ggplot2) p - ggplot(data, aes(fac1, res)) + facet_grid(. ~ fac2) jit - position_jitter(width = 0.1) p - p + layer(geom = jitter, position = jit, color = colors) print(p) } test - data.frame(res = rnorm(100), fac1 = as.factor(rep(c(A, B), 50)), fac2 = as.factor(rep(c(lrg, lrg, sm, sm), 25))) compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) Now, if I get away from idealized data where there are the same number of data points per group (25 in this case), I run into problems. So, if you do: rem - runif(5, 1, 100) # randomly remove a few points here and there test - test[-rem,] compareCats(data = test, res = res, fac1 = fac1, fac2 = fac2, colors = c(red, blue)) R throws an error due to mismatch between the recycling of colors and the actual number of data points: Error in `[-.data.frame`(`*tmp*`, gp, value = list(colour = c(red, : replacement element 1 has 2 rows, need 47 I'm new to ggplot2, but have been through the book and the web site enough to know that my problem is mapping the varible to the aesthetic; I also know I can either map or set the