Re: [R] Problems with boxplot in ggplot2:qplot
Thanks a lot, Brian! -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1558810.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
On 2/15/2010 2:41 PM, Dimitri Shvorob wrote: library(sqldf) library(ggplot2) t = data.frame(t = seq.Date(as.Date(2009-01-01), to = as.Date(2009-12-01), by = month)) x = data.frame(x = rnorm(5)) df = sqldf(select * from t, x) A simpler way to get random data that doesn't involve the sqldf package, and gets different x values for each date: df - data.frame(t = seq.Date(as.Date(2009-01-01), to=as.Date(2009-12-01), by=month), x=rnorm(60)) qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() You are converting your dates to a factor, so they are no longer dates. I'm guessing you did this to get a separate boxplot for each date, but that is not the right way to do that. Use the group aesthetic to make different groups. qplot(df$t, df$x, geom = boxplot, group=df$t) + theme_bw() qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() + scale_x_date(major = months, minor = weeks, format = %b) qplot(df$t, df$x, geom = boxplot, group=df$t) + theme_bw() + scale_x_date(major = months, minor = weeks, format = %b) qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() + scale_x_date(format = %b) qplot(df$t, df$x, geom = boxplot, group=df$t) + theme_bw() + scale_x_date(format = %b) -- Brian Diggs, Ph.D. Senior Research Associate, Department of Surgery, Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
Now that we have a reproducible example... ;) -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1557994.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
Hi Dimitri, Have you looked at the examples for scale_x_date - http://had.co.nz/ggplot2/scale_date.html? They show you how to both set the limits and control the labels. Hadley On Sun, Feb 14, 2010 at 1:34 PM, Dimitri Shvorob dimitri.shvo...@gmail.com wrote: ... Unfortunately, a problem remains: I cannot label x ticks a la 'names.arg = '. month has values like '2009-01-01', '2009-02-01', etc., while I would prefer 'Jan', 'Feb'. Using closed$month = format(closed$month, %b) disrupts the order of plot's panels, which now follows the alphabetic order of month names. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555358.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
Thank you, Hadley. I try jpeg(file, width = 800, height = 600, quality = 100) qplot(factor(closed$close.month), closed$closing.balance, geom = boxplot, main = Monthly distributions of closing balances, xlab = Month, ylab = Balance, USD) + theme_bw() + scale_x_date(major = months, minor = weeks, format = %b) dev.off() ('minor = ' can be skipped with no consequences, apparently). Labels disappear altogether. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556571.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
Trying + scale_x_date(format = %b) produces a peculiar result: Apr and Dec facets are labeled Jan, remaining labels are blank. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556573.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
Without a reproducible example, it's impossible to give you any more suggestions. Hadley On Mon, Feb 15, 2010 at 2:16 PM, Dimitri Shvorob dimitri.shvo...@gmail.com wrote: Trying + scale_x_date(format = %b) produces a peculiar result: Apr and Dec facets are labeled Jan, remaining labels are blank. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556573.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
library(sqldf) library(ggplot2) t = data.frame(t = seq.Date(as.Date(2009-01-01), to = as.Date(2009-12-01), by = month)) x = data.frame(x = rnorm(5)) df = sqldf(select * from t, x) qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() + scale_x_date(major = months, minor = weeks, format = %b) qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() + scale_x_date(format = %b) -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556745.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with boxplot in ggplot2:qplot
Dataframe closed contains balances of closed accounts: each row has month of closure (Date-type column month) and latest balance. I would like to plot by-month distributions of balances. A qplot call below produces several warnings and no output. Can anyone help? Thank you. PS. A really basic task, very similar to the examples on p. 71 of the ggplot2 book, apart from a Date grouping column; I am quite surprised to have problems with it. lattice package to the rescue? qplot(factor(month), balance, data = closed, geom = boxplot, xlim = range(closed$month)) There were 13 warnings (use warnings() to see them) warnings() Warning messages: 1: Removed 1 rows containing missing values (stat_boxplot). 2: Removed 7 rows containing missing values (geom_point). 3: Removed 5 rows containing missing values (geom_point). 4: Removed 8 rows containing missing values (geom_point). 5: Removed 3 rows containing missing values (geom_point). 6: Removed 5 rows containing missing values (geom_point). 7: Removed 2 rows containing missing values (geom_point). 8: Removed 12 rows containing missing values (geom_point). 9: Removed 2 rows containing missing values (geom_point). 10: Removed 1 rows containing missing values (geom_point). 11: Removed 2 rows containing missing values (geom_point). 12: Removed 3 rows containing missing values (geom_point). 13: Removed 4 rows containing missing values (geom_point). p = qplot(factor(month), balance, data = closed, geom = boxplot, xlim = range(closed$month)) plot(p) Error in plot.window(...) : need finite 'xlim' values -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
Hi, it's hard to tell what's wrong without a reproducible example, but I noted two things: - AFAIK there is no plot method for ggplot2. You probably meant print(p) instead - if you map x to factor(month), I think it will be incompatible with your xlim values range(month). HTH, baptiste On 14 February 2010 19:55, Dimitri Shvorob dimitri.shvo...@gmail.com wrote: Dataframe closed contains balances of closed accounts: each row has month of closure (Date-type column month) and latest balance. I would like to plot by-month distributions of balances. A qplot call below produces several warnings and no output. Can anyone help? Thank you. PS. A really basic task, very similar to the examples on p. 71 of the ggplot2 book, apart from a Date grouping column; I am quite surprised to have problems with it. lattice package to the rescue? qplot(factor(month), balance, data = closed, geom = boxplot, xlim = range(closed$month)) There were 13 warnings (use warnings() to see them) warnings() Warning messages: 1: Removed 1 rows containing missing values (stat_boxplot). 2: Removed 7 rows containing missing values (geom_point). 3: Removed 5 rows containing missing values (geom_point). 4: Removed 8 rows containing missing values (geom_point). 5: Removed 3 rows containing missing values (geom_point). 6: Removed 5 rows containing missing values (geom_point). 7: Removed 2 rows containing missing values (geom_point). 8: Removed 12 rows containing missing values (geom_point). 9: Removed 2 rows containing missing values (geom_point). 10: Removed 1 rows containing missing values (geom_point). 11: Removed 2 rows containing missing values (geom_point). 12: Removed 3 rows containing missing values (geom_point). 13: Removed 4 rows containing missing values (geom_point). p = qplot(factor(month), balance, data = closed, geom = boxplot, xlim = range(closed$month)) plot(p) Error in plot.window(...) : need finite 'xlim' values -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
... Unfortunately, a problem remains: I cannot label x ticks a la 'names.arg = '. month has values like '2009-01-01', '2009-02-01', etc., while I would prefer 'Jan', 'Feb'. Using closed$month = format(closed$month, %b) disrupts the order of plot's panels, which now follows the alphabetic order of month names. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555358.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
My bad: once I ran dev.off(), I did get a plot, albeit a blank one. Then I removed xlim - which I put in after qplot's complain about xlim - and voila! Thanks a lot. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555352.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with Boxplot
Hi r-help-boun...@r-project.org napsal dne 05.09.2009 04:59:41: Hi Petr, Thanks for these comments. I'm sorry that my post was not clear. I was referring to the questions in my original post/code/file uploads, but I had forgotten to include an updated file (now attached http://www.nabble.com/file/p25304663/Post%2Btrial%2Bdata.csv Post+trial+data.csv ) to work with the new code: testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv) new_data- read.table(testdata, skip = 0, sep = ,, na.strings = na,header = TRUE) x11(width=16, height=7, pointsize=14) boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las =1, boxwex = 0.5) legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5, lty=1:2, fill=c(lightblue, salmon), bty=n); title(main=Chart title text, cex.main = 1.8) grid() I'm still not clear how I can get the number format showing #,###. E.g. with this code and attached file, the scale shows as 2000, 1 etc. I don't know how to show 2,000. 10,000 etc. I have looked through sprintf (thanks for suggesting that - I'd spent hours looking without finding it) and it seems incredibly flexible, but the formats shown are more scientific in focus. I still haven't been able to find a way of getting a comma style. AFAIK you can not format these in boxplot directly. You need to plot without y axis and in axis you can use formating with prettyNum. I found quite easily from sprintf and formatC help pages (I did not do it before so I learned it now:-) x-rnorm(100)+1 bbb-boxplot(x, axes=F) axis(2, at= pretty(x), labels=prettyNum(pretty(x), big.mark=,)) Regards Petr Thanks again Guy Petr Pikal wrote: Hi it is rather difficult to understand what you mean by your questions/answers without real reproducible code. r-help-boun...@r-project.org napsal dne 03.09.2009 13:41:11: I'd be interested if anyone has a quick way to get percentages and additionally, how do I get numbers in the 0,000 format along the x or y-axis? In the meantime, I can live with this. plot(1:10,1:10, axes=F) axis(2, at=c(2,3,7,9), labels=c(1.2, 2.38, 13.54, 16.8)) the same applies with boxplot. by bbb- boxplot() you obtain an object which is used by bxp. See help page for boxplot, section See also ... See also par for graphic options, format and or sprintf for formating numbers Regards Petr -- View this message in context: http://www.nabble.com/Problems-with-Boxplot- tp25256461p25304663.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with Boxplot
Hi Petr, Thanks for these comments. I'm sorry that my post was not clear. I was referring to the questions in my original post/code/file uploads, but I had forgotten to include an updated file (now attached http://www.nabble.com/file/p25304663/Post%2Btrial%2Bdata.csv Post+trial+data.csv ) to work with the new code: testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv) new_data- read.table(testdata, skip = 0, sep = ,, na.strings = na,header = TRUE) x11(width=16, height=7, pointsize=14) boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las =1, boxwex = 0.5) legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5, lty=1:2, fill=c(lightblue, salmon), bty=n); title(main=Chart title text, cex.main = 1.8) grid() I'm still not clear how I can get the number format showing #,###. E.g. with this code and attached file, the scale shows as 2000, 1 etc. I don't know how to show 2,000. 10,000 etc. I have looked through sprintf (thanks for suggesting that - I'd spent hours looking without finding it) and it seems incredibly flexible, but the formats shown are more scientific in focus. I still haven't been able to find a way of getting a comma style. Thanks again Guy Petr Pikal wrote: Hi it is rather difficult to understand what you mean by your questions/answers without real reproducible code. r-help-boun...@r-project.org napsal dne 03.09.2009 13:41:11: I'd be interested if anyone has a quick way to get percentages and additionally, how do I get numbers in the 0,000 format along the x or y-axis? In the meantime, I can live with this. plot(1:10,1:10, axes=F) axis(2, at=c(2,3,7,9), labels=c(1.2, 2.38, 13.54, 16.8)) the same applies with boxplot. by bbb- boxplot() you obtain an object which is used by bxp. See help page for boxplot, section See also ... See also par for graphic options, format and or sprintf for formating numbers Regards Petr -- View this message in context: http://www.nabble.com/Problems-with-Boxplot-tp25256461p25304663.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with Boxplot
Hi it is rather difficult to understand what you mean by your questions/answers without real reproducible code. r-help-boun...@r-project.org napsal dne 03.09.2009 13:41:11: I'm posting answers to my own Q's here - as far as I have answers - first so that people don't spend time on them, and second in case the solutions are helpful to anyone else in future. 1) My first question is: is there a simple way of getting both dates along the x-axis and the *100 calculation (or percentages)? I still don't know how to change the format of the y-axis tick labels. I'd be interested if anyone has a quick way to get percentages and additionally, how do I get numbers in the 0,000 format along the x or y-axis? In the meantime, I can live with this. plot(1:10,1:10, axes=F) axis(2, at=c(2,3,7,9), labels=c(1.2, 2.38, 13.54, 16.8)) the same applies with boxplot. by bbb- boxplot() you obtain an object which is used by bxp. See help page for boxplot, section See also 2) Next is how can I put a legend somewhere to show that red is data set 1 and blue is data set 2. I did this with the following text: legend(top, c(Top,Bottom), cex=1.5, lty=1:2, fill=c(lightblue, salmon), bty=n) You can go through structure of object produced by boxplot and you will see that boxes are located on x axis from 1 to number of boxes and on y axis according to the scale of y axis boxplot(rnorm(20), axes=F) legend(.5,0, legend=letters[1:3], col=1:3, pch=1) legend(1,0, legend=letters[1:3], col=1:3, pch=19, pt.cex=3) 3) Is it possible to get the date to straddle across each of the two dates it covers: as it is, one tick has the date, the other does not. I didn't manage to do this, but as there were over 20 dates in the final data (i.e. 40 plots), by changing the width of the chart window, not every plot was labeled anyway and it was clear enough. ?? 4) Is it possible to show both the median and the mean with boxplot? I gave up on this, but I think the data looks OK in the end with just the boxplot defaults. Again object produced by bbb is your clue x-rlnorm(200) bbb-boxplot(x) points(1, mean(x), cex=3,col=2, pch=19) You can add anything not just mean but remember that when you see boxplot, you expect to have median mentioned not mean. See also par for graphic options, format and or sprintf for formating numbers Regards Petr 5) Finally, the code works as described above (i.e. up to a point) with the Post trial data.csv file I have posted. However when I try with a larger file (Larger trial.csv, also posted), I get the message: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 145 did not have 50 elements when I get to the data_headings line. I have no idea why R is seeing a difference between these two files. I ended up finding that even for specific small files, I got this error message, which prevented me from processing the data and so was fatal to the code. I narrowed it down to a small file, and then looked at the csv file in notepad. The bottom of the file (which was just 2 columns of data, of different column lengths), was along these lines: -0.48013245,0.095652174 -0.039344262,-0.067142857 0.018022077,-0.079295154 -0.078534031, 0.010054845, 0.096153846, 0.177568018 0.013818182 0.002402883 It seemed that R could cope with empty columns - as long as there was a , to indicate that there was indeed a column, but it could NOT cope with a column that didn't exist (because there was no ,). The problem was that Excel, which was generating the CSV file, wasn't putting , to indicate empty columns in certain circumstances. The solution was to fill the empty cells in Excel with na before saving as CSV. Excel then saves it correctly, and R deals with it correctly. The final code (though without the y-axis formatting being fixed) is: testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv) new_data- read.table(testdata, skip = 0, sep = ,, na.strings = na,header = TRUE) x11(width=16, height=7, pointsize=14) boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las =1, boxwex = 0.5) legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5, lty=1:2, fill=c(lightblue, salmon), bty=n); title(main=Chart title text, cex.main = 1.8) grid() Guy gug wrote: Hello, I have been having difficulty getting boxplot to give the output I want - probably a result of the way I have been handling the data. The data is arranged in columns: each date has two sets of data. The number of data points varies with the date, so each column is of different length. I want to get a series of boxplots with the date along the x-axis, with alternating colors, so that it is easy to see the difference between the results within each date, as well as across dates. testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv) data_headings -
Re: [R] Problems with Boxplot
I'm posting answers to my own Q's here - as far as I have answers - first so that people don't spend time on them, and second in case the solutions are helpful to anyone else in future. 1) My first question is: is there a simple way of getting both dates along the x-axis and the *100 calculation (or percentages)? I still don't know how to change the format of the y-axis tick labels. I'd be interested if anyone has a quick way to get percentages and additionally, how do I get numbers in the 0,000 format along the x or y-axis? In the meantime, I can live with this. 2) Next is how can I put a legend somewhere to show that red is data set 1 and blue is data set 2. I did this with the following text: legend(top, c(Top,Bottom), cex=1.5, lty=1:2, fill=c(lightblue, salmon), bty=n) 3) Is it possible to get the date to straddle across each of the two dates it covers: as it is, one tick has the date, the other does not. I didn't manage to do this, but as there were over 20 dates in the final data (i.e. 40 plots), by changing the width of the chart window, not every plot was labeled anyway and it was clear enough. 4) Is it possible to show both the median and the mean with boxplot? I gave up on this, but I think the data looks OK in the end with just the boxplot defaults. 5) Finally, the code works as described above (i.e. up to a point) with the Post trial data.csv file I have posted. However when I try with a larger file (Larger trial.csv, also posted), I get the message: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 145 did not have 50 elements when I get to the data_headings line. I have no idea why R is seeing a difference between these two files. I ended up finding that even for specific small files, I got this error message, which prevented me from processing the data and so was fatal to the code. I narrowed it down to a small file, and then looked at the csv file in notepad. The bottom of the file (which was just 2 columns of data, of different column lengths), was along these lines: -0.48013245,0.095652174 -0.039344262,-0.067142857 0.018022077,-0.079295154 -0.078534031, 0.010054845, 0.096153846, 0.177568018 0.013818182 0.002402883 It seemed that R could cope with empty columns - as long as there was a , to indicate that there was indeed a column, but it could NOT cope with a column that didn't exist (because there was no ,). The problem was that Excel, which was generating the CSV file, wasn't putting , to indicate empty columns in certain circumstances. The solution was to fill the empty cells in Excel with na before saving as CSV. Excel then saves it correctly, and R deals with it correctly. The final code (though without the y-axis formatting being fixed) is: testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv) new_data- read.table(testdata, skip = 0, sep = ,, na.strings = na,header = TRUE) x11(width=16, height=7, pointsize=14) boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las =1, boxwex = 0.5) legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5, lty=1:2, fill=c(lightblue, salmon), bty=n); title(main=Chart title text, cex.main = 1.8) grid() Guy gug wrote: Hello, I have been having difficulty getting boxplot to give the output I want - probably a result of the way I have been handling the data. The data is arranged in columns: each date has two sets of data. The number of data points varies with the date, so each column is of different length. I want to get a series of boxplots with the date along the x-axis, with alternating colors, so that it is easy to see the difference between the results within each date, as well as across dates. testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv) data_headings - read.table(testdata, skip = 0, sep = ,, header = FALSE)[1,] my_data - read.table(testdata, skip = 1, sep = ,, na.strings = na,header = FALSE) boxplot(my_data*100, names = data_headings, outline = FALSE, range = 0.3, border = c(2,4)) The result is a boxplot, but it does not show the date along the bottom (the names = data_headings bit achieves nothing). I can alternatively try this: new_data- read.table(testdata, skip = 0, sep = ,, na.strings = na,header = TRUE) boxplot(new_data,outline = FALSE, range = 0.3,border = c(2,4)) This takes all the data and plots it, but I then lose the ability to multiply by 100 (I'm trying to show percentages: e.g. 10% as 10, rather than as 0.1). 1) My first question is: is there a simple way of getting both dates along the x-axis and the *100 calculation (or percentages)? 2) Next is how can I put a legend somewhere to show that red is data set 1 and blue is data set 2. 3) Is it possible to get the date to straddle across each of the two dates it covers: as it is, one tick has the date, the other does not. 4) Is it possible to show both the median and the mean with boxplot? 5) Finally, the code works as
[R] Problems with Boxplot
Hello, I have been having difficulty getting boxplot to give the output I want - probably a result of the way I have been handling the data. The data is arranged in columns: each date has two sets of data. The number of data points varies with the date, so each column is of different length. I want to get a series of boxplots with the date along the x-axis, with alternating colors, so that it is easy to see the difference between the results within each date, as well as across dates. testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv) data_headings - read.table(testdata, skip = 0, sep = ,, header = FALSE)[1,] my_data - read.table(testdata, skip = 1, sep = ,, na.strings = na,header = FALSE) boxplot(my_data*100, names = data_headings, outline = FALSE, range = 0.3, border = c(2,4)) The result is a boxplot, but it does not show the date along the bottom (the names = data_headings bit achieves nothing). I can alternatively try this: new_data- read.table(testdata, skip = 0, sep = ,, na.strings = na,header = TRUE) boxplot(new_data,outline = FALSE, range = 0.3,border = c(2,4)) This takes all the data and plots it, but I then lose the ability to multiply by 100 (I'm trying to show percentages: e.g. 10% as 10, rather than as 0.1). 1) My first question is: is there a simple way of getting both dates along the x-axis and the *100 calculation (or percentages)? 2) Next is how can I put a legend somewhere to show that red is data set 1 and blue is data set 2. 3) Is it possible to get the date to straddle across each of the two dates it covers: as it is, one tick has the date, the other does not. 4) Is it possible to show both the median and the mean with boxplot? 5) Finally, the code works as described above (i.e. up to a point) with the Post trial data.csv file I have posted. However when I try with a larger file (Larger trial.csv, also posted), I get the message: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 145 did not have 50 elements when I get to the data_headings line. I have no idea why R is seeing a difference between these two files. http://www.nabble.com/file/p25256461/Post%2Btrial%2Bdata.csv Post+trial+data.csv http://www.nabble.com/file/p25256461/Larger%2Btrial.csv Larger+trial.csv Thanks for any suggestions, Guy Green -- View this message in context: http://www.nabble.com/Problems-with-Boxplot-tp25256461p25256461.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.