Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-17 Thread Dimitri Shvorob

Thanks a lot, Brian!
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1558810.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-16 Thread Brian Diggs
On 2/15/2010 2:41 PM, Dimitri Shvorob wrote:
 library(sqldf)
 library(ggplot2)
 
 t = data.frame(t = seq.Date(as.Date(2009-01-01), to =
 as.Date(2009-12-01), by = month))
 x = data.frame(x = rnorm(5))
 df = sqldf(select * from t, x)

A simpler way to get random data that doesn't involve the sqldf package, and 
gets different x values for each date:

df - data.frame(t = seq.Date(as.Date(2009-01-01), to=as.Date(2009-12-01), 
by=month), x=rnorm(60))

 qplot(factor(df$t), df$x, geom = boxplot) + theme_bw()

You are converting your dates to a factor, so they are no longer dates.  I'm 
guessing you did this to get a separate boxplot for each date, but that is not 
the right way to do that.  Use the group aesthetic to make different groups.

qplot(df$t, df$x, geom = boxplot, group=df$t) + theme_bw()

 qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() +
 scale_x_date(major = months,  minor = weeks, format = %b) 

qplot(df$t, df$x, geom = boxplot, group=df$t) + theme_bw() +
scale_x_date(major = months,  minor = weeks, format = %b)

 qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() +
 scale_x_date(format = %b) 

qplot(df$t, df$x, geom = boxplot, group=df$t) + theme_bw() +
scale_x_date(format = %b)

--
Brian Diggs, Ph.D.
Senior Research Associate, Department of Surgery, Oregon Health  Science 
University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-16 Thread Dimitri Shvorob

Now that we have a reproducible example... ;)
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1557994.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-15 Thread hadley wickham
Hi Dimitri,

Have you looked at the examples for scale_x_date -
http://had.co.nz/ggplot2/scale_date.html?  They show you how to both
set the limits and control the labels.

Hadley

On Sun, Feb 14, 2010 at 1:34 PM, Dimitri Shvorob
dimitri.shvo...@gmail.com wrote:

 ... Unfortunately, a problem remains: I cannot label x ticks a la 'names.arg
 =  '.

 month has values like '2009-01-01', '2009-02-01', etc., while I would prefer
 'Jan', 'Feb'. Using

 closed$month = format(closed$month, %b)

 disrupts the order of plot's panels, which now follows the alphabetic order
 of month names.

 --
 View this message in context: 
 http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555358.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-15 Thread Dimitri Shvorob

Thank you, Hadley. I try

jpeg(file, width = 800, height = 600, quality = 100)
qplot(factor(closed$close.month), closed$closing.balance, geom = boxplot,
  main = Monthly distributions of closing balances, xlab = Month,
ylab = Balance, USD) + theme_bw() + scale_x_date(major = months,  minor
= weeks, format = %b)
dev.off()

('minor = ' can be skipped with no consequences, apparently). Labels
disappear altogether.


-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556571.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-15 Thread Dimitri Shvorob

Trying 

+ scale_x_date(format = %b)

produces a peculiar result: Apr and Dec facets are labeled Jan, remaining
labels are blank.
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556573.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-15 Thread hadley wickham
Without a reproducible example, it's impossible to give you any more
suggestions.

Hadley

On Mon, Feb 15, 2010 at 2:16 PM, Dimitri Shvorob
dimitri.shvo...@gmail.com wrote:

 Trying

 + scale_x_date(format = %b)

 produces a peculiar result: Apr and Dec facets are labeled Jan, remaining
 labels are blank.
 --
 View this message in context: 
 http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556573.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-15 Thread Dimitri Shvorob

library(sqldf)
library(ggplot2)

t = data.frame(t = seq.Date(as.Date(2009-01-01), to =
as.Date(2009-12-01), by = month))
x = data.frame(x = rnorm(5))
df = sqldf(select * from t, x)

qplot(factor(df$t), df$x, geom = boxplot) + theme_bw()


qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() +
scale_x_date(major = months,  minor = weeks, format = %b) 


qplot(factor(df$t), df$x, geom = boxplot) + theme_bw() +
scale_x_date(format = %b) 
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1556745.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread Dimitri Shvorob

Dataframe closed contains balances of closed accounts: each row has month of
closure (Date-type column month) and latest balance. I would like to plot
by-month distributions of balances. A qplot call below produces several
warnings and no output. 

Can anyone help?

Thank you.

PS. A really basic task, very similar to the examples on p. 71 of the
ggplot2 book, apart from a Date grouping column; I am quite surprised to
have problems with it. lattice package to the rescue?


 qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
There were 13 warnings (use warnings() to see them)

 warnings()
Warning messages:
1: Removed 1 rows containing missing values (stat_boxplot).
2: Removed 7 rows containing missing values (geom_point).
3: Removed 5 rows containing missing values (geom_point).
4: Removed 8 rows containing missing values (geom_point).
5: Removed 3 rows containing missing values (geom_point).
6: Removed 5 rows containing missing values (geom_point).
7: Removed 2 rows containing missing values (geom_point).
8: Removed 12 rows containing missing values (geom_point).
9: Removed 2 rows containing missing values (geom_point).
10: Removed 1 rows containing missing values (geom_point).
11: Removed 2 rows containing missing values (geom_point).
12: Removed 3 rows containing missing values (geom_point).
13: Removed 4 rows containing missing values (geom_point).

 p = qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
 plot(p)
Error in plot.window(...) : need finite 'xlim' values
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread baptiste auguie
Hi,

it's hard to tell what's wrong without a reproducible example, but I
noted two things:

- AFAIK there is no plot method for ggplot2. You probably meant print(p) instead

- if you map x to factor(month), I think it will be incompatible with
your xlim values range(month).

HTH,

baptiste

On 14 February 2010 19:55, Dimitri Shvorob dimitri.shvo...@gmail.com wrote:

 Dataframe closed contains balances of closed accounts: each row has month of
 closure (Date-type column month) and latest balance. I would like to plot
 by-month distributions of balances. A qplot call below produces several
 warnings and no output.

 Can anyone help?

 Thank you.

 PS. A really basic task, very similar to the examples on p. 71 of the
 ggplot2 book, apart from a Date grouping column; I am quite surprised to
 have problems with it. lattice package to the rescue?


 qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
 There were 13 warnings (use warnings() to see them)

 warnings()
 Warning messages:
 1: Removed 1 rows containing missing values (stat_boxplot).
 2: Removed 7 rows containing missing values (geom_point).
 3: Removed 5 rows containing missing values (geom_point).
 4: Removed 8 rows containing missing values (geom_point).
 5: Removed 3 rows containing missing values (geom_point).
 6: Removed 5 rows containing missing values (geom_point).
 7: Removed 2 rows containing missing values (geom_point).
 8: Removed 12 rows containing missing values (geom_point).
 9: Removed 2 rows containing missing values (geom_point).
 10: Removed 1 rows containing missing values (geom_point).
 11: Removed 2 rows containing missing values (geom_point).
 12: Removed 3 rows containing missing values (geom_point).
 13: Removed 4 rows containing missing values (geom_point).

 p = qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
 plot(p)
 Error in plot.window(...) : need finite 'xlim' values
 --
 View this message in context: 
 http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread Dimitri Shvorob

... Unfortunately, a problem remains: I cannot label x ticks a la 'names.arg
=  '. 

month has values like '2009-01-01', '2009-02-01', etc., while I would prefer
'Jan', 'Feb'. Using

closed$month = format(closed$month, %b) 

disrupts the order of plot's panels, which now follows the alphabetic order
of month names.

-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555358.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread Dimitri Shvorob

My bad: once I ran dev.off(), I did get a plot, albeit a blank one. Then I
removed xlim - which I put in after qplot's complain about xlim - and voila!

Thanks a lot.
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555352.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Boxplot

2009-09-07 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 05.09.2009 04:59:41:

 
 Hi Petr,
 
 Thanks for these comments.
 
 I'm sorry that my post was not clear.  I was referring to the questions 
in
 my original post/code/file uploads, but I had forgotten to include an
 updated file (now attached 
 http://www.nabble.com/file/p25304663/Post%2Btrial%2Bdata.csv
 Post+trial+data.csv ) to work with the new code:
 
 testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv)
 new_data- read.table(testdata, skip = 0, sep = ,, na.strings =
 na,header = TRUE)
 x11(width=16, height=7, pointsize=14)
 boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las 
=1,
 boxwex = 0.5) 
 legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5,
 lty=1:2, fill=c(lightblue, salmon), bty=n);
 title(main=Chart title text, cex.main = 1.8)
 grid() 
 
 I'm still not clear how I can get the number format showing #,###.  E.g.
 with this code and attached file, the scale shows as 2000, 1 
etc.  I
 don't know how to show 2,000. 10,000 etc.  I have looked through 
sprintf
 (thanks for suggesting that - I'd spent hours looking without finding 
it)
 and it seems incredibly flexible, but the formats shown are more 
scientific
 in focus.  I still haven't been able to find a way of getting a comma
 style.

AFAIK you can not format these in boxplot directly. You need to plot 
without y axis and in axis you can use formating with prettyNum. I found 
quite easily from sprintf and formatC help pages (I did not do it before 
so I learned it now:-)

x-rnorm(100)+1
bbb-boxplot(x, axes=F)
axis(2, at= pretty(x), labels=prettyNum(pretty(x), big.mark=,))

Regards
Petr

 
 Thanks again
 
 Guy
 
 
 Petr Pikal wrote:
  
  Hi
  
  it is rather difficult to understand what you mean by your 
  questions/answers without real reproducible code.
  
  r-help-boun...@r-project.org napsal dne 03.09.2009 13:41:11:
  
  I'd be interested if anyone has a quick way to get percentages 
and 
  additionally, how do I get numbers in the 0,000 format along the x 
or
  y-axis?  In the meantime, I can live with this.
  
  plot(1:10,1:10, axes=F)
  axis(2, at=c(2,3,7,9), labels=c(1.2, 2.38, 13.54, 16.8))
  
  the same applies with boxplot.
  
  by
  
  bbb- boxplot()
  
  you obtain an object which is used by bxp. See help page for boxplot, 
  section See also
  
  ...
  
  See also par for graphic options, format and or sprintf for formating 
  numbers 
  
  Regards
  Petr
  
  
 
 -- 
 View this message in context: 
http://www.nabble.com/Problems-with-Boxplot-
 tp25256461p25304663.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Boxplot

2009-09-05 Thread gug

Hi Petr,

Thanks for these comments.

I'm sorry that my post was not clear.  I was referring to the questions in
my original post/code/file uploads, but I had forgotten to include an
updated file (now attached 
http://www.nabble.com/file/p25304663/Post%2Btrial%2Bdata.csv
Post+trial+data.csv ) to work with the new code:

testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv)
new_data- read.table(testdata, skip = 0, sep = ,, na.strings =
na,header = TRUE)
x11(width=16, height=7, pointsize=14)
boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las =1,
boxwex = 0.5) 
legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5,
lty=1:2, fill=c(lightblue, salmon), bty=n);
title(main=Chart title text, cex.main = 1.8)
grid() 

I'm still not clear how I can get the number format showing #,###.  E.g.
with this code and attached file, the scale shows as 2000, 1 etc.  I
don't know how to show 2,000. 10,000 etc.  I have looked through sprintf
(thanks for suggesting that - I'd spent hours looking without finding it)
and it seems incredibly flexible, but the formats shown are more scientific
in focus.  I still haven't been able to find a way of getting a comma
style.

Thanks again

Guy


Petr Pikal wrote:
 
 Hi
 
 it is rather difficult to understand what you mean by your 
 questions/answers without real reproducible code.
 
 r-help-boun...@r-project.org napsal dne 03.09.2009 13:41:11:
 
 I'd be interested if anyone has a quick way to get percentages and 
 additionally, how do I get numbers in the 0,000 format along the x or
 y-axis?  In the meantime, I can live with this.
 
 plot(1:10,1:10, axes=F)
 axis(2, at=c(2,3,7,9), labels=c(1.2, 2.38, 13.54, 16.8))
 
 the same applies with boxplot.
 
 by
 
 bbb- boxplot()
 
 you obtain an object which is used by bxp. See help page for boxplot, 
 section See also
 
 ...
 
 See also par for graphic options, format and or sprintf for formating 
 numbers 
 
 Regards
 Petr
 
 

-- 
View this message in context: 
http://www.nabble.com/Problems-with-Boxplot-tp25256461p25304663.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Boxplot

2009-09-04 Thread Petr PIKAL
Hi

it is rather difficult to understand what you mean by your 
questions/answers without real reproducible code.

r-help-boun...@r-project.org napsal dne 03.09.2009 13:41:11:

 
 I'm posting answers to my own Q's here - as far as I have answers - 
first so
 that people don't spend time on them, and second in case the solutions 
are
 helpful to anyone else in future.
 
 1) My first question is: is there a simple way of getting both dates 
along
 the x-axis and the *100 calculation (or percentages)?
 I still don't know how to change the format of the y-axis tick labels. 
I'd
 be interested if anyone has a quick way to get percentages and 
additionally,
 how do I get numbers in the 0,000 format along the x or y-axis?  In 
the
 meantime, I can live with this.

plot(1:10,1:10, axes=F)
axis(2, at=c(2,3,7,9), labels=c(1.2, 2.38, 13.54, 16.8))

the same applies with boxplot.

by

bbb- boxplot()

you obtain an object which is used by bxp. See help page for boxplot, 
section See also


 
 2) Next is how can I put a legend somewhere to show that red is data 
set 1
 and blue is data set 2.
 I did this with the following text:
 legend(top, c(Top,Bottom), cex=1.5, lty=1:2, fill=c(lightblue,
 salmon), bty=n)

You can go through structure of object produced by boxplot and you will 
see that boxes are located on x axis from 1 to number of boxes and on y 
axis according to the scale of y axis

boxplot(rnorm(20), axes=F)
legend(.5,0, legend=letters[1:3], col=1:3, pch=1)
legend(1,0, legend=letters[1:3], col=1:3, pch=19, pt.cex=3)



 
 3) Is it possible to get the date to straddle across each of the two 
dates
 it covers: as it is, one tick has the date, the other does not.
 I didn't manage to do this, but as there were over 20 dates in the final
 data (i.e. 40 plots), by changing the width of the chart window, not 
every
 plot was labeled anyway and it was clear enough.

??

 
 4) Is it possible to show both the median and the mean with boxplot?
 I gave up on this, but I think the data looks OK in the end with just 
the
 boxplot defaults.

Again object produced by bbb is your clue

x-rlnorm(200)
bbb-boxplot(x)
points(1, mean(x), cex=3,col=2, pch=19)

You can add anything not just mean but remember that when you see boxplot, 
you expect to have median mentioned not mean.

See also par for graphic options, format and or sprintf for formating 
numbers 

Regards
Petr


 
 5) Finally, the code works as described above (i.e. up to a point) with 
the
 Post trial data.csv file I have posted.  However when I try with a 
larger
 file (Larger trial.csv, also posted), I get the message: Error in
 scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
line
 145 did not have 50 elements when I get to the data_headings line.  I
 have no idea why R is seeing a difference between these two files.
 I ended up finding that even for specific small files, I got this error
 message, which prevented me from processing the data and so was fatal to 
the
 code.  I narrowed it down to a small file, and then looked at the csv 
file
 in notepad.  The bottom of the file (which was just 2 columns of data, 
of
 different column lengths), was along these lines:
 
 -0.48013245,0.095652174
 -0.039344262,-0.067142857
 0.018022077,-0.079295154
 -0.078534031,
 0.010054845,
 0.096153846,
 0.177568018
 0.013818182
 0.002402883
 
 It seemed that R could cope with empty columns - as long as there was a 
,
 to indicate that there was indeed a column, but it could NOT cope with a
 column that didn't exist (because there was no ,).  The problem was 
that
 Excel, which was generating the CSV file, wasn't putting , to indicate
 empty columns in certain circumstances.  The solution was to fill the 
empty
 cells in Excel with na before saving as CSV.  Excel then saves it
 correctly, and R deals with it correctly. 
 
 The final code (though without the y-axis formatting being fixed) is:
 
 testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv)
 new_data- read.table(testdata, skip = 0, sep = ,, na.strings =
 na,header = TRUE)
 x11(width=16, height=7, pointsize=14)
 boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las 
=1,
 boxwex = 0.5) 
 legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5,
 lty=1:2, fill=c(lightblue, salmon), bty=n);
 title(main=Chart title text, cex.main = 1.8)
 grid() 
 
 Guy
 
 
 gug wrote:
  
  Hello,
  
  I have been having difficulty getting boxplot to give the output I 
want -
  probably a result of the way I have been handling the data.
  
  The data is arranged in columns: each date has two sets of data.  The
  number of data points varies with the date, so each column is of 
different
  length.  I want to get a series of boxplots with the date along the
  x-axis, with alternating colors, so that it is easy to see the 
difference
  between the results within each date, as well as across dates.
  
  testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv)
  data_headings - 

Re: [R] Problems with Boxplot

2009-09-03 Thread gug

I'm posting answers to my own Q's here - as far as I have answers - first so
that people don't spend time on them, and second in case the solutions are
helpful to anyone else in future.

1) My first question is: is there a simple way of getting both dates along
the x-axis and the *100 calculation (or percentages)?
I still don't know how to change the format of the y-axis tick labels.  I'd
be interested if anyone has a quick way to get percentages and additionally,
how do I get numbers in the 0,000 format along the x or y-axis?  In the
meantime, I can live with this.

2) Next is how can I put a legend somewhere to show that red is data set 1
and blue is data set 2.
I did this with the following text:
legend(top, c(Top,Bottom), cex=1.5, lty=1:2, fill=c(lightblue,
salmon), bty=n)

3) Is it possible to get the date to straddle across each of the two dates
it covers: as it is, one tick has the date, the other does not.
I didn't manage to do this, but as there were over 20 dates in the final
data (i.e. 40 plots), by changing the width of the chart window, not every
plot was labeled anyway and it was clear enough.

4) Is it possible to show both the median and the mean with boxplot?
I gave up on this, but I think the data looks OK in the end with just the
boxplot defaults.

5) Finally, the code works as described above (i.e. up to a point) with the
Post trial data.csv file I have posted.  However when I try with a larger
file (Larger trial.csv, also posted), I get the message: Error in
scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :  line
145 did not have 50 elements when I get to the data_headings line.  I
have no idea why R is seeing a difference between these two files.
I ended up finding that even for specific small files, I got this error
message, which prevented me from processing the data and so was fatal to the
code.  I narrowed it down to a small file, and then looked at the csv file
in notepad.  The bottom of the file (which was just 2 columns of data, of
different column lengths), was along these lines:

-0.48013245,0.095652174
-0.039344262,-0.067142857
0.018022077,-0.079295154
-0.078534031,
0.010054845,
0.096153846,
0.177568018
0.013818182
0.002402883
 
It seemed that R could cope with empty columns - as long as there was a ,
to indicate that there was indeed a column, but it could NOT cope with a
column that didn't exist (because there was no ,).  The problem was that
Excel, which was generating the CSV file, wasn't putting , to indicate
empty columns in certain circumstances.  The solution was to fill the empty
cells in Excel with na before saving as CSV.  Excel then saves it
correctly, and R deals with it correctly.  

The final code (though without the y-axis formatting being fixed) is:

testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv)
new_data- read.table(testdata, skip = 0, sep = ,, na.strings =
na,header = TRUE)
x11(width=16, height=7, pointsize=14)
boxplot(new_data,outline = FALSE, col = c(lightblue, salmon), las =1,
boxwex = 0.5) 
legend(top, c(Label for blue boxes,Label for red boxes), cex=1.5,
lty=1:2, fill=c(lightblue, salmon), bty=n);
title(main=Chart title text, cex.main = 1.8)
grid()  

Guy


gug wrote:
 
 Hello,
 
 I have been having difficulty getting boxplot to give the output I want -
 probably a result of the way I have been handling the data.
 
 The data is arranged in columns: each date has two sets of data.  The
 number of data points varies with the date, so each column is of different
 length.  I want to get a series of boxplots with the date along the
 x-axis, with alternating colors, so that it is easy to see the difference
 between the results within each date, as well as across dates.
 
 testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv)
 data_headings - read.table(testdata, skip = 0, sep = ,, header =
 FALSE)[1,]
 my_data - read.table(testdata, skip = 1, sep = ,, na.strings =
 na,header = FALSE)
 boxplot(my_data*100, names = data_headings, outline = FALSE, range = 0.3,
 border = c(2,4))
 
 The result is a boxplot, but it does not show the date along the bottom
 (the names = data_headings bit achieves nothing).  I can alternatively
 try this:
 
 new_data- read.table(testdata, skip = 0, sep = ,, na.strings =
 na,header = TRUE)
 boxplot(new_data,outline = FALSE, range = 0.3,border = c(2,4))
 
 This takes all the data and plots it, but I then lose the ability to
 multiply by 100 (I'm trying to show percentages: e.g. 10% as 10, rather
 than as 0.1).
 
 1) My first question is: is there a simple way of getting both dates along
 the x-axis and the *100 calculation (or percentages)?
 
 2) Next is how can I put a legend somewhere to show that red is data set
 1 and blue is data set 2.
 
 3) Is it possible to get the date to straddle across each of the two dates
 it covers: as it is, one tick has the date, the other does not.
 
 4) Is it possible to show both the median and the mean with boxplot?
 
 5) Finally, the code works as 

[R] Problems with Boxplot

2009-09-02 Thread gug

Hello,

I have been having difficulty getting boxplot to give the output I want -
probably a result of the way I have been handling the data.

The data is arranged in columns: each date has two sets of data.  The number
of data points varies with the date, so each column is of different length. 
I want to get a series of boxplots with the date along the x-axis, with
alternating colors, so that it is easy to see the difference between the
results within each date, as well as across dates.

testdata- c(C:\\Files\\R\\Sample R code\\Post trial data.csv)
data_headings - read.table(testdata, skip = 0, sep = ,, header =
FALSE)[1,]
my_data - read.table(testdata, skip = 1, sep = ,, na.strings =
na,header = FALSE)
boxplot(my_data*100, names = data_headings, outline = FALSE, range = 0.3,
border = c(2,4))

The result is a boxplot, but it does not show the date along the bottom (the
names = data_headings bit achieves nothing).  I can alternatively try
this:

new_data- read.table(testdata, skip = 0, sep = ,, na.strings =
na,header = TRUE)
boxplot(new_data,outline = FALSE, range = 0.3,border = c(2,4))

This takes all the data and plots it, but I then lose the ability to
multiply by 100 (I'm trying to show percentages: e.g. 10% as 10, rather
than as 0.1).

1) My first question is: is there a simple way of getting both dates along
the x-axis and the *100 calculation (or percentages)?

2) Next is how can I put a legend somewhere to show that red is data set 1
and blue is data set 2.

3) Is it possible to get the date to straddle across each of the two dates
it covers: as it is, one tick has the date, the other does not.

4) Is it possible to show both the median and the mean with boxplot?

5) Finally, the code works as described above (i.e. up to a point) with the
Post trial data.csv file I have posted.  However when I try with a larger
file (Larger trial.csv, also posted), I get the message: Error in
scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :  line
145 did not have 50 elements when I get to the data_headings line.  I
have no idea why R is seeing a difference between these two files.
http://www.nabble.com/file/p25256461/Post%2Btrial%2Bdata.csv
Post+trial+data.csv  http://www.nabble.com/file/p25256461/Larger%2Btrial.csv
Larger+trial.csv 
Thanks for any suggestions,

Guy Green
 

-- 
View this message in context: 
http://www.nabble.com/Problems-with-Boxplot-tp25256461p25256461.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.