Re: [R] ggplot legend consolidation

2007-09-10 Thread hadley wickham
Sorry, I should have mentioned that get_legend won't work on the plots
that you are actually plotting - you have turned their legends off!
You'll need a plot which isn't plotted, but is used to produce the
legends.

Hadley

On 9/10/07, Te, Kaom <[EMAIL PROTECTED]> wrote:
> Hi Hadley,
>
> I just tried out your suggestion, but it does not look like the
> get_legends function is working correctly. Instead of returning a grob
> back to me it returns NULL.
>
> Here is my modified code and the results of running it.
>
> Any help would be appreciated. I believe that once I can get the legend
> in grob form then I can figure out how to deconstruct it myself.
>
> Thanks,
> Kaom
>
> > p.legend <- get_legends(p)
> > grid.draw(p.legend)
> Error in grid.draw(p.legend) : no applicable method for "grid.draw"
> > p.legend
> NULL
> >
>
>  BEGIN CODE
> ## Obtained from http://pastie.textmate.org/95755
> get_legends <- function(plot) {
>   if (length(plot$layers) == 0) stop("No layers to plot", call.=FALSE)
>
>   # Apply function to layer and matching data
>   dlapply <- function(f) mapply(f, data, layers, SIMPLIFY=FALSE)
>
>   plot <- plot_clone(plot)
>   layers <- plot$layers
>   scales <- plot$scales
>   facet <- plot$facet
>
>   cs <- plot$coordinates
>
>   # Evaluate aesthetics
>   data <- lapply(layers, function(x) x$make_aesthetics(plot))
>
>   # Facet
>   data <- mapply(function(d, p) facet$stamp_data(d), data, layers,
> SIMPLIFY=FALSE)
>   # Transform scales where possible.  Also need to train so statisics
>   # (e.g. stat_smooth) have access to info
>   data <- dlapply(function(d, p) p$scales_transform(d, scales))
>   dlapply(function(d, p) p$scales_train(d, scales))
>
>   # Apply statistics
>   data <- dlapply(function(d, p) p$calc_statistics(d, scales))
>   data <- dlapply(function(d, p) p$map_statistics(d, plot))
>
>   # Adjust position before scaling
>   data <- dlapply(function(d, p) p$adjust_position(d, scales, "before"))
>   # Transform, train and map scales
>   # data <- dlapply(function(d, p) p$scales_transform(d, scales))
>   dlapply(function(d, p) p$scales_train(d, scales, adjust=TRUE))
>   data <- dlapply(function(d, p) p$scales_map(d, scales))
>
>   # Adjust position after scaling
>   data <- dlapply(function(d, p) p$adjust_position(d, scales, "after"))
>   scales <- scales$minus(plot$scales$get_scales(c("x", "y", "z")))
>
>   legends(scales, FALSE)
>
> }
>
>
> library(ggplot2)
> data(mtcars)
>
> grid.newpage()
>
> hide_colour <- scale_colour_continuous()
> hide_colour$legend <- FALSE
>
> pushViewport(viewport(layout = grid.layout(2, 2)))
>
> p <- ggplot(data = mtcars) +
>   geom_point(mapping = aes(x = hp, y = mpg, colour = cyl)) +
>   hide_colour
>
> pushViewport(viewport(layout.pos.col = 1,
>   layout.pos.row = 1))
>
> print(p, vp = current.viewport())
> upViewport()
>
> p <- ggplot(data = mtcars) +
>   geom_point(mapping = aes(x = drat, y = disp, colour = cyl)) +
>   hide_colour
>
>
> pushViewport(viewport(layout.pos.col = 2,
>   layout.pos.row = 1))
>
> print(p, vp = current.viewport())
> upViewport()
>
> p <- ggplot(data = mtcars) +
>   geom_point(mapping = aes(x = qsec, y = mpg, colour = cyl)) +
>   hide_colour
>
> pushViewport(viewport(layout.pos.col = 1,
>   layout.pos.row = 2))
>
> print(p, vp = current.viewport())
> upViewport()
>
> pushViewport(viewport(layout.pos.col = 2,
>   layout.pos.row = 2))
> grid.rect()
>
> p.legend <- get_legends(p)
> grid.draw(p.legend)
> --END CODE
>
>
>
>
> -Original Message-
> From: hadley wickham [mailto:[EMAIL PROTECTED]
> Sent: Monday, September 10, 2007 7:58 AM
> To: Te, Kaom
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] ggplot legend consolidation
>
> > I have recently been introduced to the ggplot package by Hadley
> > Wickham and must say I am quite impressed so far at how easy it is to
> > make attractive plots, but one thing I am struggling over is how to
> > consolidate legends.
>
> It's not currently possible to consolidate them (although in the distant
> future that would be something nice to have), but you can turn them off:
>
> hide_colour <- scale_colour_continuous() hide_colour$legend <- FALSE
>
> p <- ggplot(data = mtcars) +
>   geom_point(mapping = aes(x = h

Re: [R] ggplot legend consolidation

2007-09-10 Thread hadley wickham
> I have recently been introduced to the ggplot package by Hadley Wickham
> and must say I am quite impressed so far at how easy it is to make
> attractive plots, but one thing I am struggling over is how to
> consolidate legends.

It's not currently possible to consolidate them (although in the
distant future that would be something nice to have), but you can turn
them off:

hide_colour <- scale_colour_continuous()
hide_colour$legend <- FALSE

p <- ggplot(data = mtcars) +
  geom_point(mapping = aes(x = hp, y = mpg, colour = cyl)) +
  hide_colour

You'll also need to twiddle your viewports a little so that you still
have space for the viewport, since space will not be allocated
automatically anymore.

The next thing is to extract the grob for the legend itself - this is
a little tricker, because there's currently no way to get at the
scales after they have been "trained" with the
data.  Load get_legends from http://pastie.textmate.org/95755, and
then you can do:

grid.newpage(); grid.draw(get_legends(p))

If you're not familiar enough with grid to stitch all of these pieces
together, please let me know, but this should be enough to get you
started.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stacking data frames with different variables

2007-09-09 Thread hadley wickham
Have a look at rbind.fill in the reshape package.

Hadley

On 9/9/07, Muenchen, Robert A (Bob) <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> If I need to stack two data frames, I can use rbind, but it requires
> that all variables exist in both sets. I can make that happen, but other
> stat packages would figure out where the differences were, add the
> missing variables to each, set their values to missing and stack them.
> Is there a more automatic way to do that in R?
>
> Below is an example program.
>
> Thanks,
> Bob
>
> # Top data frame has two variables.
> x <- c(1,2)
> y <- c(1,2)
>
> top <- data.frame(x,y)
> top
>
> # Bottom data frame has only one of them.
> x <- c(3,4)
> bottom <- data.frame(x)
> bottom
>
> # So rbind won't work.
> rbind(top, bottom)
>
> # After figuring out where the mismatches are I can
> # make the two DFs the same manually.
> bottom <- data.frame( bottom, y=NA)
> bottom
>
> # Now I get the desired result.
> both <- rbind(top,bottom)
> both
>
> =
> Bob Muenchen (pronounced Min'-chen), Manager
> Statistical Consulting Center
> U of TN Office of Information Technology
> 200 Stokely Management Center, Knoxville, TN 37996-0520
> Voice: (865) 974-5230
> FAX: (865) 974-4810
> Email: [EMAIL PROTECTED]
> Web: http://oit.utk.edu/scc,
> News: http://listserv.utk.edu/archives/statnews.html
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] computing distance in miles or km between 2 street addre

2007-09-06 Thread hadley wickham
On 9/6/07, Ted Harding <[EMAIL PROTECTED]> wrote:
> On 06-Sep-07 18:42:32, Philip James Smith wrote:
> > Hi R-ers:
> >
> > I need to compute the distance between 2 street addresses in
> > either km or miles. I do not care if the distance is a "shortest
> > driving route" or if it is "as the crow flies."
> >
> > Does anybody know how to do this? Can it be done in R? I have
> > thousands of addresses, so I think that Mapquest is out of the
> > question!
> >
> > Please rely to: [EMAIL PROTECTED]
> >
> > Thank you!
> > Phil Smith
>
> That's a somewhat ill-posed question! You will for a start
> need a database of some kind, either of geographical locations
> (coordinates) of street addresses, or of the metric of the
> road network with capability to identify the street addresses
> in the database.

I think it's perfectly well-posed, but you'll need to use some
resources outside of R.

The term for converting street addresses to lat/long values is called
geocoding, and wikipedia has a good introduction:
http://en.wikipedia.org/wiki/Geocoding.  There are a couple of free
geocoding web services for US addresses: e.g.
http://developer.yahoo.com/maps/rest/V1/geocode.html or
http://geocoder.us/.  For non-US addresses you'll have to google
around (e.g. international geocoding), and depending on the country
and level of detail you require, you may have to pay.

Once you've got lat and longs, computing grand circle distance is
easy.  Computing driving distances is harder, but may be possible with
(e.g.) the google maps API -
http://www.google.com/apis/maps/documentation/index.html#Driving_Directions.
 Googling for driving directions api suggests some other
possibilities.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice: panel superpose with groups

2007-09-04 Thread hadley wickham
Hi Michael,

It's not lattice, but you can easily do this with ggplot2:

install.packages("ggplot2")
library(ggplot2)
qplot(year, yvar, data=df, facets = . ~ week, colour=factor(temp),
geom="line") +
stat_summary(aes(group=1), geom="line", fun="mean", size=2)

Although you don't (currently) get the nice tabular layout of the
panels like in lattice.  You can find out more about ggplot2 at
http://had.co.nz/ggplot2

Hadley

On 9/4/07, Folkes, Michael <[EMAIL PROTECTED]> wrote:
> The example code below allows the plotting of three different groups per 
> panel.  I can't fathom how to write the panel function to add an additional 
> line for each group, which in this case is just the mean Y value for each 
> group within  each panel.  (i.e. there'd be six lines per panel.)
> Spent all day working on it and searching the archives to no avail!  Yikes.
> Any help would be greatly appreciated!
> Michael Folkes
>
> #
> #This builds fake dataset
>
> years<-2000:2006
> weeks<-1:20
> yr<-rep(years,rep(length(weeks)*6,length(years)))
> wk<-rep(weeks,rep(6,length(weeks)))
> temp<-rep(4:9,length(years)*length(weeks))
> yvar<-round(rnorm(length(years)*length(weeks)*6,mean=30,sd=4),0)
> xvar<-(rnorm(length(years)*length(weeks)*6)+5)/10
>
> df<-data.frame(year=yr,week=wk,temp=temp,   yvar=yvar,  xvar=xvar)
> #
>
> library(lattice)
> df$year2<-as.factor(df$year)
> df$week2<-as.factor(df$week)
> df<-df[df$temp %in% c(5,7,9),]
> xyplot(yvar~year|week2,data=df,layout = c(4, 5), as.table=TRUE,
> type='l',
> groups=temp ,
>   panel = function(x, y,groups, ...) {
> panel.superpose(x,y,groups,...)
> panel.xyplot(x,rep(mean(y),length(x)),type='l',lty=3) #<- 
> only generates the panel mean
>   }
> )
>
> ___
> Michael Folkes
> Salmon Stock Assessment
> Canadian Dept. of Fisheries & Oceans
> Pacific Biological Station
> 3190 Hammond Bay Rd.
> Nanaimo, B.C., Canada
> V9T-6N7
> Ph (250) 756-7264 Fax (250) 756-7053  [EMAIL PROTECTED]
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] UseR! 2007 presentations and posters - now available

2007-09-04 Thread hadley wickham
Hi everyone,

Many of the presentations and posters from UseR! 2007 are now available online:
http://user2007.org/program/

If you presented and your slides or poster isn't up yet, please email
a pdf version to me, [EMAIL PROTECTED], and I'll put it up.

Regards,

Hadley

(And check out http://user2007.org/ for some photos of the event and the R cake)


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Legend issue with ggplot2

2007-09-03 Thread hadley wickham
Yes - all this stuff is currently rather undocumented.  Hopefully that
will change in the near future!

Hadley

On 9/3/07, ONKELINX, Thierry <[EMAIL PROTECTED]> wrote:
> Thanks Hadley,
>
> I've been struggling with this all afternoon. But now it's working
> again. Since I'm using it in a script, the few extra lines don't bother
> me that much.
>
> Thierry
>
> 
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
> methodology and quality assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> tel. + 32 54/436 185
> [EMAIL PROTECTED]
> www.inbo.be
>
> Do not put your faith in what statistics say until you have carefully
> considered what they do not say.  ~William W. Watt
> A statistical analysis, properly conducted, is a delicate dissection of
> uncertainties, a surgery of suppositions. ~M.J.Moroney
>
>
>
> > -Oorspronkelijk bericht-
> > Van: hadley wickham [mailto:[EMAIL PROTECTED]
> > Verzonden: maandag 3 september 2007 15:15
> > Aan: ONKELINX, Thierry
> > CC: r-help@stat.math.ethz.ch
> > Onderwerp: Re: [R] Legend issue with ggplot2
> >
> > On 9/3/07, ONKELINX, Thierry <[EMAIL PROTECTED]> wrote:
> > > Dear useRs,
> > >
> > > I'm struggling with the new version of ggplot2. In the previous
> > > version I did something like this. But now this yield an
> > error (object "fill"
> > > not found).
> > >
> > > library(ggplot2)
> > > dummy <- data.frame(x = rep(1:10, 4), group = gl(4, 10)) dummy$y <-
> > > dummy$x * rnorm(4)[dummy$group] + 5 * rnorm(4)[dummy$group]
> > dummy$min
> > > <- dummy$y - 5 dummy$max <- dummy$y + 5 ggplot(data =
> > dummy, aes(x =
> > > x, max = max, min = min, fill = group)) +
> > > geom_ribbon() + geom_line(aes(y = max, colour = fill)) +
> > > geom_line(aes(y = min, colour = fill))
> >
> > Strange - I'm not sure why that ever worked.
> >
> > > When I adjust the code to the line below, it works again. But this
> > > time with two legend keys for "group". Any idea how to display only
> > > one legend key for group? The ggplot-code aboved yielded
> > only on legend key.
> > >
> > > ggplot(data = dummy, aes(x = x, max = max, min = min,
> > colour = group,
> > > fill = group)) + geom_ribbon() + geom_line(aes(y = max)) +
> > > geom_line(aes(y = min))
> >
> > You can manually turn off one of the legends:
> >
> > sc <- scale_colour_discrete()
> > sc$legend <- FALSE
> > .last_plot + sc
> >
> > It's not very convenient though, so I'll think about how to
> > do this automatically.  The legends need to be more
> > intelligent about only displaying the minimum necessary.
> >
> > Hadley
> >
>


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting predicted curves with log scale in lattice

2007-09-03 Thread hadley wickham
Hi Ken,

Alternatively, you could use ggplot2:

install.packages("ggplot2")
library(ggplot2)

qplot(LL, RR, data=ds1, facets = . ~ FF) + geom_line(data=ds2) + scale_x_log10()

It is very hard to get transformed scales working correctly, and it's
something I had to spend a lot of time on in between ggplot 1 and 2.

Hadley


On 9/3/07, Ken Knoblauch <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I was taken off guard by the following behavior in a lattice plot.
> I frequently want to add a predicted curve defined at more
> points than in the formula expression of xyplot.  There have
> been numerous examples of how to do this on r-help, but I
> still often struggle to make this work.  I just realized that
> specifying one of the axes on a log scale does not guarantee
> that the added data for a curve will automatically take that
> into account.  I don't know if this should be called a bug,
> I haven't picked up an indication that would lead me to
> expect this in the documentation.  I admit that if I had a
> deeper understanding of lattice and/or grid, it might be
> clearer why...  Here is a toy example illustrating the behavior
> (there may be a more efficient way to do this),
>
> ds1 <- data.frame( RR = rep(seq(0, 1, len = 5)^2, 2) +
> rnorm(10, sd = 0.1),
>LL = rep(10^seq(1, 5), 2),
>FF = factor(rep(letters[1:2], each = 5))
>)
> ds2 <- data.frame(RR = rep(seq(0, 1, len = 20)^2, 2),
>   LL = rep(10^seq(1, 5, len = 20), 2),
>FF = factor(rep(letters[1:2], each = 20))
>)
> library(lattice)
> xyplot(RR ~ LL | FF, ds1,
> scales = list(x = list(log = TRUE)),
> aspect = "xy",
> subscripts = TRUE,
> ID = ds2$FF,
> panel = function(x, y, subscripts, ID, ...) {
> w <- unique(ds1$FF[subscripts])
> llines(log10(ds2$LL[ID == w]), ds2$RR[ID == w], ...)
> panel.xyplot(x, y, ...)
> }
> )
>
> Note that the x-variable of llines must be logged to plot the correct values
> and so the scales argument seems to apply only to the x, y arguments
> passed to the panel function.
>
> Thank you.
>
> best,
>
> Ken
>
>
> --
> Ken Knoblauch
> Inserm U846
> Institut Cellule Souche et Cerveau
> Département Neurosciences Intégratives
> 18 avenue du Doyen Lépine
> 69500 Bron
> France
> tel: +33 (0)4 72 91 34 77
> fax: +33 (0)4 72 91 34 61
> portable: +33 (0)6 84 10 64 10
> http://www.lyon.inserm.fr/846/english.html
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Legend issue with ggplot2

2007-09-03 Thread hadley wickham
On 9/3/07, ONKELINX, Thierry <[EMAIL PROTECTED]> wrote:
> Dear useRs,
>
> I'm struggling with the new version of ggplot2. In the previous version
> I did something like this. But now this yield an error (object "fill"
> not found).
>
> library(ggplot2)
> dummy <- data.frame(x = rep(1:10, 4), group = gl(4, 10))
> dummy$y <- dummy$x * rnorm(4)[dummy$group] + 5 * rnorm(4)[dummy$group]
> dummy$min <- dummy$y - 5
> dummy$max <- dummy$y + 5
> ggplot(data = dummy, aes(x = x, max = max, min = min, fill = group)) +
> geom_ribbon() + geom_line(aes(y = max, colour = fill)) + geom_line(aes(y
> = min, colour = fill))

Strange - I'm not sure why that ever worked.

> When I adjust the code to the line below, it works again. But this time
> with two legend keys for "group". Any idea how to display only one
> legend key for group? The ggplot-code aboved yielded only on legend key.
>
> ggplot(data = dummy, aes(x = x, max = max, min = min, colour = group,
> fill = group)) + geom_ribbon() + geom_line(aes(y = max)) +
> geom_line(aes(y = min))

You can manually turn off one of the legends:

sc <- scale_colour_discrete()
sc$legend <- FALSE
.last_plot + sc

It's not very convenient though, so I'll think about how to do this
automatically.  The legends need to be more intelligent about only
displaying the minimum necessary.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] ggplot2 - version 0.5.5

2007-09-02 Thread hadley wickham
ggplot2
===

ggplot2 is a plotting system for R, based on the grammar of graphics,
which tries to take the good parts of base and lattice graphics and
avoid bad parts. It takes care of many of the fiddly details
that make plotting a hassle (like drawing legends) as well as
providing a powerful model of graphics that makes it easy to produce
complex multi-layered graphics.

Find out more at http://had.co.nz/ggplot2, and check out the over 500
examples of ggplot in use.


Changes in version 0.5.5


Improvements:
* ggplot now gives rather more helpful errors if you have
misspecified a variable name in the aesthetic mapping
* changed default hline and vline intercepts to 0
* added "count" output variable from stat_density for creating
stacked/conditional density plots
* added parameters to geom_boxplot to control appearance of outlying 
points
* overriding aesthetics with fixed values that have already been set
with aesthetics now actually works
* slightly better names for xaxis and yaxis grobs
* added aes_string function to make it easier to construction
aesthetic mapping specifications in functions
* continuous scales now have labels argument so that you can manually
specify labels if desired
* stat_density now calculates densities on a common grid across
groups.  This means that position_fill and position_stack now work
properly
* if numeric, legend labels right aligned
* polar coordinates much improved, and with better examples

Documentation:
* fixed argument documentation for qplot
* added (very) rudimentary documentation about what functions return
* documentation now lists extra variables created by statistics

Bug fixes:
* coord_flip now works with segment and all interval geoms
* geom_errorbar now works in all coordinate systems
* derived y axes (eg. on histogram) are now labelled correctly
* fixed bug in stat_quantile caused by new output format from predict.rq
* fixed bug if x or y are constant
* fixed bug in histogram where sometimes lowest bar was omitted
* fixed bug in stat_qq which prevent setting aesthetics
* fixed bug in qplot(..., geom="density", position="identity")
* fixed stat_qq so that unnecessary arguments are no longer passed to
the distribution function

Subtractions:
* removed grid argument from ggsave, replaced by ggtheme(theme_bw)
* removed add argument from qplot


Regards,

Hadley

-- 
http://had.co.nz/

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning line colors in xyplot

2007-08-31 Thread hadley wickham
On 8/31/07, Christof Bigler <[EMAIL PROTECTED]> wrote:
> The suggestions by Deepayan Sarkar and Hadley Wickham work for that
> case, but I get into troubles when I try to draw e.g. a panel for "A"
> and "B":
>
> xyplot(y ~ x | f , groups=g, data=tmp,type="l",
>   par.settings=list(superpose.line=list(col=c("red","blue"))),
>   auto.key=list(space="top",
> text=levels(tmp$f),points=FALSE,lines=TRUE))

In ggplot, you would need to specify the grouping variable as well as
the colour variable:

qplot(x, y, data=tmp, geom="line", colour=f, group=interaction(f,g))

which should work for any facetting.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning line colors in xyplot

2007-08-30 Thread hadley wickham
Hi Christof,

You can do this in ggplot, with one exception:

install.packages("ggplot2")
library(ggplot2)
qplot(x, y, data=tmp, facets = . ~ g, geom="line", colour=f)

Unfortunately I don't yet have an implementation of facetting that
works like lattice, wrapping the line of plots in to 2d dimensions.

You can find out more about ggplot2 at http://had.co.nz/ggplot2

Hadley

On 8/30/07, Christof Bigler <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a dataframe containing data from individuals 1, ..., 12 (grouping
> variable "g" in the data frame below), which belong either to "A" or "B"
> (grouping variable "f"):
>
> set.seed(1)
>
> tmp <- data.frame(
>
> y=c(rnorm(10,0,1),rnorm(10,4,2),rnorm(10,0,1),rnorm(10,4,2),rnorm(10,0,1),rnorm(10,4,2),rnorm(10,0,1),rnorm(10,4,2),rnorm(10,0,1),rnorm(10,4,2),rnorm(10,0,1),rnorm(10,4,2)),
>   x=1:10,
>
> f=c(rep("A",10),rep("B",10),rep("A",10),rep("B",10),rep("A",10),rep("B",10),rep("A",10),rep("B",10),rep("A",10),rep("B",10),rep("A",10),rep("B",10)),
>
> g=c(rep("3",10),rep("2",10),rep("1",10),rep("4",10),rep("5",10),rep("6",10),rep("8",10),rep("7",10),rep("9",10),rep("11",10),rep("12",10),rep("10",10)))
>
>
> I would like to draw line plots using the function xyplot:
>
> library(lattice)
>
> xyplot(y ~ x | g , groups=g, data=tmp,type="l",
>   par.settings=list(superpose.line=list(col=c("red","blue"))),
>   auto.key=list(space="top",
> text=levels(tmp$f),points=FALSE,lines=TRUE))
>
> As it is, the colors are recycled alternately in the order the
> individuals appear in the plot (1, 10, 11, 12, 2, ..., 9).
>
> How can I assign the red color to all individuals of group "A" and the
> blue color to all individuals of group "B"?
>
> Thanks in advance!
>
> Christof
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fill circles

2007-08-26 Thread hadley wickham
Hi Christian,

You could use the ggplot2 package (http://had.co.nz/ggplot2) which
takes care of many of these details for you:

qplot(Height, Volume, data=trees, size = Girth)
qplot(Height, Volume, data=trees, size = Girth, colour=Height)

You can find more details on the website about how to customise the
scales for your needs, but the defaults should be pretty good.

Hadley

On 8/25/07, Cristian cristian <[EMAIL PROTECTED]> wrote:
> Hi all,
> I'm an R newbie,
> I did this script to create a scatterplot using the "tree" matrix from
> "datasets" package:
>
> library('datasets')
> with(trees,
> {
> plot(Height, Volume, pch=3, xlab="Height", ylab="Volume")
> symbols(Height, Volume, circles=Girth/12, fg="grey", inches=FALSE,
> add=FALSE)
> }
> )
>
> I'd like to use the column Named "Height" to fill the circles with colors
> (ex.: the small numbers in green then yellow and the high numbers in red).
> I'd like to have a legend for the size and the colors too.
> I did it manually using a script like that:
> color[(x>=0.001)&(x<0.002)]<-"#41FF41"
> color[(x>=0.002)&(x<0.003)]<-"#2BFF2B"
> color[(x>=0.003)&(x<0.004)]<-"#09FF09"
> color[(x>=0.004)&(x<0.005)]<-"#00FE00"
> color[(x>=0.005)&(x<0.006)]<-"#00F700"
> color[(x>=0.006)&(x<0.007)]<-"#00E400"
> color[(x>=0.007)&(x<0.008)]<-"#00D600"
> color[(x>=0.008)&(x<0.009)]<-"#00C300" and so on but I don't like to do it
> manually... do know a solution...
> Thank you very much
> chris
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] perception of graphical data

2007-08-24 Thread hadley wickham
Hi Richard,

> I apologize that this is off-topic.  I am seeking information on
> perception of graphical data, in an effort to improve the plots I
> produce.  Would anyone point me to literature reviews in this area?  (Or
> keywords to try on google?)  Is this located somewhere near cognitive
> science, psychology, human factors research?

Probably the best place to start on these general issues, are a couple
of papers by Cleveland:

@article{cleveland:1987,
Author = {Cleveland, William and McGill, Robert},
Journal = {Journal of the Royal Statistical Society. Series A
(General)},
Number = {3},
Pages = {192-229},
Title = {Graphical Perception: The Visual Decoding of Quantitative
Information on Graphical Displays of Data},
Volume = {150},
Year = {1987}}

@article{cleveland:1984,
Author = {Cleveland, William S. and McGill, M. E.},
Journal = {Journal of the American Statistical Association},
Number = 387,
Pages = {531-554},
Title = {Graphical Perception: Theory, Experimentation and
Application to the Development of Graphical Methods.},
Volume = 79,
Year = 1984}

For colour in particular, I like Ross Ihaka's introduction to the subject:


@inproceedings{ihaka:2003,
Author = {Ihaka, Ross},
Booktitle = {Proceedings of the 3rd International Workshop on
Distributed Statistical Computing (DSC 2003)},
Title = {Colour for Presentation Graphics},
Year = {2003}}


and also see colorbrewer.org

> Scatter plots of microarray data often attempt to represent thousands or
> tens of thousands of points, but all I read from them are density and
> distribution --- the gene names cannot be shown.  At what point, would a
> sunflowerplot-like display or a smooth gradient be better?  When two
> data points drawn as 50% gray disks are small and tangent, are they
> perceptually equivalent to a single, 100% black disk?  Or a 50% gray
> disk with twice the area?  What problems are known about plotting with
> disks --- do viewers use the area or the diameter (or neither) to gauge
> weight?

I think many of these are still research topics.  Two (of many) places
to start are:


@article{huang:1997,
Author = {Huang, Chisheng and McDonald, John Alan and Stuetzle, Werner},
Journal = {Journal of Computational and Graphical Statistics},
Pages = {383--396},
Title = {Variable resolution bivariate plots},
Volume = {6},
Year = {1997}}

@article{carr:1987,
Author = {Carr, D. B. and Littlefield, R. J. and Nicholson, W. L. and
Littlefield, J. S.},
Journal = {Journal of the American Statistical Association},
Number = {398},
Pages = {424-436},
Title = {Scatterplot Matrix Techniques for Large N},
Volume = {82},
Year = {1987}}



> As you can tell, I'm a non-expert, mixing issues of data interpretation,
> visual perception, graphic representation.  Previously, I didn't have
> the flexibility of R's graphics, so I didn't need to think so much.
> I've read some of Edward S. Tufte's books, but found them more
> qualitative than quantitative.

More quantitative approaches are Cleveland's, Bertin's and Wilkinson's:


@book{cleveland:1993,
Author = {Cleveland, William},
Publisher = {Hobart Press},
Title = {Visualizing data},
Year = {1993}}

@book{cleveland:1994,
Author = {Cleveland, William},
Publisher = {Hobart Press},
Title = {The Elements of Graphing Data},
Year = {1994}}

@book{chambers:1983,
Author = {Chambers, John and Cleveland, William and Kleiner, Beat and
Tukey, Paul},
Publisher = {Wadsworth},
Title = {Graphical methods for data analysis},
Year = {1983}}


@book{bertin:1983,
Address = {Madison, WI},
Author = {Bertin, Jacques},
Publisher = {University of Wisconsin Press},
Title = {Semiology of Graphics},
Year = {1983}}


@book{wilkinson:2006,
Author = {Wilkinson, Leland},
Publisher = {Springer},
Series = {Statistics and Computing},
Title = {The Grammar of Graphics},
Year = {2005}}

Hope this gets you started!

Hadley


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] uneven list to matrix

2007-08-24 Thread hadley wickham
On 8/23/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
> Here are two solutions.  The first repeatedly uses merge and the
> second creates a zoo object from each alph component whose time
> index consists of the row labels and uses zoo's multiway merge to
> merge them.
>
> # test data
> m <- matrix(1:5, 5, dimnames = list(LETTERS[1:5], NULL))
> alph <- list(m[1:4,,drop=F], m[c(1,3,4),,drop=F], m[c(1,4,5),,drop=F])
> alph

Or using reshape:

cast(melt(alph), X1 ~ L1)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need a variant of rbind for datasets with different numbers of columns

2007-08-23 Thread hadley wickham
You might try rbind.fill in the reshape package.

Hadley

On 8/22/07, Kirsten Beyer <[EMAIL PROTECTED]> wrote:
> Hello.  I am looking for a function that will allow me to paste rows
> together without regard for the numbers of columns in the datasets to
> be joined.  The only columns where it matters if they are aligned
> correctly are at the beginning - the rest of the columns represent
> differing numbers of ICD9 (disease) codes reported by each
> person(record) at a health visit.  They are in no particular order.
>
> For example, a result would look like this:
>
> patient  ICD91  ICD92  ICD93
> patient A   12345  67891543
> patient B3469   9090
> patient C   1234
>
> I am trying to accomplish this inside a loop which first identifies
> the codes associated with the person and then joins them to the
> person.  I have the code working so that it can create a row for each
> person, but I can't figure out how to join these rows together!  FYI,
> my dataset has 200,000+ people.
>
> Thanks
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-19 Thread hadley wickham
On 8/18/07, John Kane <[EMAIL PROTECTED]> wrote:
> I'm just starting to get a grasp on how R works so
> don't take my words too seriously but have a look at
> http://addictedtor.free.fr/graphiques/ for some idea
> of what R can do for publication quality graphics.  It
> is always possible that you might need another
> graphics package as well but I think it unlikely.

I may be in the minority, but I really don't like the R graph gallery.
 To my eye it largely provides examples of what you _shouldn't_ do
with graphics (and also seems rather unloved at the moment, given the
large number of spam keywords).  It fails to provide examples of using
graphics to gain insight into your data and mainly focuses on drawing
pretty (ugly) pictures.

Unfortunately there aren't many better resources at the moment.
Deepayan Sarkar is working on a lattice book, and hopefully he will
make the plots available on his website as well.  I'm also working on
a book for my ggplot2 package (http://had.co.nz/ggplot2) but that
won't be finished until next year.  For interactive graphics, the
GGobi book (http://www.ggobi.org/book/) is very close to being
published, and provides details about the R-GGobi link as well as many
techniques for gaining insight into your data interactively.  Another
option is the Graphics of Large Dataset book (http://rosuda.org/gold/)
which provides a wider survey of state of the art in interactive
graphics for large datasets.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] 'fda' 1.2.2 is now available on CRAN.

2007-08-14 Thread hadley wickham
The fda package supports "Functional Data Analysis" and "Applied
Functional Data Analysis" by Bernard Silverman and James Ramsay.
Functional data analysis, which lots of us like to call "FDA", is
about the analysis of information on curves or functions. FDA is a
collection statistical techniques for answering questions like, "What
are the main ways in which the curves vary from one to another?" In
fact, most of the questions and problems associated with multivariate
data (PCA, LDA, clustering, ...) have functional counterparts. More
information about FDA can be found at
http://www.psych.mcgill.ca/misc/fda/.

This version (and the previous 1.2.1) includes bug fixes plus a
"scripts" subdirectory with R code to reproduce some of the analyses
in the two functional data analysis books by Ramsay and Silverman and
a "Continuously Stirred Tank Reactor (CSTR)" simulation discussed in a
Ramsay, et al., discussion paper to appear soon in the Journal of the
Royal Statistical Society-series B.

It also includes the draft of a presentation on "fda in Matlab & R"
(in PowerPoint and Adobe Acrobat PDF formats) for the UseR! 2007
conference this Friday, Aug. 10, 1:55 - 2:20 PM in Ames, IA.

Regards

Hadley Wickham
James Ramsey
Spencer Graves

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help using gPath

2007-08-12 Thread hadley wickham
> Here's a partial extract from a sample session after running your code
> (NOTE this is using the development version of R;  grid.ls() does not
> exist in R 2.5.1 or earlier):
>
> Inspect the grob tree with grid.ls() (similar to Hadley's
> current.grobTree(), but with different formatting) ...

(I'll probably remove current.grobTree as soon as grid.ls makes it to
a released version of R)

>
>  > grid.ls()
> plot-surrounds
>GRID.cellGrob.118
>  background
>GRID.cellGrob.119
>  plot.gTree.113
>background
>guide.gTree.90
>  background.rect.80
>  minor-horizontal.segments.82
>  minor-vertical.segments.84
> # OUTPUT TRUNCATED

The format is much nicer than mine!

> ... It is not necessarily obvious which grob is which,
> but a little trial and error (e.g., grid.edit() to change
> the colour of a grob) shows that the border on the first
> panel is 'guide.rect.92', which is a child of 'plot.gTree.113'
> (NOTE the numbers come from a fresh R session).

I will try and rename these grobs so that they are more easily
accessible (and reproducible across multiple calls).  That should make
things easier in the future.

> Use grid.get() to grab that gTree and inspect that
> further using grid.ls(), this time also showing the
> viewports involved ...

What do all the upViewports represent?  Could the downViewports be
incorporating into the same place as the original definition?

> (The remaining code should work for you in your version of R;  it
> is just grid.ls() that is new.)
>
> Remove the original border rect, ...
>
>  > grid.remove("guide.rect.92", global=TRUE)
>
> ... (need global=TRUE because the border appears twice as a child
> of 'plot.gTree.113' [not sure why that is]) then add some lines that
> only draw the top, right, and bottom borders ...
>
>  > grid.add("plot.gTree.113",
> linesGrob(c(0, 1, 1, 0), c(1, 1, 0, 0),
>   gp=gpar(col="green"),
>   vp=vpPath("layout", "panel_1_1")))
>
> ... (I drew the new lines green so that they are easy to see).
> NOTE that in order to put the new lines in the same "place" as
> the original border, the new lines are added as children of the
> gTree 'plot.gTree.113' and they have a vpPath to make sure
> they get drawn in the right viewport within that gTree.

Do you think it would be worth drawing all these rectangles as lines
to make them easier to edit?

> What would probably be ideal would be a graphical interface to the
> grid.ls()-type information (something like an object explorer) that
> would make it easier to see which object is which and also make it
> easier to add and remove objects.  A nice student project perhaps :)

That would be great!

Hadley
-- 
http://had.co.nz/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Classifly problems

2007-08-07 Thread hadley wickham
Have you tried using classifly directly?

library(classifly)
classifly(Tumor ~ ., my.data.set, lda)

generate_classification_data is an internal function, and you are
passing it the wrong arguments.

Hadley

On 8/7/07, Dani Valverde <[EMAIL PROTECTED]> wrote:
> Hello,
> I am trying to explore a classification with GGobi. I am trying to
> generate additional data according to the model so I can draw the
> decision boundaries on the GGobi plot. The problem is that I always get
> the same error: Error in predict.lda(model,data): wrong number of
> variables, even if I know that I used the same number of variables for
> the model generation (6) and for the additional data generation (6
> also). I paste the code I am using:
>
> library(MASS)
> Tumor <- c(rep("MM",20),rep("GBM",18),rep("LGG",17))
> data.lda <- lda(data,Tumor)
> data.ld <- predict(data.lda)
> data.ldd <- data.frame(data.ld$x,data.ld$class)
>
> library(rggobi)
> data.g <- ggobi(data.ldd)
>
> library(classifly)
> generate_classification_data(data.lda,data,method="grid",n=10)
>
> Could you help me?
> Best regards,
>
> Dani
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 qplot and add

2007-08-04 Thread hadley wickham
On 8/4/07, Emilio Gagliardi <[EMAIL PROTECTED]> wrote:
> Dear Hadley,
> You are a machine! Its way past late you should be relaxing man ;)

Thanks :)

>
> Here is what I hacked together. My question was how to pass in the colour
> used to fill the dot/legend and the label for the legend entries.
>  test <- qplot(x,y, data=some.data, size=chop(variance, n=4),
> colour="cornflowerblue", xlim=c(-20,20), main="title", xlab="label",
> ylab="label") + geom_jitter(aes(colour="gray15", x=col1,
> xjitter=1, yjitter= 0.01)) +
> geom_jitter(aes(colour="gray20", x=col2, xjitter=1,
> yjitter=0.01)) + geom_jitter(aes(colour="gray25", x=col3,
> xjitter=1, yjitter=0.01)) +
> geom_jitter(aes(colour="gray30", x=col4, xjitter=1,
> yjitter= 0.01)) +
> scale_colour_identity(labels=c("number1","number2","number3","number4","number5"),
> grob="tile", name="Legend title")
>
> Because I want to be able to set the colors based on properties of my data,
> while still meaningfully name the legend entry.
>
> I did find another example, but I can't seem to find it atm, but it did
> assign labels from the underlying dataframe...which is better than hard
> coding them, but at this point its good enuf!

Yes, those are the two approaches I'd take - either build up piece by
piece and use scale_identity, or reshape the data frame so you can do
it one call.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 qplot and add

2007-08-03 Thread hadley wickham
On 8/2/07, Emilio Gagliardi <[EMAIL PROTECTED]> wrote:
> Hi Thierry and Hadley,
>
> Thanks for your help! I got it working almost 100% to what I want. My last
> questions are simple I think ;)
> 1) How can I pass a label to colour and assign the color at the same time?
> The auto-selection of colors is awesome for learning, but makes it harder to
> see some patterns.

I'm not sure what you mean?  Could you provide an example?

> 2) in regards to 1), where can I find a list of the possible values that can
> be passed to colour=""?  I've seen in some examples "gray50" but when I try
> it, it doesn't come out gray50, it just prints that as the label.  For
> example, in my case, I could have 4 colors with 4 different shades of each.
> Or maybe I can use word length and gray scale, and make longer words darker,
> etc...

I will try and explain this more in the book - but there's a
difference between mapping and setting.  In qplot values are always
mapped (this is usually what you want - you want to specify the raw
values and have them automatically mapped to sensible visual
properties).  If you manually add on geoms you can map or set:

 * mapping:  + geom_point(aes(colour = "Treatment A"))
 * setting: + geom_point(colour = "green")

Note in the second case, the colour must be a valid R colour.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 qplot and add

2007-08-02 Thread hadley wickham
On 8/2/07, ONKELINX, Thierry <[EMAIL PROTECTED]> wrote:
> I think you need something like.
>
> qplot(appetitive.stimulus, graphLabels, data=related.differences,
> size=variance, colour="Appetitive Stimulus", xlim=c(-20,20), main="Title
> here", xlab="Differences", ylab="Header Concepts") +
> geom_point(aes(colour = "Aversive Stimulus"))

You'll probably want aversive.stimulus in there as well:

+ geom_point(aes(colour = "Aversive Stimulus", x=aversive.stimulus))

The reason why Emilio's first attempt didn't work is that I have
removed the add argument from qplot because it is no longer necessary
- I might not have removed it from the documentation yet though,
sorry!

Hadley

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ggplot2 equivalent of axis and problem with log scale

2007-07-25 Thread hadley wickham
On 7/25/07, ONKELINX, Thierry <[EMAIL PROTECTED]> wrote:
> Dear useRs,
>
> Recently I've discorved ggplot2 and I must say that I really like it,
> although the documentation still is a working in progress.
>
> My first question: How can I change the position of the labels and the
> text of the labels? With a basic plot I would use axis(2, at =
> position.of.the.ticks, labels = text.at.the.ticks). Could someone
> provide me with an example of how to do this with ggplot2?

Have a look at scale_continous - in particular the breaks and labels
arguments (although I haven´t tested them much yet).  You need +
scale_x_continuous() and + scale_y_continuous() as appropriate.

> The second question is probably a little bug. If I plot the y-axis in
> log10 scale then geom_errorbar still plot the values in the original
> scale. See the example below. The second plot is what I would suspect
> when plotting the first graph.

Yes, that's a bug - I`ll try and get it fixed in the next version.

Thanks,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 axis color

2007-07-24 Thread hadley wickham
Hi Felipe,

Looks like a bug!  I'll try and get it fixed for the next version.  In
the meantime, you can read the last chapter of the ggplot book to see
how to fix it with grid.

Hadley

On 7/24/07, Felipe Carrillo <[EMAIL PROTECTED]> wrote:
> Hi:
> Does anyone have an idea on how to color the axis and
> labels using ggplot2? This is what I got:
>
> library(ggplot2)
>  p <- qplot(total_bill, tip, data = tips)
>  NewPlot<-  p + geom_abline(slope=c(0.1,0.15,0.2),
> colour=c("red","blue","yellow"),size=c(2,5,2))
> NewPlot + geom_smooth(colour="green",
> size=3,linetype=3)
> NewPlot$background.fill<-"cornsilk"
> NewPlot$background.colour <- "blue"
> NewPlot$axis.colour<-"red"  ? it doesn't do it
> Thanks
>
>  Felipe D. Carrillo
>   Fishery Biologist
>   US Fish & Wildlife Service
>   Red Bluff, California 96080
>
>
>
>   
> 
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternative to xyplot()?

2007-07-17 Thread hadley wickham
On 7/18/07, Stephen Tucker <[EMAIL PROTECTED]> wrote:
> What's wrong with lattice? Here's an alternative:
>
> library(ggplot2)
> ggplot(data=data.frame(x,y,grps=factor(grps)),
>mapping=aes(x=x,y=y,colour=grps)) + # define data
>   geom_identity() +# points
>   geom_smooth(method="lm") # regression line

I think you mean geom_point() not geom_identity()!

Also, if you just want groups, and not colours you can use the group aesthetic.

library(ggplot2)
qplot(x, y, group=grps) + geom_smooth(method=lm)

# You can have different grouping in different layers
qplot(x, y, colour=factor(grps)) + geom_smooth(method=lm)
qplot(x, y, colour=factor(grps)) + geom_smooth(aes(group=1), method=lm)

You can see more examples of ggplot2 in use at http://had.co.nz/ggplot2

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot for longitudinal data

2007-07-17 Thread hadley wickham
On 7/18/07, Osman Al-Radi <[EMAIL PROTECTED]> wrote:
> Dear R-help subscribers,
>
> I use xyplot to plot longitudinal data as follows:
>
> score<-runif(100,-4,5)
> group<-sample(1:4,100,rep=T)
> subject<-rep(1:25,4)
> age<-rep(runif(4,1,40),25)
> df<-data.frame(score,group,age,subject)
>
> xyplot(score~age|group, group=subject,
> panel=function(...){
> panel.loess(...,lwd=4)
> panel.superpose(...)}
> ,data=df)
>
> this produced a plot with four panels one for each group, with unique
> plotting parameters for each subject.
>
> How can I create a create a plot with a single panel where all four groups
> are superimposed using different line colors and symbols for each group, but
> preserving the longitudinal nature of the data (i.e. one line per subject).
>

Another approach would be to use the ggplot2 package (http://had.co.nz/ggplot2):

library(ggplot2)
qplot(age, score, data=df, group = interaction(subject, group),
geom="line", colour=factor(group)) + geom_smooth(aes(group=group),
enp.target=2, size=4)

# This gives a smooth per group, if you want one over all smooth
# use the following instead
+ geom_smooth(aes(group=1), enp.target=2, size=4)

# You can have both by adding both geom_smooths on

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scaling of different data sets in ggplot

2007-07-16 Thread hadley wickham
Hi Stephen,

You can't do that in ggplot (have two different scales) because I
think it's generally a really bad idea.  The whole point of plotting
the data is so that you can use your visual abilities to gain insight
into the data.  When you have two different scales the positions of
the two groups are essentially arbitrary - the data only have x values
in common, not y values.  You essentially have two almost unrelated
graphs plotted on top of each other.

On the other hand, for this data, I think it would be reasonable to
plot log(z) and y on the same scale - the data is transformed not the
scales.

Hadley

On 7/14/07, Stephen Tucker <[EMAIL PROTECTED]> wrote:
> Dear list (but probably mostly Hadley):
>
> In ggplot, operations to modify 'guides' are accessed through grid
> objects, but I did not find mention of creating new guides or possibly
> removing them altogether using ggplot functions. I wonder if this is
> something I need to learn grid to learn more about (which I hope to do
> eventually).
>
> Also, ggplot()+geom_object() [where 'object' can be point, line, etc.]
> or layer() contains specification for the data, mappings and
> geoms/stats - but the geoms/stats can be scale-dependent [for
> instance, log]. so I wonder how different scalings can be applied to
> different data sets.
>
> Below is an example that requires both:
>
> x <- runif(100) y <- exp(x^2) z <- x^2+rnorm(100,0,0.02)
>
> par(mar=c(5,4,2,4)+0.1) plot(x,y,log="y") lines(lowess(x,y,f=1/3))
> par(new=TRUE) plot(x,z,col=2,pch=3,yaxt="n",ylab="")
> lines(lowess(x,z,f=1/3),col=2) axis(4,col=2,col.axis=2)
> mtext("z",4,line=3,col=2)
>
> In ggplot:
>
> ## data specification
> ggplot(data=data.frame(x,y,z)) +
>
>   ## first set of points geom_point(mapping=aes(x=x,y=y)) +
>   ## scale_y_log() +
>
>   ## second set of points geom_point(mapping=aes(x=x,y=z),pch=3) +
>   ## layer(mapping=aes(x=x,y=z),stat="smooth",method="loess") +
>   ## scale_y_continuous()
>
> scale_y_log() and scale_y_continuous() appear to apply to both mappings at
> once, and I can't figure out how to associate them with the intended ones (I
> expect this will be a desire for size and color scales as well).
>
> Of course, I can always try to fool the system by (1) applying the scaling a
> priori to create a new variable, (2) plotting points from the new variable,
> and (3) creating a new axis with custom labels. Which then brings me back to
> ...how to add new guides? :)
>
> Thanks,
>
> Stephen
>
>
>
>   
> 
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem to sort factor

2007-07-16 Thread hadley wickham
On 7/16/07, Arne Brutschy <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I'm having a problem renaming and sorting the underlying factor of a
> ggplot2 based plot. Here's my code:
>
> ---8<--
> > delta <- ggplot(subset(data, Model==c("dyn", "dl4", "dl3")), 
> > aes(x=Problemsize, y=Fitness)) +
>   geom_smooth(size=1, color="black", fill=alpha("blue", 0.2))+
>   geom_point(size=0.5, aes(colour=DeltaConfig))+
>   scale_colour_gradient2(expression(bold(paste(Delta,"Config"))), 
> limits=c(0,10), midpoint=5,
>  low="green", mid="yellow", high="red")
> > delta <- delta + facet_grid(Model ~ . , margins = TRUE)
> > delta
> ---8<--
>
> and the data
> ---8<--
> > data
>Model Problemsize Fitness DeltaConfig
> 1dl1   3 81.5271  2.4495
> 2dl1   3 83.1999  2.4495
>  ...
> ---8<--
>
> I want to select a subset of the possible models and display the
> resulting three plots in a column. This works fine, but: the displayed
> names and order is wrong. Instead of the models "dl3","dl4","dyn"
> I want to change dl* to rm* and the order to "dyn","rm3","rm4".
>
> I hope it's understandable what I want. I tried to use factor()
> function in a thousand combinations, but I don't seem to get it. Can
> someone help me?

Does this get you started?

x <- factor(c("a", "b", "c"))
factor(x, levels=c("c","b","a"), labels=c("cc","bb", "aa"))

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do I draw such a barplot?

2007-07-16 Thread hadley wickham
On 7/16/07, Donatas G. <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I cannot figure out how to draw a certain plot: could someone help me out?
>
> I have this data.frame from a survey
> my.data
>
> that looks like something like this:
>
>col1  col2  col3  col4
> 1 5 5 4 5
> 2 3 5 3 1
> 3 2 3 4 5
> 4 3 1 1 2
> 5 5 5 4 5
> 6 4 2 5 5
> 
>
>
> Each row represents a single questionnaire with someone giving his
> agreement/disagreement with a statement (each column is a statement) that is
> coded from 1 to 5.
>
> I need to draw a barplot giving a visual representation showing differences
> between the five columns: Each bar should represent a single column, and
> should be divided into 5 sections, the thickness of each depending on the
> number of respondents who choose that particular answer.
>
> How do I do that? All I have managed to do so far is to produce a barplot of a
> single column, and that - only with bars side by side...

One way would be the use the ggplot2 and reshape packages:

library(ggplot2)
df <- as.data.frame(matrix(sample(1:5, 100, rep=T), ncol=5))

dfm <- melt(df, m=1:5)
qplot(variable, data=dfm, geom="bar", fill=factor(value))
qplot(variable, data=dfm, geom="bar", fill=factor(value), position="dodge")
qplot(variable, data=dfm, geom="bar", fill=factor(value), facets = . ~ value)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different axis limits for each facet in ggplot2

2007-07-16 Thread hadley wickham
On 7/16/07, Karl Ove Hufthammer <[EMAIL PROTECTED]> wrote:
> Hi!
>
> Is it possible to have different axis limit for each facet in a ggplot2
> plot? Here is an example:

Not yet, although it is on the to do list.

> --
> library(ggplot2)
>
> x=seq(-10,10,.1)
> y=cos(x)
> z=sin(x)*10

One crude way to get around it is:

df <- data.frame(x,y,z)
df <- rescaler(df)

 - ie. scale all variables to common scales

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Restructuring data

2007-07-15 Thread hadley wickham
On 7/16/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> On 7/15/07, Daniel Malter <[EMAIL PROTECTED]> wrote:
> > Hi folks,
> >
> > I am new to the list and relatively new to R. I am trying to unstack data
> > "arraywise" and could not find a convenient solution yet. I tried to find a
> > solution for the problem on help archives. I also tried to use the reshape
> > command in R and the reshape package but could not get result. I will
> > illustrate the case below, but the real dataset is quite large so that I
> > would appreciate an easy solution if there is any.
> >
> > The current data structure (variable names):
> >
> > ID, TIME, BUY-A, BUY-B, SELL-A, SELL-B
> >
> > Achieved structure (with the reshape command or the reshape package)
> >
> > ID, TIME, BUY-A
> > ID, TIME, BUY-B
> > ID, TIME, SELL-A
> > ID, TIME, SELL-B
> >
> > This is regular unstacking with two identifier variables. Nothing special
> > though. What I am looking for and did not manage is the following structure:
> >
> > ID, TIME, BUY-A, SELL-A
> > ID, TIME, BUY-B, SELL-B
> >
> > I am quite sure it's pretty easy, but I could not find how to do this.
>
> This seems to work:
>
> > foo <- data.frame(ID = 1:4, TIME=1:4,
> +   "BUY-A" = rnorm(4),
> +   "BUY-B" = rnorm(4),
> +   "SELL-A" = rnorm(4),
> +   "SELL-B" = rnorm(4), check.names = FALSE)
> >
> >
> > foo
>   ID TIME   BUY-A  BUY-B SELL-A  SELL-B
> 1  11  0.47022807 1.09573107  0.1977035 -0.08333043
> 2  22 -0.20672870 0.07397772  1.4959044 -0.98555020
> 3  33  0.05533779 0.25821758  1.3531913  0.16808307
> 4  44 -0.11471772 1.27798740 -0.1101390 -0.36937994
> >
> > reshape(foo, direction="long",
> + varying = list(c("BUY-A", "BUY-B"), c("SELL-A", "SELL-B")),
> + v.names=c("BUY", "SELL"), idvar="ID",
> + times = c("A", "B"), timevar="which")
> ID TIME which BUYSELL
> 1.A  11 A  0.47022807  0.19770349
> 2.A  22 A -0.20672870  1.49590443
> 3.A  33 A  0.05533779  1.35319133
> 4.A  44 A -0.11471772 -0.11013896
> 1.B  11 B  1.09573107 -0.08333043
> 2.B  22 B  0.07397772 -0.98555020
> 3.B  33 B  0.25821758  0.16808307
> 4.B  44 B  1.27798740 -0.36937994

It's a little more verbose with the reshape package, but I find it
easier to understand what's going on.

fm <- melt(foo, id=c("ID","TIME"))
fm <- cbind(fm, colsplit(fm$variable, "-", c("direction","type")))
fm$variable <- NULL

cast(fm, ... ~ direction)

There's an example like this in the introduction to reshape manual.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Break during the recursion?

2007-07-15 Thread hadley wickham
On 7/15/07, Atte Tenkanen <[EMAIL PROTECTED]> wrote:
> Here is now more elegant function for inorder tree walk, but I still can't 
> save the indexes!? This version now prints them ok, but if I use return, I 
> get only the first v[i].
>
> leftchild<-function(i){return(2*i)}
>
> rightchild<-function(i){return(2*i+1)}
>
> iotw<-function(v,i)
>
> {
> if (is.na(v[i])==FALSE & is.null(unlist(v[i]))==FALSE)
> {
> iotw(v,leftchild(i))
> print(v[i]) # return doesn't work here
> iotw(v,rightchild(i))
> }
> }

Shouldn't you return:

c(iotw(v, leftchild(i)), v[i], iotw(v, rightchild(i)))

(and rewrite the conditition to return null if the node doesn't exist,
I think it reads clearer that way)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Break during the recursion?

2007-07-15 Thread hadley wickham
On 7/15/07, Duncan Murdoch <[EMAIL PROTECTED]> wrote:
> On 15/07/2007 11:36 AM, Atte Tenkanen wrote:
> >> On 15/07/2007 10:33 AM, Atte Tenkanen wrote:
>  On 15/07/2007 10:06 AM, Atte Tenkanen wrote:
> > Hi,
> >
> > Is it possible to break using if-condition during the recursive
>  function?
>  You can do
> 
>   if (condition) return(value)
> 
> > Here is a function which almost works. It is for inorder-tree-
>  walk.
> > iotw<-function(v,i,Stack,Indexes) # input: a vector and the
> >> first
>  index (1), Stack=c(), Indexes=c().
> > {
> >   print(Indexes)
> >   # if (sum(i)==0) break # Doesn't work...
> if (sum(i)==0) return(NULL)
> 
>  should work.
> 
>  Duncan Murdoch
> >>> Hmm - - - I'd like to save the Indexes-vector (in the example
> >> c(8,4,9,2,10,5,11,1,3)) and stop, when it is ready.
> >>
> >> This seems more like a problem with the design of your function
> >> than a
> >> question about R.  I can't really help you with that, because your
> >> description of the problem doesn't make sense to me.  What does it
> >> mean
> >> to do an inorder tree walk on something that isn't a tree?
> >>
> >> Duncan Murdoch
> >
> > The symbols in vector v have been originally derived from "tree". See
> >
> > http://users.utu.fi/attenka/Tree.jpg
> >
> > But perhaps there's another way to do this, for instance by using loops and 
> > if-conditions?
>
> Or perhaps by doing the tree walk on the tree, before you collapse it
> into a vector.

If it's a binary tree with n levels, I think you should be able to
generate the positions more directly, depending on how the tree has
been flattened.  Binary heaps work this way, so that might be a good
place to start.  See http://en.wikipedia.org/wiki/Binary_heap,
particularly heap implementation.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Send SMS out of R?

2007-07-15 Thread hadley wickham
On 7/14/07, Thomas Schwander <[EMAIL PROTECTED]> wrote:
> Hi everyone,
>
>
>
> Now I read the posting guidelines again; COMPLETELY! ;-)
>
>
>
> I use Windows XP Professional, R 2.5.1 and I have Blat to send eMails out of
> R. Works perfect! Thank you for your help!
>
>
>
> Now I want to send an SMS out of R! Any idea how it could work? Could I send
> an eMail to a mobile phone number?

This might be a good place to start:
http://en.wikipedia.org/wiki/SMS_gateways

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot usage question

2007-07-14 Thread hadley wickham
On 7/14/07, Pete Kazmier <[EMAIL PROTECTED]> wrote:
> Could someone show me how to get a blue line in this plot?
>
> > ggplot(movies, aes(x=rating)) + stat_qq(geom="line",
> quantiles=seq(0,1,0.005), distribution=qunif)

It's a bug in ggplot, sorry.  It will be fixed in the next version.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with handling replicates before reshaping data

2007-07-13 Thread hadley wickham
Hi Tom,

>   I have a dataset consists of duplicated sequences within day for each 
> patient (see below data) and I want to reshape the data with patient as time 
> variable. However the reshape function only takes the first sequence of the 
> replicates and ignores the second. How can I 1) average the duplicates and 2) 
> give the duplicated sequences unique names before reshaping the data ?
>
>   > data
>  patient day  seq   y
>   1   10   1 acdf -0.52416066
>   2   10   1 cdsv  0.62551539
>   3   10   1 dlfg -1.54668047
>   4   10   1 acdf  0.82404978
>   5   10   1 cdsv -1.17459914
>   6   10   2 acdf  0.47238216

You mind find that the functions in the reshape package give you a bit
more flexibility.

# The reshape package expects data like to have
# the value variable named "value"
d2 <- rename(data, c("y" = "value"))

# I think this is the format you want, which will average over the reps
cast(d2, day + seq ~ patient, mean)


Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Choosing the number of colour breaks in ggplot2

2007-07-13 Thread hadley wickham
Hi Karl,

There's no official way to do it, but you can "hack" the colour
gradient scale to do what you want:

x=-10:10
y=-10:10
dat=expand.grid(x=x,y=y)
dat$z=dat$x^2+dat$y^2-100

# Create a modified scale
gr <- scale_fill_gradient2()$clone()
gr$breaks <- function(.) seq(-100, 100, by=10)

ggplot(dat, mapping=aes(x=x, y=y, fill=z)) +
 geom_tile() + gr

# Or to use the range of the scale if you don't want to set it by hand
gr$breaks <- function(.) seq(.$frange()[1], .$frange()[2], length=10)

This works because ggplot2 is built on top of the proto library and
has mutable objects.  Most of the time you don't notice this because
the default functions operate with R's copy-on-modify semantics.

Hadley

On 7/13/07, Karl Ove Hufthammer <[EMAIL PROTECTED]> wrote:
> A seemingly simple problem has me stumped. Is it possible to choose the
> number of colour breaks for a gradient scale in the current version of
> ggplot2?
>
> Here is a simple example:
>
> -
> x=-10:10
> y=-10:10
> dat=expand.grid(x=x,y=y)
> dat$z=dat$x^2+dat$y^2-100
>
> ggplot(dat, mapping=aes(x=x, y=y, fill=z)) +
>   geom_tile() + scale_fill_gradient2()
> -
>
> The image shows many (61) colours, but only 5 of them are shown in the
> legend. How do I change the legend to show, say, 10 colours?
>
>
> --
> Karl Ove Hufthammer
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 / histogram / y-axis

2007-07-12 Thread hadley wickham
On 7/12/07, Pete Kazmier <[EMAIL PROTECTED]> wrote:
> "hadley wickham" <[EMAIL PROTECTED]> writes:
>
> > On 7/12/07, Pete Kazmier <[EMAIL PROTECTED]> wrote:
> >> Is there a way in ggplot to make a histogram with the left-hand y-axis
> >> label as frequency, and a right-hand y-axis label as percentage?
> >
> > Not currently.  I did a quick exploration to see if it was feasible to
> > draw another axis on with grid, but it doesn't look like it's
> > possible:
>
> Thank you for trying.
>
> > Also how were you expecting the axes/gridlines to line up?  Would both
> > axes be labelled "nicely" (with whole numbers) and the secondary axis
> > wouldn't have gridlines; or would the second axis match the lines of
> > the primary, even though the number wouldn't be so attractive?
>
> I hadn't thought that far ahead.  Depending on the audience, I render
> histograms differently, and was curious if I could just put both on a
> single graph.  However, you bring up some interesting questions in
> terms of the presentation.
>
> On another note, and feel free to defer me to the documentation which
> I'm still in the process of reading, but will I be able to take
> advantage of some of Tufte's recommendations in terms of the typical
> histogram and/or scatterplots (pp126-134 in Visual Display of
> Quantitative Information)?
>
> For example, with histograms, he would eliminates the use of
> coordinate lines in favor of using a white grid to improve the
> data/ink ratio.  Likewise in scatterplots, he uses range-frames and
> dot-dash-plots.  Will I be able to use ggplot for these types of
> enhancements?

I am familiar with Tufte's suggestions, and while they do increase the
data-ink ratio, I'm not confident they actually make the plot any
better perceptually.  Displaying grid lines on _top_ of data seems
like a bad idea, and throwing away the plot frame is a bad idea
because you loose important visual reference points.  Range frames
also fail to scale to facetted plots.

If you're not already familiar with them, I strongly recommend the
following two papers which tacke similar ideas to Tufte but in a
rigourous scientific framework:

@article{cleveland:1987,
Author = {Cleveland, William and McGill, Robert},
Journal = {Journal of the Royal Statistical Society. Series A 
(General)},
Number = {3},
Pages = {192-229},
Title = {Graphical Perception: The Visual Decoding of Quantitative
Information on Graphical Displays of Data},
Volume = {150},
Year = {1987}}

@article{cleveland:1993a,
Author = {Cleveland, William},
Journal = {Journal of Computational and Graphical Statistics},
Pages = {323-364},
Title = {A model for studying display methods of statistical graphics},
Volume = {2},
Year = {1993}}

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 / histogram / y-axis

2007-07-12 Thread hadley wickham
On 7/12/07, Pete Kazmier <[EMAIL PROTECTED]> wrote:
> Is there a way in ggplot to make a histogram with the left-hand y-axis
> label as frequency, and a right-hand y-axis label as percentage?

Not currently.  I did a quick exploration to see if it was feasible to
draw another axis on with grid, but it doesn't look like it's
possible:

p <- qplot(rating, data=movies, geom="histogram")

# Map aesthetics to data
data <- p$layers[[1]]$make_aesthetics(p)
# Calculate statistic "by hand" (we'll need this to get the scales right)
binned <- StatBin$calculate(data=data, p$scales)

n <- nrow(movies)

# Manually recreate the y scale
sp <- scale_y_continuous()
sp$train(binned$count)

# rescale the labels
labels <- formatC(sp$breaks() / n, digits=2)

# Have to do without labels because of bug in grid
print(p, pretty=FALSE)
downViewport("panel_1_1")
grid.draw(ggaxis(sp$breaks(), as.list(labels), "right", sp$frange()))

# Why don't labels line up? - I'm not sure
# How could you make space for the extra axis? - Not sure either
# How would this worked for a facetted graphic - not well


Also how were you expecting the axes/gridlines to line up?  Would both
axes be labelled "nicely" (with whole numbers) and the secondary axis
wouldn't have gridlines; or would the second axis match the lines of
the primary, even though the number wouldn't be so attractive?

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 / reshape / Question on manipulating data

2007-07-12 Thread hadley wickham
On 7/12/07, Pete Kazmier <[EMAIL PROTECTED]> wrote:
> "hadley wickham" <[EMAIL PROTECTED]> writes:
>
> > On 7/12/07, Pete Kazmier <[EMAIL PROTECTED]> wrote:
> >> I'm an R newbie but recently discovered the ggplot2 and reshape
> >> packages which seem incredibly useful and much easier to use for a
> >> beginner.  Using the data from the IMDB, I'm trying to see how the
> >> average movie rating varies by year.  Here is what my data looks like:
> >>
> >> > ratings <- read.delim("groomed.list", header = TRUE, sep = "|", 
> >> > comment.char = "")
> >> > ratings <- subset(ratings, VoteCount > 100)
> >> > head(ratings)
> >>  Title  Histogram VoteCount VoteMean Year
> >> 1!Huff (2004) (TV) 16   299  8.4 2004
> >> 8  'Allo 'Allo! (1982) 000125   829  8.6 1982
> >> 50  .hack//SIGN (2002) 001113   150  7.0 2002
> >> 561-800-Missing (2003) 000103   118  5.4 2003
> >> 66  Greatest Artists (2000) (mini) 00..16   110  7.8 2000
> >> 77 00 Scariest Movie (2004) (mini) 00..000115   256  8.6 2004
> >
> > Have you tried using the movies dataset included in ggplot?  Or is
> > there some data that you want that is not in that dataset.
>
> It's funny that you mention this because I had intended to write this
> email about a month ago but was delayed due to other reasons.  In any
> case, when I was typing this up last night, I wanted to recreate my
> steps but I could not find the IMDB movie data I had used originally.
> I searched everywhere to no avail so I downloaded the data myself and
> groomed it.  Only now do I remember that I had used the movies dataset
> included in ggplot.
>
> >> How do 'byYear' and 'byYear2' differ?  I am trying to use 'typeof' but
> >> both seem to be lists.  However, they are clearly different in some
> >> way because 'qplot' graphs them differently.
> >
> > Try using str - it's much more helpful, and you should see the
> > different quickly.
>
> Thanks!  This is the function I've been looking for in my quest to
> learn about internal data types of R.  Too bad it has such a terrible
> name!
>
> > Using the built in movies data:
> >
> > mm <- melt(movies, id=1:2, m=c("rating", "votes"))
> > msum <- cast(mm, year ~ variable, c(mean, sum))
> >
> > qplot(year, rating_mean, data=msum, colour=votes_sum)
> > qplot(year, rating_mean, data=msum, colour=votes_sum, geom="line")
>
> Great!  This is exactly what I was looking to do.  By the way, does
> any of your documentation use the movie dataset as an example?  I'm
> curious what else I can do with the dataset.  For example, how can I
> use ggplot's facets to see the same information by type of movie?  I'm
> unsure of how to manipulate the binary variables into a single
> variable so that it can be treated as levels.

A lot of the examples do use the movies data, but I don't think any of
it is particularly revealing.  You might want to look at the results
for the 2007 infovis visualisation challenge
(http://www.apl.jhu.edu/Misc/Visualization/) which uses similar data.
Submission isn't complete yet, but you can see my teams entry at
http://had.co.nz/infovis-2007/.  There are lots of interesting stories
to pursue.

I think I will update the movies data to include the first genre as
another column.  That will make it easier to facet by genre

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot doesnt work in loops?

2007-07-12 Thread hadley wickham
On 7/12/07, hadley wickham <[EMAIL PROTECTED]> wrote:
> Hi Steve,
>
> You need to explicitly print the ggplot object:
> ggplot(mydata, aes(x=mydata$varc)) + geom_bar()
>
> (this is a R-faq for lattice plots, and ggplot works the same way)
>
> In the latest version of ggplot (0.5.4) you can construct the plot
> before hand and modify the aesthetics in each instance of the loop:
>
> p <- ggplot(mydata) + geom_bar()
> mydata$varc = c(1,2,3)
> for (i in 1:1){
> jpeg("test3.jpg")
> p + aes(x = mydata$varc)
> dev.off()
> }
>
> (not that this will actually work because you're not using i inside
> your loop anywhere)
>
> (and to be strictly correct you should probably use list(x =
> as.name(names(mydata)[i]))  instead of the aes call - but I haven't
> written any documentation for this yet)

Actually a better solution (will be included in the next version of ggplot) is:

aes_string <- function(...) structure(lapply(list(...), as.name),
class="uneval")

p + aes_string(x = names(mydata)[i])

It converts aes(x = "x", y="y") to aes(x=x, y=y).  The first is easy
to generate programmatically, the second is less to type.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p-value from survreg

2007-07-12 Thread hadley wickham
On 7/12/07, Terry Therneau <[EMAIL PROTECTED]> wrote:
> The question was how to get the p-value from the fit below, as an S object
>
> sr<-survreg(s~groups, dist="gaussian")
> Coefficients:
> (Intercept)  groups
> -0.02138485  0.03868351
>
> Scale= 0.01789372
>
> Loglik(model)= 31.1   Loglik(intercept only)= 25.4
> Chisq= 11.39 on 1 degrees of freedom, p= 0.00074
> n= 16
>
>
> 
>   In general, good places to start are
> > names(sr)
> > help(survreg.object)
> > ssr <- summary(sr)
> > names(ssr)
> As someone else pointed out, it's also easy to look at the print.survreg
> function and see how the value was created -- one of the things I love
> about S.
>
> Unfortunately, doing the above myself showed that I have let the documentation
> page for survreg.object get seriously out of date -- quite embarassing as
> that is logically the first place to start.
>
> As to the print function creating things "on the fly": there is an area where
> there is no good answer.  Does one make the return object from a fit such
> that it contains only minimal data, or add in all of the other computations
> that can be derived from these?  The Chambers and Hastie book "Statistical
> Models in S", which was the starting point for model objects, leaned towards
> the former, and this still influences many functions.  Often the summary
> function will "fill in" these derived values, the std and t-tests for
> the individual coefficients for instance.

I think this is where it's nice to have a separate function that does
the filling in - then you can have the best of both worlds.  That's
the role that summary often plays.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get the p-values from an lm function ?

2007-07-12 Thread hadley wickham
On 7/12/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> On Thu, 12 Jul 2007, hadley wickham wrote:
>
> > On 7/12/07, Benoit Chemineau <[EMAIL PROTECTED]> wrote:
> >> Hi, dear R-users,
> >>
> >> I am computing a liner regression by rating category using the 'by' 
> >> function
> >> as stated below:
> >>
> >> tmp <- by(projet, rating, function(x) lm(defaults ~ CGDP+CSAVE+SP500, data 
> >> =
> >> x))
> >>
> >> I would like to get not only the coefficients but also their p-values. I
> >> can't find the command in the help pages to get them.
> >>
> >> Does anyone have a suggestion ?
> >
> > Hi Benoit,
> >
> > A general approach to find p-values:
> >
> > m <- lm(wt ~ mpg, data=mtcars)
> >
> > First figure out how to display them on screen:
> > m # nope
> > coef(m) # nope
> > summary(m) # got it
> >
> > # Then use str to look at the components
> > str(summary(m))
> >
> > # And pick out the one want
> > summary(m)$coef
> > coef(summary(m)) # slighty better style, but won't work in general
>
> If x$coef works, coef(x) will almost certainly work at least as well.
> But note that in most cases it is x$coefficients and so x$coef is liable
> to partially match erroneously.

I meant in general that x$y, does not correspond to y(x) - I realised
after I wrote it that I was unclear.

> > # In general, you may also need to try
> > str(print(summary(m)))
> > # as sometimes the print method calculates the data you're looking for
>
> But a print method should always return its input, so
>
> str(summary(m))
> str(print(summary(m)))

Oh yes, I was getting confused with print functions which compute
values and print them but do not return them.

And that comment has made me realise many of my print methods don't
return x.   - something to fix.
Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get the p-values from an lm function ?

2007-07-12 Thread hadley wickham
On 7/12/07, Benoit Chemineau <[EMAIL PROTECTED]> wrote:
> Hi, dear R-users,
>
> I am computing a liner regression by rating category using the 'by' function
> as stated below:
>
> tmp <- by(projet, rating, function(x) lm(defaults ~ CGDP+CSAVE+SP500, data =
> x))
>
> I would like to get not only the coefficients but also their p-values. I
> can't find the command in the help pages to get them.
>
> Does anyone have a suggestion ?

Hi Benoit,

A general approach to find p-values:

m <- lm(wt ~ mpg, data=mtcars)

First figure out how to display them on screen:
m # nope
coef(m) # nope
summary(m) # got it

# Then use str to look at the components
str(summary(m))

# And pick out the one want
summary(m)$coef
coef(summary(m)) # slighty better style, but won't work in general

# In general, you may also need to try
str(print(summary(m)))
# as sometimes the print method calculates the data you're looking for

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot doesnt work in loops?

2007-07-12 Thread hadley wickham
Hi Steve,

You need to explicitly print the ggplot object:
ggplot(mydata, aes(x=mydata$varc)) + geom_bar()

(this is a R-faq for lattice plots, and ggplot works the same way)

In the latest version of ggplot (0.5.4) you can construct the plot
before hand and modify the aesthetics in each instance of the loop:

p <- ggplot(mydata) + geom_bar()
mydata$varc = c(1,2,3)
for (i in 1:1){
jpeg("test3.jpg")
p + aes(x = mydata$varc)
dev.off()
}

(not that this will actually work because you're not using i inside
your loop anywhere)

(and to be strictly correct you should probably use list(x =
as.name(names(mydata)[i]))  instead of the aes call - but I haven't
written any documentation for this yet)

Hadley

On 7/12/07, Steve Powell <[EMAIL PROTECTED]> wrote:
> Dear list members
> I am still a newbie so might be asking a stupid question, but I can't get
> ggplot to work in a loop (or a "while" statement for that matter).
>
> # to take a minimal example -
> mydata$varc = c(1,2,3)
> for (i in 1:1){
> jpeg("test3.jpg")
> plot(mydata$varc)
> #ggplot(mydata, aes(x=mydata$varc)) + geom_bar()
> dev.off()
> }
>
> this produces an empty jpeg, whereas the content of the loop produces the
> jpeg correctly.
> a standard plot() does work inside the loop.
> Any ideas? This is with R 2.4.0 and ggplot2
> thanks in advance
>
> Steve Powell
>
> proMENTE social research
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 / reshape / Question on manipulating data

2007-07-12 Thread hadley wickham
On 7/12/07, Pete Kazmier <[EMAIL PROTECTED]> wrote:
> I'm an R newbie but recently discovered the ggplot2 and reshape
> packages which seem incredibly useful and much easier to use for a
> beginner.  Using the data from the IMDB, I'm trying to see how the
> average movie rating varies by year.  Here is what my data looks like:
>
> > ratings <- read.delim("groomed.list", header = TRUE, sep = "|", 
> > comment.char = "")
> > ratings <- subset(ratings, VoteCount > 100)
> > head(ratings)
>  Title  Histogram VoteCount VoteMean Year
> 1!Huff (2004) (TV) 16   299  8.4 2004
> 8  'Allo 'Allo! (1982) 000125   829  8.6 1982
> 50  .hack//SIGN (2002) 001113   150  7.0 2002
> 561-800-Missing (2003) 000103   118  5.4 2003
> 66  Greatest Artists (2000) (mini) 00..16   110  7.8 2000
> 77 00 Scariest Movie (2004) (mini) 00..000115   256  8.6 2004

Have you tried using the movies dataset included in ggplot?  Or is
there some data that you want that is not in that dataset.

> The above data is not aggregated.  So after playing around with basic
> R functionality, I stumbled across the 'aggregate' function and was
> able to see the information in the manner I desired (average movie
> rating by year).
>
> > byYear <- aggregate(ratings$VoteMean, list(Year = ratings$Year), mean)
> > plot(byYear)
>
> Having just discovered gglot2, I wanted to create the same graph but
> augment it with a color attribute based on the total number of votes
> in a year.  So first I tried to see if I could reproduce the above:
>
> > library(ggplot2)
> > qplot(Year, x, byYear)
>
> This did not work as expected because the x-axis contained labels for
> each and every year making it impossible to read whereas the plot
> created with basic R had nice x-axis labels.  How do I get 'qplot' to
> treat the x-axis in a similar manner to 'plot'?

The problem is probably that Year is a factor - and factors are
labelled on every level (even if they overlap - which is a bug).
There's no terribly easy way to fix this, but the following will work:

qplot(as.numeric(as.character(Year)), x, data=byYear)

> After playing around further, I was able to get 'qplot' to work in a
> manner similar to 'plot' with regards to the x-axis labels by using
> 'melt' and 'cast'.  The 'qplot' now behaves correctly:
>
> > mratings <- melt(ratings, id = c("Title", "Year"), measure = c("VoteCount", 
> > "VoteMean"))
> > byYear2 <- cast(mratings, Year ~ variable, mean, subset = variable == 
> > "VoteMean")
> > qplot(Year, VoteMean, data = byYear2)
>
> How do 'byYear' and 'byYear2' differ?  I am trying to use 'typeof' but
> both seem to be lists.  However, they are clearly different in some
> way because 'qplot' graphs them differently.

Try using str - it's much more helpful, and you should see the
different quickly.

> Finally, I'd like to use a color attribute to 'qplot' to augment each
> point with a color based on the total number of votes for the year.
> Using attributes with 'qplot' seems simple, but I'm having a hard time
> grooming my data appropriately.  I believe this requires aggregation
> by summing the VoteCount column.  Is there a way to cast the data
> using different aggregation functions for various columns?  In my

Not easily, unfortunately.  However, you could do:

cast(mratings, Year ~ variable, c(mean, sum)), subset = variable %in%
c("VoteMean", "VoteCount"))

which will give you a mean and sum for both.

> case, I want the mean of the VoteMean column, and the sum of the
> VoteCount column.  Then I want to produce a graph showing the average
> movie rating per year but with each point colored to reflect the total
> number of votes for that year.  Any pointers?

Using the built in movies data:

mm <- melt(movies, id=1:2, m=c("rating", "votes"))
msum <- cast(mm, year ~ variable, c(mean, sum))

qplot(year, rating_mean, data=msum, colour=votes_sum)
qplot(year, rating_mean, data=msum, colour=votes_sum, geom="line")

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread hadley wickham
On 7/12/07, Stephen Tucker <[EMAIL PROTECTED]> wrote:
> In the Trellis approach, another way (I like) to deal with multiple pieces of
> external data sources is to 'attach' them to panel functions through lexical
> closures. For instance...
>
> rectInfo <-
> list(matrix(runif(4), 2, 2),
>  matrix(runif(4), 2, 2),
>  matrix(runif(4), 2, 2))
>
> panel.qrect <- function(rect.info) {
>   function(x, y, ...) {
> ri <- rect.info[[packet.number()]]
> panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
>col = "grey86", border = NA)
> panel.xyplot(x, y, ...)
>   }
> }
>
> xyplot(runif(30) ~ runif(30) | gl(3, 10),
>panel = panel.qrect(rectInfo))
>
> ...which may or may not be more convenient than passing rectInfo (and perhaps
> other objects if desired) explicitly as an argument to xyplot().

This is an interesting approach.  The one problem I see with it is
that if you change the trellising specification, you have to change
your rectInfo datastructure.  I guess we're missing the code that
actually generates rectInfo in the first place, so maybe in practice
it's not such a big problem.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread hadley wickham
On 7/12/07, Stephen Tucker <[EMAIL PROTECTED]> wrote:
> Not that Trellis/lattice was entirely easy to learn at first. :)
>
> I've been playing around with ggplot2 and there is a plot()-like wrapper for
> building a quick plot [incidentally, called qplot()], but otherwise it's my
> understanding that you superpose elements (incrementally) to build up to the
> graph you want. Here is the same plot in ggplot2:
>
> rectInfo <-
> list(matrix(runif(4), 2, 2),
>  matrix(runif(4), 2, 2),
>  matrix(runif(4), 2, 2))
>
> library(ggplot2)
> ggopt(grid.fill = "white") # just my preference
> ## original plot of points
> p <-
> qplot(x,y,data=data.frame(x=runif(30),y=runif(30),f=gl(3,30)),facets=f~.)
> # print(p)
>
> ## external data (rectangles) -> in coordinates for geom_polygon
> x <- do.call(rbind,
>  mapply(function(.r,.f)
> data.frame(x=.r[c(1,1,2,2),1],y=.r[c(1,2,2,1),2],f=.f),
> .r=rectInfo,.f=seq(along=rectInfo),SIMPLIFY=FALSE))
> ## add rectangle to original plot of points
> p+layer(geom="polygon",data=x,mapping=aes(x=x,y=y),facets=f~.)
> # will print the graphics on my windows() device

You should be able to simplify this line to:
p+geom_polygon(data=x)
because all the other information is already contained in the plot object.

> Though lattice does seem to emphasize the 'chart type' approach to graphing,
> in a way I see that it provides a similar flexibility - just that the
> specifications for each element are contained in functions and objects that
> are ultimately invoked by a high-level/higher-order function, instead of
> being combined in the linear fashion of ggplot2.

I tend to think in very data centric approach, where you first
generate the data (in a data frame) and then you plot it.  There is
very little data creation/modification during the plotting itself - I
think this is different to lattice, where you often do more data
manipulation in the panel function itself.  I don't think one is
better or worse, just different.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread hadley wickham
On 7/12/07, Deepayan Sarkar <[EMAIL PROTECTED]> wrote:
> On 7/11/07, hadley wickham <[EMAIL PROTECTED]> wrote:
> > > A question/comment: I have usually found that the subscripts argument is
> > > what I need when passing *external* information into the panel function, 
> > > for
> > > example, when I wish to add results from a fit done external to the 
> > > trellis
> > > call. Fits[subscripts] gives me the fits (or whatever) I want to plot for
> > > each panel. It is not clear to me how the panel layout information from
> > > panel.number(), etc. would be helpful here instead. Am I correct? -- or is
> > > there a smarter way to do this that I've missed?
> >
> > This is one of things that I think ggplot does better - it's much
> > easier to plot multiple data sources.  I don't have many examples of
> > this yet, but the final example on
> > http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
>
> That's probably true. The Trellis approach is to define a plot by
> "data source" + "type of plot", whereas the ggplot approach (if I
> understand correctly) is to create a specification for the display
> (incrementally?) and then render it. Since the specification can be
> very general, the approach is very flexible. The downside is that you
> need to learn the language.

Yes, that's right.  ggplot basically decomposes "type of plot" into
statistical transformation (stat) + geometric object and allows you to
control each component separately.  ggplot also explicitly includes
the idea of layers (ie. one layer is a scatterplot and another layer
is a loess smooth) and allows you to supply different datasets to
different layers.

> On a philosophical note, I think the apparent limitations of Trellis
> in some (not all) cases is just due to the artificial importance given
> to data frames as the one true container for data. Now that we have
> proper multiple dispatch in S4, we can write methods that behave like
> traditional Trellis calls but work with more complex data structures.
> We have tried this in one bioconductor package (flowViz) with
> encouraging results.

That's one area which I haven't thought much about.  ggplot is very
data.frame centric and it's not yet clear to me how plotting a linear
model (say) would fit into the grammar.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread hadley wickham
> A question/comment: I have usually found that the subscripts argument is
> what I need when passing *external* information into the panel function, for
> example, when I wish to add results from a fit done external to the trellis
> call. Fits[subscripts] gives me the fits (or whatever) I want to plot for
> each panel. It is not clear to me how the panel layout information from
> panel.number(), etc. would be helpful here instead. Am I correct? -- or is
> there a smarter way to do this that I've missed?

This is one of things that I think ggplot does better - it's much
easier to plot multiple data sources.  I don't have many examples of
this yet, but the final example on
http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.

For the original poster ggplot2 isn't that much more convenient,
because there isn't a built in rectangle geom (although it would be
trivial to add one).  You could use the more general polygon geom,
http://had.co.nz/ggplot2/geom_polygon.html, however it currently
doesn't have a lot of documentation.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p-value from survreg(), library(survival)

2007-07-11 Thread hadley wickham
On 7/11/07, Marc Schwartz <[EMAIL PROTECTED]> wrote:
> Actually, in this case, looking at the code for:
>
>   survival:::print.survreg
>
> would be better, as the p value is calculate there, rather than being
> part of the survreg object. As with many R functions, the p value is
> calculated in the print method for the object.

I wish print methods wouldn't do that. Printing is supposed to be
about displaying existing create, not creating new values.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p-value from survreg(), library(survival)

2007-07-11 Thread hadley wickham
str(survreg(s~groups, dist="gaussian"))

is probably a good place to start.

Hadley

On 7/11/07, Vlado Sremac <[EMAIL PROTECTED]> wrote:
> dear r experts:
> It seems my message got spam filtered, another try:
> i would appreciate advice on how to get the p-value from the object 'sr'
> created  with the function survreg() as given below.
> vlad
>
> sr<-survreg(s~groups, dist="gaussian")
> Coefficients:
> (Intercept)  groups
> -0.02138485  0.03868351
>
> Scale= 0.01789372
>
> Loglik(model)= 31.1   Loglik(intercept only)= 25.4
> Chisq= 11.39 on 1 degrees of freedom, p= 0.00074
> n= 16
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to plot two variables using a secondary Y axis

2007-07-10 Thread hadley wickham
On 7/10/07, Felipe Carrillo <[EMAIL PROTECTED]> wrote:
>   Date  Fo  Co6/27/2007  57.1  13.96/28/2007  57.7  14.3
> 6/29/2007  57.8  14.36/30/2007  57  13.97/1/2007  57.1  13.9
> 7/2/2007  57.2  14.07/3/2007  57.3  14.17/4/2007  57.6  14.2
> 7/5/2007  58  14.47/6/2007  58.1  14.57/7/2007  58.2  14.6
> 7/8/2007  58.4  14.77/9/200758.7
> 14.8
>
>   Hello all:
>   I am a newbie to R, and I was wondering how can I plot the Temperature 
> values above using Lattice or ggplot2 code. I want Date(X axis), Degrees F(Y 
> axis) and Degrees C( on a secondary Y axis).

Hi Felipe,

It's not currently possible with ggplot2, but it is something on my to do list.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overlay boxplot

2007-07-10 Thread hadley wickham
You will get more useful answers if you specify exactly how you want
to overlay the boxplots (overlay them on what?).  You can certainly do
this with the ggplot2 package, or lattice or base graphics.

Hadley

On 7/10/07, Hao Liu <[EMAIL PROTECTED]> wrote:
> hi, All:
>
> I need to overlay two boxplot, I played around with points() but found
> it does not seem to work with boxplot, it works fine with other. Is
> there a way to overlay two boxplot (using different color) in R?
>
> There was a thread talking about using ggplot package, however, I don't
> think there is a final solution... the answer give does not give overlay
> but a new plot.
>
> Thanks
> Hao
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Reshape version 0.8

2007-07-09 Thread hadley wickham
Reshape version 0.8
http://had.co.nz/reshape

Reshape is an R package for flexibly restructuring and aggregating
data.  It's inspired by Excel's pivot tables, and it (hopefully) makes
it very easy to get your data into the shape that you want.  You can find out
more at http://had.co.nz/reshape

This version brings a few minor changes to make the output more
attractive and less surprising.  If you have any code that relies on
the exact output structure you might need to tweak it a little.

* preserve.na renamed to na.rm to be consistent with other R functions

* Column names are no longer automatically converted to valid R names.
You may need to use `` (those are backticks) to access these names.

* Margins now displayed with (all) instead of NA

* melt.array can now deal with cases where there are partial dimnames
- Thanks to Roberto Ugoccioni

 * Added the Smiths dataset to the package

 * Fixed a bug when displaying margins with multiple result variables

Regards,

Hadley

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible bug in ggplot2 v0.5.2???

2007-07-09 Thread hadley wickham
On 7/6/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> On Wed, 4 Jul 2007, hadley wickham wrote:
>
> > On 7/4/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> >> On Tue, 3 Jul 2007, hadley wickham wrote:
> >>
> >> > Hi Stephane,
> >> >
> >> > The problem is that the windows graphics device doesn't support
> >> > transparent colours.  You can get around this in two ways:
> >>
> >> It certainly does!  Try col="transparent" (and perhaps consult your
> >> dictionary).  It was news to me that the windows() graphics device worked
> >> on
> >> Linux i586.
> >
> > Well my dictionary defines transparent as "allowing light to pass
> > through so that objects behind can be distinctly seen" which I believe
> > applies here (ie. stained glass windows and blue points with alpha 0.5
> > are both transparent).  What does your dictionary say?
>
> Not quite the same, but even by your definition col="transparent" is
> transparent.  In this context
>
> http://en.wikipedia.org/wiki/Transparency_%28graphic%29
>
> seems more pertinent.

col="transparent" is transparent by any reasonable definition, but
does it make sense to claim that a graphics device "supports"
transparency?  How can you tell the difference between a transparent
object and nothing?

> >> What it does not support as yet is translucent colours, and that is a
> >> restriction imposed by Windows (translucency support was introduced for
> >> Windows XP, and we still try to support older versions of Windows, unlike
> >> the MacOS people).  I have been working on a workaround, so translucency
> >> support is likely to be implemented in R 2.6.0 for users of XP or later.
> >
> > I am confused by your implication that windows (prior to XP) does not
> > support translucency.  Perhaps it is not supported at the operating
> > system level, but it has certainly been available at the application
> > level for a very long time.
>
> Really? It's hard to reply to unspecific assertions.  But remember XP has
> been out since 2001, almost as long as PDF has supported translucency.

Yes, I agree, and thank you for providing some support to your
statements.  Java has supported transparency since 1.2 (with the
Graphics2D class), and was released on Dec 4, 1998, so certainly some
applications were drawing transparent graphics on windows.

> >> Given that neither of the two main screen devices and neither of the
> >> standard print devices support translucency, the subject line looks
> >> correct to me: the problem surely lies in the assumptions made in ggplot2.
> >
> > The features of the windows and X11 devices clearly lag behind the
> > quartz and pdf devices.  I can program for the lowest common
> > denominator or I can use modern features that support the tasks I am
> > working on.  I choose the later, and it is certainly your prerogative
> > to declare that a bug in me.
>
> I think to make undocumented assumptions about the environment is unkind
> to your would-be users.  Ideally the graphics devices would detect and

I have tried to point that out in most places where I used
alpha-blending in the documentation, but I did miss a few.  I think
part of my job is to educate users about what is possible with R, even
though it might be currently available for their default set up.

> report that, but that is not how support for semi-transparency was added.
> As a by-product of adding limited translucency support on the windows()
> family of devices, they do now warn.

That's great news.

> You also need to check that the extra features work correctly.  I found
> some problems with all the devices I tried that support translucency (or
> at least with device+viewer combinations for pdf and svg).  Issues include
> whether translucent fills are rendered at all, blending translucent
> colours with transparent backgrounds, and the model used (is it the light
> intensity or the perceptual colours that are being blended?).

Could you provide more details about these bugs so that I can look
into the implications for my code?  I haven't seen any problems with
preview or acrobat on the mac.

Regards,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using the function unique(), but asking it to ignore a column of a data.frame

2007-07-09 Thread hadley wickham
On 7/9/07, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
> Andrew Yee wrote:
> > Thanks.  But in this specific case, I would like the output to include
> > all three columns, including the "ignored" column (in this case, I'd
> > like it to ignore column a).
> >
> df[!duplicated(df[,c("a","c")]),]
>
> or perhaps
>
> df[!duplicated(df[-2]),]

Yes - of course.  I was momentarily confused about unique vs. duplicated.  Oops!

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about scagnostics

2007-07-09 Thread hadley wickham
Hi Olivier,

You can call scagnostics either with two vectors, or a data.frame (in
which case it computes all pairwise scagnostics).

I just double checked to make sure I didn't accidentally misname the
vector of scagnostics in R, and it doesn't look like I did, so could
you please send me a reproducible example so I can look into it more
closely.

Thanks,

Hadly

On 7/9/07, Olivier ETERRADOSSI <[EMAIL PROTECTED]> wrote:
> Hi Hadley,
> thank you for providing this "scagnostics" primer
> I was trying to do some basic testing, and I see that I probably missed
> some points :
> first it's not clear for me if the argument of "scagnostics" should be
> raw data or "processed" data (results of calling "splom" or whatever...).
> If the first, I thought (from Wilkinson & al.) that if taking as an
> example variables x and y being the coordinates of a circle, I should
> find in scagnostics(x,y)$s :
> Skinny = 0 and Convex =1.
> I get Skinny = 1 and Convex =0 What am I missing ?
> (My God, I'm feeling myself going to be "Ripleyed" !.)
> Regards, Olivier
>
> --
> Olivier ETERRADOSSI
> Maître-Assistant
> CMGD / Equipe "Propriétés Psycho-Sensorielles des Matériaux"
> Ecole des Mines d'Alès
> Hélioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9
> tel std: +33 (0)5.59.30.54.25
> tel direct: +33 (0)5.59.30.90.35
> fax: +33 (0)5.59.30.63.68
> http://www.ema.fr
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using the function unique(), but asking it to ignore a column of a data.frame

2007-07-09 Thread hadley wickham
On 7/9/07, Andrew Yee <[EMAIL PROTECTED]> wrote:
> Take for example the following data.frame:
>
> a<-c(1,1,5)
> b<-c(3,2,3)
> c<-c(5,1,5)
> sample.data.frame<-data.frame(a=a,b=b,c=c)
>
> I'd like to be able to use unique(sample.data.frame), but have
> unique() ignore column a when determining the unique elements.
>
> However, I figured that this would be setting for incomparables=, but
> it appears that this funcationality hasn't been incorporated.  Is
> there a work around for this, i.e. to be able to get unique to only
> look at selected columns of a data frame?

unique(df[,c("a","c")]) ?

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] ggplot 0.5.4

2007-07-09 Thread hadley wickham
ggplot2
===

ggplot2 is a plotting system for R, based on the grammar of graphics,
which tries to take the good parts of base and lattice graphics and
avoid bad parts. It takes care of many of the fiddly details
that make plotting a hassle (like drawing legends) as well as
providing a powerful model of graphics that makes it easy to produce
complex multi-layered graphics.

Find out more at http://had.co.nz/ggplot2, and check out the over 500
examples of ggplot in use.

Changes in version 0.5.4 --

* border now drawn on top of geoms, instead of below - this results in
better appearance when adjusting scale limits
* ggplot() + aes() now modifies existing default aesthetic mapping,
rather than overwriting
* polished examples in facet_grid

Changes in version 0.5.3 --

* added experimental scatterplot matrix, see ?plotmatrix
* added new border.colour and grid.minor.colour options for better
control over plot appearance
* updated theme_bw to do better when drawing a plot with white background
* better default colour choices for gradients (and more discussion in examples)
* fixed bug in ScaleGradient2 where scales with different positive and
negative ranges were not scaled correctly
* allow expressions as result from strip.text
* fixed rare bug in geom_vline and geom_hline
* fixed example in geom_abline
* tweaked display of multiline axis labels

Regards,

Hadley

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Scagnostics - scatterplot diagnostics

2007-07-08 Thread hadley wickham
The scagnostics package implements the graph theoretic scagnostics
described by Leland Wilkinson, Anushka Anand and Robert Grossman
(http://www.ncdm.uic.edu/publications/files/proc-094.pdf), building on
an old idea of Tukey's to define indices of "interestingness" to help
guide the search for interesting features in the pair-wise
scatterplots of a highly multivariate dataset.

The scagnostics package currently only supports two methods, one which
computes the scagnostics for a pair of variables, and the other for
all pairs of variables in a data.frame.

If you are attending the JSM, there is a session on scagnostics.
Details are available at http://tinyurl.com/324yb5

(The package has just been added to CRAN, it may be a couple of days
before it is available on your local mirror)

Regards,

Hadley

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 customizing

2007-07-05 Thread hadley wickham
Hi Ido,

On 7/5/07, Ido M. Tamir <[EMAIL PROTECTED]> wrote:
> Dear all,
>
> I know that ggplot2 documentation is coming along,
> but at the moment I can't find how to do the following:
> a) change the title of the legend

There's lot of examples in the documentation - and you seem to have
figured how to change the axis labels - so you should find it pretty
easy!

 + scale_colour("new legend name")

> b) get rid of the closing line at the bottom of the
> density line.

Try:

 + stat_density(..., geom="path")

> I also observed that the density lines (after limiting the
> x-scale) extend a little bit into the surrounding of the plot,
> which can be seen very strong when plotted as pdf.
> They extend into the white space between the tick and the
> plotting panel.

Yes, this is a bug - I'll try and get it fixed in the next version.

Thanks,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible bug in ggplot2 v0.5.2???

2007-07-04 Thread hadley wickham
On 7/4/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> On Tue, 3 Jul 2007, hadley wickham wrote:
>
> > Hi Stephane,
> >
> > The problem is that the windows graphics device doesn't support
> > transparent colours.  You can get around this in two ways:
>
> It certainly does!  Try col="transparent" (and perhaps consult your
> dictionary).  It was news to me that the windows() graphics device worked
> on
> Linux i586.

Well my dictionary defines transparent as "allowing light to pass
through so that objects behind can be distinctly seen" which I believe
applies here (ie. stained glass windows and blue points with alpha 0.5
are both transparent).  What does your dictionary say?

> What it does not support as yet is translucent colours, and that is a
> restriction imposed by Windows (translucency support was introduced for
> Windows XP, and we still try to support older versions of Windows, unlike
> the MacOS people).  I have been working on a workaround, so translucency
> support is likely to be implemented in R 2.6.0 for users of XP or later.

I am confused by your implication that windows (prior to XP) does not
support translucency.  Perhaps it is not supported at the operating
system level, but it has certainly been available at the application
level for a very long time.

> Given that neither of the two main screen devices and neither of the
> standard print devices support translucency, the subject line looks
> correct to me: the problem surely lies in the assumptions made in ggplot2.

The features of the windows and X11 devices clearly lag behind the
quartz and pdf devices.  I can program for the lowest common
denominator or I can use modern features that support the tasks I am
working on.  I choose the later, and it is certainly your prerogative
to declare that a bug in me.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible bug in ggplot2 v0.5.2???

2007-07-03 Thread hadley wickham
Hi Stephane,

The problem is that the windows graphics device doesn't support
transparent colours.  You can get around this in two ways:

 * export to a device that does support transparency (eg. pdf)
 * use a solid fill colour : + stat_smooth(method="lm", fill="grey50")

Hadley

On 7/3/07, Stephane Cruveiller <[EMAIL PROTECTED]> wrote:
> Dear R-Users,
>
> I recently gave a try to the nice package ggplot2. Everything  went
> well until I tried to add a smoother (using lm method for instance).
> On the graphic device the regression line is displayed but not confidence
> intervals as it should be (at least on ggplot website). I tried to do
> the job on
> both MS winXP and Linux i586: same result. Did anyone encountered this
> problem? Did I miss something?
>
>
> My R version is 2.4.1.
>
>
>
> Thanks,
>
> Stéphane.
>
>
> --
> ==
> Stephane CRUVEILLER Ph. D.
> Genoscope - Centre National de Sequencage
> Atelier de Genomique Comparative
> 2, Rue Gaston Cremieux   CP 5706
> 91057 Evry Cedex - France
> Phone: +33 (0)1 60 87 84 58
> Fax: +33 (0)1 60 87 25 14
> EMails: [EMAIL PROTECTED] ,[EMAIL PROTECTED]
> ===
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Clusterfly

2007-07-01 Thread hadley wickham
clusterfly
http://had.co.nz/clusterfly/

Typically, there is somewhat of a divide between statistics and
visualisation software. Statistics software, particularly R, provides
implementation of cutting edge research methods, but limited graphics.
Visualisation software will provide sophisticated visual interfaces,
but few statistical algorithms. The clusterfly package presents some
early experimentation aimed at overcoming this deficiency by linking R
and GGobi. Cluster analysis was chosen as it is an exploratory method
that needs sophisticated visualisation and statistical algorithms.

Clusterfly provides some tools that work with all clustering
algorithms, and some that are tailored for particular ones.  Generic
tools allow you to animate between clusterings (see ?cfly_animate) and
produce common static graphics (?cfly_dist, ?cfly_pcp).  Specific
algorithms are available for:

* Self organising maps (aka Kohonen neural networks), ?ggobi.som.
Displays the self organising map/net in the original space of the
data.

* Hierarchical clustering, ?hierfly. Connects data points with lines
like a dendrogram, but in the high-dimensional space of the original
data

 * Model based clustering, ?mefly. Adds ellipsoids from the
multivariate normal distributions the clusters are based on

You will need GGobi (http://www.ggobi.org) and rggobi
(http://www.ggobi.org/rggobi) installed to be able to use clusterfly.

Regards,

Hadley

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plots from categorial data

2007-07-01 Thread hadley wickham
On 7/1/07, Christoph Krammer <[EMAIL PROTECTED]> wrote:
> Hello Hadley,
>
> Thanks a lot for your help. I got the plot I want out of this module with a
> slightly more complicated command.
>
> But now, I have an additional problem:
>
> In the given case, the "filtersetting" column contains letters, so R takes
> the values as categories. But I have other filters, which only have numeric
> categories like "0.125", "0.25", "1", and so on. But there is no real
> "distance" between these values, so the data is still categorial. But if I
> draw a plot from this data, the result is a plot with axis labels like 0.2,
> 0.4, 0.6, ...
>
> How do I tell R to treat the numbers in the filtersetting column as
> categories?

Just make it a factor:
qplot(factor(filter_setting), avg.hit, data=data, colour=ocrtool, geom="line")

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plots from categorial data

2007-07-01 Thread hadley wickham
Perhaps this will do what you want:

library(ggplot2)
qplot(filter_setting, avg.hit, data=data, colour=ocrtool, geom="line")

find out more about ggplot2 at http://had.co.nz/ggplot2

Hadley

On 7/1/07, Christoph Krammer <[EMAIL PROTECTED]> wrote:
> Hello everybody,
>
> Since my first message was caught by the spam filter, I just try to do it
> again:
>
> I want to use R to generate plots from categorial data. The data contains
> results from OCR scans over images with are preprocessed by different image
> filtering techniques. A small sample data set looks as following:
>
> > data <- read.csv("d:/tmp_da/sql_data/filter_d_tool.csv", header=T)
> > data
>   ocrtool filter_setting avg.hit.
> 1  FineReader2x10.383
> 2  FineReader2x20.488
> 3  FineReader3x20.268
> 4  FineReader3x30.198
> 5  FineReader4x30.081
> 6  FineReader4x40.056
> 7gocr2x10.153
> 8gocr2x20.102
> 9gocr3x20.047
> 10   gocr3x30.052
> 11   gocr4x30.014
> 12   gocr4x40.002
> 13  ocrad2x10.085
> 14  ocrad2x20.094
> 15  ocrad3x20.045
> 16  ocrad3x30.050
> 17  ocrad4x30.025
> 18  ocrad4x40.009
>
>
> I now want to draw a plot with the categories (filter_setting) as X axis,
> and the avg_hit as Y axis. There should be lines for each ocrtool.
>
> But when I draw a plot, the resulting plot always contains bars, even if I
> specify type="n".
> > plot(data$filter_setting, data$avg.hit., type="n")
>
> When I only plot the categories, without data, there appear strange grey
> (but empty) boxes.
> > plot(data$filter_setting, type="n")
>
> Who do I get a clean white box to draw the different lines in?
>
> Thanks and regards,
>  Christoph
>
> ---
> Christoph Krammer
> Student
>
> University of Mannheim
> Laboratory for Dependable Distributed Systems A5, 6
> 68131 Mannheim
> Germany
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error en assign(mname, def, where)

2007-06-28 Thread hadley wickham
Hi Martin,

Could you please provide a minimal replicable example so that we can
investigate further.

Thanks,

Hadley

On 6/28/07, Martín Gastón <[EMAIL PROTECTED]> wrote:
>
> Hi R users,
> I am working with the fda package but when I call the function pca.fd I
> obtain a message error, which I cann't identify. The error say That :
>  error in assihn(mname,def,where), is not possible to add links to a
> blockade enviroment.
> The orther that I'm writting is:
>
> > cp1 <- pca.fd(ind.fd1,nharm=3)
>
> and before it I can to plot the functional data object ind.fd1.
> ¿Have anybody seen this error or any similar message?,
> ¿Any idea?
>
> Thancks for your help
> --
> View this message in context: 
> http://www.nabble.com/Error-en-assign%28mname%2C-def%2C-where%29-tf3995432.html#a11346689
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding different output to different lattice panels

2007-06-28 Thread hadley wickham
On 6/28/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> I would like to add a reference line to lattice graphs, with the reference 
> line
> being different according to the factor level.
>
> Example : Draw 3 dotplots for "a","b" and "c" factors, and then add an
> horizontal line at y=10 for panel "a", y=8 for panel "b" and y=6 for panel "4"

It's quite easy to do this with ggplot2 (http://had.co.nz/ggplot2) -
see http://had.co.nz/ggplot2/geom_vline.html for examples of both
common and specific reference lines.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to plot this?

2007-06-25 Thread hadley wickham
On 6/25/07, jim holtman <[EMAIL PROTECTED]> wrote:
> You might want to check out this link to the type of graphs that R can
> produce and find one you like; the code will be with it.
>
> http://addictedtor.free.fr/graphiques/allgraph.php

Or for examples using the ggplot2 package:

http://had.co.nz/ggplot2

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlaying lattice graphs (continued)

2007-06-22 Thread hadley wickham
Yes - you'll need ggplot2.

Hadley

On 6/22/07, Sébastien <[EMAIL PROTECTED]> wrote:
> Hadley,
>
> I have some troubles to run your code with ggplot version 0.4.1. Is the
> package ggplot2 mandatory ?
>
> Sebastien
>
> hadley wickham a écrit :
> > Hi Sebastian,
> >
> > I think the following does what you want:
> >
> > library(ggplot2)
> > names(mydata) <- tolower(names(mydata))
> >
> > obs <- rename(subset(mydata, model=="A", -predicted), c("observed" =
> > "value"))
> > obs$model <- factor("observed")
> > pred <- rename(mydata[, -5], c("predicted" = "value"))
> > all <- rbind(obs, pred)
> >
> > ggplot(all, aes(x = time, y = value, colour=model)) +
> > geom_point(data = subset(all, model != "Observed")) +
> > geom_line(data= subset(all, model == "Observed")) +
> > facet_grid(. ~ individuals)
> >
> > Hadley
> >
> > On 6/22/07, Sébastien <[EMAIL PROTECTED]> wrote:
> >> Hi Deepayan,
> >>
> >> The following code creates a dummy dataset which has the same similar as
> >> my usual datasets. I did not try to implement the changes proposed by
> >> Hadley, hoping that a solution can be found using the original dataset.
> >>
> >> # My code
> >>
> >> # Creating dataset
> >>
> >> nPts<-10# number of time points
> >> nInd<-6  # number of individuals
> >> nModel<-3 # number of models
> >>
> >> TimePts<-rep(1:nPts,nInd*nModel)#
> >> creates the "Time" column
> >> Coef<-rep(rnorm(6,0.1,0.01),each=nPts,nModel) # Creates a
> >> vector of coefficients for generating the observations
> >> Obs<-10*exp(-Coef*TimePts) #
> >> creates the observations
> >>
> >> for (i in 1:60){
> >> Pred[i]<-jitter(10*exp(-Coef[i]*TimePts[i]))
> >> Pred[i+60]<-jitter(5)
> >> Pred[i+120]<-jitter(10-Coef[i+120]*TimePts[i])
> >> }
> >>   # creates the predicted values
> >>
> >> colPlot<-rep(1,nPts*nInd*nModel)
> >> # creates the "Plot" column
> >> colModel<-gl(nModel,nPts*nInd,labels=c("A","B","C")) #
> >> creates the "Model" column
> >> colID<-gl(nInd,nPts,nPts*nInd*nModel)
> >>   # creates the "ID" column
> >>
> >> mydata<-data.frame(colPlot,colModel,colID,TimePts,Obs,Pred)
> >>   # creates the dataset
> >> names(mydata)<-c("Plot","Model","Individuals","Time","Observed","Predicted")
> >>
> >>
> >> # Plotting as indicated by Deepayan
> >>
> >>
> >> xyplot(Observed + Predicted ~ Time | Individuals + Model,
> >>   data = mydata,
> >>   panel = panel.superpose.2, type = c("p", "l"),
> >>   layout = c(0, nlevels(mydata$Individuals))) #,
> >>   #<...>)
> >>
> >> ### End of code
> >>
> >> This codes is not exactly what I am looking for, although it is pretty
> >> close. In the present case, I would like to have a Trellis plot with 6
> >> panels (one for each individual), where the Observations and the
> >> Predicted are plotted as symbols and lines, respectively. All three
> >> models should be plotted on the same panel. Unfortunately, it looks to
> >> me as 3 successives xyplots are created by the code above but only the
> >> last one remains displayed. I tried to play with
> >> panel.superpose,panel.superpose.2 and type, without much success.
> >>
> >> I also tried the following code that creates 18 panels and distinguish
> >> all (Individuals,Model) couples... so, not what I want.
> >>
> >> xyplot(Observed + Predicted ~ Time | Individuals+Model, data = mydata,
> >>  type = c("p", "l"), distribute.type = TRUE)
> >>
> >> Sebastien
> >>
> >>
> >> Deepayan Sarkar a écrit :
> >> > On 6/21/07, Sébastien <[EMAIL PROTECTED]> wrote:
> >> >> Hi Hadley,
> >> >>
> >> >> Hopefully, my dataset won't be too hard to changed. Can I modify the
> >> >> aspect of each group using your code

Re: [R] Visualize quartiles of plot line

2007-06-22 Thread hadley wickham
On 6/17/07, Arne Brutschy <[EMAIL PROTECTED]> wrote:
> Hi,
>
> thanks for your tips - all of them worked. After a bit of fiddling, I
> managed to get what I wanted.

Glad to hear it.

> hadley wickham wrote:
> h> You might want to read the introductory chapters in the ggplot book,
> h> available from http://had.co.nz/ggplot2, which will give you more of a
> h> background.  Please let me know places where you think the
> h> documentation is inconsistent so I can try and make them better.
> I already did. :) A general problem: the examples are nice and easy to
> get, but it's often hard to apply them to my own specific problem.
> It's more a problem of the general structure: what has to go where.
> Most of the methods are using qplot, but what do I have to do if I'm
> trying create a more complex plot. Hmm, it's hard to describe.
>
> Example: I know how I set the title when using qplot (qplot(
> main="asdf"). Where do I have to put it when I'm using gplot? Stuff
> like this is unclear...

p <- ggplot(...) + ...
p$title <- "Title goes here"

It is currently hard to figure this out in the current documentation though.

> A more general problem is, that the manual pages are very, eh,
> minimalistic documented. The overall reference page is good and nicely
> structured. But the big idea is sort of missing. All components are
> linked, but the basics like layout, ggplot, aes etc are harder to find
> - and their help pages are the shortest. Especially the small details
> are hard to figure out. Lists of attributes etc..

Yes, that's definitely something I'm working on for the book.
Unfortunately, I don't have
 that much time and it is a lot of work.  Every comment helps though.

> Hmm, I know this is not really helpful. I can't describe my problems
> properly, I guess. Perhaps the documentation simply has to improve
> based on users questions. :\
>
> How old is this package? I think it's really, really great, but are
> there many users? Is there an additional mailinglist or forum where I
> can get more information?

It's pretty young still, although the precursor ggplot package has
been around for about a year.  I really have no idea how many users
there are.  For questions, either email me or R-help.

> Some more questions:
>
> Why doesn't ggplot2 work with layout()? I'm using viewport now, which
> works fine for me, but there should be a note in the docs perhaps.

Because it works with the grid drawing package - see the last chapter
in the ggplot book for some details on how to use the grid
equivalents.

> How do I change the legend. The auto-creation of it might be nice,
> but I want a) to add a title b) change the order to ascending and c)
> add a short description like:
>
>   DeltaConfig
>   [ ] 0 best
>   [ ]
>   [ ] 5
>   [ ]
>   [ ]10 worst
>
> I don't know if this is possible, but it would be nice to explain what
> the colors/values might mean if it's not clear from the beginning
> (ligke diamonds.size). The only thing I found was the attribute
> legend.justifcation in ggopt, which isn't fully documented.

The legends aren't very customisable at the moment - look at the
examples for the scale functions to see what you can do.  You can see
the name of the title easily, and you can change the labels by
changing the level of the factors, or setting the breaks argument.  I
agree there could be more options.  If you could provide me with a
picture of what you want, I'll add it to my to do list to think about.

> Additionally, how can I change the order of the facets? I currently
> have a plot with a smoother for each model (all in the same plot),
> which sorts the models like this: dyn,dl4,dl3 Below that, I have a
> facet with point-plots for each model which sorts them the other way
> round, which is a bit confusing.

Again, change the order of the underlying factor.

> BTW, what's the "strip" and the associated attributes?

The strip is the labelled associated with the facet.

> Again, I think this package is great - nice work! All the above isn't
> meant as general critisism, but is being said in order to improve the
> documentation..

I do appreciate your comments and they definitely help me to make a
better product.

Thanks,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stacked barchart color

2007-06-22 Thread hadley wickham
Hi Owen,

The bars should be stacked in the order specified by the factor.  Try
using factor(..., levels=...) to explicitly order them the way you
want.  If that doesn't work, please provide a small replicable example
and I'll look into it.

Hadley

On 6/18/07, owenman <[EMAIL PROTECTED]> wrote:
>
> Hi Hadley,
> Great, I am starting to get it.  It's working for me, but there is one more
> thing I am having trouble with.  The ordering of the stacked bars seems to
> be dictated by the name of the color, I guess because of the fill=color
> argument in aes().  In other words, if I set up my colors like this:
> y$color = c("gray1","gray35","gray45","gray65")  the bars get stacked in the
> opposite order than if I set up the colors like this:  y$color =
> c("gray65","gray45","gray35","gray1").  How can I control the order of the
> bars independent of the name of the colors?   Thanks so much in advance!
> Really neat package you've made.
>
> FYI, my plot command now looks like this:
>
> p = ggplot(y, aes(x=locus, y=Freq, fill=color))
> p = p + geom_bar(position="fill")
> p = p + scale_fill_identity(labels=levels(y$Fnd), grob="tile", name="Fnd
> Results")
> p = p + coord_flip()
>
> And the data table is similar as before:
>
> > y
>   Fnd locusFreq  color
> 1  signeg  DPB1 0.013071895  gray1
> 2 neg  DPB1 0.581699346 gray35
> 3 pos  DPB1 0.379084967 gray45
> 4  sigpos  DPB1 0.026143791 gray65
> 5  signeg  DPA1 0.068181818  gray1
> 6 neg  DPA1 0.659090909 gray35
> 7 pos  DPA1 0.25000 gray45
> 8  sigpos  DPA1 0.022727273 gray65
>
>
>
> hadley wrote:
> >
> > Hi Owen,
> >
> > The identity scale won't create a legend, unless you tell it what
> > labels it should use - there's an example at
> > http://had.co.nz/ggplot2/scale_identity.html.  Otherwise, if you have
> > a continuous scale and you want something that works in black and
> > white, p + scale_fill_gradient(low="white", high="black") might be
> > easier.
> >
> > Hadley
> >
> >
> >>
> >> > y$color = factor(y$Fnd)
> >> > y$color = c("black","darkgray","lightgray","white")
> >> > y
> >>   Fnd locusFreq color
> >> 1  signeg A 0.087248322 black
> >> 2 neg A 0.711409396  darkgray
> >> 3 pos A 0.201342282 lightgray
> >> 4  sigpos A 0.0 white
> >> 5  signeg C 0.320754717 black
> >> 6 neg C 0.603773585  darkgray
> >> 7 pos C 0.075471698 lightgray
> >> 8  sigpos C 0.0 white
> >> 9  signeg B 0.157534247 black
> >> 10neg B 0.732876712  darkgray
> >> 11pos B 0.109589041 lightgray
> >> 12 sigpos B 0.0 white
> >>
> >> > p = ggplot(y, aes(x=locus, y=Freq, fill=color)) +
> >> > geom_bar(position="fill") + scale_fill_identity()
> >> > p
> >>
> >>
> >>
> >>
> >> hadley wrote:
> >> >
> >> >
> >> > Hi Dieter,
> >> >
> >> > You can do this with ggplot2 (http://had.co.nz/ggplot2) as follows:
> >> >
> >> > library(ggplot2)
> >> >
> >> > barley1 <- subset(barley, site=="Grand Rapids" & variety %in%
> >> > c("Velvet","Peatland"))
> >> > barley1[] <- lapply(barley1, "[", drop=TRUE)
> >> >
> >> > qplot(variety, yield, data=barley1, geom="bar", stat="identity",
> >> > fill=factor(year))
> >> >
> >> > barley1$fill <- c("red","green","blue","gray")
> >> > qplot(variety, yield, data=barley1, geom="bar", stat="identity",
> >> > fill=fill) + scale_fill_identity()
> >> >
> >> > See http://had.co.nz/ggplot2/scale_identity.html and
> >> > http://had.co.nz/ggplot2/position_stack.html for more details.
> >> >
> >> > Hadley
> >> >
> >> >
> >>
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Stacked-barchart-color-tf3909162.html#a11149419
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >>
> >> __
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
> --
> View this message in context: 
> http://www.nabble.com/Stacked-barchart-color-tf3909162.html#a11182581
>
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read th

Re: [R] Switching X-axis and Y-axis for histogram

2007-06-22 Thread hadley wickham
It's trivial to do this with ggplot2 (http://had.co.nz):

qplot(rating, data=movies, geom="histogram") + coord_flip()
qplot(rating, data=movies, geom="histogram", binwidth=0.1) + coord_flip()

Hadley

On 6/22/07, Donghui Feng <[EMAIL PROTECTED]> wrote:
> Dear all,
>
> I'm creating a histogram with the function hist(). But
> right now what I get is column representation (as normal).
> I'm wondering if I could switch X-axis and Y-axis and
> get row-representation of frequencies?
>
> One more question, can I define the step of each axises
> for the histogram?
>
> Thanks so much!
>
> Donghui
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlaying lattice graphs (continued)

2007-06-22 Thread hadley wickham
Hi Sebastian,

I think the following does what you want:

library(ggplot2)
names(mydata) <- tolower(names(mydata))

obs <- rename(subset(mydata, model=="A", -predicted), c("observed" = "value"))
obs$model <- factor("observed")
pred <- rename(mydata[, -5], c("predicted" = "value"))
all <- rbind(obs, pred)

ggplot(all, aes(x = time, y = value, colour=model)) +
geom_point(data = subset(all, model != "Observed")) +
geom_line(data= subset(all, model == "Observed")) +
facet_grid(. ~ individuals)

Hadley

On 6/22/07, Sébastien <[EMAIL PROTECTED]> wrote:
> Hi Deepayan,
>
> The following code creates a dummy dataset which has the same similar as
> my usual datasets. I did not try to implement the changes proposed by
> Hadley, hoping that a solution can be found using the original dataset.
>
> # My code
>
> # Creating dataset
>
> nPts<-10# number of time points
> nInd<-6  # number of individuals
> nModel<-3 # number of models
>
> TimePts<-rep(1:nPts,nInd*nModel)#
> creates the "Time" column
> Coef<-rep(rnorm(6,0.1,0.01),each=nPts,nModel) # Creates a
> vector of coefficients for generating the observations
> Obs<-10*exp(-Coef*TimePts) #
> creates the observations
>
> for (i in 1:60){
> Pred[i]<-jitter(10*exp(-Coef[i]*TimePts[i]))
> Pred[i+60]<-jitter(5)
> Pred[i+120]<-jitter(10-Coef[i+120]*TimePts[i])
> }
>   # creates the predicted values
>
> colPlot<-rep(1,nPts*nInd*nModel)
> # creates the "Plot" column
> colModel<-gl(nModel,nPts*nInd,labels=c("A","B","C")) #
> creates the "Model" column
> colID<-gl(nInd,nPts,nPts*nInd*nModel)
>   # creates the "ID" column
>
> mydata<-data.frame(colPlot,colModel,colID,TimePts,Obs,Pred)
>   # creates the dataset
> names(mydata)<-c("Plot","Model","Individuals","Time","Observed","Predicted")
>
> # Plotting as indicated by Deepayan
>
>
> xyplot(Observed + Predicted ~ Time | Individuals + Model,
>   data = mydata,
>   panel = panel.superpose.2, type = c("p", "l"),
>   layout = c(0, nlevels(mydata$Individuals))) #,
>   #<...>)
>
> ### End of code
>
> This codes is not exactly what I am looking for, although it is pretty
> close. In the present case, I would like to have a Trellis plot with 6
> panels (one for each individual), where the Observations and the
> Predicted are plotted as symbols and lines, respectively. All three
> models should be plotted on the same panel. Unfortunately, it looks to
> me as 3 successives xyplots are created by the code above but only the
> last one remains displayed. I tried to play with
> panel.superpose,panel.superpose.2 and type, without much success.
>
> I also tried the following code that creates 18 panels and distinguish
> all (Individuals,Model) couples... so, not what I want.
>
> xyplot(Observed + Predicted ~ Time | Individuals+Model, data = mydata,
>  type = c("p", "l"), distribute.type = TRUE)
>
> Sebastien
>
>
> Deepayan Sarkar a écrit :
> > On 6/21/07, Sébastien <[EMAIL PROTECTED]> wrote:
> >> Hi Hadley,
> >>
> >> Hopefully, my dataset won't be too hard to changed. Can I modify the
> >> aspect of each group using your code (symbols for observed and lines for
> >> predicted)?
> >>
> >> Sebastien
> >>
> >> hadley wickham a écrit :
> >> > Hi Sebastian,
> >> >
> >> > I think you need to rearrange your data a bit.  Firstly, you need to
> >> > put observed on the same footing as the different models, so you would
> >> > have a new column in your data called value (previously observed and
> >> > predicted) and a new model type ("observed").  Then you could do:
> >
> > Yes, and ?make.groups (and reshape of course) could help with that.
> > This might not be strictly necessary though.
> >
> > However, I'm finding your pseudo-code confusing. Could you create a
> > small example data set that can be used to try out some real code?
> > Just from your description, I would have suggested something like
> >
> > xyplot(Observed + Predicted ~ Time | Individuals + Model,
> >   data = mydata,
> >   panel = panel.superpose.2, type = c("p", "l"),
> >   layout =

Re: [R] Overlaying lattice graphs (continued)

2007-06-21 Thread hadley wickham
Sebastian,

You should be able to, but I don't know how to do it with lattice.  In
ggplot (http://had.co.nz/ggplot2) you would do it as follows:

ggplot(mydata, aes(x = time, y = value, colour=model)) +
geom_point(subset(data, model != "observed")) +
geom_line((subset(data, model == "observed")) +
facet_grid(. ~ individuals)

or if you only wanted the models coloured:

ggplot(mydata, aes(x = time, y = value)) +
geom_point(subset(data, model != "observed"), aes(colour=model)) +
geom_line((subset(data, model == "observed")) +
facet_grid(. ~ individuals)

Although the way the panels are arranged is probably suboptimal if you
have many individuals.  It's something I plan to fix in the future, so
that  + facet_wrap(individuals) would give you a display like lattice
does.

Hadley


On 6/21/07, Sébastien <[EMAIL PROTECTED]> wrote:
> Hi Hadley,
>
> Hopefully, my dataset won't be too hard to changed. Can I modify the
> aspect of each group using your code (symbols for observed and lines for
> predicted)?
>
> Sebastien
>
> hadley wickham a écrit :
> > Hi Sebastian,
> >
> > I think you need to rearrange your data a bit.  Firstly, you need to
> > put observed on the same footing as the different models, so you would
> > have a new column in your data called value (previously observed and
> > predicted) and a new model type ("observed").  Then you could do:
> >
> > xyplot(value ~ time | individauls, data=mydata, group=model)
> >
> > Hadley
> >
> >
> > On 6/21/07, Sébastien <[EMAIL PROTECTED]> wrote:
> >> Dear R Users,
> >>
> >> I recently posted an email on this list  about the use of data.frame and
> >> overlaying multiple plots. Deepayan kindly indicated to me the
> >> panel.superposition command which worked perfectly in the context of the
> >> example I gave.
> >> I'd like to go a little bit further on this topic using a more complex
> >> dataset structure (actually the one I want to work on).
> >>
> >>  >mydata
> >>   PlotModelIndividualsTimeObserved
> >> Predicted
> >> 11A   1  0.05
> >> 1010.2
> >> 21A   1  0.10
> >> 2019.5
> >> etc...
> >> 10  1B   1  0.05 10
> >>  9.8
> >> 11  1B   1  0.10 20
> >>  20.2
> >> etc...
> >>
> >> There are p "levels" in mydata$Plot, m in mydata$Model, n in
> >> mydata$Individuals and t in mydata$Time (Note that I probably use the
> >> word levels improperly as all columns are not factors). Basically, this
> >> dataset summarizes the t measurements obtained in n individuals as well
> >> as the predicted values from m different modeling approaches (applied to
> >> all individuals). Therefore, the observations are repeated m times in
> >> the Observed columns, while the predictions appears only once for a
> >> given model an a given individual.
> >>
> >> What I want to write is a R batch file creating a Trellis graph, where
> >> each panel corresponds to one individual and contains the observations
> >> (as scatterplot) plus the predicted values for all models (as lines of
> >> different colors)... $Plot is just a token: it might be used to not
> >> overload graphs in case there are too many tested models. The fun part
> >> is that the values of p, m, n and t might vary from one dataset to the
> >> other, so everything has to be coded dynamically.
> >>
> >> For the plotting part I was thinking about having a loop in my code
> >> containing something like that:
> >>
> >> for (i in 1:nlevels(mydata$Model)) {
> >>
> >> subdata<-subset(mydata,mydata$Model=level(mydata$Model)[i])
> >> xyplot(subset(Observed + Predicted ~ Time | Individuals, data =
> >> subdata)   #plus additionnal formatting code
> >>
> >> }
> >>
> >> Unfortunately, this code simply creates a new Trellis plot instead of
> >> adding the model one by one on the panels. Any idea or link to a useful
> >> command will wellcome.
> >>
> >> Sebastien
> >>
> >> __
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlaying lattice graphs (continued)

2007-06-21 Thread hadley wickham
Hi Sebastian,

I think you need to rearrange your data a bit.  Firstly, you need to
put observed on the same footing as the different models, so you would
have a new column in your data called value (previously observed and
predicted) and a new model type ("observed").  Then you could do:

xyplot(value ~ time | individauls, data=mydata, group=model)

Hadley


On 6/21/07, Sébastien <[EMAIL PROTECTED]> wrote:
> Dear R Users,
>
> I recently posted an email on this list  about the use of data.frame and
> overlaying multiple plots. Deepayan kindly indicated to me the
> panel.superposition command which worked perfectly in the context of the
> example I gave.
> I'd like to go a little bit further on this topic using a more complex
> dataset structure (actually the one I want to work on).
>
>  >mydata
>   PlotModelIndividualsTimeObserved
> Predicted
> 11A   1  0.05
> 1010.2
> 21A   1  0.10
> 2019.5
> etc...
> 10  1B   1  0.05 10
>  9.8
> 11  1B   1  0.10 20
>  20.2
> etc...
>
> There are p "levels" in mydata$Plot, m in mydata$Model, n in
> mydata$Individuals and t in mydata$Time (Note that I probably use the
> word levels improperly as all columns are not factors). Basically, this
> dataset summarizes the t measurements obtained in n individuals as well
> as the predicted values from m different modeling approaches (applied to
> all individuals). Therefore, the observations are repeated m times in
> the Observed columns, while the predictions appears only once for a
> given model an a given individual.
>
> What I want to write is a R batch file creating a Trellis graph, where
> each panel corresponds to one individual and contains the observations
> (as scatterplot) plus the predicted values for all models (as lines of
> different colors)... $Plot is just a token: it might be used to not
> overload graphs in case there are too many tested models. The fun part
> is that the values of p, m, n and t might vary from one dataset to the
> other, so everything has to be coded dynamically.
>
> For the plotting part I was thinking about having a loop in my code
> containing something like that:
>
> for (i in 1:nlevels(mydata$Model)) {
>
> subdata<-subset(mydata,mydata$Model=level(mydata$Model)[i])
> xyplot(subset(Observed + Predicted ~ Time | Individuals, data =
> subdata)   #plus additionnal formatting code
>
> }
>
> Unfortunately, this code simply creates a new Trellis plot instead of
> adding the model one by one on the panels. Any idea or link to a useful
> command will wellcome.
>
> Sebastien
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Controlling text and strip arrangement in xyplot

2007-06-19 Thread hadley wickham
On 6/19/07, Juan Pablo Lewinger <[EMAIL PROTECTED]> wrote:
> I've searched the archives and read the xyplot help but can't figure
> out the 2 lattice questions below?
>
> Consider:
>
> library(lattice)
> DF <- data.frame(x=rnorm(20), y=rnorm(20), g1=rep(letters[1:2], 10),
>   g2=rep(LETTERS[1:2], each=10),
> g3=rep(rep(letters[3:4],each=5),2))
>
> xyplot(y ~ x | g1 + g2, groups=g3, data=DF)
>
> 1) Is there a way to get one strip per row and column of panels as
> below instead of the default?
>
>
> _|__a__|__b__|
>  |
>B
>  |
> --
>  |
>A
>  |

Instead of using lattice, you could use ggplot2
(http://had.co.nz/ggplot2), where this is the default:

(p <- qplot(x, y, data=DF, facets = g1 ~ g2))

> 2) How do I control the text of the strips so that for instance
> instead of "a" and "b" it reads"g1=alpha", "g1=beta" where "alpha"
> and "beta" stand for the corresponding greek symbols? (my difficulty
> here is not with the plotmath symbols but with controlling the text
> of the strips directly from the call to xyplot and not by renaming
> the levels of g1)

It's also possible to do this in ggplot, but some bugs currently stop
it from working. It will work in the next version to be released next
week:

p$strip.text <- function(variable, value) {
greek <- c("A" = "alpha", "B" = "beta")[value]
makelabel <- function(g) substitute(variable == greek,
list(variable=as.name(variable), greek=as.name(g)))

lapply(greek, makelabel)
}
p

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] ggplot2 0.5.2

2007-06-19 Thread hadley wickham
ggplot2
===

ggplot2 is a plotting system for R, based on the grammar of graphics,
which tries to take the good parts of base and lattice graphics and
none of the bad parts. It takes care of many of the fiddly details
that make plotting a hassle (like drawing legends) as well as
providing a powerful model of graphics that makes it easy to produce
complex multi-layered graphics.

Find out more at http://had.co.nz/ggplot2

Changes in version 0.5.2 --

* add argument to position dodge so it's now possible to accurately
dodge things with different widths to their physical widths
* added median summary
* new examples:
* logistic regression example in stat_smooth
* bugs fixed:
* evaluation of arguments to layer is no longer delayed
* can use categorical xseq with stat_smooth
* x and y axes named incorrectly (thanks to Dieter Menne for spotting 
this)
* can now pass position objects to qplot
* y jitter calculated correctly, and jittered data rescales axis now
* removed silly legend from quantile plot
* extra arguments not being passed on to geoms/stats
* fixed bug in stat_summary when summarising a factor
* fixed bugs in stat_summary, geom_ribbon, and coord_trans examples

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [ggplot2] Change color of grid lines

2007-06-17 Thread hadley wickham
On 6/17/07, Bernd Weiss <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am making myself familiar with ggplot2 (I really like the examples
> at ).
>
> One thing that really annoys me is the default use of white grid
> lines and a gray background [1, 2]. I simply would like to have black
> grid lines and a white background. No problem, I thought, "This is R.
> There is no if. Only how." (fortune("Simon Blomberg")).
>
> I carfully checked the ggplot2 homepage 
> and the ggplot2 book .
>
> It seemed that the use of ggopt would be a good idea, in particular
> grid.colour.
>
> library(ggplot2)
> x <- rnorm(100)
> y <- rnorm(100)
> ## the default behaviour
> (a <- qplot(x,y))
> ## my attempt to change the default behaviour
> ggopt(grid.colour = "black", grid.fill = "white", background.colour =
> "black")
> (b <- qplot(x,y))
>
> (Of course, I also gave ggtheme a try but without success.)
>
> Unfortunately, I didn't found any solution for my problem which I
> could hardly believe. I strongly suspect that it's my fault but would
> appreciate any hint like RTFM on page XXX or so.

While the structure of ggplot plots is largely complete, I'm still
working on the appearance.  I know a lot of people prefer a white
background with black gridlines (and many journals require it) but it
hasn't been a priority.  It is on my todo list, and hopefully it will
make it in the next release of ggplot (probably 7-10 days from now)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualize quartiles of plot line

2007-06-17 Thread hadley wickham
On 6/17/07, Arne Brutschy <[EMAIL PROTECTED]> wrote:
> Hi,
>
> h> How about quantile regression? Have a look at
> h> http://had.co.nz/ggplot2/stat_quantile.html for some examples of
> h> what that might look like.
> I tried the ggplot2 package, it seems to be quite powerful. But
> documentation is only partially available, so I'm having some problems
> creating the graphs at all.
>
> First of all, where can I find the diamonds and dsmall data? I cannot
> recreate the samples given in the documentation.

The diamonds dataset is available from the ggplot2 package, and the
dsmall dataset is usually created as needed - dsmall <-
diamonds[sample(1:nrow(diamonds), 1000), ]

> I'm currently using a simple smoother to display the tendency of the
> data and it's stderr. For some reason, it works only for simple
> colors:
>
> p <- ggplot(data, aes(x=Problemsize, y=Fitness)) +
>   geom_smooth(fill=alpha("blue", 0.2), colour="darkblue", size=2)
>
> This does only display a line, not the surrounding stderr. When I
> change the fill atrribute to "blue" or "grey80" without the alpha, the
> stderr gets displayed.

As I said in the other email, this is a known restriction of the
windows graphics device.

> Additionally, I want to display three different models by this, each
> with a differen curve/stderr fill color. How do I do that? I tried so
> set color=Model, which yields only a single line.

It's hard to know without know more about the structure of your
dataset.  Including colour=factor(Model) in the aes statement may do
what you need.

> On another plot, I want to use a single model to be displayed with
> points colored by a gradient depending on a third property:
>
> p <- ggplot(data, aes(x=Problemsize, y=Fitness), color=DeltaConfig) +
>   geom_smooth(size=1, color="black", fill="grey80")+
>   geom_point(size=0.5)+
>   scale_colour_gradient(limits=c(0,10), low="red", high="white")
>
> This does not work, I think the connection between goem_point and
> DeltaConfig is not there. But when I try to set
>
>   geom_point(size=0.5, color=DeltaConfig)+

Colour needs to be inside the aes function - you are mapping colour to
the DeltaConfig variable, not setting colour to a fixed variable.

> it complains about an unknown DeltaConfig object.
>
> Hmm, I guess I don't fully understand this 'grammar of graphics'
> thing. But documentation is quite inconsistent. :( And, the coloring
> thing seems to be a bug. BTW, I'm using R 2.5.0 on windows.

You might want to read the introductory chapters in the ggplot book,
available from http://had.co.nz/ggplot2, which will give you more of a
background.  Please let me know places where you think the
documentation is inconsistent so I can try and make them better.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualize quartiles of plot line

2007-06-17 Thread hadley wickham
On 6/17/07, Arne Brutschy <[EMAIL PROTECTED]> wrote:
> Hi,
>
> concerning the missing se coloring: I followed the examples on
> http://had.co.nz/ggplot2/stat_smooth.html
>
>  c <- ggplot(mtcars, aes(y=wt, x=qsec))
>
>  c + stat_smooth()
>  c + stat_smooth() + geom_point()
>  c + stat_smooth(se = TRUE) + geom_point()
>  c + stat_smooth(fill=alpha("blue", 0.2), colour="darkblue", size=2)
>  c + geom_point() + stat_smooth(fill=alpha("blue", 0.2), colour="darkblue", 
> size=2)
> Does not work, se is missing.

That's not a ggplot bug - it's a limitation of the graphics device you
are using (windows I guess), which does not support transparent
colours.

>  c + stat_smooth(fill="blue", colour="darkblue", size=2)
> Does work.
>
>  c + stat_smooth(method = lm, formula= y ~ ns(x,3)) + geom_point()
>  c + stat_smooth(method = rlm, formula= y ~ ns(x,3)) + geom_point()
> Does not work:
> "Error in model.frame(formula = formula, data = data, weights = weight,  :
>  ..2 used in a wrong context, no ... to read"

Oops, sorry, yes, that's a bug in the current version.  I'll be
releasing a new version that fixes that bug very soon (ie. today or
tomorrow)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualize quartiles of plot line

2007-06-16 Thread hadley wickham
How about quantile regression?  Have a look at
http://had.co.nz/ggplot2/stat_quantile.html for some examples of what
that might look like.

Hadley

On 6/16/07, Arne Brutschy <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I'm currently using a simple plot to visualize some mean values. I'm
> having ~200 datapoints on the x-axis, each has 10 records. I'm
> currently plotting only the mean value of each of the datapoints.
>
> What I need is a way to visualize the quartiles/error/whatever of
> these points. I thought about boxplots, but I have to many points on
> the xaxis - it would be impossible to see anything. I though that it
> would be nice to have a "hull" around each line, indicate the width of
> the quartiles, visualized by a different background. It's like a very
> wide boxplot with a changing mean value...
>
> Is this possible with r? Does anyone know what I mean and/or has done
> this before?
>
> Thanks
> Arne
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stacked barchart color

2007-06-15 Thread hadley wickham
On 6/16/07, owenman <[EMAIL PROTECTED]> wrote:
>
> Hi Hadley,
> I tried your suggestion, using ggplot2, but I am still having a problem. The
> final plot lacks the figure legend -- which it had before I added the
> scale_fill_identity()  bit.  Can  you see what I am doing wrong?
> (By the way, all I am trying to do is make the figure monochrome friendly.
> Is there an easy way to prepare ggplot graphics for a monochrom device?)
> Thanks,Owen

Hi Owen,

The identity scale won't create a legend, unless you tell it what
labels it should use - there's an example at
http://had.co.nz/ggplot2/scale_identity.html.  Otherwise, if you have
a continuous scale and you want something that works in black and
white, p + scale_fill_gradient(low="white", high="black") might be
easier.

Hadley


>
> > y$color = factor(y$Fnd)
> > y$color = c("black","darkgray","lightgray","white")
> > y
>   Fnd locusFreq color
> 1  signeg A 0.087248322 black
> 2 neg A 0.711409396  darkgray
> 3 pos A 0.201342282 lightgray
> 4  sigpos A 0.0 white
> 5  signeg C 0.320754717 black
> 6 neg C 0.603773585  darkgray
> 7 pos C 0.075471698 lightgray
> 8  sigpos C 0.0 white
> 9  signeg B 0.157534247 black
> 10neg B 0.732876712  darkgray
> 11pos B 0.109589041 lightgray
> 12 sigpos B 0.0 white
>
> > p = ggplot(y, aes(x=locus, y=Freq, fill=color)) +
> > geom_bar(position="fill") + scale_fill_identity()
> > p
>
>
>
>
> hadley wrote:
> >
> >
> > Hi Dieter,
> >
> > You can do this with ggplot2 (http://had.co.nz/ggplot2) as follows:
> >
> > library(ggplot2)
> >
> > barley1 <- subset(barley, site=="Grand Rapids" & variety %in%
> > c("Velvet","Peatland"))
> > barley1[] <- lapply(barley1, "[", drop=TRUE)
> >
> > qplot(variety, yield, data=barley1, geom="bar", stat="identity",
> > fill=factor(year))
> >
> > barley1$fill <- c("red","green","blue","gray")
> > qplot(variety, yield, data=barley1, geom="bar", stat="identity",
> > fill=fill) + scale_fill_identity()
> >
> > See http://had.co.nz/ggplot2/scale_identity.html and
> > http://had.co.nz/ggplot2/position_stack.html for more details.
> >
> > Hadley
> >
> >
>
>
> --
> View this message in context: 
> http://www.nabble.com/Stacked-barchart-color-tf3909162.html#a11149419
> Sent from the R help mailing list archive at Nabble.com.
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] model.frame: how does one use it?

2007-06-15 Thread hadley wickham
On 6/15/07, Deepayan Sarkar <[EMAIL PROTECTED]> wrote:
> On 6/15/07, Dirk Eddelbuettel <[EMAIL PROTECTED]> wrote:
> >
> > Philipp Benner reported a Debian bug report against r-cran-rpart aka rpart.
> > In short, the issue has to do with how rpart evaluates a formula and
> > supporting arguments, in particular 'weights'.
> >
> > A simple contrived example is
> >
> > -
> > library(rpart)
> >
> > ## using data from help(rpart), set up simple example
> > myformula <- formula(Kyphosis ~ Age + Number + Start)
> > mydata <- kyphosis
> > myweight <- abs(rnorm(nrow(mydata)))
> >
> > goodFunction <- function(mydata, myformula, myweight) {
> >   hyp <- rpart(myformula, data=mydata, weights=myweight, method="class")
> >   prev <- hyp
> > }
> > goodFunction(mydata, myformula, myweight)
> > cat("Ok\n")
> >
> > ## now remove myweight and try to compute it inside a function
> > rm(myweight)
> >
> > badFunction <- function(mydata, myformula) {
> >   myweight <- abs(rnorm(nrow(mydata)))
> >   mf <- model.frame(myformula, mydata, myweight)
> >   print(head(df))
> >   hyp <- rpart(myformula,
> >data=mf,
> >weights=myweight,
> >method="class")
> >   prev <- hyp
> > }
> > badFunction(mydata, myformula)
> > cat("Done\n")
> > -
> >
> > Here goodFunction works, but only because myweight (with useless random
> > weights, but that is not the point here) is found from the calling
> > environment.
> >
> > badFunction fails after we remove myweight from there:
> >
> > :~> cat /tmp/philipp.R | R --slave
> > Ok
> > Error in eval(expr, envir, enclos) : object "myweight" not found
> > Execution halted
> > :~>
> >
> > As I was able to replicate it, I reported this to the package maintainer.  
> > It
> > turns out that seemingly all is well as this is supposed to work this way,
> > and I got a friendly pointer to study model.frame and its help page.
> >
> > Now I am stuck as I can't make sense of model.frame -- see badFunction
> > above. I would greatly appreciate any help in making rpart work with a local
> > argument weights so that I can tell Philipp that there is no bug.  :)
>
> I don't know if ?model.frame is the best place page to look. There's a
> more detailed description at
>
> http://developer.r-project.org/nonstandard-eval.pdf
>
> but here are the non-standard evaluation rules as I understand them:
> given a name in either (1) the formula or (2) ``special'' arguments like
> 'weights' in this case, or 'subset', try to find the name
>
> 1. in 'data'
> 2. failing that, in environment(formula)
> 3. failing that, in the enclosing environment, and so on.
>
> By 'name', I mean a symbol, such as 'Age' or 'myweight'.  So
> basically, everything is as you would expect if the name is visible in
> data, but if not, the search starts in the environment of the formula,
> not the environment where the function call is being made (which is
> the standard evaulation behaviour).  This is a feature, not a bug
> (things would be a lot more confusing if it were the other way round).

Could you give an example?  It's always seemed confusing to me and I
don't see why looking in the environment of the formula helps.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] 'gv' and fractional points

2007-06-15 Thread hadley wickham
This doesn't answer your original question, and isn't much help unless
you're on a mac, but there's a nice looking program that makes this
kind of graph scraping really easy:
http://www.arizona-software.ch/applications/graphclick/en/

Hadley

On 6/15/07, Ted Harding <[EMAIL PROTECTED]> wrote:
> Hi Folks,
>
> This is off-topic R-wise, but it may be close to
> the heart of many R-users, so I think it may be
> the best place to ask!
>
> Users of 'gv' (the "front end" to ghostscript) will
> be aware of the little window which gives you the
> x-y coordinates (in points = 1/72 inch) of the position
> of the "cross-hair" mouse cursor. These coordinates
> are those of the corresponding position on the printed
> page, relative to some origin.
>
> I have often used this to extract numerical values
> for data from graphs in Postscript files (also PDF
> files, after you have converted them to PS). Then
> (veering back on topic ... ) you can submit the
> numerical data to R and try your own analyses on
> these data, and compare with what the article does.
>
> However, this little window only gives the numbers
> in whole points. Say a smallish graphic may print
> out 3 inches wide or high. Then you get precision
> of 1/216 per 3 inches or 0.4% of full scale. This
> can be adequate on many occasions, but can be on
> the coarse side on other occasions.
>
> Even for a 6-inch-wide/high graph, you only get down
> to 0.2% of full scale.
>
> If it were possible to induce 'gv' to display these
> coordinates in tenths of a point, then much greater
> precision (as adequate as one can expect to hope for
> when, in effect, "measuring off the graph") could be
> obtained.
>
> Does anyone know:
> a) Whether it is possible to persuade 'gv' to give
>this display in fractional points (my own search
>of the documentation has not revealed anything);
> b) Of any alternative to 'gv' as PS viewer which would
>provide this capability?
>
> With thanks, and best wishes to all,
> Ted.
>
> 
> E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
> Fax-to-email: +44 (0)870 094 0861
> Date: 15-Jun-07   Time: 16:13:21
> -- XFMail --
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] back-transform predictors for x-axis in plot -- mgcv package

2007-06-15 Thread hadley wickham
Hi Suzan,

You can do sort of backtransformation inside of ggplot2
(http://had.co.nz/ggplot2).

library(ggplot2)

# Create the base scatterplot with y and x axes transformed by logging,
# and then back transformed by exponentiating
(base <- qplot(carat, price, data=diamonds) + scale_x_log10() +
scale_y_log10() + coord_trans(y="pow10", x="pow10"))

base + geom_smooth(method="lm")

library(mgcv)
base + geom_smooth(method="gam", formula = y ~ s(x, bs="cr"))
base + geom_smooth(method="gam", formula = y ~ s(x, bs="cr"), fill="grey50")

# cf.

qplot(carat, price, data=diamonds) + geom_smooth(method="lm")
qplot(carat, price, data=diamonds) + geom_smooth(method="gam", formula
= y ~ s(x, bs="cr"), fill="grey50")


Regards,

Hadley

On 6/14/07, Suzan Pool <[EMAIL PROTECTED]> wrote:
> My question is related to plot( ) in the mgcv package.  Before modelling
> the data, a few predictors were transformed to normalize them.
> Therefore, the x-axes in the plots show transformed predictor values.
> How do I back-transform the predictors so that the plots are easier to
> interpret?
>
> Thanks in advance,
> Suzan
>
> --
> Suzan Pool
> Oregon State University
> Cooperative Institute for Marine Resources Studies
> c/o NOAA Fisheries
> 520 Heceta Place
> P.O. Box 155
> Hammond, OR  97121
>
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> Phone:  503-861-1818 x36 TTY
> Voice to TTY:  711
> Fax:  503-861-2589
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with hist()

2007-06-15 Thread hadley wickham
On 6/15/07, Mario Dejung <[EMAIL PROTECTED]> wrote:
> > On 6/14/07, Mario Dejung <[EMAIL PROTECTED]> wrote:
> >> Hey everybody,
> >> I try to make a graph with two different plots.
> >>
> >>
> >> First I make a boxplot of my data. It is a collection off correlation
> >> values of different pictures. For example:
> >>
> >> 0.23445 pica
> >> 0.34456 pica
> >> 0.45663 pica
> >> 0.98822 picb
> >> 0.12223 picc
> >> 0.34443 picc
> >> etc.
> >>
> >> Ok, I make this boxplot and I get for every picture the boxes. After
> >> this
> >> I want to know, how many correlations per picture exist.
> >> So I make a new vector y <- as.numeric(data$picture)
> >>
> >> So I get for my example something like this:
> >>
> >> y
> >> [1] 1 1 1 1 1 1 1 1 1 1
> >> [11] 1 1 1 1 1 1 1 1 2 2
> >> ...
> >> [16881] 59 59 59 60 60 60 60 60 60 60
> >>
> >> After this I make something like this
> >>
> >> boxplot(cor ~ pic)
> >> par(new = TRUE)
> >> hist(y, nclass = 60)
> >>
> >> But there is my problem. I have 60 pictures, so I get 60 different
> >> boxplots, and I want the hist behind the boxes. But it makes only 59
> >> histbars.
> >>
> >> What can I do? I tried also
> >> hist(y, 1:60) # same effect
> >> and
> >> hist(y, 1:61)
> >> this give me 60 places, but only 59 bars. the last bar is 0.
> >>
> >> I hope anyone can help me.
> >
> > What does the y axis represent?  It will be counts for the histogram,
> > and correlations for the boxplots.  These aren't comparable, so you're
> > probably better off making two separate graphics.
> >
> > Hadley
> >
> The boxplots show only the median, min, max, etc of the different
> pictures, but I want to know, how many entry's are in this plot. Now I
> have done this by the hist function, and when I use different colors, you
> can see, for the first picture there are about 130 entry, but for the 8th
> picture, there are only 40 entry's...
> Doesn't make this sense?

I think your plot would be more clear if you used two graphics - one
showing the spread, and one showing the number of points (you might
also want to look at notched boxplots).  In the graphic you attached
the bars of the barchart (not histogram! - that's for continuous data)
distract the eye from the boxplots.  You might also want to try
ordering the x axis by mean or number of observations as this will
make it easier to see trends in the data.

The confusion with the barchart arises because there are really two
quite different types of barcharts.  One type is basically the same as
a dotchart, but you draw bars instead of dots - this is the default in
R.  The other type is the categorical analog of the histogram, and
this is the default in ggplot2
(http://had.co.nz/ggplot2/geom_bar.html), allow the next version will
automatically work out which version you want.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with hist()

2007-06-14 Thread hadley wickham
On 6/14/07, Mario Dejung <[EMAIL PROTECTED]> wrote:
> Hey everybody,
> I try to make a graph with two different plots.
>
>
> First I make a boxplot of my data. It is a collection off correlation
> values of different pictures. For example:
>
> 0.23445 pica
> 0.34456 pica
> 0.45663 pica
> 0.98822 picb
> 0.12223 picc
> 0.34443 picc
> etc.
>
> Ok, I make this boxplot and I get for every picture the boxes. After this
> I want to know, how many correlations per picture exist.
> So I make a new vector y <- as.numeric(data$picture)
>
> So I get for my example something like this:
>
> y
> [1] 1 1 1 1 1 1 1 1 1 1
> [11] 1 1 1 1 1 1 1 1 2 2
> ...
> [16881] 59 59 59 60 60 60 60 60 60 60
>
> After this I make something like this
>
> boxplot(cor ~ pic)
> par(new = TRUE)
> hist(y, nclass = 60)
>
> But there is my problem. I have 60 pictures, so I get 60 different
> boxplots, and I want the hist behind the boxes. But it makes only 59
> histbars.
>
> What can I do? I tried also
> hist(y, 1:60) # same effect
> and
> hist(y, 1:61)
> this give me 60 places, but only 59 bars. the last bar is 0.
>
> I hope anyone can help me.

What does the y axis represent?  It will be counts for the histogram,
and correlations for the boxplots.  These aren't comparable, so you're
probably better off making two separate graphics.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confusion with sapply

2007-06-13 Thread hadley wickham
On 6/13/07, Patnaik, Tirthankar <[EMAIL PROTECTED]> wrote:
> Hi,
>  I have some confusion in applying a function over a column.
>
> Here's my function. I just need to shift non-March month-ends to March
> month-ends. Initially I tried seq.dates, but one cannot give a negative
> increment (decrement) here.
>
> return(as.Date(seq.dates(format(xdate,"%m/%d/%Y"),by="months",len=4)[4])
> )
>
> Hence this simple function:
>
> > mydate <- as.Date("2006-01-01")
> >
> > # Function to shift non-March company-reporting dates to March.
> > Set2March <- function(xdate){
> + # Combines non-March months into March months:
> + # Dec2006 -> Mar2007
> + # Mar2006 -> Mar2006
> + # Jun2006 -> Mar2006
> + # Sep2006 -> Mar2006
> + # VERY Specific code.
> + Month <- format(xdate,"%m")
> + wDate <- month.day.year(julian(xdate))
> + if (Month=="12"){
> + wDate$year <- wDate$year + 1
> + wDate$month <- 3
> + }else
> + if (Month=="06"){
> + wDate$month <- 3
> + }else
> + if (Month=="09"){
> + wDate$month <- 3
> + wDate$day <- wDate$day + 1
> + }else warning ("No Changes made to the month, since month is not
> one of (6,9,12)")
> + cDate <- chron(paste(wDate$month,wDate$day,wDate$year,sep="/"))
> + return(as.Date(as.yearmon(as.Date(cDate,"%m/%d/%y")),frac=1))
> + }
> > Set2March(as.Date("2006-06-30"))
> [1] "2006-03-31"
> > Set2March(mydate)
> [1] "2006-01-31"
> Warning message:
> No Changes made to the month, since month is not one of (6,9,12) in:
> Set2March(mydate)
> >
>
> Works well when I use it on a single date. Then I try it on a vector:
>
>
> > dc <- seq(as.Date("2006-01-01"),len=10, by="month")
> > dc
>  [1] "2006-01-01" "2006-02-01" "2006-03-01" "2006-04-01" "2006-05-01"
> "2006-06-01" "2006-07-01" "2006-08-01"
>  [9] "2006-09-01" "2006-10-01"
>
>
> > sapply(as.vector(dc),Set2March)
> Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3,
> :
> unimplemented type 'character' in 'asLogical'
> >
>
> What am I missing here? Shouldn't the function work with the sapply
> working on each entry?

You can considerable simplify your code with some helper functions:

month <- function(x) as.POSIXlt(x)$mon + 1
"month<-" <- function(x, value) {
ISOdatetime(year(x) + (value - 1) %/% 12,  (value - 1) %% 12 + 1 ,
mday(x), hour(x), minute(x), second(x), tz(x))
}
year <- function(x) as.POSIXlt(x)$year + 1900
"year<-" <- function(x, value) {
ISOdatetime(value,  month(x), mday(x), hour(x), minute(x), second(x), 
tz(x))
}

marchise <- function(x) {
if (month(x) == 12) year(x) <- year(x)
if (month(x) %in% c(6, 9, 12)) month(x) <- 3
x
}

dc <- seq(as.Date("2006-01-01"),len=10, by="month")
marchise(dc[[1]])


However, that doesn't work with sapply because the date class seems to
get stripped off - I'm not completely why, but perhaps because the
date class is a property of the entire vector not the individual
values:

sapply(marchise, dc)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlaying lattice graphs

2007-06-12 Thread hadley wickham
On 6/12/07, Seb <[EMAIL PROTECTED]> wrote:
> Hello
>
> I apologize in advance if this question has already be posted on the
> list, although I could not find a relevant thread in the archives.
>
> I would like to overlay xyplots using different datasets for each plot.
> I typically work on the following data.frame (mydata) structure
>
> >mydata
> DrugTimeObserved  Predicted
> 1   A0.05 10 10.2
> 2   A0.10 20 19.5
> etc...
> 100 B0.05 11 12.7
> 101 B0.10 35 36
> etc...
>
> I want to plot the observed data as points and the predicted values as
> lines. If I use the following commands, I don't have the possibility to
> switch the "y" values from Observed for the scatterplot to Predicted for
> the line.
>
> xyplot(Observed ~ Time | Drug, data = mydata, panel  =  function(x,y, ...){
> +panel.xyplot(x,y,...)
> +panel.xyplot(x,y,type="l",...)})
>
> I wonder if this problem can be solved using the trellis.focus "family"
> commands but I have a hard time to understand how they work.

Another approach would be to use ggplot, http://had.co.nz/ggplot2.
Then your code might look something like:

ggplot(mydata, aes(x=Time)) +
geom_point(aes(y=Observed)) +
geom_line(aes(y = Predicted)) +
facet_grid(. ~ Drug)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stacked barchart color

2007-06-12 Thread hadley wickham
On 6/12/07, Dieter Menne <[EMAIL PROTECTED]> wrote:
> Dear Latticer,
>
> I want to give individual colors to all elements in a simple stacked
> barchart. I know why the example below does not work (and it is a excellent
> default), but is there any workaround for this?
>
> Dieter
>
>
> # This only colors red and green, but I want blue and gray for Peatland.
>
> barchart(yield ~ variety , groups=year, data = barley,  stack = TRUE,
>   subset=site=="Grand Rapids" & variety %in% c("Velvet","Peatland"),
> col=c("red","green","blue","gray"))

Hi Dieter,

You can do this with ggplot2 (http://had.co.nz/ggplot2) as follows:

library(ggplot2)

barley1 <- subset(barley, site=="Grand Rapids" & variety %in%
c("Velvet","Peatland"))
barley1[] <- lapply(barley1, "[", drop=TRUE)

qplot(variety, yield, data=barley1, geom="bar", stat="identity",
fill=factor(year))

barley1$fill <- c("red","green","blue","gray")
qplot(variety, yield, data=barley1, geom="bar", stat="identity",
fill=fill) + scale_fill_identity()

See http://had.co.nz/ggplot2/scale_identity.html and
http://had.co.nz/ggplot2/position_stack.html for more details.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT]Web-Based Data Brushing

2007-06-12 Thread hadley wickham
On 6/12/07, Roy Mendelssohn <[EMAIL PROTECTED]> wrote:
> I apologize for the off-topic post, but my Google search did not turn
> up much and I thought people on this list my have knowledge of this.
> I am looking for examples of  data brushing  (i.e. dynmaic linked
> plots) either on a web site, or in a web-based application, such as
> an AJAX app.  Even better if there is a way to do this in R.
>
> Thanks for any help.

It's not completely in R, but rggobi (http://www.ggobi.org/rggobi)
offers a tight link to ggobi (http://www.ggobi.org) which offers a
wide range of interactive and dynamic graphics, including linked
brushing.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Updated ggplot2 package (beta version)

2007-06-10 Thread hadley wickham
ggplot2
===

ggplot2 is a plotting system for R, based on the grammar of graphics,
which tries to take the good parts of base and lattice graphics and
none of the bad parts. It takes care of many of the fiddly details
that make plotting a hassle (like drawing legends) as well as
providing a powerful model of graphics that makes it easy to produce
complex multi-layered graphics.

Find out more at http://had.co.nz/ggplot2

Changes in version 0.5.1 --

 * new chapter in book and changes to package to make it possible to
customise every aspect of ggplot display using grid

 * a new economic data set to help demonstrate line, path and area plots

 * many bug fixes reported by beta testers

Hadley

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] comparing two vectors

2007-06-10 Thread hadley wickham
On 6/10/07, gallon li <[EMAIL PROTECTED]> wrote:
> Suppose I have a vector A=c(1,2,3)
>
> now I want to compare each element of A to another vector L=c(0.5, 1.2)
>
> and then recode values for sum(A>0.5) and sum(A>1.2)
>
> to get a result of (3,2)
>
> how can I get this without writing a loop of sums?

How about colSums(outer(A, L, ">"))

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wrapping lattice xyplot

2007-06-08 Thread hadley wickham
On 6/8/07, Zack Weinberg <[EMAIL PROTECTED]> wrote:
> This is an expanded version of the question I tried to ask last night
> - I thought I had it this morning, but it's still not working and I
> just do not understand what is going wrong.
>
> What I am trying to do is write a wrapper for lattice xyplot() that
> passes a whole bunch of its secondary arguments, so that I can produce
> similarly formatted graphs for several different data sets.  This is
> what I've got:
>
> graph <- function (x, data, groups, xlab) {
>   g <- eval(substitute(groups), data, parent.frame())
>
>   pg <- function(x, y, group.number, ...) {
> panel.xyplot(x, y, ..., group.number=group.number)
> panel.text(2, unique(y[x==2]),
>levels(g)[group.number],
>pos=4, cex=0.5)
>   }
>
>   xyplot(x, data=data, groups=substitute(g),
>   type='l',
>   ylab=list(cex=1.1, label='Mean RT (ms)'),
>   xlab=list(cex=1.1, label=xlab),
>   scales=list(
> x=list(alternating=c(1,1), tck=c(1,0)),
> y=list(alternating=c(1,0))
> ),
>   panel=panel.superpose,
>   panel.groups=pg
>   )
> }
>
> "pg" is supposed to pick "g" up from the lexical enclosure. I have no
> idea whether that actually works, because it never gets that far.  A
> typical call to this function looks like so:
>
> > graph(est ~ pro | hemi, sm, obs, "Probe type")
>
> (where 'sm' is a data frame that really does contain all four columns
> 'est', 'pro', 'hemi', and 'obs', pinky swear) and, as it stands above,
> invariably gives me this error:
>
> Error in eval(expr, envir, enclos) : object "est" not found
>
> I tried substitute(x) (as that seems to have cured a similar problem
> with "g") but then x is not a formula and method dispatch fails.
>
> Help?
> zw

It's not lattice, but ggplot2, http://had.co.nz/ggplot2, is designed
to make this easy because you don't have to specify the data set when
creating the plot. e.g.

install.packages("ggplot2", dep=T)
library(ggplot2)

# This is an abstract definition of a plot - it doesn't have any data yet
p <- ggplot(mapping = aes(x=cyl, y=mpg)) + geom_point() +
geom_smooth(method="lm")

mt2 <- mtcars * 2
mt3 <- as.data.frame(mtcars ^ 2)

# Add datasets
p %+% mtcars
p %+% mt2
p %+% mt3
# (the syntax isn't great, but you get the idea)

# Or even changing the default mapping from data to visual properties
p %+% mt3 + aes(x = mpg, y=wt)

Obviously, you can do even more within a function, and the aes call is
relatively easy to create programmatically (although not well
documented currently, so please ask me for more details if you are
interested).

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rlm results on trellis plot

2007-06-08 Thread hadley wickham
On 6/7/07, Alan S Barnett <[EMAIL PROTECTED]> wrote:
> How do I add to a trellis plot the best fit line from a robust fit? I
> can use panel.lm to add a least squares fit, but there is no panel.rlm
> function.

It's not trellis, but it's really easy to do this with ggplot2:

install.packages("ggplot2", dep=T)
library(ggplot2)

p <- qplot(x, y, data=diamonds)
p + geom_smooth(method="lm")
p + geom_smooth(method="rlm")
p + geom_smooth(method="lm", formula="y ~ poly(x,3)")

see http://had.co.nz/ggplot2/stat_smooth.html for more examples.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplots: Editing the frequency x-axis names

2007-06-08 Thread hadley wickham
On 6/8/07, Tom.O <[EMAIL PROTECTED]> wrote:
>
> Hi
> I have a timeSeries object (X) with monthly returns. I want to display the
> returns with a barplot, which I can fix easily. But my problem is labaling
> the x-axis, if I use the positions from the timeseries It gets very messy. I
> have tried rotating and changing the font size but it doesn't do the trick.
> I think the optimal solution for my purpose is too only display every second
> or third date, pherhaps only use every 12 month. But how do I do that?

It's quite easy to do that with ggplot2, see below, or
http://had.co.nz/ggplot2/scale_date.html for examples.

df <- data.frame(
 date = seq(Sys.Date(), len=100, by="1 day")[sample(100, 50)],
 price = runif(50)
)

qplot(date, price, data=df, geom="line")
qplot(date, price, data=df, geom="bar", stat="identity")
qplot(date, price, data=df, geom="bar", stat="identity") +
scale_x_date(major="2 months")
qplot(date, price, data=df, geom="bar", stat="identity") +
scale_x_date(major="10 day", format="%d-%m")
qplot(date, price, data=df, geom="bar", stat="identity") +
scale_x_date(major="5 day", format="%d-%m")

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple color schemes for barchart (lattice)

2007-06-06 Thread hadley wickham
On 6/6/07, Sarah Hawley <[EMAIL PROTECTED]> wrote:
> Hello R-help.
>
> I am trying to make a stacked barplot where the color of the sections of
> each bar depend on another variable.
>
> > myData[1:11,]
>score   percentmarker cellType Malignant
> 1  0 100.0 ESR1 (ER) Bladder.M(5) TRUE
> 2  0  80.0  PAX8 Bladder.M(5) TRUE
> 3  1  20.0  PAX8 Bladder.M(5) TRUE
> 4  0 100.0 ESR1 (ER)   Brain.N(3) FALSE
> 5  0 100.0  PAX8   Brain.N(3) FALSE
> 6  3 100.0 ESR1 (ER) Breast.M(11) TRUE
> 7  0 100.0  PAX8 Breast.M(11) TRUE
> 8  0  36.36364 ESR1 (ER) Cervix.M(11) TRUE
> 9  1   9.09091 ESR1 (ER) Cervix.M(11) TRUE
> 10 2  18.18182 ESR1 (ER) Cervix.M(11) TRUE
> 11 3  36.36364 ESR1 (ER) Cervix.M(11) TRUE
>
> palette <- palette(gray(seq(0, 1,len=4)))
> trellis.par.set(list(par.xlab.text=list(cex=0.85)
>, superpose.polygon=list(col=palette())
>, axis.text=list(cex=0.8)))
>
>
> barchart(percent~cellType|marker
> , groups=score
> , data=myData
> , stack=TRUE
> , xlab='N=Normal/Benign, M=Malignant'
> , ylab='Percentage of Cores Staining'
> , color=palette()
> , auto.key = list(points = FALSE, rectangles = TRUE, space = "top")
> , scales=list(x=list(rot=70))
> , layout=c(1,2))
>
> I would like to make the color scheme of the bar differ according to the
> variable 'Malignant' and add a second color scheme to the key.

It's pretty easy to do this with ggplot2 - see
http://had.co.nz/ggplot2/position_stack.html for some examples.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   >