[R] barchart (lattice) with text labels
I would like to place the value for each bar in barchart (lattice) at the top of each bar. Something like the following code produces. library(lattice) mypanelfunc - function(x, y, ...) { panel.barchart(x, y, ...) panel.text(x, y, labels=as.character(round(x,2)), ...) } myprepanelfunc - function(x, y, ...) list(xlim=c(0, max(x)+.1)) mydata - expand.grid(a=factor(1:5), b=factor(1:3), c=factor(1:2)) mydata$x - runif(nrow(mydata)) barchart(a~x|b, mydata, groups=c, panel=mypanelfunc, prepanel=myprepanelfunc, adj=c(-0.1,0.5)) However, I cannot figure out how to shift the values to correspond with their respective grouped bar. Has anyone done this type of thing? Thanks. Mark Lyman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Google Custom Search Engine for R
There appears to be another site (linked from Wikipedia) that lists some of the sites that it searches: http://www.dangoldstein.com/search_r.html --- Steven McKinney [EMAIL PROTECTED] wrote: Whereas R is very generic, CRAN is much less so. I've had very good luck adding CRAN to my search terms, e.g. try to Google cran 3d scatterplot This produces all R-related hits on the first Google page. Hope this helps Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Matthew Keller Sent: Fri 2/23/2007 7:51 AM To: Sérgio Nunes Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Google Custom Search Engine for R Hi Sergio, There was a discussion on this board recently about the difficulty of searching for R related material on the web. I think the custom google search engine is a good idea. It would be helpful if we could have access to the full list of websites it is indexing so that we could make suggestions about other sites that are missing. As it is, it only tells us that there are 35 websites, and shows us the first several. Also, you might check out Sasha Goodman's Rseek: http://www.rseek.org/ Have you tried to compare the success of yours with Rseek? All the Best, Matt On 2/23/07, Sérgio Nunes [EMAIL PROTECTED] wrote: Hi, Since R is a (very) generic name, I've been having some trouble searching the web for this topic. Due to this, I've just created a Google Custom Search Engine that includes several of the most relevant sites that have information on R. See it in action at: http://google.com/coop/cse?cx=018133866098353049407%3Aozv9awtetwy This is really a preliminary test. Feel free to add yourself to the project and contribute with suggestions. Sérgio Nunes __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew C Keller Postdoctoral Fellow Virginia Institute for Psychiatric and Behavioral Genetics __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. TV dinner still cooling? Check out Tonight's Picks on Yahoo! TV. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot two graphs on one single plot?
you can also call plot(x1,y1) par(new=TRUE) plot(x2,y2,axes=FALSE) but this is more helpful for when you want to plot with multiple y-axes. in general, it is also possible to build up from low-level graphical elements: plot.new() plot.window(xlim=range(x1),ylim=range(y1)) lines(x1,y1) axis(1); axis(2); box() plot.window(xlim=range(x2),ylim=range(y2)) lines(x2,y2) axis(1); axis(2); box() if you want both plots to be on the same scale, you can specify the axes limits a priori, or retrieve the limits from the first figure to call in the second by par(usr). to use these in the second xlim, ylim call, you should set par(xaxs=i,yaxs=i) so they aren't padded above and below the specified limits. (see ?par for xaxs and yaxs) best regards, st --- Steven McKinney [EMAIL PROTECTED] wrote: Broadly speaking, there are a few classes of plotting functions. 1) New graph functions. The plot() function starts a new graph. 2) Add to a graph functions The points(), lines(), text() etc. functions add something to the current graph. 3) Control graph functions par() controls various aspects of the graph. R graphics experts might have some better classification and terminology. So you want your second plotting function to be points() rather than plot(), to add to the existing graph. Try x1=rnorm(25, mean=0, sd=1) y1=dnorm(x1, mean=0, sd=1) x2=rnorm(25, mean=0, sd=1) y2=dnorm(x2, mean=0, sd=1) plot(x1, y1, type='p', xlim=range(x1,x2), ylim=range(y1, y2), xlab='x', ylab='y') points(x2, y2, type='p', col=red, xlab='x', ylab='y') Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Yun Zhang Sent: Fri 2/23/2007 7:34 AM To: Henrique Dallazuanna Cc: r-help@stat.math.ethz.ch Subject: Re: [R] How to plot two graphs on one single plot? Thanks. Now R plots two graphs on one plot. Yet they are still on two graphs, vertically parallelized with each other. But what I want to do is actually plotting two distribution on one single graph, using the same x and y axis. Like: | | | (dist2) | (dist 1) | --- Is it possible to do that? Thanks, Yun Henrique Dallazuanna wrote: par(mfrow=c(2,1)) #your plot #after plot par(mfrow=c(1,1)) On 23/02/07, *Yun Zhang* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hi, I am trying to plot two distribution graph on one plot. But I dont know how. I set them to the same x, y limit, even same x, y labels. Code: x1=rnorm(25, mean=0, sd=1) y1=dnorm(x1, mean=0, sd=1) x2=rnorm(25, mean=0, sd=1) y2=dnorm(x2, mean=0, sd=1) plot(x1, y1, type='p', xlim=range(x1,x2), ylim=range(y1, y2), xlab='x', ylab='y') plot(x2, y2, type='p', col=red, xlab='x', ylab='y') They just dont show up in one plot. Any hint will be very helpful. Thanks, Yun __ R-help@stat.math.ethz.ch mailto:R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná Brasil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Need a quick answer? Get one in minutes from people who know. Ask your question on www.Answers.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlaying graphics
Not pretty, but you could possibly try: # first map map(#arguments#) xylim = par(usr) # second map out = map(#arguments#, plot=FALSE) par(xaxs=i,yaxs=i) plot.window(xlim=xylim[1:2],ylim=xylim[3:4]) polygon(out) --- Takatsugu Kobayashi [EMAIL PROTECTED] wrote: Rusers: This is a very fundamental and silly question. How can I overlay different maps on the same x and y scales? I do neither want to change X and Y axes nor show different x and y ticks on top of each other. Thanks Taka Indiana University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Making two lines graph ...
Hello! Can anyone help me to build a graph with the alphanumeric values on x-axis, with two lines (preferably doted and solid, or similar) that present values on y-axes. In a toy example, data frame could be like this: x.orig x.num y1 y2 a 1 0.2 0.4 b 2 0.1 0.1 c 3 0.3 0.3 d 4 0.3 0.15 e 5 0.1 0.05 I can make graph only if I use values converted to numeric in x.num, but not original X.orig: matplot(dat$x.num, dat[, c(y1,y2)], type=b, lty=1, ylab=(1) y1, (2) y2) Also, how to make doted and solid instead of coloured lines? Thanks in advance. Sincerely, Petar M __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pdf with an exact size
The package grid provides very convenient tools for such things: library(grid) pdf(file = square.pdf, paper = a4) pushViewport(viewport()) grid.rect(width = 100, height = 100, default.units = mm) dev.off() works just fine for me (R 2.4.1, MS WIndows XP): the printed output on my inkjet printer is exactly 100 mm x 100 mm. Best, Renaud 2007/2/23, Joerg van den Hoff [EMAIL PROTECTED]: On Fri, Feb 23, 2007 at 04:24:54PM -0200, Alberto Monteiro wrote: Is it possible to create a pdf output file with an (as nearly as possible) exact size? For example, if I want to draw in an A4 paper (210 x 297 mm) a square of 100 x 100 mm, how can I do it? FWIW, about 6 months ago I learned here how to create an exact png image. For example, if I want a 500 x 500 black square in a 1000 x 1000 white png, occupying the center of the png, the procedure is this: png(image.png, width=1000, height=1000, bg=white) par(mar=c(0,0,0,0)) # reset margins plot(0, xlim=c(0, 999), ylim=c(0, 999), col=white) par(usr=c(0, 999, 0, 999)) points(c(250, 250, 749, 749, 250), c(250, 749, 749, 250, 250), type=l, col=black) dev.off() However, I don't know how do this with a pdf monstr... oops... file. Alberto Monteiro going via `bitmap' and using the `pdfwrite' type is probably the better idea since in this case `ghostscript` is used for pdf generation which seems over all to generate cleaner/better pdf-output in comparsion to the R internal pdf-device (I believe there was recently some discussion of this on the list?). joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Renaud LANCELOT Département Systèmes Biologiques du CIRAD CIRAD, Biological Systems Department Campus International de Baillarguet TA 30 / B F34398 Montpellier Tel +33 (0)4 67 59 37 17 Secr. +33 (0)4 67 59 37 37 Fax +33 (0)4 67 59 37 95 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gsub: replacing a.*a if no occurence of b in .*
I am trying to read a number of XML files using xmlTreeParse(). Unfortunately, some of them are malformed in a way that makes R crash. The problem is that closing tags are sometimes repeated like this: tagvalue1/tagtagvalue2/tagsome garbage/tag/tagtagvalue3/tag I want to preprocess the contents of the XML file using gsub() before feeding them to xmlTreeParse() to clean them up, but I can't figure out how to do it. What I need is something that transforms the example above into: tagvalue1/tagtagvalue2/tagtagvalue3/tag Some kind of /tag.*/tag that only matches if there is no tag in .*. Thanks in advance for you ideas, Uli __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] recovering collums of DF using a text var.list
Hello people, I would like to know how can I use a list of variables (a char list) to have access to the collums from a dataframe to be used in some analysis like, just as example, a ploting task on a for() loop. Of course the code below is just to understand the way. In this example I have a dataframe with several collumns (more then fifty in my case) and I would like do use only some of them. I really need use a var.list! a-seq(1,100,1) b-c(rep(c(1,2,3,4,5),20)) c-rnorm(100,0,1) d-runif(100,0,1) e-c^2 f-c/d g-c-d df-data.frame(cbind(a,b,c,d,e,f,g)) var.list-c(c,f,g) for (myvar in var.list) { plot(density(df$myvar)) # here I need recover df$c , df$f and df$g } Kind regards Miltinho Brazil __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot two graphs on one single plot?
Hi Yun, try this. x1 - rnorm(1, 0.5,0.4018) x2 - rnorm(1, 0.01919,0.3969) d1 - density(x1) d2 - density(x2) plot(range(d1$x,d2$x), range(d1$y, d2$y), type = n, xlab = X, ylab = Y ) lines(d1, col = blue,lwd=2) lines(d2, col = red,lwd=2) Cheers Nguyen -Original Message- From: [EMAIL PROTECTED] on behalf of Yun Zhang Sent: Fri 2/23/2007 7:34 AM To: Henrique Dallazuanna Cc: r-help@stat.math.ethz.ch Subject: Re: [R] How to plot two graphs on one single plot? Thanks. Now R plots two graphs on one plot. Yet they are still on two graphs, vertically parallelized with each other. But what I want to do is actually plotting two distribution on one single graph, using the same x and y axis. Like: | | | (dist2) | (dist 1) | --- Is it possible to do that? Thanks, Yun Henrique Dallazuanna wrote: par(mfrow=c(2,1)) #your plot #after plot par(mfrow=c(1,1)) On 23/02/07, *Yun Zhang* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hi, I am trying to plot two distribution graph on one plot. But I dont know how. I set them to the same x, y limit, even same x, y labels. Code: x1=rnorm(25, mean=0, sd=1) y1=dnorm(x1, mean=0, sd=1) x2=rnorm(25, mean=0, sd=1) y2=dnorm(x2, mean=0, sd=1) plot(x1, y1, type='p', xlim=range(x1,x2), ylim=range(y1, y2), xlab='x', ylab='y') plot(x2, y2, type='p', col=red, xlab='x', ylab='y') They just dont show up in one plot. Any hint will be very helpful. Thanks, Yun [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making two lines graph ...
Does this do what you want? x - x.orig x.num y1 y2 a 1 0.2 0.4 b 2 0.1 0.1 c 3 0.3 0.3 d 4 0.3 0.15 e 5 0.1 0.05 x.in - read.table(textConnection(x), header=TRUE) plot(x.in$y2, type='l', ylim=range(x.in$y1, x.in$y2), xaxt='n', col='red', xlab='', ylab='y') lines(x.in$y1, lty=2, col='green') axis(1, labels=as.character(x.in$x.orig), at=seq(nrow(x.in))) On 2/24/07, Petar Milin [EMAIL PROTECTED] wrote: Hello! Can anyone help me to build a graph with the alphanumeric values on x-axis, with two lines (preferably doted and solid, or similar) that present values on y-axes. In a toy example, data frame could be like this: x.orig x.num y1 y2 a 1 0.2 0.4 b 2 0.1 0.1 c 3 0.3 0.3 d 4 0.3 0.15 e 5 0.1 0.05 I can make graph only if I use values converted to numeric in x.num, but not original X.orig: matplot(dat$x.num, dat[, c(y1,y2)], type=b, lty=1, ylab=(1) y1, (2) y2) Also, how to make doted and solid instead of coloured lines? Thanks in advance. Sincerely, Petar M __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random uniform sample of points on an ellipsoid (e.g. WG
[Apologies if this is a repeated posting for you. Something seems to have gone amiss with my previous attempts to post this reply, as seen from my end] On 22-Feb-07 Roger Bivand wrote: On 21 Feb 2007, Russell Senior wrote: I am interested in making a random sample from a uniform distribution of points over the surface of the earth, using the WGS84 ellipsoid as a model for the earth. I know how to do this for a sphere, but would like to do better. I can supply random numbers, want latitude longitude pairs out. Can anyone point me at a solution? Thanks very much. http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.html looks promising, untried. Hmmm ... That page didn't seem to be directly useful, since on my understanding of the code (and comments) listed under subroutine uniform_on_ellipsoid_map(dim_num, n, a, r, seed, x) UNIFORM_ON_ELLIPSOID_MAP maps uniform points onto an ellipsoid. in http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.f90 it takes points uniformly distributed on a sphere and then linearly transforms these onto an ellipsoid. This will not give unform density over the surface of the ellipsoid: indeed the example graph they show of points on an ellipse generated in this way clearly appear to be more dense at the ends of the ellipse, and less dense on its sides. See: http://www.csit.fsu.edu/~burkardt/f_src/random_data/ uniform_on_ellipsoid_map.png [all one line] Indeed, if I understand their method correctly, in the case of a horizontal ellipse it is equivalent (modulo rotating the result) to distributing the points uniformly over a circle, and then stretching the circle sideways. This will preserve the vertical distribution (so at the two ends of the major axis it has the same density as on the circle) but diluting the horizontal distribution (so that at the two ends of the minor axis the density isless than on the circle). I did have a notion about this, but sat on it expecting that someone would come up with a slick solution -- which hasn't happened yet. For the application you have in hand, uniform distribution over a sphere is a fairly close approximation to uniform distriobution over the ellipspoid -- but not quite. But a rejection method, applied to points uniform on the sphere, can give you points uniform on the ellipsoid and, because of the close approximation of the sphere to the ellipsoid, you would not be rejecting many points. The outline strategy I had in mind (I haven't worked out details) is based on the following. Consider a point X0 on the sphere, at radial distance r0 from the centre of the sphere (same as the centre of the ellipsoid). Let the radius through that point meet the ellipsoid at a point X1, at radial distance R1. Let dS0 be an element of area at X0 on the sphere, which projects radially onto an element of area dS1 on the ellipsoid. You want all elements dS1 of equal size to be equally likely to receive a random point. Let the angle between the tangent plane to the ellipsoid at X1, and the tangent plane to the sphere at X0, be phi. The the ratio of areas dS1/dS0 is R(X0), say, where R(X0) = dS1/dS0 = r1^2/(r0^2 * cos(phi)) and the smaller this ratio, the less likely you want a point u.d. on the sphere to give rise to a point on the ellipsoid. Now define an acceptance probability P(X0) by P(X0) = R(X0)/sup[R(X)] taking the supremum over X on the sphere. Then sample points X0 unformly on the sphere, accepting each one with probability P(X0), and continue sampling until you have the number of points that you need. Maybe someone has a better idea ... (or code for the above!) Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 24-Feb-07 Time: 13:45:50 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with weights on lmer function
On 2/23/07, Ronaldo Reis Junior [EMAIL PROTECTED] wrote: Em Quinta 22 Fevereiro 2007 20:36, Andrew Robinson escreveu: Hi Ronaldo, I suggest that you send us a small, well-documented, code example that we can reproduce. It certainly looks as though there is a problem, but given this information it's hard to know what it is! Cheers Andrew Andrew and all R users. Look this example: test-structure(list(subject = structure(c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2), .Label = c(S1, S2), class = factor), time = c(0, 7, 15, 22, 32, 39, 46, 53, 0, 7, 14, 24, 28, 34, 41, 48), noccup = c(0, 1, 2, 1, 6, 4, 3, 3, 0, 18, 21, 14, 7, 14, 12, 8), ntotal = c(100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100)), .Names = c(subject, time, noccup, ntotal), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)) I have 2 subject over 8 times. Each subject is compound by 100 pieces. I measure the number of piece occuped in any time. I try this: m1-lmer(noccup/ntotal~time+(time|subject),family=binomial,weights=ntotal) Error in lmer(noccup/ntotal ~ time + (time | subject), family = binomial, : object `weights' of incorrect type I dont understand why this error. m1-lmer(noccup/ntotal~1+(time|subject),family=binomial,weights=ntotal) Error in lmer(noccup/ntotal ~ time + (time | subject), family = binomial, : object `weights' of incorrect type Why weights is of incorrect type? In glm this is correct type. I don't get such an error using lme4 0.9975-10 under R-2.4.1 on i386-apple-darwin8.8.1 Of course the model fit is singular because you are attempting to estimate 3 variance-covariance components from 2 groups. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing a.*a if no occurence of b in .*
Ulrich Keller [EMAIL PROTECTED] writes: I am trying to read a number of XML files using xmlTreeParse(). Unfortunately, some of them are malformed in a way that makes R crash. The problem is that closing tags are sometimes repeated like this: tagvalue1/tagtagvalue2/tagsome garbage/tag/tagtagvalue3/tag I want to preprocess the contents of the XML file using gsub() before feeding them to xmlTreeParse() to clean them up, but I can't figure out how to do it. What I need is something that transforms the example above into: tagvalue1/tagtagvalue2/tagtagvalue3/tag Some kind of /tag.*/tag that only matches if there is no tag in .*. Thanks in advance for you ideas, Hmm, there are things you just cannot do with RE's, and I suspect that this is one of them. Something involving explicit splitting of the strings might work, though. How's this for size? trim - function(x)paste(sub(/tag.*,/tag,x),collapse=tag) sapply(strsplit(x,tag),trim) [1] tagvalue1/tagtagvalue2/tagtagvalue3/tag -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing a.*a if no occurence of b in .*
On Sat, 2007-02-24 at 15:03 +0100, Peter Dalgaard wrote: Ulrich Keller [EMAIL PROTECTED] writes: I am trying to read a number of XML files using xmlTreeParse(). Unfortunately, some of them are malformed in a way that makes R crash. The problem is that closing tags are sometimes repeated like this: tagvalue1/tagtagvalue2/tagsome garbage/tag/tagtagvalue3/tag I want to preprocess the contents of the XML file using gsub() before feeding them to xmlTreeParse() to clean them up, but I can't figure out how to do it. What I need is something that transforms the example above into: tagvalue1/tagtagvalue2/tagtagvalue3/tag Some kind of /tag.*/tag that only matches if there is no tag in .*. Thanks in advance for you ideas, Hmm, there are things you just cannot do with RE's, and I suspect that this is one of them. Something involving explicit splitting of the strings might work, though. How's this for size? trim - function(x)paste(sub(/tag.*,/tag,x),collapse=tag) sapply(strsplit(x,tag),trim) [1] tagvalue1/tagtagvalue2/tagtagvalue3/tag Does this work? XML [1] tagvalue1/tagtagvalue2/tagsome garbage/tag/tagtagvalue3/tag gsub([^]*(/tag){2}, , XML) [1] tagvalue1/tagtagvalue2/tagtagvalue3/tag This looks for any characters != '' that precedes a /tag/tag sequence. It replaces that with . ? Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making two lines graph ...
Or a possible alternative still using matplot: matplot(DF[, 3:4], type = b, xaxt = n, pch = c(21, 22), col = black, lty = c(dotted, solid), ylab = Y Vals, xlab = Groups) axis(1, at = 1:5, labels = as.character(DF$x.orig)) legend(topright, legend = c(Group 1, Group 2), lty = c(dotted, solid), pch = c(21, 22)) See ?points for different point symbols (pch) if you prefer. HTH, Marc Schwartz On Sat, 2007-02-24 at 08:08 -0500, jim holtman wrote: Does this do what you want? x - x.orig x.num y1 y2 a 1 0.2 0.4 b 2 0.1 0.1 c 3 0.3 0.3 d 4 0.3 0.15 e 5 0.1 0.05 x.in - read.table(textConnection(x), header=TRUE) plot(x.in$y2, type='l', ylim=range(x.in$y1, x.in$y2), xaxt='n', col='red', xlab='', ylab='y') lines(x.in$y1, lty=2, col='green') axis(1, labels=as.character(x.in$x.orig), at=seq(nrow(x.in))) On 2/24/07, Petar Milin [EMAIL PROTECTED] wrote: Hello! Can anyone help me to build a graph with the alphanumeric values on x-axis, with two lines (preferably doted and solid, or similar) that present values on y-axes. In a toy example, data frame could be like this: x.orig x.num y1 y2 a 1 0.2 0.4 b 2 0.1 0.1 c 3 0.3 0.3 d 4 0.3 0.15 e 5 0.1 0.05 I can make graph only if I use values converted to numeric in x.num, but not original X.orig: matplot(dat$x.num, dat[, c(y1,y2)], type=b, lty=1, ylab=(1) y1, (2) y2) Also, how to make doted and solid instead of coloured lines? Thanks in advance. Sincerely, Petar M __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing a.*a if no occurence of b in .*
All these methods do assume that you don't have nested tag's, like so: tagtagfoo/taguseful stuff/tagsome garbage/tag For that you would really need a true parser. So I would double-check to make sure this doesn't happen. Do you have any control on where those XML files are generated though? It sounds to me it might be easier to fix the utility generating those XML files, since it clearly is doing something wrong. On Feb 24, 2007, at 11:07 AM, Gabor Grothendieck wrote: I assume tag is known. This removes any occurrence /tag.*/tag where .* does not contain tag or /tag. The regular expression, re, matches /tag, then does a greedy match (?U) for anything followed by /tag but uses a zero width lookahead subexpression (?=...) for the second /tag so that it it can be rematched again. gsubfn in package gsubfn is like the usual gsub except that instead of replacing the match with a string it passes the match to function f and then replaces the match with the output of f. See the gsubfn home page: http://code.google.com/p/gsubfn/ and vignette. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing a.*a if no occurence of b in .*
The _question_ assumed that, which is why the answers did too. On 2/24/07, Charilaos Skiadas [EMAIL PROTECTED] wrote: All these methods do assume that you don't have nested tag's, like so: tagtagfoo/taguseful stuff/tagsome garbage/tag For that you would really need a true parser. So I would double-check to make sure this doesn't happen. Do you have any control on where those XML files are generated though? It sounds to me it might be easier to fix the utility generating those XML files, since it clearly is doing something wrong. On Feb 24, 2007, at 11:07 AM, Gabor Grothendieck wrote: I assume tag is known. This removes any occurrence /tag.*/tag where .* does not contain tag or /tag. The regular expression, re, matches /tag, then does a greedy match (?U) for anything followed by /tag but uses a zero width lookahead subexpression (?=...) for the second /tag so that it it can be rematched again. gsubfn in package gsubfn is like the usual gsub except that instead of replacing the match with a string it passes the match to function f and then replaces the match with the output of f. See the gsubfn home page: http://code.google.com/p/gsubfn/ and vignette. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple comparisons when interacction]
Hello, I send the message again with the data file as txt because it seems not to be accepted as csv in the R-help list. Data comes from a multiyear field experiment in which 4 levels of a treatment (2, 3, 4, 6) are compared to see the effect on yield. It is a randomized complete block design. The SAS code follows: options ls=95; data uno; infile 'data.txt' delimiter=';' firstobs=2; input year plot block treat yield; run; proc mixed data=uno; class treat year block; model yield=block year treat treat*year; lsmeans year treat /pdiff; lsmeans treat*year /slice=year pdiff; ods output diffs=dos; run; data tres; set dos; if year=_year; proc print data=tres; var year _year treat _treat estimate stderr df tvalue probt; run; Data are attached as a file: data.csv. In fact, I do not know if this is the best approach to analyze the data: - Should block be considered as random? We use the same file and randomization every year. We are interested in the long term effect of the treatment. - Data should be considered as repeated measurements over time (years)? In multcomp package: - What is the equivalence between the tests proposed (Sequen, AVE, Changepoint, Williams, Marcus, McDermott) and the tests agronomist are used to do: LSD (least significant difference), Duncan multiple range test, Scheffe, S-N-K (Student-Newman-Keuls)? Thanks a lot for your interest. Jorge Lampurlanés Agronomist Is it possible to do this analysis in R? Yes, it is possible. The syntax isn't in place yet. If you send me the complete SAS code and data for an example using slice, I will duplicate it for you in the multcomp package in R. I will send that to the R-help list and to Torsten and it will bring us one step closer to the syntax. The example I showed before was designed to get the same answer as S-Plus multicomp using the adjust= argument. Rich YEAR;PLOT;BLOC;LEVEL;YIELD 15;1;1;3;3896.00 15;2;1;2;3881.33 15;3;1;4;3394.11 15;4;1;6;3261.11 15;5;2;6;3273.66 15;6;2;4;3568.89 15;7;2;2;3535.33 15;8;2;3;3218.66 15;9;3;3;3311.00 15;10;3;4;3458.77 15;11;3;2;3543.44 15;12;3;6;2684.00 15;13;4;4;3591.77 15;14;4;3;3555.55 15;15;4;2;3511.11 15;16;4;6;2755.55 16;1;1;3;2736.33 16;2;1;2;2769.66 16;3;1;4;2353.11 16;4;1;6;2964.44 16;5;2;6;2631.11 16;6;2;4;2882.77 16;7;2;2;2203.66 16;8;2;3;2636.89 16;9;3;3;2461.55 16;10;3;4;2285.00 16;11;3;2;2361.66 16;12;3;6;2636.89 16;13;4;4;2639.78 16;14;4;3;2596.66 16;15;4;2;2928.44 16;16;4;6;2745.44 17;1;1;3;2751.63 17;2;1;2;2855.81 17;3;1;4;2089.71 17;4;1;6;4096.04 17;5;2;6;4254.52 17;6;2;4;2483.10 17;7;2;2;2920.86 17;8;2;3;3399.48 17;9;3;3;3443.28 17;10;3;4;3447.03 17;11;3;2;3839.87 17;12;3;6;4135.01 17;13;4;4;3549.82 17;14;4;3;3542.12 17;15;4;2;3597.47 17;16;4;6;4017.49 18;1;1;3;2106.00 18;2;1;2;2622.78 18;3;1;4;3023.33 18;4;1;6;3811.44 18;5;2;6;3204.11 18;6;2;4;2739.00 18;7;2;2;2700.33 18;8;2;3;2273.89 18;9;3;3;2131.78 18;10;3;4;3075.00 18;11;3;2;2997.44 18;12;3;6;3010.33 18;13;4;4;2868.22 18;14;4;3;2519.44 18;15;4;2;2855.33 18;16;4;6;3462.55 19;1;1;3;3802.27 19;2;1;2;3987.03 19;3;1;4;3879.28 19;4;1;6;2992.50 19;5;2;6;2221.98 19;6;2;4;3739.80 19;7;2;2;4523.07 19;8;2;3;4460.97 19;9;3;3;4587.39 19;10;3;4;4104.34 19;11;3;2;4018.42 19;12;3;6;2504.81 19;13;4;4;3819.31 19;14;4;3;4163.23 19;15;4;2;3718.99 19;16;4;6;2399.38 20;1;1;3;2320.64 20;2;1;2;1540.83 20;3;1;4;1473.81 20;4;1;6;1271.28 20;5;2;6;1025.46 20;6;2;4;1488.39 20;7;2;2;1728.99 20;8;2;3;2196.10 20;9;3;3;2037.82 20;10;3;4;1745.93 20;11;3;2;1876.27 20;12;3;6;1045.36 20;13;4;4;1376.80 20;14;4;3;2176.18 20;15;4;2;1985.51 20;16;4;6;647.27 21;1;1;3;5183.94 21;2;1;2;3583.61 21;3;1;4;3041.82 21;4;1;6;3645.95 21;5;2;6;3525.90 21;6;2;4;2989.62 21;7;2;2;3174.04 21;8;2;3;3021.54 21;9;3;3;2985.57 21;10;3;4;3099.75 21;11;3;2;3214.95 21;12;3;6;3454.56 21;13;4;4;3168.78 21;14;4;3;3244.26 21;15;4;2;3747.98 21;16;4;6;3060.55 22;1;1;3;2920.53 22;2;1;2;1987.68 22;3;1;4;2235.97 22;4;1;6;2693.58 22;5;2;6;2461.07 22;6;2;4;2255.39 22;7;2;2;2815.35 22;8;2;3;3192.49 22;9;3;3;2909.77 22;10;3;4;2832.01 22;11;3;2;2663.89 22;12;3;6;2010.96 22;13;4;4;2471.86 22;14;4;3;2615.70 22;15;4;2;2981.77 22;16;4;6;1719.13 23;1;1;3;4134.62 23;2;1;2;3276.45 23;3;1;4;3136.52 23;4;1;6;3784.49 23;5;2;6;3817.49 23;6;2;4;3506.78 23;7;2;2;4109.16 23;8;2;3;4009.86 23;9;3;3;4099.24 23;10;3;4;4111.49 23;11;3;2;4388.40 23;12;3;6;4090.12 23;13;4;4;4001.93 23;14;4;3;4195.77 23;15;4;2;4247.79 23;16;4;6;4119.71 24;1;1;3;1157.99 24;2;1;2;1008.99 24;3;1;4;822.09 24;4;1;6;565.01 24;5;2;6;518.99 24;6;2;4;889.10 24;7;2;2;1048.63 24;8;2;3;1396.62 24;9;3;3;1238.06 24;10;3;4;1061.40 24;11;3;2;1198.28 24;12;3;6;761.29 24;13;4;4;805.48 24;14;4;3;1172.09 24;15;4;2;1055.87 24;16;4;6;527.04__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
Re: [R] gsub: replacing a.*a if no occurence of b in .*
On Feb 24, 2007, at 11:37 AM, Gabor Grothendieck wrote: The _question_ assumed that, which is why the answers did too. Oh yes, I totally agree, the file snippet the OP provided did indeed assume that, though nothing in the text of his question did, so I wasn't entirely clear whether the actual file that is going to be processed has this form or not. So I just wanted to make sure the OP is aware of this limitation, in case the actual file is more problematic. But most importantly, I wanted to suggest a reevaluation, if possible, of the process that generates these XML's, and perhaps fixing that, instead of patching the problem after it has been created. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing a.*a if no occurence of b in .*
Charilaos Skiadas wrote: On Feb 24, 2007, at 11:37 AM, Gabor Grothendieck wrote: The _question_ assumed that, which is why the answers did too. Oh yes, I totally agree, the file snippet the OP provided did indeed assume that, though nothing in the text of his question did, so I wasn't entirely clear whether the actual file that is going to be processed has this form or not. So I just wanted to make sure the OP is aware of this limitation, in case the actual file is more problematic. But most importantly, I wanted to suggest a reevaluation, if possible, of the process that generates these XML's, and perhaps fixing that, instead of patching the problem after it has been created. Also, I wouldn't tolerate R *crashing* in package code on malformed xml input. Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Woolf's test, Odds ratio, stratification
Just a general question concerning the woolf test (package vcd), when we have stratified data (2x2 tables) and when the p.value of the woolf-test is below 0.05 then we assume that there is a heterogeneity and a common odds ratio cannot be computed? Does this mean that we have to try to add more stratification variables (stratify more) to make the woolf-test p.value insignificant? Also in the same package there is an oddsratio function that seems to be calculating the odds ratio even when the values are non-integers and even when one cell has a null value (in contrast to the mantel_hanszel or the fisher exact test that do not admit zero values or non-integers) does anyone know what's the difference and how to optain a p.value on that odds ratio? Thanks -- View this message in context: http://www.nabble.com/Woolf%27s-test%2C-Odds-ratio%2C-stratification-tf3284589.html#a9136458 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barchart (lattice) with text labels
Deepayan Sarkar wrote: On 2/24/07, Mark and Heather Lyman [EMAIL PROTECTED] wrote: I would like to place the value for each bar in barchart (lattice) at the top of each bar. Something like the following code produces. library(lattice) mypanelfunc - function(x, y, ...) { panel.barchart(x, y, ...) panel.text(x, y, labels=as.character(round(x,2)), ...) } myprepanelfunc - function(x, y, ...) list(xlim=c(0, max(x)+.1)) mydata - expand.grid(a=factor(1:5), b=factor(1:3), c=factor(1:2)) mydata$x - runif(nrow(mydata)) barchart(a~x|b, mydata, groups=c, panel=mypanelfunc, prepanel=myprepanelfunc, adj=c(-0.1,0.5)) However, I cannot figure out how to shift the values to correspond with their respective grouped bar. You should look at panel.barchart and try to reproduce the calculations done there. Deepayan That's what I was afraid of. I made a half-hearted attempt at that, but when I didn't work, I hoped there was an easier way. Well, I will give it another try. Mark Lyman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with weights on lmer function
Hi Ronaldo, Thanks, that's helpful! I also don't get an error. Mind you, I added data=test to the model call. This is my system: sessionInfo() R version 2.4.1 Patched (2006-12-30 r40330) i386-unknown-freebsd6.1 locale: C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: lme4 Matrix lattice 0.9975-13 0.9975-11 0.14-16 Cheers, Andrew On Sat, Feb 24, 2007 at 07:46:09AM -0600, Douglas Bates wrote: On 2/23/07, Ronaldo Reis Junior [EMAIL PROTECTED] wrote: Em Quinta 22 Fevereiro 2007 20:36, Andrew Robinson escreveu: Hi Ronaldo, I suggest that you send us a small, well-documented, code example that we can reproduce. It certainly looks as though there is a problem, but given this information it's hard to know what it is! Cheers Andrew Andrew and all R users. Look this example: test-structure(list(subject = structure(c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2), .Label = c(S1, S2), class = factor), time = c(0, 7, 15, 22, 32, 39, 46, 53, 0, 7, 14, 24, 28, 34, 41, 48), noccup = c(0, 1, 2, 1, 6, 4, 3, 3, 0, 18, 21, 14, 7, 14, 12, 8), ntotal = c(100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100)), .Names = c(subject, time, noccup, ntotal), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)) I have 2 subject over 8 times. Each subject is compound by 100 pieces. I measure the number of piece occuped in any time. I try this: m1-lmer(noccup/ntotal~time+(time|subject),family=binomial,weights=ntotal) Error in lmer(noccup/ntotal ~ time + (time | subject), family = binomial, : object `weights' of incorrect type I dont understand why this error. m1-lmer(noccup/ntotal~1+(time|subject),family=binomial,weights=ntotal) Error in lmer(noccup/ntotal ~ time + (time | subject), family = binomial, : object `weights' of incorrect type Why weights is of incorrect type? In glm this is correct type. I don't get such an error using lme4 0.9975-10 under R-2.4.1 on i386-apple-darwin8.8.1 Of course the model fit is singular because you are attempting to estimate 3 variance-covariance components from 2 groups. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] independent text in panels
Hi Folks, The following gives me the exact plot I want using trellis. xyplot(V56 ~ V12 | subset(frequency, V17 12000) , data=kdata, ylab=Fine Structure Depth [dB], xlab=QSIN score, panel=function(x,y,...) { panel.xyplot(x[HL==N],y[HL==N], pch=16,col=green) panel.xyplot(x[HL==SN],y[HL==SN],pch=16, col=red) panel.lmline(x[HL==N],y[HL==N], col=green) panel.lmline(x[HL==SN],y[HL==SN], col=red) } ) Now I want to add to each panel by writing the result of cor(V56[HL==N],V12[HL==N], use=complete) cor(V56[HL==SN],V12[HL==SN], use=complete) in the specific colors. I have tried things such as: panel.text(4,20,round(cor(V56[HL==N],V12[HL==N], use=pairwise.complete),digits=3),col=green4) But this does not seem to compute correlation based on the grouping variable in each panel. I am not sure that I have given you adequate information. Any help will be much appreciated. Regards, Sumit [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random uniform sample of points on an ellipsoid (e.g. WG
Hi, Sorry for being a late entrant to this thread, but let me see if I understand the problem. The poster wants to sample from an ellipsoid. Let us call this ellipsoid X'\Gamma X - d^2= 0. There is no loss in assuming that the center is zero, otherwise the same can be done. Let us consider the case when Gamma = I first. Then, let X \sim N_p(0, I) (any radially symmetric distribution will do here), then d*X/||X|| is uniform on the sphere of radius d. How about imitating the same? Let X \sim N_p(0, \Sigma), where \Sigma = \Gamma^{-1} then X restricted to X'\Gamma X = d^2 gives the required uniform density on the ellipsoid. How do we get this easily? I don't think rejection works or is even necessary. Isn't d*X / ||\Gamma^{1/2}X|| enough? Here \Gamma^{1/2} is the square root matrix of \Gamma. Note that any distribution of the kind f(X'\Gamma X) would work, but the multivariate Gaussian is a convenient tool, for which two-lines of R code should be enough. Many thanks and best wishes, Ranjan On Sat, 24 Feb 2007 13:45:56 - (GMT) (Ted Harding) [EMAIL PROTECTED] wrote: [Apologies if this is a repeated posting for you. Something seems to have gone amiss with my previous attempts to post this reply, as seen from my end] On 22-Feb-07 Roger Bivand wrote: On 21 Feb 2007, Russell Senior wrote: I am interested in making a random sample from a uniform distribution of points over the surface of the earth, using the WGS84 ellipsoid as a model for the earth. I know how to do this for a sphere, but would like to do better. I can supply random numbers, want latitude longitude pairs out. Can anyone point me at a solution? Thanks very much. http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.html looks promising, untried. Hmmm ... That page didn't seem to be directly useful, since on my understanding of the code (and comments) listed under subroutine uniform_on_ellipsoid_map(dim_num, n, a, r, seed, x) UNIFORM_ON_ELLIPSOID_MAP maps uniform points onto an ellipsoid. in http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.f90 it takes points uniformly distributed on a sphere and then linearly transforms these onto an ellipsoid. This will not give unform density over the surface of the ellipsoid: indeed the example graph they show of points on an ellipse generated in this way clearly appear to be more dense at the ends of the ellipse, and less dense on its sides. See: http://www.csit.fsu.edu/~burkardt/f_src/random_data/ uniform_on_ellipsoid_map.png [all one line] Indeed, if I understand their method correctly, in the case of a horizontal ellipse it is equivalent (modulo rotating the result) to distributing the points uniformly over a circle, and then stretching the circle sideways. This will preserve the vertical distribution (so at the two ends of the major axis it has the same density as on the circle) but diluting the horizontal distribution (so that at the two ends of the minor axis the density isless than on the circle). I did have a notion about this, but sat on it expecting that someone would come up with a slick solution -- which hasn't happened yet. For the application you have in hand, uniform distribution over a sphere is a fairly close approximation to uniform distriobution over the ellipspoid -- but not quite. But a rejection method, applied to points uniform on the sphere, can give you points uniform on the ellipsoid and, because of the close approximation of the sphere to the ellipsoid, you would not be rejecting many points. The outline strategy I had in mind (I haven't worked out details) is based on the following. Consider a point X0 on the sphere, at radial distance r0 from the centre of the sphere (same as the centre of the ellipsoid). Let the radius through that point meet the ellipsoid at a point X1, at radial distance R1. Let dS0 be an element of area at X0 on the sphere, which projects radially onto an element of area dS1 on the ellipsoid. You want all elements dS1 of equal size to be equally likely to receive a random point. Let the angle between the tangent plane to the ellipsoid at X1, and the tangent plane to the sphere at X0, be phi. The the ratio of areas dS1/dS0 is R(X0), say, where R(X0) = dS1/dS0 = r1^2/(r0^2 * cos(phi)) and the smaller this ratio, the less likely you want a point u.d. on the sphere to give rise to a point on the ellipsoid. Now define an acceptance probability P(X0) by P(X0) = R(X0)/sup[R(X)] taking the supremum over X on the sphere. Then sample points X0 unformly on the sphere, accepting each one with probability P(X0), and continue sampling until you have the number of points that you need. Maybe someone has a better idea ... (or code for the above!) Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email:
Re: [R] recovering collums of DF using a text var.list
Milton Cezar Ribeiro napsal(a): Hello people, I would like to know how can I use a list of variables (a char list) to have access to the collums from a dataframe to be used in some analysis like, just as example, a ploting task on a for() loop. Of course the code below is just to understand the way. In this example I have a dataframe with several collumns (more then fifty in my case) and I would like do use only some of them. I really need use a var.list! a-seq(1,100,1) b-c(rep(c(1,2,3,4,5),20)) c-rnorm(100,0,1) d-runif(100,0,1) e-c^2 f-c/d g-c-d df-data.frame(cbind(a,b,c,d,e,f,g)) var.list-c(c,f,g) for (myvar in var.list) { plot(density(df$myvar)) # here I need recover df$c , df$f and df$g } Kind regards Miltinho Brazil __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Is this what you expect? for (myvar in var.list) { plot(density(df[[paste(myvar)]]), main=paste('Density of',myvar)) } It may not be bad idea to make a structure (list, dataframe) consisting of only those variables you want and use some variant of apply() instead of the for-loop. See ?apply, ?tapply, ?sapply Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Woolf's test, Odds ratio, stratification
On Sat, 2007-02-24 at 10:30 -0800, francogrex wrote: Just a general question concerning the woolf test (package vcd), when we have stratified data (2x2 tables) and when the p.value of the woolf-test is below 0.05 then we assume that there is a heterogeneity and a common odds ratio cannot be computed? Does this mean that we have to try to add more stratification variables (stratify more) to make the woolf-test p.value insignificant? Also in the same package there is an oddsratio function that seems to be calculating the odds ratio even when the values are non-integers and even when one cell has a null value (in contrast to the mantel_hanszel or the fisher exact test that do not admit zero values or non-integers) does anyone know what's the difference and how to optain a p.value on that odds ratio? Thanks I'll defer comments on your 'vcd' function specific questions to the package authors. However, from a general perspective the Woolf Test, or for that matter, the Breslow-Day Test (+/- Tarone Correction), are tests that are akin to running tests for the homogeneity of variances prior to performing an ANOVA. You are testing to see if the underlying assumptions of the latter test have been violated. If the tests are significant, it is not an _absolute_ contraindication to continuing, but the result should give you pause to consider what else is going on in your data. These tests are commonly applied, for example, in multi-center clinical trials to test for the 'poolability' of the data from the sites into a common data set for analysis. In the case of the ANOVA, having markedly different variances suggests that the differences in the means may not be the most 'interesting' of findings in your data. Similarly, with stratified 2x2 tables, you can perform a CMH test using mantelhaen.test() and you will get a common odds ratio and CI's. However, the common odds ratio may be at best, less interesting than other underlying characteristics and at worst perhaps, misleading. For an example, let's use the UCBAdmissions dataset and some of the examples in ?mantelhaen.test. For specific information on the dataset, see ?UCBAdmissions. In fact, reading the Details on that help page and running the examples, especially the series of 6 mosaicplot()s, should provide additional insight into the findings I present below. So, let's start: First, let's charge ahead and run the CMH test on the data: mantelhaen.test(UCBAdmissions) Mantel-Haenszel chi-squared test with continuity correction data: UCBAdmissions Mantel-Haenszel X-squared = 1.4269, df = 1, p-value = 0.2323 alternative hypothesis: true common odds ratio is not equal to 1 95 percent confidence interval: 0.7719074 1.0603298 sample estimates: common odds ratio 0.9046968 So, we can see that the resultant test statistic is not significant, yet we do get a common OR and CI's. Now, let's do the Woolf test, using the code provided in ?mantelhaen.test: woolf(UCBAdmissions) [1] 0.003427200 This suggests that the assumption of common odds ratios in the CMH test is violated and indeed if we run: apply(UCBAdmissions, 3, function(x) (x[1,1]*x[2,2])/(x[1,2]*x[2,1])) A B C D E F 0.3492120 0.8025007 1.1330596 0.9212838 1.2216312 0.8278727 We can see that the OR's range from 0.3492120 to 1.2216312. So, let's take a look at the 'raw' data and do some further testing. A while ago, I wrote (and believe posted) code to take a data.frame.table object and convert it back to it's raw structure as a non-summarized data frame. Let's do that with UCBAdmissions, since UCBAdmissions is a summarized 3D table: str(UCBAdmissions) table [1:2, 1:2, 1:6] 512 313 89 19 353 207 17 8 120 205 ... - attr(*, dimnames)=List of 3 ..$ Admit : chr [1:2] Admitted Rejected ..$ Gender: chr [1:2] Male Female ..$ Dept : chr [1:6] A B C D ... Here is the code: expand.dft - function(x, na.strings = NA, as.is = FALSE, dec = .) { DF - sapply(1:nrow(x), function(i) x[rep(i, each = x$Freq[i]), ], simplify = FALSE) DF - subset(do.call(rbind, DF), select = -Freq) for (i in 1:ncol(DF)) { DF[[i]] - type.convert(as.character(DF[[i]]), na.strings = na.strings, as.is = as.is, dec = dec) } DF } So: DF - expand.dft(as.data.frame.table(UCBAdmissions)) str(DF) 'data.frame': 4526 obs. of 3 variables: $ Admit : Factor w/ 2 levels Admitted,Rejected: 1 1 1 1 1 1 1 1 1 1 ... $ Gender: Factor w/ 2 levels Female,Male: 2 2 2 2 2 2 2 2 2 2 ... $ Dept : Factor w/ 6 levels A,B,C,D,..: 1 1 1 1 1 1 1 1 1 1 ... Now, let's create logistic regression models on the raw data, first using just the two covariates of Gender and Dept: fit - glm(Admit ~ Gender + Dept, family = binomial, data = DF) summary(fit) Call: glm(formula = Admit ~ Gender + Dept, family =
Re: [R] random uniform sample of points on an ellipsoid (e.g. WG
On 24-Feb-07 Ranjan Maitra wrote: Hi, Sorry for being a late entrant to this thread, but let me see if I understand the problem. The poster wants to sample from an ellipsoid. Let us call this ellipsoid X'\Gamma X - d^2= 0. There is no loss in assuming that the center is zero, otherwise the same can be done. Let us consider the case when Gamma = I first. Then, let X \sim N_p(0, I) (any radially symmetric distribution will do here), then d*X/||X|| is uniform on the sphere of radius d. How about imitating the same? Let X \sim N_p(0, \Sigma), where \Sigma = \Gamma^{-1} then X restricted to X'\Gamma X = d^2 gives the required uniform density on the ellipsoid. How do we get this easily? I don't think rejection works or is even necessary. Isn't d*X / ||\Gamma^{1/2}X|| enough? Here \Gamma^{1/2} is the square root matrix of \Gamma. Note that any distribution of the kind f(X'\Gamma X) would work, but the multivariate Gaussian is a convenient tool, for which two-lines of R code should be enough. Many thanks and best wishes, Ranjan As I understand Rusell Senior's original query, he wants to generate a uniform distribution on the surface of the ellipsoid, not over its interior: I am interested in making a random sample from a uniform distribution of points over the surface of the earth, using the WGS84 ellipsoid as a model for the earth. Your solution, and the solution given in Roger Bivand's reference Section UNIFORM_IN_ELLIPSOID_MAP maps uniform points into an ellipsoid in: http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.f90 is valid for uniform points in the interior of an ellipsoid, I think. But, since it is a linear transformation, it is not valid for the points on the surface, as I explain for the case of an ellipse (in particular, it results in higher linear density along the circumference of an ellipse near the ends of the major axis than near the ends of the minor axis). It is also the method adopted for points on the surface in the Section UNIFORM_ON_ELLIPSOID_MAP maps uniform points onto an ellipsoid, and I think this is wrong, as I explained. Ted. On Sat, 24 Feb 2007 13:45:56 - (GMT) (Ted Harding) [EMAIL PROTECTED] wrote: [Apologies if this is a repeated posting for you. Something seems to have gone amiss with my previous attempts to post this reply, as seen from my end] On 22-Feb-07 Roger Bivand wrote: On 21 Feb 2007, Russell Senior wrote: I am interested in making a random sample from a uniform distribution of points over the surface of the earth, using the WGS84 ellipsoid as a model for the earth. I know how to do this for a sphere, but would like to do better. I can supply random numbers, want latitude longitude pairs out. Can anyone point me at a solution? Thanks very much. http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.html looks promising, untried. Hmmm ... That page didn't seem to be directly useful, since on my understanding of the code (and comments) listed under subroutine uniform_on_ellipsoid_map(dim_num, n, a, r, seed, x) UNIFORM_ON_ELLIPSOID_MAP maps uniform points onto an ellipsoid. in http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.f90 it takes points uniformly distributed on a sphere and then linearly transforms these onto an ellipsoid. This will not give unform density over the surface of the ellipsoid: indeed the example graph they show of points on an ellipse generated in this way clearly appear to be more dense at the ends of the ellipse, and less dense on its sides. See: http://www.csit.fsu.edu/~burkardt/f_src/random_data/ uniform_on_ellipsoid_map.png [all one line] Indeed, if I understand their method correctly, in the case of a horizontal ellipse it is equivalent (modulo rotating the result) to distributing the points uniformly over a circle, and then stretching the circle sideways. This will preserve the vertical distribution (so at the two ends of the major axis it has the same density as on the circle) but diluting the horizontal distribution (so that at the two ends of the minor axis the density isless than on the circle). I did have a notion about this, but sat on it expecting that someone would come up with a slick solution -- which hasn't happened yet. For the application you have in hand, uniform distribution over a sphere is a fairly close approximation to uniform distriobution over the ellipspoid -- but not quite. But a rejection method, applied to points uniform on the sphere, can give you points uniform on the ellipsoid and, because of the close approximation of the sphere to the ellipsoid, you would not be rejecting many points. The outline strategy I had in mind (I haven't worked out details) is based on the following. Consider a point X0 on the sphere, at radial distance r0 from the centre of the sphere (same as the centre of the ellipsoid).
Re: [R] random uniform sample of points on an ellipsoid (e.g. WG
My method is for the surface, not for the interior. The constraint d*X/|| \Gamma^{-1/2}X ||ensures the constraint, no? The uniformity is ensured by the density restricted to satisfy the constraint which makes it a constant. Ranjan On Sat, 24 Feb 2007 22:49:25 - (GMT) (Ted Harding) [EMAIL PROTECTED] wrote: On 24-Feb-07 Ranjan Maitra wrote: Hi, Sorry for being a late entrant to this thread, but let me see if I understand the problem. The poster wants to sample from an ellipsoid. Let us call this ellipsoid X'\Gamma X - d^2= 0. There is no loss in assuming that the center is zero, otherwise the same can be done. Let us consider the case when Gamma = I first. Then, let X \sim N_p(0, I) (any radially symmetric distribution will do here), then d*X/||X|| is uniform on the sphere of radius d. How about imitating the same? Let X \sim N_p(0, \Sigma), where \Sigma = \Gamma^{-1} then X restricted to X'\Gamma X = d^2 gives the required uniform density on the ellipsoid. How do we get this easily? I don't think rejection works or is even necessary. Isn't d*X / ||\Gamma^{1/2}X|| enough? Here \Gamma^{1/2} is the square root matrix of \Gamma. Note that any distribution of the kind f(X'\Gamma X) would work, but the multivariate Gaussian is a convenient tool, for which two-lines of R code should be enough. Many thanks and best wishes, Ranjan As I understand Rusell Senior's original query, he wants to generate a uniform distribution on the surface of the ellipsoid, not over its interior: I am interested in making a random sample from a uniform distribution of points over the surface of the earth, using the WGS84 ellipsoid as a model for the earth. Your solution, and the solution given in Roger Bivand's reference Section UNIFORM_IN_ELLIPSOID_MAP maps uniform points into an ellipsoid in: http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.f90 is valid for uniform points in the interior of an ellipsoid, I think. But, since it is a linear transformation, it is not valid for the points on the surface, as I explain for the case of an ellipse (in particular, it results in higher linear density along the circumference of an ellipse near the ends of the major axis than near the ends of the minor axis). It is also the method adopted for points on the surface in the Section UNIFORM_ON_ELLIPSOID_MAP maps uniform points onto an ellipsoid, and I think this is wrong, as I explained. Ted. On Sat, 24 Feb 2007 13:45:56 - (GMT) (Ted Harding) [EMAIL PROTECTED] wrote: [Apologies if this is a repeated posting for you. Something seems to have gone amiss with my previous attempts to post this reply, as seen from my end] On 22-Feb-07 Roger Bivand wrote: On 21 Feb 2007, Russell Senior wrote: I am interested in making a random sample from a uniform distribution of points over the surface of the earth, using the WGS84 ellipsoid as a model for the earth. I know how to do this for a sphere, but would like to do better. I can supply random numbers, want latitude longitude pairs out. Can anyone point me at a solution? Thanks very much. http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.html looks promising, untried. Hmmm ... That page didn't seem to be directly useful, since on my understanding of the code (and comments) listed under subroutine uniform_on_ellipsoid_map(dim_num, n, a, r, seed, x) UNIFORM_ON_ELLIPSOID_MAP maps uniform points onto an ellipsoid. in http://www.csit.fsu.edu/~burkardt/f_src/random_data/random_data.f90 it takes points uniformly distributed on a sphere and then linearly transforms these onto an ellipsoid. This will not give unform density over the surface of the ellipsoid: indeed the example graph they show of points on an ellipse generated in this way clearly appear to be more dense at the ends of the ellipse, and less dense on its sides. See: http://www.csit.fsu.edu/~burkardt/f_src/random_data/ uniform_on_ellipsoid_map.png [all one line] Indeed, if I understand their method correctly, in the case of a horizontal ellipse it is equivalent (modulo rotating the result) to distributing the points uniformly over a circle, and then stretching the circle sideways. This will preserve the vertical distribution (so at the two ends of the major axis it has the same density as on the circle) but diluting the horizontal distribution (so that at the two ends of the minor axis the density isless than on the circle). I did have a notion about this, but sat on it expecting that someone would come up with a slick solution -- which hasn't happened yet. For the application you have in hand, uniform distribution over a sphere is a fairly close approximation to uniform distriobution over the ellipspoid -- but not quite. But a