Re: [R] ramdom sampling from a dataset
Try this: x - c(2,5,9,4,5,6,7,8) u - replicate(20, sample(x, replace = TRUE)) t(u) The first argument of replicate() is the number of times to iterate the process. I believe you'll find that it does indeed do random sampling; each row represents a nonparametric bootstrap sample of x. There are other ways of doing this; one is v - matrix(sample(x, 160, replace = TRUE), ncol = 8, byrow = TRUE) HTH, Dennis On Tue, Sep 28, 2010 at 2:03 PM, Michael Larkin mlar...@rsmas.miami.eduwrote: I am trying to get R to pick random integers from my dataset (i.e. bootstrapping) with replacement. However, my attempts at this have been unsuccessful. Here is a basic example of what I am doing: I have a data vector of 8 integers (data= 2,5,9,4,5,6,7,8). I used the sample function and it worked but it only repeated my values in the exact same order. It did not randomly sample them. Here is my code: sample(data, replace=TRUE) Any advice would be greatly appreciated. Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot
Dear Felipe, Assuming you are not interested in the exact formulae that are used to calculate each of these, from top to bottom it is: minimum, lower quartile, median, upper quartile, maximum If there are dots, these usually indicate outliers. Please read ?boxplot for more details on what R will do when. If you are interested in the specifics of the computations, read ?boxplot.stats Hope that helps, Josh See inline comments to match these things to your terms. On Tue, Sep 28, 2010 at 10:06 PM, Luis Felipe Parra felipe.pa...@quantil.com.co wrote: Hello, does somebody know in a boxplot, what does each element in the boxplot represent? 1. lines at the extremes of the dotted lines? min and max (unless there are outliers AND you have not set range = 0) 2. Extremes of the boxes lower upper quartile 3. Black line in the middle of the box? median 4. notches? Thank you Felipe Parra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot
http://www.netmba.com/statistics/plot/box/ ...found via Google search. HTH, Dennis On Tue, Sep 28, 2010 at 10:06 PM, Luis Felipe Parra felipe.pa...@quantil.com.co wrote: Hello, does somebody know in a boxplot, what does each element in the boxplot represent? 1. lines at the extremes of the dotted lines? 2. Extremes of the boxes 3. Black line in the middle of the box? 4. notches? Thank you Felipe Parra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A problem with plotting a long expression in ylab ?
My honor. A short question: if there is something in the device that is sensitive to the overlapping of the text, then is it possible to add a warning massage output when the length of the text is longer then the device dimensions? With much respect, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Wed, Sep 29, 2010 at 1:24 AM, Paul Murrell p.murr...@auckland.ac.nzwrote: Hi It is a bug. A fix has been committed. Thanks for the report! Paul On 29/09/2010 10:15 a.m., Ben Bolker wrote: Barry Rowlingsonb.rowlingsonat lancaster.ac.uk writes: My point is that in regular text, ylab plots it where it then goes outside the borders. With the use of expressions - the text just doesn't show up. Originally I thought it was because of my miss-use of expressions, until I figured it was the level of cex.lab I was using. The problem is that when you can't see the text, you don't have a sense of how much to decrease the cex.lab so the text will fit. I hope I was now clearer. Gotcha. Seems to only affect ylab though. Do this: t = expression(paste(test loo(% of 360 *degree, ))) plot(1,xlab=t,ylab=t,main=t) then if I shrink my graphics window I can make the ylab disappear but not the xlab or title. Seems to affect any rotated expressions: plot(1) text(1,1,t,srt=90) text(1,1,t,srt=0) text(1,1,t,srt=45) Now shrink window and watch the rotated expressions vanish! They disappear when they start (or finish) out of the entire graphics device, not the plot region... I cant find anything relating to clipping in the help, and I am on Linux, so see if there's any news about it, try it with R-patched or R-devel and then report a bug after having read all the other stuff about R bug reporting! Barry I don't claim to understand it, but there is something quite fundamental about the properties of the X11() graphics device in R that makes labels that would otherwise overlap, disappear -- if you do 'extreme resizing' with the graphics above, you can see that otherwise-overlapping x- and y-axis tick labels disappear as the graph gets scrunched. This is (apparently) true of X11 graphics on MacOS as well -- Quartz window has a different behavior. Trying with pdf() as well -- for height=2, width=2, only 1 y-axis and 2 x-axis tick labels survive, *but* the x and y labels and the title are all still present (but clipped, of course). [Hmmm. Take my reports above with a grain of salt, I wasn't always using expression()s.] So I would guess that if you reported this as a bug you would be told that it was a poorly documented property of R's X11 graphics model, rather than a bug ... I have no idea where to start looking for more information about what defines this behavior -- if I were desperate to know I would probably try asking Paul Murrell ... I would be very interested to see this discussed on r-devel, if anyone bit ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 p...@stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get response from a glm
Hello, The predict.glm is your friend. Type ?predict.glm to see the help page. Michael On 29 September 2010 12:45, xinxin xx xxgr...@hotmail.com wrote: Hi everyone: I am new to R and I have a really basic question. I have already got a generalized linear model from some dataset, say y=b0 + b1X1 + b2X2. Then I want to get the value of y provided, say, X1=1, X2=2. And the confidence Intervals of this y. I know I can just calculate that since I know the model already. But is there some code that can calculate those automatically? Thank you very much! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] String split and concatenation
x - rep(letters[1:3], 2) Are there any ways to transform assign the above as the one shown below to an object? (in exact format; i.e length of 1 class of character), i.e x ('a', 'b', 'c', 'a', 'b', 'c') Highly appreciate for any advice. On Wed, Sep 29, 2010 at 3:33 PM, bill.venab...@csiro.au wrote: dump(x, file = x.R) file.show(x.R) will get you most of the way. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Steven Kang Sent: Wednesday, 29 September 2010 3:11 PM To: r-help@r-project.org Subject: [R] String split and concatenation Hi R users, I desire to transform the following vector consisting of repeated characters x - rep(letters, 3) into this exact format (i.e a single string containing each characters in quotation mark separated by comma between each; al ). (a, b, c, d, a, b, c, d, ..., a, b, c, d, .z) Any advice would be much appreciated. -- Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] name ONLY one column // empty.dimnames() in 'sfsmisc'
IC == Ivan Calandra ivan.calan...@uni-hamburg.de on Mon, 27 Sep 2010 16:59:31 +0200 writes: IC Hi, IC I'm not sure it's even possible (and if it is I don't know how, but I'm IC no expert). yes it is possible (in some way), see below. IC But I think it doesn't make much sense to have only one named column. IC Just give it a vector: IC vect_names - c(myname1, myname2, myname3) IC colnames(my_matrix) - vect_names yes, exactly, ... but read on IC HTH, IC Ivan IC Le 9/27/2010 10:26, Lorenzo Cattarino a écrit : Hi R-users I can not change the name of one column only of my matrix. my_matrix - matrix (1:12,ncol=3) colnames(my_matrix)[1] - 'myname' Error in dimnames(x)- dn : length of 'dimnames' [2] not equal to array extent Just read ... and maybe re-read this error message. It is exactly on target: The colnames are the dimnames(.)[2] and they have the wrong length. { I'm too often disappointed that many R users read only the first, or actually the zero-th word of the 'very fine' error messages that we provide you with (most of the time): Such useRs only read up to Error .. and are put off, saying It does not work or Why can't I ... or } Now back to the original question: What you want is to give empty column names to all but the first column. mat - matrix(1:12, 3) colnames(mat) - c(myname, , ) And a note: giving empty column- and row-names is sometimes desirable if just for printing. For that reason, our 'sfsmisc' CRAN package has contained the function empty.dimnames() for many years now (actually longer than R exists; we had it as an S+ function in our goodies collection). Excerpt from its help page: empty.dimnames package:sfsmisc R Documentation Empty Dimnames of an Array. Description: `Remove' all dimension names from an array for compact printing. Usage: empty.dimnames(a) Arguments: a: an ‘array’, especially a matrix. Value: Returns ‘a’ with its dimnames replaced by empty character strings. Author(s): Bill Venables / Martin Maechler, Sept 1993. and here the examples we provide: example(empty.dimnames) empty. empty.dimnames(diag(5)) # looks much nicer 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 empty. (a - matrix(-9:10, 4,5)) [,1] [,2] [,3] [,4] [,5] [1,] -9 -5 -137 [2,] -8 -4048 [3,] -7 -3159 [4,] -6 -226 10 empty. empty.dimnames(a) # nicer, right? -9 -5 -1 3 7 -8 -4 0 4 8 -7 -3 1 5 9 -6 -2 2 6 10 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Validation / Training - test data
Dear List, I have developed two models i want to use to predict a response, one with a binary response and one with a ordinal response. My original plan was to divide the data into test (300 entries) and training (1000 entries) and check the power of the model by looking at the % correct predictions. However i have been told my a colleague that 1300 entries is far too little to partition the data set and i should use the whole data set, and determine the power of the model with scores such as c-value and Brier score and use bootstrapping. I understand how to bootstrap in R however i have never used it on predicted values. My questions are - 1. Using the boot() command how do i use this to test the power of my predictive model? 2. Is it possible to bootstrap brier score or is this not necessary? 3. ( This is a separate point i am struggling with, i thought i would include it here instead of posting again!) I have selected the most likely model with AIC criteria from a set of candidate GLMM models, however as GLMM has no predict function i have used the best model and excluded the random effects and ran it as a glm and used the predict function from here - is this OK? Thanks Sam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] drop.terms and [.terms ignores intercept
Dear R-help list The functions drop.terms and [.terms ignores if the intercept has been explicitly removed. Is that a deliberate feature? For instance, drop.terms(terms(~a+b-1),1) terms(~a+b-1)[2] R version 2.12.0 Under development (unstable) (2010-09-13 r52905) Best, Niels -- Niels Richard Hansen Web: www.math.ku.dk/~richard Associate Professor Email: niels.r.han...@math.ku.dk Department of Mathematical Sciences nielsrichardhan...@gmail.com University of Copenhagen Skype: nielsrichardhansen.dk Universitetsparken 5 Phone: +45 353 20783 (office) 2100 Copenhagen Ø +45 2859 0765 (mobile) Denmark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop.terms and [.terms ignores intercept
On Sep 29, 2010, at 10:29 , Niels Richard Hansen wrote: The functions drop.terms and [.terms ignores if the intercept has been explicitly removed. Is that a deliberate feature? Perhaps rather an unimplemented one. The root cause is that both functions use reformulate() on the term.labels attribute, and there is no way to specify that you want to reformulate into a no-intercept formula. On the other hand, the modeling code will happily proceed with a no-intercept model even if there is no -1 in formula part of a terms object, e.g. x - terms(y~a+b) attr(x,intercept) - 0 lm(x) Call: lm(formula = x) Coefficients: a b 0.2263 0.4178 formula(x) y ~ a + b so I suppose that there is no really good excuse not to carry the intercept attribute over. As usual, with code as old as this, there is always the risk that something actually relies on current behavior. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] constrained optimization -which package?
Leonardo Monasterio leonardo.monasterio at gmail.com writes: Dear R users, In the function bellow I want to find the maximum value of v, subject to the constrain that the sum of x is equal to 1. I want to maximize: v-t(x)%*%distance%*%x Subject to: sum(x)=1 I do not see why you would take the trouble to use constraint optimization here. Use x_n as a dependent variable, x_n - 1 - sum(x), x an (n-1)-dim. vector, define another function f1(x) - f(c(x, 1-sum(x))) appropriately, and apply optim() without constraints. Hans Werner Where: x is a vector n X 1 distance is a matrix n*n and it is given. (In practice, the number of n can go up to hundreds.) I have a bit of experience using R but not much on optimization techniques or on R optimization packages. I have taken a look at optim and nlminb, but I am quite confused. Do you have any suggestion on how to do it? Thank you very much, Leo Monasterio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use R in Visual Basic Environment
Hi Soumen, it depends what you mean by embedded what i have done once is to run a R script in batch mode from an access application, the R code import an access table, with the RODBC package, produce some table and graph which are compiled on the fly in html output, using HWRITER or R2HTML package and i can open this html output from an access form, or it open automatically kind regards Christophe On Tue, Sep 28, 2010 at 12:39 PM, Soumen Pal soumen.4...@gmail.com wrote: I need your kind help regarding the following: I wish to know is there any way to use R in Visual Basic environment. I want to develop a VB application where R can be embedded (R will work as a back end statistical engine). If available, please provide me some source of study materials/articles available on the internet related to this. -- Thanks Regards, Soumen Pal [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop.terms and [.terms ignores intercept
On 29/09/10 11.13, peter dalgaard wrote: On Sep 29, 2010, at 10:29 , Niels Richard Hansen wrote: The functions drop.terms and [.terms ignores if the intercept has been explicitly removed. Is that a deliberate feature? Perhaps rather an unimplemented one. The root cause is that both functions use reformulate() on the term.labels attribute, and there is no way to specify that you want to reformulate into a no-intercept formula. On the other hand, the modeling code will happily proceed with a no-intercept model even if there is no -1 in formula part of a terms object, e.g. x- terms(y~a+b) attr(x,intercept)- 0 lm(x) Call: lm(formula = x) Coefficients: a b 0.2263 0.4178 formula(x) y ~ a + b so I suppose that there is no really good excuse not to carry the intercept attribute over. As usual, with code as old as this, there is always the risk that something actually relies on current behavior. Thanks Peter. I expected something like that. As I see it, the intercept (or more often, removal of the intercept) is treated differently from other terms, and information on the presence of an intercept is stored in an attribute of the terms object rather than the formula. This attribute is not copied when using drop.terms or [.terms. I believe it should be, but as you say Peter, some may rely on the way the functions behave now. So basically, you should use code like x - terms(y~a+b-1) z - x[2] attr(z, intercept) - attr(x, intercept) to make sure that the intercept information is preserved. - Niels -- Niels Richard Hansen Web: www.math.ku.dk/~richard Associate Professor Email: niels.r.han...@math.ku.dk Department of Mathematical Sciences nielsrichardhan...@gmail.com University of Copenhagen Skype: nielsrichardhansen.dk Universitetsparken 5 Phone: +45 353 20783 (office) 2100 Copenhagen Ø +45 2859 0765 (mobile) Denmark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to rotate the x axis lable to an interested angle?
http://r.789695.n4.nabble.com/file/n2718591/QQ%E6%88%AA%E5%9B%BE%E6%9C%AA%E5%91%BD%E5%90%8D--1.png I am a beginner for R,and i could not solve this problem.who can help me to solve it ? how to rotate the x axis lable to an interested angle as the picture show? -- View this message in context: http://r.789695.n4.nabble.com/how-to-rotate-the-x-axis-lable-to-an-interested-angle-tp2718591p2718591.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D Graphics
Dear All, I have made a scatter plot and placed a plane within it using scatterplot3d. However, I have been asked for the data points to be a surface plot or have the plane more closely resemble the data rather than show trends. I have since tried to use the rgl package. Why doesn't this package use the window which already contains graphics? I have one axis with decreasing values and two others with increasing values, hence I get an error message when I've tried using pers3d. Are there anyother ways in which I can create a surface plot? Thank you, Jo -- View this message in context: http://r.789695.n4.nabble.com/3D-Graphics-tp2718658p2718658.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting data in to multiple boxplots
Ok, I don't think I was specific enough. The data originally came in this form 1 a 12 2 b 4 3 a 3 4 c 54 5 a 12 6 b 11 7 c 9 8 c 2 . . . . . . . . . Where I sorted by the second column (NB the second column is the categories and they have long names). I would then like separate boxplots for each category. The loop idea would be nice but unfortunately I do not fully understand the answer given. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Splitting-data-in-to-multiple-boxplots-tp2717491p2718659.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 95% confidence intercal with glm
I am looking to do the same but am a bit confused and apply the inverse link function for your model. i understand up to this point and i understand what this means, however i am unsure why it needs to be done and how you do it - i.e i use family=binomial is this wrong if i use this method to calculate 95% CI? Thanks Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote: zozio32 remy.pascal at gmail.com writes: Hi I had to use a glm instead of my basic lm on some data due to unconstant variance. now, when I plot the model over the data, how can I easily get the 95% confidence interval that sormally coming from: yv - predict(modelVar,list(aveLength=xv),int=c) matlines(xv,yv,lty=c(1,2,2)) There is no interval argument to pass to the predict function when using a glm, so I was wondering if I had to use an other function You need to use predict with se=TRUE; construct the confidence intervals by computing predicted values +- 1.96 times the standard error returned; and apply the inverse link function for your model. If heteroscedasticity is your main problem, and not a specific (known) non-normal distribution, you might consider using the gls function from the nlme package with an appropriate 'weights' argument. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular expressions: offsets of groups
Bill, Michael, good to see I'm not the only one who sees potential for improvements in the regexpr domain. Adding a subpattern argument is certainly a step in the right direction and would make my life much easier. However, in my application I need to know not only the position of one group but also the position of the overall match in the original string. The ideal solution would provide positions and match lengths for the whole pattern and for all groups if desired. Only this would solve all related issues. One possibility is to have a subpattern argument that accepts a vector of numbers (0 refers to the whole pattern): gregexpr(a+(b+), abcdaabbc, subpattern=c(0,1)) [[1]]: [[1]][[1]]: [1] 1 5 attr(, match.length): [1] 2 4 [[1]][[2]]: [1] 2 7 attr(, match.length): [1] 1 2 A weakness of this solution is that the structure of the return values changes if length(subpattern)1. An alternative is to have a separate function, say ggregepxr for group gregexpr, that returns a list of lists as in the above example. This function would always return positions and match lengths of the whole pattern (group 0) and all groups. The original gregexpr could still have the subpattern argument but it would only accept single numbers. This way the return format of gregexpr remains the same. Best, Titus On Wed, Sep 29, 2010 at 2:42 AM, Michael Bedward michael.bedw...@gmail.com wrote: Ah, that's interesting - thanks Bill. That's certainly on the right track for me (Titus, you too ?) especially if the subpattern argument accepted a vector of multiple group indices. As you say, this is straightforward in C. I'd be happy to (try to) make a patch for the R sources if there was some consensus on the best way to implement it, ie. as a new R function or by extending existing function(s). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] svg plot and dashed lines
Dear users, When I boxplot(), the lines of the whiskers are dashed. However, when I save in an svg file, the dashed lines of the whiskers are not dashed anymore. How can I have the dashed lines in the svg file? I don't have this problem with a ps file, but I cannot edit such file as easily as an svg file. That's why I'd like to stick to the svg format. Thanks in advance, Ivan df - structure(list(a = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c(A, B), class = factor), b = c(0.904439748839731, -0.855322875817714, -0.957288625102814, 0.130401502975395, -1.27765131101282, -2.08861064654457, 1.10234256081394, -2.05533035069656, -1.04529859053820, -0.0847903566670016, 1.02553030160793, 0.321170740199536, 1.87419854190502, -0.891404432182873, 0.968745913802415, -0.85229752730528, 0.641555656821046, 1.72455661053506, -0.523097596614304, 1.26729031187194)), .Names = c(a, b), row.names = c(NA, -20L), class = data.frame) library(RSvgDevice) devSVG(file=test.svg) boxplot(b~a, data=df) dev.off() -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting data in to multiple boxplots
Hi, Something like bb = data.frame(label=c(a,b,a,b,c,a,b,c),val=c(4,2,1,6,4,3,2, 1)) l = split(bb,bb$label) par(mfrow=c(2,2)) lapply(l,function(a) {boxplot(a$val)}) might be what you are looking for Martyn -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of deadlyspider Sent: 29 September 2010 11:02 To: r-help@r-project.org Subject: Re: [R] Splitting data in to multiple boxplots Ok, I don't think I was specific enough. The data originally came in this form 1 a 12 2 b 4 3 a 3 4 c 54 5 a 12 6 b 11 7 c 9 8 c 2 . . . . . . . . . Where I sorted by the second column (NB the second column is the categories and they have long names). I would then like separate boxplots for each category. The loop idea would be nice but unfortunately I do not fully understand the answer given. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Splitting-data-in-to-multiple-boxplots-tp2 717491p2718659.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems extracting p-value from summary(aov(...))
Hi, I did a aov and used summary to obtain the p-value. I tried many ways to extract the p-value from the summary result but failed. Among others I tried the following: test.summary - summary(aov(data[,1]~time.points+Error(subject/time.points))) test.summary Error: subject Df Sum Sq Mean Sq F value Pr(F) Residuals 9 0.27467 0.030518 Error: subject:time.points Df Sum Sq Mean Sq F value Pr(F) time.points 2 0.018563 0.0092814 3.1777 0.06578 . Residuals 18 0.052574 0.0029208 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 as.matrix(test.summary[[1]][,5]) Error in `[.default`(test.summary[[1]], , 5) : incorrect number of dimensions test.summary$Error: Within[[1]]$Pr(F) NULL test.summary[[2]][,5] Error in `[.default`(test.summary[[2]], , 5) : incorrect number of dimensions Any advise? Cheers -- View this message in context: http://r.789695.n4.nabble.com/Problems-extracting-p-value-from-summary-aov-tp2718726p2718726.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix plot
Hi, Fairly new to R - have done basic plots but now faced with plotting a matrix/table of results -I know what I want but cannot find out how to do it. Basically have individual questions ( x) to which an organization can rate themselves 1-10 (y) what I want to show is a matrix/density type plot (like the matrix corrolation plots I have seen on R graph site) showing for eac question (x) a circle or shape of varying size and/or colour to represent the number of organisations rating themselves for each value of y. Hope this makes sense, any advice would be gratefully received. HH __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding two data.frames.
Hi I have two data frames that contains the same sort of data and I want do add them together to get one big data frame, anyone know how to do that? ex: data.frame1 A B C 1 2 3 4 5 6 7 8 9 and data.frame2 A B C 9 8 7 6 5 4 3 2 1 Would then become one big set: data.frame3 A B C 1 2 3 4 5 6 7 8 9 9 8 7 6 5 4 3 2 1 Thx for your help //Joel -- View this message in context: http://r.789695.n4.nabble.com/Adding-two-data-frames-tp2718771p2718771.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding two data.frames.
Hi, Take a look at rbind() HTH, Ivan Le 9/29/2010 13:24, Joel a écrit : Hi I have two data frames that contains the same sort of data and I want do add them together to get one big data frame, anyone know how to do that? ex: data.frame1 A B C 1 2 3 4 5 6 7 8 9 and data.frame2 A B C 9 8 7 6 5 4 3 2 1 Would then become one big set: data.frame3 A B C 1 2 3 4 5 6 7 8 9 9 8 7 6 5 4 3 2 1 Thx for your help //Joel -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular expressions: offsets of groups
I'd definitely be a customer for it Titus. And it does seem like an obvious hole in regex processing in R that cries out to be filled. Um, ggregexpr isn't the sexiest of function names :) Perhaps we can think of something a little easier ? How is your C coding ? Bill ? Anyone else ? I could have a got at writing some prototype code to test in the next few days, though if someone else with decent C skills is itching to do it please speak up. Michael On 29 September 2010 20:08, Titus von der Malsburg malsb...@gmail.com wrote: Bill, Michael, good to see I'm not the only one who sees potential for improvements in the regexpr domain. Adding a subpattern argument is certainly a step in the right direction and would make my life much easier. However, in my application I need to know not only the position of one group but also the position of the overall match in the original string. The ideal solution would provide positions and match lengths for the whole pattern and for all groups if desired. Only this would solve all related issues. One possibility is to have a subpattern argument that accepts a vector of numbers (0 refers to the whole pattern): gregexpr(a+(b+), abcdaabbc, subpattern=c(0,1)) [[1]]: [[1]][[1]]: [1] 1 5 attr(, match.length): [1] 2 4 [[1]][[2]]: [1] 2 7 attr(, match.length): [1] 1 2 A weakness of this solution is that the structure of the return values changes if length(subpattern)1. An alternative is to have a separate function, say ggregepxr for group gregexpr, that returns a list of lists as in the above example. This function would always return positions and match lengths of the whole pattern (group 0) and all groups. The original gregexpr could still have the subpattern argument but it would only accept single numbers. This way the return format of gregexpr remains the same. Best, Titus On Wed, Sep 29, 2010 at 2:42 AM, Michael Bedward michael.bedw...@gmail.com wrote: Ah, that's interesting - thanks Bill. That's certainly on the right track for me (Titus, you too ?) especially if the subpattern argument accepted a vector of multiple group indices. As you say, this is straightforward in C. I'd be happy to (try to) make a patch for the R sources if there was some consensus on the best way to implement it, ie. as a new R function or by extending existing function(s). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular expressions: offsets of groups
On Wed, Sep 29, 2010 at 1:58 PM, Michael Bedward michael.bedw...@gmail.com wrote: How is your C coding ? Bill ? Anyone else ? I could have a got at writing some prototype code to test in the next few days, though if someone else with decent C skills is itching to do it please speak up. We have a skilled C- and R-programmer who could work on it. I'll talk to him. Titus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D Graphics
On 29/09/2010 6:00 AM, JoH wrote: Dear All, I have made a scatter plot and placed a plane within it using scatterplot3d. However, I have been asked for the data points to be a surface plot or have the plane more closely resemble the data rather than show trends. I have since tried to use the rgl package. Why doesn't this package use the window which already contains graphics? The usual graphics drivers don't support OpenGL, which rgl uses. It would probably be possible to implement a standard R graphics device on an rgl canvas, but I don't know anyone who has done that. I have one axis with decreasing values and two others with increasing values, hence I get an error message when I've tried using pers3d. You can always plot against negative x without labels, and add the labels for positive x later (using persp3d and axis3d). Duncan Murdoch Are there anyother ways in which I can create a surface plot? Thank you, Jo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems extracting p-value from summary(aov(...))
Hi: Try this: test.summary[[1]][, 5] It should return a vector of p-values, the last being NA. In your case, since there is only one non-NA p-value, it is enough to do test.summary[[1]][, 5][1] You mean that wasn't obvious? :) Explanation: summary(aovobj) actually returns a list, but that's not obvious from looking at the print method. If you ask for its names, as in names(summary(aovobj)), it returns NULL, further adding to the confusion. However, since it is a list, summary(aovobj)[[1]] returns the ANOVA table, and ANOVA tables in R's modeling functions are typically matrices. The model terms are the rownames of the matrix (although that isn't obvious at first, either). That being the case, the p-values are in column 5 (whose column name is 'Pr(F)' ), so we can extract that column with summary(aovobj)[[1]][, 5] orsummary(aovobj)[[1]][, 'Pr(F)'] To get the first element of that p-value vector, use summary(aovobj)[[1]][, 5][1] HTH, Dennis On Wed, Sep 29, 2010 at 3:47 AM, syrvn ment...@gmx.net wrote: Hi, I did a aov and used summary to obtain the p-value. I tried many ways to extract the p-value from the summary result but failed. Among others I tried the following: test.summary - summary(aov(data[,1]~time.points+Error(subject/time.points))) test.summary Error: subject Df Sum Sq Mean Sq F value Pr(F) Residuals 9 0.27467 0.030518 Error: subject:time.points Df Sum Sq Mean Sq F value Pr(F) time.points 2 0.018563 0.0092814 3.1777 0.06578 . Residuals 18 0.052574 0.0029208 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 as.matrix(test.summary[[1]][,5]) Error in `[.default`(test.summary[[1]], , 5) : incorrect number of dimensions test.summary$Error: Within[[1]]$Pr(F) NULL test.summary[[2]][,5] Error in `[.default`(test.summary[[2]], , 5) : incorrect number of dimensions Any advise? Cheers -- View this message in context: http://r.789695.n4.nabble.com/Problems-extracting-p-value-from-summary-aov-tp2718726p2718726.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to rotate the x axis lable to an interested angle?
On 09/29/2010 07:19 PM, komais wrote: http://r.789695.n4.nabble.com/file/n2718591/QQ%E6%88%AA%E5%9B%BE%E6%9C%AA%E5%91%BD%E5%90%8D--1.png I am a beginner for R,and i could not solve this problem.who can help me to solve it ? how to rotate the x axis lable to an interested angle as the picture show? Hi komais, Have a look at the staxlab function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding two data.frames.
Thx for the help works very well for me. -- View this message in context: http://r.789695.n4.nabble.com/Adding-two-data-frames-tp2718771p2718901.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix plot
Hi: Are you thinking of a levelplot/heatmap? Some places to look: ?levelplot in the lattice package ?geom_tile in the ggplot2 package ?heatmap2 in the gplots package You can find many more heatmap functions in R; try library(sos)# install first if you don't have it - *very* handy for quick searches findFn('heatmap') There are more than a few packages with heatmap functions... HTH, Dennis On Wed, Sep 29, 2010 at 3:55 AM, hairryharry hairryha...@tesco.net wrote: Hi, Fairly new to R - have done basic plots but now faced with plotting a matrix/table of results -I know what I want but cannot find out how to do it. Basically have individual questions ( x) to which an organization can rate themselves 1-10 (y) what I want to show is a matrix/density type plot (like the matrix corrolation plots I have seen on R graph site) showing for eac question (x) a circle or shape of varying size and/or colour to represent the number of organisations rating themselves for each value of y. Hope this makes sense, any advice would be gratefully received. HH __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Validation / Training - test data
Split sample validation is highly unstable with your sample size. The rms package can help with bootstrapping or cross-validation, assuming you have all modeling steps repreated for each resample. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2718905.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 95% confidence intercal with glm
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a binomial model you need 'plogis' rather than 'exp' as your inverse link, and you may need to multiply by the binomial N as appropriate. On 10-09-29 06:07 AM, Sam wrote: I am looking to do the same but am a bit confused and apply the inverse link function for your model. i understand up to this point and i understand what this means, however i am unsure why it needs to be done and how you do it - i.e i use family=binomial is this wrong if i use this method to calculate 95% CI? Thanks Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote: zozio32 remy.pascal at gmail.com writes: Hi I had to use a glm instead of my basic lm on some data due to unconstant variance. now, when I plot the model over the data, how can I easily get the 95% confidence interval that sormally coming from: yv - predict(modelVar,list(aveLength=xv),int=c) matlines(xv,yv,lty=c(1,2,2)) There is no interval argument to pass to the predict function when using a glm, so I was wondering if I had to use an other function You need to use predict with se=TRUE; construct the confidence intervals by computing predicted values +- 1.96 times the standard error returned; and apply the inverse link function for your model. If heteroscedasticity is your main problem, and not a specific (known) non-normal distribution, you might consider using the gls function from the nlme package with an appropriate 'weights' argument. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyjNKQACgkQc5UpGjwzenNW9gCgjfXin/9dI2y0rnk9wZWu97P8 TVgAn3WsG2ATva5WpLZXS1PXUTMRTTMi =gvAc -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting multiple animal tracks against Date/Time
On Wed, Sep 29, 2010 at 7:52 AM, Struve, Juliane j.str...@imperial.ac.uk wrote: I will post the example again to see if its readable now. My question is why does read.zoo(file=filenames,) work and lapply(filenames, read.zoo,...) does not ? Since I am reading the same file in both statements I just do not know how to interpret Error in strptime(x, format, tz = tz) : invalid 'x' argument. Thank you for all help. Juliane library(chron) library(zoo) #Generate example file Fish_ID=1646 Date - 01/01/2004 00:01:00 Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S)) R2sqrt -100 Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt) write.csv(Test,file=Test) #Read in example file filenames=Test read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) FUN is an argument of lapply so what is actually running is lapply(filenames, FUN = as.chron, ...) rather than lapply(filenames, FUN = read.zoo, ...). It seems the short form usage of lapply won`t work here. Try this instead: lapply(filenames, function(F) read.zoo(F, header = TRUE, sep = ,, FUN = as.chron, colClasses = c(NULL, NULL, character, numeric))) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix plot
HH- I'm not familiar with the plots you mention, but the following is a quick attempt to create the plot you describe. data-data.frame( org=1:10, q1=sample(1:10,replace=T), q2=sample(1:10,replace=T), q3=sample(1:10,replace=T)) # This generates a random data set like the one you describe. It looks like this: # org q1 q2 q3 #11 9 1 4 #22 1 2 1 #33 1 3 7 #44 10 9 6 # ... n-ncol(data)-1 # NUMBER OF ORGANIZATIONS new.mat-apply(data[,-1],2,function(x){xtabs(~factor(x,levels=1:10))}) plot(rep(1:n,each=10),rep(1:10,n), pch=16, cex=c(new.mat), xaxt=n, xlab=Question, ylab=Score) axis(1,at=1:n) Instead of cex=c(new.mat), you can supply a function of new.mat, say cex=f(new.mat) to specify the size of the dots. Hope that helps. -tgs On Wed, Sep 29, 2010 at 6:55 AM, hairryharry hairryha...@tesco.net wrote: Hi, Fairly new to R - have done basic plots but now faced with plotting a matrix/table of results -I know what I want but cannot find out how to do it. Basically have individual questions ( x) to which an organization can rate themselves 1-10 (y) what I want to show is a matrix/density type plot (like the matrix corrolation plots I have seen on R graph site) showing for eac question (x) a circle or shape of varying size and/or colour to represent the number of organisations rating themselves for each value of y. Hope this makes sense, any advice would be gratefully received. HH __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems extracting p-value from summary(aov(...))
On Sep 29, 2010, at 14:25 , Dennis Murphy wrote: test.summary[[1]][, 5][1] You mean that wasn't obvious? :) Worse, it doesn't actually work... test.summary - summary(npk.aovE) test.summary[[1]][, 5] Error in `[.default`(test.summary[[1]], , 5) : incorrect number of dimensions test.summary$Error: block[[1]][, 5] [1] 0.5252361NA test.summary[[1]][[1]][, 5] [1] 0.5252361NA I.e., you need an extra list extraction operator. The data structure goes List of 2 $ Error: block :List of 1 ..$ :Classes ‘anova’ and 'data.frame':2 obs. of 5 variables: .. ..$ Df : num [1:2] 1 4 .. ..$ Sum Sq : num [1:2] 37 306 ... and the intermediate list has class (summary.aov,listof). I believe the point is that you can have a matrix LHS and get a table for each of its columns. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Validation / Training - test data
Thanks for this, I had used validate(model0, method=boot,B=200) To get a index.corrected Brier score, However i am also wanting to bootstrap the predicted probabilities output from predict(model1, type = response) to get a idea of confidence, or am i best just using se.fit = TRUE and then calculating the 95%CI? Does what i want to do make sense? Thanks On 29 Sep 2010, at 13:38, Frank Harrell wrote: Split sample validation is highly unstable with your sample size. The rms package can help with bootstrapping or cross-validation, assuming you have all modeling steps repreated for each resample. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2718905.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] short captions for xtable?
Hi, For my dissertation, I've made copious use of xtable. I've just gotten stumped however. I'm a fan of extended captions explaining the table, but now I have to assemble a a list of tables and the captions are unwieldy. I presume xtable calls LaTeX's \caption function. Is there a way to include short captions (a la \caption[short caption]{long caption}? Thanks in advance for you help. Cuz -- View this message in context: http://r.789695.n4.nabble.com/short-captions-for-xtable-tp2719000p2719000.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting data in to multiple boxplots
Hi, The form of the data is not terribly important neither is whether it was sorted as boxplots are not order dependent are category a is a sorted or not. See below for individual plots with your new data. # read in data dat - read.table(textConnection( 1 a 12 2 b 4 3 a 3 4 c 54 5 a 12 6 b 11 7 c 9 8 c 2 ), header = FALSE) closeAllConnections() # add names since you said they lack them names(dat) - c(id, cat, value) # this makes it so it will ask for input before # changing plots par(ask = TRUE) # run the function boxplot for different values by() each level in dat$cat. # if par(ask = TRUE) was set, it should make you click or hit # return before changing each plot by(dat$value, dat$cat, boxplot) HTH, Josh On Wed, Sep 29, 2010 at 3:01 AM, deadlyspider wrcst...@gmail.com wrote: Ok, I don't think I was specific enough. The data originally came in this form 1 a 12 2 b 4 3 a 3 4 c 54 5 a 12 6 b 11 7 c 9 8 c 2 . . . . . . . . . Where I sorted by the second column (NB the second column is the categories and they have long names). I would then like separate boxplots for each category. The loop idea would be nice but unfortunately I do not fully understand the answer given. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Splitting-data-in-to-multiple-boxplots-tp2717491p2718659.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D Graphics
Duncan Murdoch murdoch.duncan at gmail.com writes: On 29/09/2010 6:00 AM, JoH wrote: Are there anyother ways in which I can create a surface plot? persp() (in base R) is less flexible than rgl::persp3d, doesn't do hidden line removal, etc., but does appear in the regular R graphics window and can be added to in a similar way to scatterplot3d (see the examples). (I believe it will have the same complaint as persp3d about x, y needing to be in increasing order) See also wireframe in the lattice package. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fitting a half-ellipse curve
Dear mailing list, I have following array: X2 Y2 [1,] 422.7900 6.0 [2,] 469.8007 10.5 [3,] 483.9428 11.0 [4,] 532.4917 25.5 [5,] 596.1942 33.5 [6,] 630.8496 40.5 [7,] 733.2996 45.0 [8,] 946.4779 32.0 [9,] 996.8068 35.5 [10,] 1074.3310 23.0 I do afterwards the following: plot.new() plot.window(xlim=c(min(X1)-50,max(X1)+50), ylim=c(min(Y1)-25,max(Y1)+25)) axis(2, cex.axis=1.2) axis(1, cex.axis=1.2) points(X2, Y2, type=p, pch=0, cex=1.2, col=black) title(main=Dyke Geometry Along Strike, cex.main=1.2, font.main=4) title(xlab=distance [m], cex.lab=1.2) title(ylab=half-thickness [m], cex.lab=1.2) box() I would like curve fitting where I proceeded with a non linear-regression with the according function below: nls(formula = Y2 ~ abs(b*abs(1-X2^2/a^2)^(1/2)), start = list( a=282, b=40)) The formula should give the y-positive part of an ellipse in my opinion or I might be completely wrong. In the end I would like to fit a curve of half an ellipse through the data. I might could do this as well with a 2nd order polynomial fit. I am grateful for any suggestions and comments to my problem. Cheers -- Niklaus Hürlimann Doctorant-PhD Université de Lausanne Institut de Minéralogie et Géochimie L'Anthropole CH-1015 Lausanne Suisse E-mail: niklaus.hurlim...@unil.ch Tel:+41(0)21 692 4452 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to rotate the x axis lable to an interested angle?
komais liutao1986 at yahoo.com.cn writes: how to rotate the x axis lable to an interested angle as the picture show? http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to rotate the x axis lable to an interested angle?
Thank you very much. -- View this message in context: http://r.789695.n4.nabble.com/how-to-rotate-the-x-axis-lable-to-an-interested-angle-tp2718591p2718963.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems extracting p-value from summary(aov(...))
Hi: On Wed, Sep 29, 2010 at 5:56 AM, peter dalgaard pda...@gmail.com wrote: On Sep 29, 2010, at 14:25 , Dennis Murphy wrote: test.summary[[1]][, 5][1] You mean that wasn't obvious? :) Worse, it doesn't actually work... test.summary - summary(npk.aovE) test.summary[[1]][, 5] Error in `[.default`(test.summary[[1]], , 5) : incorrect number of dimensions test.summary$Error: block[[1]][, 5] [1] 0.5252361NA test.summary[[1]][[1]][, 5] [1] 0.5252361NA I.e., you need an extra list extraction operator. The data structure goes List of 2 $ Error: block :List of 1 ..$ :Classes anova and 'data.frame':2 obs. of 5 variables: .. ..$ Df : num [1:2] 1 4 .. ..$ Sum Sq : num [1:2] 37 306 ... and the intermediate list has class (summary.aov,listof). I believe the point is that you can have a matrix LHS and get a table for each of its columns. I presume we're using about the npk data in the MASS package. library(MASS) npk.aov - aov(yield ~ block + N*P*K, data = npk) npk.aovE - summary(npk.aov) str(npk.aovE) List of 1 $ :Classes anova and 'data.frame': 8 obs. of 5 variables: ..$ Df : num [1:8] 5 1 1 1 1 1 1 12 ..$ Sum Sq : num [1:8] 343.3 189.3 8.4 95.2 21.3 ... ..$ Mean Sq: num [1:8] 68.7 189.3 8.4 95.2 21.3 ... ..$ F value: num [1:8] 4.447 12.259 0.544 6.166 1.378 ... ..$ Pr(F) : num [1:8] 0.01594 0.00437 0.4749 0.0288 0.26317 ... - attr(*, class)= chr [1:2] summary.aov listof npk.aovE[[1]][, 5] [1] 0.015938790 0.004371812 0.474904093 0.028795054 0.263165283 0.168647879 [7] 0.862752086 NA npk.aovE[[1]][, 5][1] [1] 0.01593879 npk.aovE[[1]][, 'Pr(F)'] [1] 0.015938790 0.004371812 0.474904093 0.028795054 0.263165283 0.168647879 [7] 0.862752086 NA npk.aovE[[1]][, 'Pr(F)'][1] [1] 0.01593879 What am I missing? Regards, Dennis -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting multiple animal tracks against Date/Time
I will post the example again to see if its readable now. My question is why does read.zoo(file=filenames,) work and lapply(filenames, read.zoo,...) does not ? Since I am reading the same file in both statements I just do not know how to interpret Error in strptime(x, format, tz = tz) : invalid 'x' argument. Thank you for all help. Juliane library(chron) library(zoo) #Generate example file Fish_ID=1646 Date - 01/01/2004 00:01:00 Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S)) R2sqrt -100 Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt) write.csv(Test,file=Test) #Read in example file filenames=Test read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) From: Gabor Grothendieck [ggrothendi...@gmail.com] Sent: 29 September 2010 00:09 To: Struve, Juliane Cc: r-help@r-project.org Subject: Re: [R] plotting multiple animal tracks against Date/Time On Tue, Sep 28, 2010 at 9:30 AM, Struve, Juliane j.str...@imperial.ac.uk wrote: Hi, in this self-contained example the file the same error message appears as when I read in my original results files. library (zoo) library(chron) #generate example data Fish_ID=1646 Date - 01/01/2004 00:01:00 Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S)) R2sqrt -100 #put into dataframe Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt) # write .csv file write.csv(Test,file=Test) #generate list of files filenames=Test #read file(s) into zoo object read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) #works fine #read list of files into zoo.object lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric))# error Error in strptime(x, format, tz = tz) : invalid 'x' argument Am I missing something ? Thank you for your time and patience. Self contained means anyone else can just copy your code and paste it into their session and see the error message you see. Its likely that your file does not contain what you think it does. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 95% confidence intercal with glm
Hi Ben and list, Sorry to be a pain! I have followed your code, and modified it - pp - predict(model1,se.fit=TRUE, type = response) etaframe - + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - plogis(etaframe) pframe My response variable is 0 or 1, the probabilities are all high above my cut-off point of 0.445, i think this may have something to do with you may need to multiply by the binomial N as appropriate. However how do i multiply if my response is 0 or 1? Furthermore, will the approach above work for a ordinal response? Thanks Sam On 29 Sep 2010, at 13:44, Ben Bolker wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a binomial model you need 'plogis' rather than 'exp' as your inverse link, and you may need to multiply by the binomial N as appropriate. On 10-09-29 06:07 AM, Sam wrote: I am looking to do the same but am a bit confused and apply the inverse link function for your model. i understand up to this point and i understand what this means, however i am unsure why it needs to be done and how you do it - i.e i use family=binomial is this wrong if i use this method to calculate 95% CI? Thanks Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote: zozio32 remy.pascal at gmail.com writes: Hi I had to use a glm instead of my basic lm on some data due to unconstant variance. now, when I plot the model over the data, how can I easily get the 95% confidence interval that sormally coming from: yv - predict(modelVar,list(aveLength=xv),int=c) matlines(xv,yv,lty=c(1,2,2)) There is no interval argument to pass to the predict function when using a glm, so I was wondering if I had to use an other function You need to use predict with se=TRUE; construct the confidence intervals by computing predicted values +- 1.96 times the standard error returned; and apply the inverse link function for your model. If heteroscedasticity is your main problem, and not a specific (known) non-normal distribution, you might consider using the gls function from the nlme package with an appropriate 'weights' argument. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyjNKQACgkQc5UpGjwzenNW9gCgjfXin/9dI2y0rnk9wZWu97P8 TVgAn3WsG2ATva5WpLZXS1PXUTMRTTMi =gvAc -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 95% confidence intercal with glm
Hi Ben and list, Sorry to be a pain! I have followed your code, and modified it - pp - predict(model1,se.fit=TRUE, type = response) etaframe - + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - plogis(etaframe) pframe My response variable is 0 or 1, the probabilities are all high above my cut-off point of 0.445, i think this may have something to do with you may need to multiply by the binomial N as appropriate. However how do i multiply if my response is 0 or 1? Furthermore, will the approach above work for a ordinal response? Thanks Sam On 29 Sep 2010, at 13:44, Ben Bolker wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a binomial model you need 'plogis' rather than 'exp' as your inverse link, and you may need to multiply by the binomial N as appropriate. On 10-09-29 06:07 AM, Sam wrote: I am looking to do the same but am a bit confused and apply the inverse link function for your model. i understand up to this point and i understand what this means, however i am unsure why it needs to be done and how you do it - i.e i use family=binomial is this wrong if i use this method to calculate 95% CI? Thanks Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote: zozio32 remy.pascal at gmail.com writes: Hi I had to use a glm instead of my basic lm on some data due to unconstant variance. now, when I plot the model over the data, how can I easily get the 95% confidence interval that sormally coming from: yv - predict(modelVar,list(aveLength=xv),int=c) matlines(xv,yv,lty=c(1,2,2)) There is no interval argument to pass to the predict function when using a glm, so I was wondering if I had to use an other function You need to use predict with se=TRUE; construct the confidence intervals by computing predicted values +- 1.96 times the standard error returned; and apply the inverse link function for your model. If heteroscedasticity is your main problem, and not a specific (known) non-normal distribution, you might consider using the gls function from the nlme package with an appropriate 'weights' argument. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyjNKQACgkQc5UpGjwzenNW9gCgjfXin/9dI2y0rnk9wZWu97P8 TVgAn3WsG2ATva5WpLZXS1PXUTMRTTMi =gvAc -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] FW: creating a custom background
Hi. I want to create a plot with Pantone654 as the background. The RGB for this color is (0,61,121), which corresponds to a hex of #003D79. How do I specify the bg parameter for this? All Best, Ethan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: creating a custom background
Try this: par(bg = '#003D79') On Wed, Sep 29, 2010 at 11:14 AM, Arenson, Ethan earen...@nbome.org wrote: Hi. I want to create a plot with Pantone654 as the background. The RGB for this color is (0,61,121), which corresponds to a hex of #003D79. How do I specify the bg parameter for this? All Best, Ethan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 95% confidence intercal with glm
On 10-09-29 10:04 AM, Sam wrote: Hi Ben and list, Sorry to be a pain! I have followed your code, and modified it - **You should not use type=response here.** The point is that the (symmetric) confidence intervals are computed on the link/linear predictor scale, and then inverse-link-transformed (in this case, along with the fitted values) -- they then become asymmetric. pp - predict(model1,se.fit=TRUE, type = response) etaframe - + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - plogis(etaframe) pframe My response variable is 0 or 1, the probabilities are all high above my cut-off point of 0.445, i think this may have something to do with you may need to multiply by the binomial N as appropriate. However how do i multiply if my response is 0 or 1? if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by 1, so don't bother. On 29 Sep 2010, at 13:44, Ben Bolker wrote: ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a binomial model you need 'plogis' rather than 'exp' as your inverse link, and you may need to multiply by the binomial N as appropriate. On 10-09-29 06:07 AM, Sam wrote: I am looking to do the same but am a bit confused and apply the inverse link function for your model. i understand up to this point and i understand what this means, however i am unsure why it needs to be done and how you do it - i.e i use family=binomial is this wrong if i use this method to calculate 95% CI? Thanks Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote: zozio32 remy.pascal at gmail.com writes: Hi I had to use a glm instead of my basic lm on some data due to unconstant variance. now, when I plot the model over the data, how can I easily get the 95% confidence interval that sormally coming from: yv - predict(modelVar,list(aveLength=xv),int=c) matlines(xv,yv,lty=c(1,2,2)) There is no interval argument to pass to the predict function when using a glm, so I was wondering if I had to use an other function You need to use predict with se=TRUE; construct the confidence intervals by computing predicted values +- 1.96 times the standard error returned; and apply the inverse link function for your model. If heteroscedasticity is your main problem, and not a specific (known) non-normal distribution, you might consider using the gls function from the nlme package with an appropriate 'weights' argument. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generalized additive mixed models for ordinal data
Dear R-users, Is there any R-function for fitting generalized additive mixed models for ordinal data? Do they actually make some sense? I can fit a generalized linear mixed model for ordinal data using the function clmm(ordinal) and I'm able to cope with generalized additive model for ordinal data within the package VGAM. But I would like to fit something like: g(\gamma_{ij}) = \theta_{j} + x_{i1} \beta_1 + f(x_{2i}) + u_{i}, where \gamma_{ij} denote the cumulative probability that the i-th observation falls in the j-th category or below. Sorry for the rather out-of-R question, Carlo Giovanni Camarda -- This mail has been sent through the MPI for Demographic ...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] short captions for xtable?
On Sep 29, 2010, at 8:31 AM, cuz wrote: Hi, For my dissertation, I've made copious use of xtable. I've just gotten stumped however. I'm a fan of extended captions explaining the table, but now I have to assemble a a list of tables and the captions are unwieldy. I presume xtable calls LaTeX's \caption function. Is there a way to include short captions (a la \caption[short caption]{long caption}? Thanks in advance for you help. Cuz I don't see any obvious way to do that with xtable. You may wish to contact the package author directly and perhaps ask about adding that functionality as an enhancement. In Frank's Hmisc package, there is the latex() function, which has separate arguments for 'caption' and 'caption.lot', so you may wish to look at that approach. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nlminb and optim
I am using both nlminb and optim to get MLEs from a likelihood function I have developed. AFAIK, the model I has not been previously used in this way and so I am struggling a bit to unit test my code since I don't have another data set to compare this kind of estimation to. The likelihood I have is (in tex below) \begin{equation} \label{eqn:marginal} L(\beta) = \prod_{s=1}^N \int \prod_{i=1}^K\frac{e^{x_{is}(\theta_s-\beta_i)}} {x_{is}!e^{e^(\theta_s-\beta_i)}} f(\theta)d(\theta) \end{equation} Where I view $\theta$ as a nuisance parameter and so I integrate it out of the likelihood. The goal is to get parameter estimates for $\beta$. The integral cannot be easily evaluated so I approximate it as: \begin{equation} \label{eqn:marginal2} L(\beta) \approx \prod_{s=1}^N \sum_{q=1}^Q \prod_{i=1}^K\frac{e^{x_{is}(\theta_{q}-\beta_i)}} {x_{is}!e^{e^(\theta_{q}-\beta_i)}} w_q \end{equation} \noindent where $\theta_{q}$ is the node at quadrature point $q = 1, \ldots, Q$ and $w_q$ is the weight at quadrature point $q$. For now, I am assuming $f(\theta)$ is Uniform but this may change and that is not a major issue. Now, I have written a function using both nlminb and optim. I have copied my function below where the arguments are the data set, Q for the number of quadrature points, and an option for providing starting values. I am running into some flow control problems and so I hack it a bit and fix a lower bound as you may see in the function. Here is the issue at hand. First, nlminb and optim give different parameter estimates. Optim tells me it converged but nlminb tells me I get relative convergence. If I use different number of quadrature points in optim, I get very different parameter estimates and this should not happen. But, varying the number of quadrature points in nlminb yields the same parameter estimates, but I am not sure the model converged. I have also provided the data set I am working with below as well as sessionInfo(). There may be many issues going on here and am grateful for any reactions you all may have. I *think* my likelihood is written properly and I *think* I am handling flow control issues in a relatively decent way. I may be misinterpreting optim or nlminb reports. In any event, should anyone take the time to review this and offers pointers I would be grateful. Harold fit - function(data, Q, startVal = NULL, ...){ if(!is.null(startVal) length(startVal) != ncol(data) ){ stop(Length of argument startVal not equal to the number of parameters estimated) } if(!is.null(startVal)){ startVal - startVal } else { startVal - rep(0, ncol(data)) } #qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) qq - gauss.quad.prob(Q, dist = 'uniform', alpha = -5, beta=5) rr1 - matrix(0, nrow = Q, ncol = nrow(data)) data - as.matrix(data) L - nrow(data) C - ncol(data) fn - function(b){ b - b[1:C] for(j in 1:Q){ for(i in 1:L){ rr1[j,i] - prod((exp(data[i,]*(qq$nodes[j]-b))) / (factorial(data[i,]) * exp(exp(qq$nodes[j]-b * qq$weights[j] } } rr1[rr1==0] - 1e-10 -sum(log(colSums(rr1))) } #opt - optim(startVal, fn, method = BFGS, hessian = FALSE) opt - nlminb(startVal, fn) #list(coefficients = opt$par, LogLik = -opt$value, Std.Error = sqrt(diag(solve(opt$hessian #list(coefficients = opt$par, LogLik = -opt$value) opt } fit(data, Q = 10) data cindyR cjR ohsR sdhpR [1,] 24 146 9 [2,] 19 14610 [3,] 19 14610 [4,] 17 186 8 [5,] 19 175 8 [6,] 17 176 9 [7,] 13 15511 [8,] 19 136 8 [9,] 21 10411 [10,] 16 155 9 [11,] 14 16510 [12,] 14 13510 [13,] 17 155 8 [14,] 16 184 8 [15,] 16 145 8 [16,] 18 135 8 [17,] 15 146 7 [18,] 16 175 7 [19,] 18 125 8 [20,] 18 94 9 [21,] 17 9410 [22,] 18 124 8 [23,] 15 155 7 [24,] 16 95 8 [25,] 18 115 7 [26,] 18 104 9 [27,] 15 154 7 [28,] 16 114 9 [29,] 11 165 8 [30,] 11 165 8 [31,] 16
Re: [R] 95% confidence intercal with glm
Right, that makes sense, thanks The reason i used type= response was i wanted to convert the predicted probabilities to the response scale, as surely this is the scale at which a 95CI value is most useful for? I.e pp - predict(model1,se.fit=TRUE, type = response) 1 0.68 Probability of point 1 having a response of 1 = 0.68 - # this is above the cutoff therefore this has a response of 1 Then it would be of most use to get the 95CI on this scale to see if the probability straddles the cutoff value? Maybe i am missing something? Thanks On 29 Sep 2010, at 15:27, Ben Bolker wrote: On 10-09-29 10:04 AM, Sam wrote: Hi Ben and list, Sorry to be a pain! I have followed your code, and modified it - **You should not use type=response here.** The point is that the (symmetric) confidence intervals are computed on the link/linear predictor scale, and then inverse-link-transformed (in this case, along with the fitted values) -- they then become asymmetric. pp - predict(model1,se.fit=TRUE, type = response) etaframe - + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - plogis(etaframe) pframe My response variable is 0 or 1, the probabilities are all high above my cut-off point of 0.445, i think this may have something to do with you may need to multiply by the binomial N as appropriate. However how do i multiply if my response is 0 or 1? if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by 1, so don't bother. On 29 Sep 2010, at 13:44, Ben Bolker wrote: ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a binomial model you need 'plogis' rather than 'exp' as your inverse link, and you may need to multiply by the binomial N as appropriate. On 10-09-29 06:07 AM, Sam wrote: I am looking to do the same but am a bit confused and apply the inverse link function for your model. i understand up to this point and i understand what this means, however i am unsure why it needs to be done and how you do it - i.e i use family=binomial is this wrong if i use this method to calculate 95% CI? Thanks Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote: zozio32 remy.pascal at gmail.com writes: Hi I had to use a glm instead of my basic lm on some data due to unconstant variance. now, when I plot the model over the data, how can I easily get the 95% confidence interval that sormally coming from: yv - predict(modelVar,list(aveLength=xv),int=c) matlines(xv,yv,lty=c(1,2,2)) There is no interval argument to pass to the predict function when using a glm, so I was wondering if I had to use an other function You need to use predict with se=TRUE; construct the confidence intervals by computing predicted values +- 1.96 times the standard error returned; and apply the inverse link function for your model. If heteroscedasticity is your main problem, and not a specific (known) non-normal distribution, you might consider using the gls function from the nlme package with an appropriate 'weights' argument. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getDLLVersion
Hi, I'm just learning to write R extensions in C and to embed R in C. I was trying to get through the example in the help page on calling the .dll directly ( http://cran.r-project.org/doc/manuals/R-exts.html#Calling-R_002edll-directly ). When I compile I consistently get the error that getDLLVersion() as well as get_R_HOME() and getRUser() are undefined references. These are defined in Rembedded.h ln60 as: extern char *getDLLVersion(void), *getRUser(void), *get_R_HOME(void); I included Rembedded.h from the example. I searched the entire R directory for the string getDLLVersion and only found it in Rembedded.h and the help documentation (for the .dll directly example). Am I missing some key file or just not understanding how these functions work? I'm just learning C so this may be a very basic question. If anyone could point me to a good reference on this it would be very helpful. Technical Notes: OS: win7 32bit Compiler: mingw32 R: 2.11.1 Thanks Kyle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: creating a custom background
Ethan- You need to be more explicit about what you mean by 'background'. Do you mean: (a) the entire plot including margins?, or (b) only the plotting area?, or (c) a different color for both margins and plotting area? If you want (a), the solution is par(bg = '#003D79'). If you want (b), the solution is a little trick I picked up from the R help archives: plot.area.col-function(col){ ttt-par()$usr rect(ttt[1],ttt[3],ttt[2],ttt[4],col=col) } plot(1,1, panel.first=plot.area.col('#003D79') ) If you want (c), you can combine both (a) and (b) as follows: par(bg = 'blue') plot(1,1, panel.first=plot.area.col('red') ) Hope that helps. -tgs If you are using the plot function, and you only want to change the plot area colored On Wed, Sep 29, 2010 at 10:14 AM, Arenson, Ethan earen...@nbome.org wrote: Hi. I want to create a plot with Pantone654 as the background. The RGB for this color is (0,61,121), which corresponds to a hex of #003D79. How do I specify the bg parameter for this? All Best, Ethan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What's the meaning of Species ~ . in IRIS data
I am refering to a function call like this: data(iris) x - svmlight(Species ~ ., data = iris) I tried to see the content of it by typing: Species ~ . but it gives nothing. How can I see it's content ? - P.Dubois __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the meaning of Species ~ . in IRIS data
On 29/09/2010 10:51 AM, Gundala Viswanath wrote: I am refering to a function call like this: data(iris) x- svmlight(Species ~ ., data = iris) I tried to see the content of it by typing: Species ~ . but it gives nothing. How can I see it's content ? str(Species ~ .) will tell you that it's a formula. The help page for the svmlight() function should tell you how it is used. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's the meaning of Species ~ . in IRIS data
On Wed, 29 Sep 2010, Duncan Murdoch wrote: On 29/09/2010 10:51 AM, Gundala Viswanath wrote: I am refering to a function call like this: data(iris) x- svmlight(Species ~ ., data = iris) I tried to see the content of it by typing: Species ~ . but it gives nothing. How can I see it's content ? str(Species ~ .) will tell you that it's a formula. The help page for the svmlight() function should tell you how it is used. If this is svmlight in klaR, it seems not to. But ?formula will definitely tell you how to interpret the formula. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R crashes when loading rgl package before minqa package
Hej, Calling newuoa (from the minqa package) makes R crash when the package rgl is loaded first. This however only on certain selected data. The data used for testing (saved to 'bugs.R'): xvals = c(1,2,4,5,7,8,9,10,11,12,14,15,16,18,19,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36) yvals = c(857.7597,975.8624,978.2655,979.3034,965.5919,983.8946,992.2512,992.1178,979.5379,974.4269,968.4113,991.5210,977.3361,985.7800,975.5220,974.6880,973.8102,980.7295,982.0034,984.7993,978.4948,970.4351,969.0718,983.7892,976.3637,980.7833,987.1665,976.6000,975.1332,971.0757,989.4693) initpar = c(-5.1471384, -3861.8905839, 979.2616002, 0.2572355, 27.5705764) optimft - function(x) { yft = x[2] + (x[3] - x[2])/((1 + exp(x[1] * (log(xvals) - log(x[4]^x[5]) return(sum((yvals - yft)^2)) } Sequence of commands needed to make the bug appear: Start R source('bugs.R') library(minqa) library(rgl) newuoa(initpar, optimft) = OK Start R source('bugs.R') library(rgl) library(minqa) newuoa(initpar, optimft) = Crash: segfault: address 0x18, cause 'memory not mapped' I found the bug using the package qpcR, where rgl is loaded when loading qpcR while minqa is only loaded later, when needed. Running on Debian squeeze 64 bit. R version: R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu rgl version: 0.91 minqa version: 1.1.9 Rcpp version: 0.8.6 (loaded by minqa) Kind regards, Gaspard Lequeux __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generalized additive mixed models for ordinal data
Hi: You could look into the gamm4 package. Its description is: Fit generalized additive mixed models via a version of mgcv's gamm function, using lme4 for estimation via Fabian Scheipl's trick. HTH, Dennis On Wed, Sep 29, 2010 at 7:28 AM, Camarda, Carlo Giovanni cama...@demogr.mpg.de wrote: Dear R-users, Is there any R-function for fitting generalized additive mixed models for ordinal data? Do they actually make some sense? I can fit a generalized linear mixed model for ordinal data using the function clmm(ordinal) and I'm able to cope with generalized additive model for ordinal data within the package VGAM. But I would like to fit something like: g(\gamma_{ij}) = \theta_{j} + x_{i1} \beta_1 + f(x_{2i}) + u_{i}, where \gamma_{ij} denote the cumulative probability that the i-th observation falls in the j-th category or below. Sorry for the rather out-of-R question, Carlo Giovanni Camarda -- This mail has been sent through the MPI for Demographic ...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (OT) Change of email address
Apologies for bothering anyone who may not be interested in this, but some of you will, for instance, have my current email address in your address books, etc. As result of a new policy by Manchester University, retired former staff who no longer contribute actively to research in their former departments are to have their email accounts closed. The address from which I have been mailing to R-help (etc.): ted.hard...@manchester.ac.uk will therefore soon be closed down. I have set up (and subscribed to R-help, R-devel, R-announce) a new email address: ted.hard...@wlandres.net If you have kept a record of my current (manchester.ac.uk) address, please modify this to the new address. Messages sent to the current address will continue to arrive, for another week or two. With thanks, and best wishes to all, Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 29-Sep-10 Time: 16:58:18 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] short captions for xtable?
Thanks. I didn't see any obvious way to do it either. It looks like the caption option has room for additional input that are as yet not designated. I'll take a look at the latex() possibility. Best, cuz -- View this message in context: http://r.789695.n4.nabble.com/short-captions-for-xtable-tp2719000p2719284.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Spracheinstellung
sehr geehrte Damen und Herren Für unseren Statistik Unterricht brauchen wir das R-Programm. Ich habe das Programm heruntergeladen, deutsch gewählt und installiert. Das Problem ist nach 3mal neu installieren, dass das Programm immer auf englisch kommt. Können Sie mir helfen? Vg Tobias Keller __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Graphic - Tellis as a potential
Hello all, I have been meaning to learn R for a while and have just subscribed to this list. I am planning to give R a shot at one of my live projects. I am looking to explore graphical features of R on my data below. Sample Data: Cat1 - Cat2 - Cat3 - Cat4 - NumPeople - Salary H - L - H - L - 100 - 5 L - L - L - L - 40 - 3 - H - H - - 100 - 45000 Cat1 through Cat4 are categorical variables containing High, Medium, Low or Blank values and the last two variables are continuous. 1. Is there a good way of graphically representing this data in R? I am looking for total population and average salary within each combination of values. 2. Cat1 and Cat2 are related while Cat3, Cat4 are related. I have a grid in Excel that summarizes by having (Cat1 + Cat2) on the left and (Cat3 + Cat4) on the top. Could we draw this kind of matrix using R? I started looking up this morning and see that this could be done using trellis objects. I will appreciate if anyone has examples similar to this. Thanks, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems extracting p-value from summary(aov(...))
Hi: Someone off-list (Josh Wiley - thank you) mentioned the Error() term in the OP's ANOVA, which I missed in responding to the post - sorry for the misinformation. Using the npk example with code that Josh showed me, we have, for the following model, npk.aov2 - aov(yield ~ N*P*K + Error(block/P), data = npk) npk.aovE2 - summary(npk.aov2) str(npk.aovE2) List of 3 $ Error: block :List of 1 ..$ :Classes anova and 'data.frame':2 obs. of 5 variables: .. ..$ Df : num [1:2] 1 4 .. ..$ Sum Sq : num [1:2] 37 306 .. ..$ Mean Sq: num [1:2] 37 76.6 .. ..$ F value: num [1:2] 0.483 NA .. ..$ Pr(F) : num [1:2] 0.525 NA ..- attr(*, class)= chr [1:2] summary.aov listof $ Error: block:P:List of 1 ..$ :Classes anova and 'data.frame':3 obs. of 5 variables: .. ..$ Df : num [1:3] 1 1 4 .. ..$ Sum Sq : num [1:3] 8.4 33.1 38.3 .. ..$ Mean Sq: num [1:3] 8.4 33.14 9.57 .. ..$ F value: num [1:3] 0.878 3.464 NA .. ..$ Pr(F) : num [1:3] 0.402 0.136 NA ..- attr(*, class)= chr [1:2] summary.aov listof $ Error: Within :List of 1 ..$ :Classes anova and 'data.frame':5 obs. of 5 variables: .. ..$ Df : num [1:5] 1 1 1 1 8 .. ..$ Sum Sq : num [1:5] 189.282 95.202 21.282 0.482 147.023 .. ..$ Mean Sq: num [1:5] 189.282 95.202 21.282 0.482 18.378 .. ..$ F value: num [1:5] 10.2994 5.1802 1.158 0.0262 NA .. ..$ Pr(F) : num [1:5] 0.0124 0.0524 0.3133 0.8754 NA ..- attr(*, class)= chr [1:2] summary.aov listof - attr(*, class)= chr summary.aovlist In this case, the p-values can be gotten from each component of the list, but one has to read the output from str() carefully. For example, under each $Error component is another component, and it is the subcomponent that needs to be subsetted to grab the p-values: Top level: npk.aovE2[[1]][[1]][, 5] [1] 0.5252361NA Mid-level: npk.aovE2[[2]][[1]][, 5] [1] 0.4017266 0.1362185NA Bottom level: npk.aovE2[[3]][[1]][, 5] [1] 0.01243794 0.05239664 0.31325902 0.87540509 NA BTW, this is one reason why str() is such an important function in R. Thanks to Peter and Josh for pointing out my error; apologies for the thickheadedness. Dennis On Wed, Sep 29, 2010 at 5:56 AM, peter dalgaard pda...@gmail.com wrote: On Sep 29, 2010, at 14:25 , Dennis Murphy wrote: test.summary[[1]][, 5][1] You mean that wasn't obvious? :) Worse, it doesn't actually work... test.summary - summary(npk.aovE) test.summary[[1]][, 5] Error in `[.default`(test.summary[[1]], , 5) : incorrect number of dimensions test.summary$Error: block[[1]][, 5] [1] 0.5252361NA test.summary[[1]][[1]][, 5] [1] 0.5252361NA I.e., you need an extra list extraction operator. The data structure goes List of 2 $ Error: block :List of 1 ..$ :Classes anova and 'data.frame':2 obs. of 5 variables: .. ..$ Df : num [1:2] 1 4 .. ..$ Sum Sq : num [1:2] 37 306 ... and the intermediate list has class (summary.aov,listof). I believe the point is that you can have a matrix LHS and get a table for each of its columns. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] caret package version 4.63
Version 4.63 of the caret package is now on CRAN. caret can be used to tune the parameters of predictive models using resampling, estimate variable importance and visualize the results. There are also various modeling and helper functions that can be useful for training models. caret has wrappers to over 99 different models for classification and regression. See the package vignettes or: http://user2010.org/slides/Kuhn.pdf http://www.jstatsoft.org/v28/i05 for more details. Since the last posting to the list: - wrappers for a number of new models were added, notably gam models (from both the gam and mgcv packages) and logic trees - when resampling with train(), class probabilities can now be used to calculate performance (such as the AUC of an ROC curve). A basic summary function, twoClassSummary(), can be used to calculate sensitivity, specificity and the ROC AUC. - repeated k-fold CV and the bootstrap 632 technique are available in train() - pre-processing can not be used within each resampling iteration within train(). - a function for independent component regression (icr) was added - the class for aggregating and visualization resampling results (resamples) has been enhanced with more visualization methods. The class can also work with caret's feature selection routines (rfe() and sbf()) - the print method for train() has been improved - functions can be now be passed to the tuneGrid argument in train() - an existing function that catalogs the existing models available within train(), called modelLookup(), is now available to the users - when parallel processing, more computations are being executed in the worker processes than previously (e.g. performance calcs) The NEWS file has the blow-by-blow list of changes. The package homepage is https://r-forge.r-project.org/projects/caret/ Send questions, collaborations, comments etc to max.kuhn at pfizer.com. Max ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting multiple animal tracks against Date/Time
Hi: This 'works' on the lapply end: # Function to read one file from the list: g - function(x) read.zoo(file = x, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) # Apply to all files in list: lapply(filenames, g) [[1]] (01/01/04 00:01:00) 100 # Just to prove to myself that this works on a list of input files, write.csv(Test,file=Test2) write.csv(Test,file=Test3) filenames - c('Test', 'Test2', 'Test3') lapply(filenames, g) [[1]] (01/01/04 00:01:00) 100 [[2]] (01/01/04 00:01:00) 100 [[3]] (01/01/04 00:01:00) 100 Hope this helps, Dennis On Wed, Sep 29, 2010 at 4:52 AM, Struve, Juliane j.str...@imperial.ac.ukwrote: I will post the example again to see if its readable now. My question is why does read.zoo(file=filenames,) work and lapply(filenames, read.zoo,...) does not ? Since I am reading the same file in both statements I just do not know how to interpret Error in strptime(x, format, tz = tz) : invalid 'x' argument. Thank you for all help. Juliane library(chron) library(zoo) #Generate example file Fish_ID=1646 Date - 01/01/2004 00:01:00 Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S)) R2sqrt -100 Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt) write.csv(Test,file=Test) #Read in example file filenames=Test read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) From: Gabor Grothendieck [ggrothendi...@gmail.com] Sent: 29 September 2010 00:09 To: Struve, Juliane Cc: r-help@r-project.org Subject: Re: [R] plotting multiple animal tracks against Date/Time On Tue, Sep 28, 2010 at 9:30 AM, Struve, Juliane j.str...@imperial.ac.uk wrote: Hi, in this self-contained example the file the same error message appears as when I read in my original results files. library (zoo) library(chron) #generate example data Fish_ID=1646 Date - 01/01/2004 00:01:00 Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S)) R2sqrt -100 #put into dataframe Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt) # write .csv file write.csv(Test,file=Test) #generate list of files filenames=Test #read file(s) into zoo object read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric)) #works fine #read list of files into zoo.object lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, colClasses = c(NULL, NULL, character, numeric))# error Error in strptime(x, format, tz = tz) : invalid 'x' argument Am I missing something ? Thank you for your time and patience. Self contained means anyone else can just copy your code and paste it into their session and see the error message you see. Its likely that your file does not contain what you think it does. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (OT) Change of email address
Hello, go to https://stat.ethz.ch/mailman/options/r-help/y...@email and if You do not remember Your password, use Password reminder (down) to mail it to Your address. Then login and do all needed changes. You can replace r-hep in link above with r-sig-phylo, r-announce and so on. And of course replace y...@email above with probable ted.hard...@wlandres.net. Have a nice day! Vojtěch Dne St 29. září 2010 10:58:22 ted.hard...@wlandres.net napsal(a): Apologies for bothering anyone who may not be interested in this, but some of you will, for instance, have my current email address in your address books, etc. As result of a new policy by Manchester University, retired former staff who no longer contribute actively to research in their former departments are to have their email accounts closed. The address from which I have been mailing to R-help (etc.): ted.hard...@manchester.ac.uk will therefore soon be closed down. I have set up (and subscribed to R-help, R-devel, R-announce) a new email address: ted.hard...@wlandres.net If you have kept a record of my current (manchester.ac.uk) address, please modify this to the new address. Messages sent to the current address will continue to arrive, for another week or two. With thanks, and best wishes to all, Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 29-Sep-10 Time: 16:58:18 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vojtěch Zeisek Department of Botany, Faculty of Science, Charles Uni., Prague, CZ Institute of Botany, Academy of Science, Czech Republic Community of the openSUSE GNU/Linux https://www.natur.cuni.cz/faculty-en?set_language=en http://www.ibot.cas.cz/?p=indexamp;site=en http://www.opensuse.org/ http://web.natur.cuni.cz/~zeisek/ signature.asc Description: This is a digitally signed message part. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Spracheinstellung
G'day Tobias, On Wed, 29 Sep 2010 14:01:10 + Keller Tobias (kelleto0) kelle...@students.zhaw.ch wrote: Für unseren Statistik Unterricht brauchen wir das R-Programm. Ich habe das Programm heruntergeladen, deutsch gewählt und installiert. Das Problem ist nach 3mal neu installieren, dass das Programm immer auf englisch kommt. Können Sie mir helfen? R for Windows FAQ 3.2: http://cran.ms.unimelb.edu.au/bin/windows/base/rw-FAQ.html#I-want-R-in-English_0021 HTH. Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 95% confidence intercal with glm
Dear List and Ben ( I apologise if this has been sent twice, but it is not showing in my sent folder and i have been having trouble with my email of late) Right, that makes sense, thanks The reason i used type= response was i wanted to convert the predicted probabilities to the response scale, as surely this is the scale at which a 95CI value is most useful for? I.e pp - predict(model1,se.fit=TRUE, type = response) 1 0.68 Probability of point 1 having a response of 1 = 0.68 - # this is above the cutoff therefore this has a response of 1 Then it would be of most use to get the 95CI on this scale to see if the probability straddles the cutoff value? Maybe i am missing something? Thanks On 29 Sep 2010, at 15:27, Ben Bolker wrote: On 10-09-29 10:04 AM, Sam wrote: Hi Ben and list, Sorry to be a pain! I have followed your code, and modified it - **You should not use type=response here.** The point is that the (symmetric) confidence intervals are computed on the link/linear predictor scale, and then inverse-link-transformed (in this case, along with the fitted values) -- they then become asymmetric. pp - predict(model1,se.fit=TRUE, type = response) etaframe - + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - plogis(etaframe) pframe My response variable is 0 or 1, the probabilities are all high above my cut-off point of 0.445, i think this may have something to do with you may need to multiply by the binomial N as appropriate. However how do i multiply if my response is 0 or 1? if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by 1, so don't bother. On 29 Sep 2010, at 13:44, Ben Bolker wrote: ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a binomial model you need 'plogis' rather than 'exp' as your inverse link, and you may need to multiply by the binomial N as appropriate. On 10-09-29 06:07 AM, Sam wrote: I am looking to do the same but am a bit confused and apply the inverse link function for your model. i understand up to this point and i understand what this means, however i am unsure why it needs to be done and how you do it - i.e i use family=binomial is this wrong if i use this method to calculate 95% CI? Thanks Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote: zozio32 remy.pascal at gmail.com writes: Hi I had to use a glm instead of my basic lm on some data due to unconstant variance. now, when I plot the model over the data, how can I easily get the 95% confidence interval that sormally coming from: yv - predict(modelVar,list(aveLength=xv),int=c) matlines(xv,yv,lty=c(1,2,2)) There is no interval argument to pass to the predict function when using a glm, so I was wondering if I had to use an other function You need to use predict with se=TRUE; construct the confidence intervals by computing predicted values +- 1.96 times the standard error returned; and apply the inverse link function for your model. If heteroscedasticity is your main problem, and not a specific (known) non-normal distribution, you might consider using the gls function from the nlme package with an appropriate 'weights' argument. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 95% confidence intercal with glm
[I'm a little confused: are you Sam Smith or Chris Mcowen ... ?] This is admittedly a bit confusing, but the best scale on which to compute standard errors is the link scale. It turns out (I hadn't realized this) that predict.glm does give you not-crazy answers when you ask for se.fit=TRUE on the response scale, in which case you can (a) just take +/- 1.96 SE around the response-scale fit and be done. However, these are less accurate than doing what I suggested ((b) computing fit +/- 1.96 SE on the link scale and inverse-link-transforming). Among other things, method a produces confidence intervals that are exactly symmetric (generally not true) and that are not necessarily bounded within the appropriate range -- for example, the lower 95% CI bound for females in the example below is slightly 0 ... ## extended example from ?predict.glm require(graphics) ## example from Venables and Ripley (2002, pp. 190-2.) ldose - rep(0:5, 2) numdead - c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16) sex - factor(rep(c(M, F), c(6, 6))) SF - cbind(numdead, numalive=20-numdead) budworm.lg - glm(SF ~ sex*ldose, family=binomial) summary(budworm.lg) ci.scale - 1.96 plot(c(1,32), c(0,1), type = n, xlab = dose, ylab = prob, log = x, las=1, bty=l) text(2^ldose, numdead/20, as.character(sex), col=c(red,black)[sex]) ld - seq(0, 5, 0.1) pM - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(M, length(ld)), levels=levels(sex))), type = response,se.fit=TRUE) lines(2^ld, pM$fit ) matlines(2^ld,pM$fit+outer(pM$se.fit,ci.scale*c(-1,1)),lty=2,col=1) pF - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(F, length(ld)), levels=levels(sex))), type = response,se.fit=TRUE) lines(2^ld, pF$fit,col=2) matlines(2^ld,pF$fit+outer(pF$se.fit,ci.scale*c(-1,1)),lty=2,col=2) pM_ex - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(M, length(ld)), levels=levels(sex))), type = link,se.fit=TRUE) with(pM_ex, matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=1)) pF_ex - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(F, length(ld)), levels=levels(sex))), type = link,se.fit=TRUE) with(pF_ex, matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=2)) legend(bottomright, c(prediction, +/- 2 SE (response scale), +/- 2 SE (link scale)), lty=1:3) On 10-09-29 10:38 AM, Chris Mcowen wrote: Right, that makes sense, thanks The reason i used type= response was i wanted to convert the predicted probabilities to the response scale, as surely this is the scale at which a 95CI value is most useful for? I.e pp - predict(model1,se.fit=TRUE, type = response) 1 0.68 Probability of point 1 having a response of 1 = 0.68 - # this is above the cutoff therefore this has a response of 1 Then it would be of most use to get the 95CI on this scale to see if the probability straddles the cutoff value? Maybe i am missing something? Thanks On 29 Sep 2010, at 15:27, Ben Bolker wrote: On 10-09-29 10:04 AM, Sam wrote: Hi Ben and list, Sorry to be a pain! I have followed your code, and modified it - **You should not use type=response here.** The point is that the (symmetric) confidence intervals are computed on the link/linear predictor scale, and then inverse-link-transformed (in this case, along with the fitted values) -- they then become asymmetric. pp - predict(model1,se.fit=TRUE, type = response) etaframe - + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - plogis(etaframe) pframe My response variable is 0 or 1, the probabilities are all high above my cut-off point of 0.445, i think this may have something to do with you may need to multiply by the binomial N as appropriate. However how do i multiply if my response is 0 or 1? if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by 1, so don't bother. On 29 Sep 2010, at 13:44, Ben Bolker wrote: ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a binomial model you need 'plogis' rather than 'exp' as your inverse link, and
Re: [R] 95% confidence intercal with glm
Thats great thanks very much for your help On 29 Sep 2010, at 17:30, Ben Bolker wrote: [I'm a little confused: are you Sam Smith or Chris Mcowen ... ?] This is admittedly a bit confusing, but the best scale on which to compute standard errors is the link scale. It turns out (I hadn't realized this) that predict.glm does give you not-crazy answers when you ask for se.fit=TRUE on the response scale, in which case you can (a) just take +/- 1.96 SE around the response-scale fit and be done. However, these are less accurate than doing what I suggested ((b) computing fit +/- 1.96 SE on the link scale and inverse-link-transforming). Among other things, method a produces confidence intervals that are exactly symmetric (generally not true) and that are not necessarily bounded within the appropriate range -- for example, the lower 95% CI bound for females in the example below is slightly 0 ... ## extended example from ?predict.glm require(graphics) ## example from Venables and Ripley (2002, pp. 190-2.) ldose - rep(0:5, 2) numdead - c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16) sex - factor(rep(c(M, F), c(6, 6))) SF - cbind(numdead, numalive=20-numdead) budworm.lg - glm(SF ~ sex*ldose, family=binomial) summary(budworm.lg) ci.scale - 1.96 plot(c(1,32), c(0,1), type = n, xlab = dose, ylab = prob, log = x, las=1, bty=l) text(2^ldose, numdead/20, as.character(sex), col=c(red,black)[sex]) ld - seq(0, 5, 0.1) pM - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(M, length(ld)), levels=levels(sex))), type = response,se.fit=TRUE) lines(2^ld, pM$fit ) matlines(2^ld,pM$fit+outer(pM$se.fit,ci.scale*c(-1,1)),lty=2,col=1) pF - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(F, length(ld)), levels=levels(sex))), type = response,se.fit=TRUE) lines(2^ld, pF$fit,col=2) matlines(2^ld,pF$fit+outer(pF$se.fit,ci.scale*c(-1,1)),lty=2,col=2) pM_ex - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(M, length(ld)), levels=levels(sex))), type = link,se.fit=TRUE) with(pM_ex, matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=1)) pF_ex - predict(budworm.lg, data.frame(ldose=ld, sex=factor(rep(F, length(ld)), levels=levels(sex))), type = link,se.fit=TRUE) with(pF_ex, matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=2)) legend(bottomright, c(prediction, +/- 2 SE (response scale), +/- 2 SE (link scale)), lty=1:3) On 10-09-29 10:38 AM, Chris Mcowen wrote: Right, that makes sense, thanks The reason i used type= response was i wanted to convert the predicted probabilities to the response scale, as surely this is the scale at which a 95CI value is most useful for? I.e pp - predict(model1,se.fit=TRUE, type = response) 1 0.68 Probability of point 1 having a response of 1 = 0.68 - # this is above the cutoff therefore this has a response of 1 Then it would be of most use to get the 95CI on this scale to see if the probability straddles the cutoff value? Maybe i am missing something? Thanks On 29 Sep 2010, at 15:27, Ben Bolker wrote: On 10-09-29 10:04 AM, Sam wrote: Hi Ben and list, Sorry to be a pain! I have followed your code, and modified it - **You should not use type=response here.** The point is that the (symmetric) confidence intervals are computed on the link/linear predictor scale, and then inverse-link-transformed (in this case, along with the fitted values) -- they then become asymmetric. pp - predict(model1,se.fit=TRUE, type = response) etaframe - + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - plogis(etaframe) pframe My response variable is 0 or 1, the probabilities are all high above my cut-off point of 0.445, i think this may have something to do with you may need to multiply by the binomial N as appropriate. However how do i multiply if my response is 0 or 1? if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by 1, so don't bother. On 29 Sep 2010, at 13:44, Ben Bolker wrote: ## from ?glm counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) d.AD - data.frame(treatment, outcome, counts) glm.D93 - glm(counts ~ outcome + treatment, family=poisson, data=d.AD) ## predict on 'link' or 'linear predictor' scale, with SEs pp - predict(glm.D93,se.fit=TRUE) etaframe - with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit)) pframe - exp(etaframe) ## inverse-link ## picture pframe - as.data.frame(cbind(obs=d.AD$counts,pframe)) library(plotrix) plot(pframe$obs,ylim=c(5,30)) with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE)) If you're using a
[R] Simplify, several Ave or aggregates
Hello. How can I write this all in one line? mydata is a zoo series, limit is a numeric vector of the same size tmp - ave(coredata(mydata),as.Date(index(mydata)),FUN = function(x) ( (cummax(x)-x )) ) tmp - (tmp limit) final - ave(coredata(tmp),as.Date(index(mydata)),FUN = function(x) cumprod( x) ) I've tried to use two vectors as argument to ave(...) but it seems to accept just one even if I join them into a maxtrix. This is just an example, but any other function could be use. Here I need to compare the value of cummax(mydata)-mydata with a numeric vector and once it surpasses it I'll keep zeros till the end of the day. The cummax is calculated from the beginning of each day. If limit were a single number instead of a vector (with different possible numbers) I could write it: ave(coredata(mydata),as.Date(index(mydata)),FUN = function(x) cumprod( (cummax(x)-x ) limit) ) But I can't introduce there a vector longer than x (with the length of each day) and I don't know how to introduce it as another argument in ave() cheers -- View this message in context: http://r.789695.n4.nabble.com/Simplify-several-Ave-or-aggregates-tp2719363p2719363.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Spracheinstellung
On Sep 29, 2010, at 18:21 , Berwin A Turlach wrote: G'day Tobias, On Wed, 29 Sep 2010 14:01:10 + Keller Tobias (kelleto0) kelle...@students.zhaw.ch wrote: Für unseren Statistik Unterricht brauchen wir das R-Programm. Ich habe das Programm heruntergeladen, deutsch gewählt und installiert. Das Problem ist nach 3mal neu installieren, dass das Programm immer auf englisch kommt. Können Sie mir helfen? R for Windows FAQ 3.2: http://cran.ms.unimelb.edu.au/bin/windows/base/rw-FAQ.html#I-want-R-in-English_0021 For different values of en, of course (presumably de). And assuming Windows OS. -p HTH. Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Validation / Training - test data
It all depends on the ultimate use of the results. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2719370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ranked Set Sampling
Dear All; I have searched to see if any code in R was written to calculate mean and variance of a Ranked Set Sample, but did not find any. Is there any package for RSS, or kindly can somebody share a code he/she wrote, I am very grateful and willing to acknowledge that in my work. Thanks much Ahmed [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] move colorkey
On 2010-09-28 18:39, Marlin Keith Cox wrote: When using a wireframe, I need to move the colorkey from the right position (default0 towards the plot. I have also needed to adjust the height and used the code colorkey=list(T,space='right',height=.5) I have looked at documents (within levelplot) but cannot find a way to move the colorkey other than right, left, bottom and top. I do not understand corner interacts with x, y; unimplemented. Is this a way to place a colorkey. keith Try adding a layout.widths setting to your wireframe call: par.settings = list(layout.widths = list(key.right = 0.8)) You might then have to adjust the space on the right with par.settings = list(layout.widths = list(key.right = 0.8, right.padding = 2)) -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (OT) Change of email address
Ted is well aware of how to change his list email. He was advising people on the list who who have his old email address in their address books to remove it. On 09/29/2010 12:16 PM, Vojtěch Zeisek wrote: Hello, go to https://stat.ethz.ch/mailman/options/r-help/y...@email and if You do not remember Your password, use Password reminder (down) to mail it to Your address. Then login and do all needed changes. You can replace r-hep in link above with r-sig-phylo, r-announce and so on. And of course replace y...@email above with probable ted.hard...@wlandres.net. Have a nice day! Vojtěch Dne St 29. září 2010 10:58:22 ted.hard...@wlandres.net napsal(a): Apologies for bothering anyone who may not be interested in this, but some of you will, for instance, have my current email address in your address books, etc. As result of a new policy by Manchester University, retired former staff who no longer contribute actively to research in their former departments are to have their email accounts closed. The address from which I have been mailing to R-help (etc.): ted.hard...@manchester.ac.uk will therefore soon be closed down. I have set up (and subscribed to R-help, R-devel, R-announce) a new email address: ted.hard...@wlandres.net If you have kept a record of my current (manchester.ac.uk) address, please modify this to the new address. Messages sent to the current address will continue to arrive, for another week or two. With thanks, and best wishes to all, Ted. -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drawing samples based on a matching variable
On Tue, 28 Sep 2010, L Brown wrote: Hi, everyone. I have what I hope will be a simple coding question. It seems this is a common job, but so far I've had trouble finding the answer in searches. I have two matrices (x and y) with a different number of observations in each. I need to draw a random sample without replacement of observations from x, and then, using a matching variable, draw a sample of equal size from y. It is the matching variable that is hanging me up. For example-- # example matrices. lets assume seed always equals 1. (lets also assume I have assigned variable names A and B to my columns..) set.seed(1) x-cbind(1:10,sample(1:5,10,rep=T)) x [A] [B] [1,]12 [2,]22 [3,]33 [4,]45 [5,]52 [6,]65 [7,]75 [8,]84 [9,]94 [10,] 101 Looks like set.seed(1) was invoked here, too. y-cbind(1:14,sample(1:5,14,rep=T)) y [A] [B] [1,]12 [2,]22 [3,]33 [4,]45 [5,]52 [6,]65 [7,]75 [8,]84 [9,]94 [10,] 101 [11,] 112 [12,] 121 [13,] 134 [14,] 142 #draw random sample of n=4 without replacement from matrix x. x.samp-x[sample(10,4,replace=F),] x.samp [A] [B] [1,]33 [2,]45 [3,]52 [4,]75 Next, I would need to draw four observations from matrix y (without replacement) so that the distribution of y$B is identical to x.samp$B. I'd appreciate any help, and sorry to post such a basic question! Break it down like this: x.levels - sort( unique(x[,2]) ) x.samp.tab - table( factor( x.samp[,2], x.levels ) ) y.rows - split(1:nrow(y),factor( y[,2], x.levels ) ) unlist( mapply( sample, y.rows, x.samp.tab ),use.names=FALSE ) In some cases sample might fail if length( y.rows[[i]] ) x.samp.tab[ i ] you can trace which element that was using 'traceback()' or write a wrapper for sample() that checks that condition. HTH, Chuck LB [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] eha aftreg overall p-value
Dear useRs, I am currently fitting an advanced failure time model using Göran Broström's excellent eha library with the aftreg command. My question: How do I interpret the Overall p-value, that is reported at the very bottom of the output? I already figured out it must be a chi-square test, but I am wondering what a p-value 0.01 means: Does it mean my model fits the actual data correctly OR does it mean my model DOES NOT fit the data correctly? In other words: Is it better to have a small or a large overall p-value? EXAMPLE OUTPUT: Events230 Total time at risk 3340 Max. log. likelihood -380.34 LR test statistic 16.7 Degrees of freedom1 Overall p-value 4.41898e-05 Any advice is highly appreciated!! Thanks Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subtraction based on two groups in a dataframe
Thanks for the help. Sharad On Mon, Sep 27, 2010 at 9:12 PM, Remko Duursma [via R] ml-node+2716469-935075351-6...@n4.nabble.comml-node%2b2716469-935075351-6...@n4.nabble.com wrote: Try something like this: dfr - read.table(textConnection(plate.id well.id Group HYB rlt1 1 P1 A1 Control SKOV3hyb 0.190 2 P1 A2 Control SKOV3hyb 0.210 3 P1 A3 Control SKOV3hyb 0.205 4 P1 A4 Control SKOV3hyb 0.206 5 P1 A5 Control SKOV3hyb 0.184 385 P1 A1ovca SKOV3hyb 0.184 386 P1 A2ovca SKOV3hyb 0.229 387 P1 A3ovca SKOV3hyb 0.214 388 P1 A4ovca SKOV3hyb 0.226 389 P1 A5ovca SKOV3hyb 0.217 )) difs - lapply(split(dfr,dfr$plate.id), function(x)x$rlt1[x$Group == Control] - x$rlt1[x$Group == ovca]) dfr$Diff - Reduce(c,difs) greetings, Remko -- View message @ http://r.789695.n4.nabble.com/subtraction-based-on-two-groups-in-a-dataframe-tp2716104p2716469.html To unsubscribe from subtraction based on two groups in a dataframe, click herehttp://r.789695.n4.nabble.com/template/TplServlet.jtp?tpl=unsubscribe_by_codenode=2716104code=c2JwdXJvaGl0QGdtYWlsLmNvbXwyNzE2MTA0fDU4ODg0MTYwOQ==. -- View this message in context: http://r.789695.n4.nabble.com/subtraction-based-on-two-groups-in-a-dataframe-tp2716104p2719364.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Table with different digit number
Thank you very much for your solution but it works only in a dataframe object. If I am using an ftable object, it doesn't run. I use, as a workaround, to fill with blank spaces the left of each number, so when I print the table, it appears aligned to right. But, obviously, this doesn't work for the HTML table, because HTML ignores multiple white spaces . Any suggestion? Thanks in advance, Nicola 2010/9/28 Henrique Dallazuanna www...@gmail.com Try this: df[1,] - as.character(df[1,]) On Tue, Sep 28, 2010 at 8:48 AM, Nicola Sturaro Sommacal (Quantide srl) mailingl...@sturaro.net wrote: Hi! I have a table representing both absolute and relative frequency, for example (code to get example data under the signature): ItalyGermany absolute100 105 relative 40.51 41.18 How can I print a different number of decimal digits? I try to transform to as.character, but cells result aligned to left and I don't like this solution. At the end of my work I need to export the table to HTML, so this can be do also with xtable package. Thanks in advance for your help. Nicola Sturaro Sommacal -- Quantide srl http://www.quantide.com This is the code to get the data.frame with the data above: df = data.frame(italy = c(100,40.51), germany = c(105, 41.18)) row.names(df) = c(absolute, relative) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] executing loop
Dear All, I am trying to define a loop for a m*n matrix, where i=1:n and j=1:m. Unlike the usual for loop, i should go in the following way: For j=1, i=1,2,3,n For j=2, i=n,n-1,n-2,..,1 For j=3, i=1,2,3,.n etc. which means i should go in either increasing or decreasing order alternatively. Can anyone please help me in doing this? Thanks, Cassie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting a line by line program into an array to perform summary stats
I am trying to turn several lines of information into a variable. I used the filx function to input my file then the readlines to qualify what I want. Essentially I have data in a file every 10 minutes through a day for several years down a column: date time value 9/28/10 02:00 13 9/28/10 02:10 15 9/28/10 02:20 12 9/28/10 02:30 11 etc. I then use if statements to segment the day into various portions (2:00 to 2:30 for example) and find the average or maximum value for the time. Next I compare this value to another time of the day to see if the price was reached/broken. For example the 2:00 to 2:30 low from above is 11 at 2:30. I would then check the 8:00 to 13:00 time period to see if the 2:00 to 2:30 low occured from 8:00 to 13:00. If it did occur I would call this day a qualifier and check the time 2:30 price occured in the 8:00 to 13:00 period until the price at 13:00. So another example if 2:30 was 11, 8:00 was 13, 8:45 was 11, and 13:00 was 20 I would say this day qualified and the price from the am was first acheived at 8:45 in the 8:00 to 13:00 period and the change from that price to the 13:00 period was 20-11=+9. This example was all background to my problem. Using the filx/readlines methods I wind up with output for the days which qualify, but when I try to create an array variable using the total of each line for a given column (aka summarizing every day that my given criteria were met), something about having used the line by line method doesn't allow me to create an array variable. It only gives me the last value. The variables I declared to fidn the qualifying days are specific to each day vs across all days that qualified. Is it possible to convert this line by line data into an array wih some R code I don't know? I considered converting the data example above into a matrix, but then I run into tricky problems of shortened days and days missing data (the code I used for the readlines method enables me to specify the new days without this problem...I don't have to fill in blanks it just changes days when it hits a new day). Given how specific this problem is I can't find much on the R help logs. Perhaps outputting the line by line output to a file, then reinputting it using a read.csv method would enable me to achieve my desired result but then I have the laborious task of dumping into excel, converting file types, saving as csv, reuploading to R, writing a separate program, etc. Is there an easy fix to my problem? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] executing loop
for (j in 1:n) { if (j%%2==0) { iRange = c(n:1) } else iRange = c(1:n) for (i in iRange) { your code } } Peter On Wed, Sep 29, 2010 at 10:40 AM, cassie jones cassiejone...@gmail.com wrote: Dear All, I am trying to define a loop for a m*n matrix, where i=1:n and j=1:m. Unlike the usual for loop, i should go in the following way: For j=1, i=1,2,3,n For j=2, i=n,n-1,n-2,..,1 For j=3, i=1,2,3,.n etc. which means i should go in either increasing or decreasing order alternatively. for (j in 1:n) { if (j%%2==0) { iRange = c(n:1) } else iRange = c(1:n) for (i in iRange) { your code } } Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use R in Visual Basic Environment
Hello On Tue, Sep 28, 2010 at 1:39 PM, Soumen Pal soumen.4...@gmail.com wrote: I need your kind help regarding the following: I wish to know is there any way to use R in Visual Basic environment. I want to develop a VB application where R can be embedded (R will work as a back end statistical engine). If available, please provide me some source of study materials/articles available on the internet related to this. Have you tried searching 'r vba' on Google [1]? At least the second and third links seem relevant. I think you'll want to search for 'vba rexcel' on either Google or Rseek. Regards Liviu [1] http://www.google.com/search?hl=ensource=hpq=r+vbabtnG=Google+Searchaq=faqi=aql=oq=gs_rfai= __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exponential integral
Is there an R function for evaluating the exponential integral Ei(x) = INT(-inf,x) exp(t)/t dt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] resampling issue
I am trying to get R to resample my dataset of two columns of age and length data for fish. I got it to work, but it is not resampling every replicate. Instead, it resamples my data once and then repeated it 5 times. Here is my dataset of 9 fish samples with an age and length for each one: Age Length 2 200 5 450 6 600 7 702 8 798 5 453 4 399 1 120 2 202 Here is my code which resamples my data to produce up to 9 different samples and creates a new dataset of 12 samples: testdat-growth[sample(9,12,replace=T),] Now I want R to repeat this procedure 5 times. Here is my code: testdat2 - replicate(5, sample(testdat), simplify=F) testdat2 Here is my output showing that it did it once and then just repeated the values: testdat2 [[1]] Age Length 1 2200 9 2202 8 1120 5 8798 4 7702 6 5453 1.1 2200 4.1 7702 4.2 7702 5.1 8798 4.3 7702 6.1 5453 [[2]] Age Length 1 2200 9 2202 8 1120 5 8798 4 7702 6 5453 1.1 2200 4.1 7702 4.2 7702 5.1 8798 4.3 7702 6.1 5453 [[3]] Age Length 1 2200 9 2202 8 1120 5 8798 4 7702 6 5453 1.1 2200 4.1 7702 4.2 7702 5.1 8798 4.3 7702 6.1 5453 [[4]] Length Age 1 200 2 9 202 2 8 120 1 5 798 8 4 702 7 6 453 5 1.1200 2 4.1702 7 4.2702 7 5.1798 8 4.3702 7 6.1453 5 [[5]] Length Age 1 200 2 9 202 2 8 120 1 5 798 8 4 702 7 6 453 5 1.1200 2 4.1702 7 4.2702 7 5.1798 8 4.3702 7 6.1453 5 Any advice on how to get R to resample each time would be greatly appreciated? Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exponential integral
Mishra, Srikanta MISHRAS at battelle.org writes: Is there an R function for evaluating the exponential integral Ei(x) = INT(-inf,x) exp(t)/t dt Try the gsl package. Also library(sos); findFn({exponential integral}) although admittedly it doesn't find the Expint page in the gsl package ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] resampling issue
I think that you want: replicate(5, growth[sample(9,12,replace=T),], simplify = FALSE) On Wed, Sep 29, 2010 at 3:19 PM, Michael Larkin mlar...@rsmas.miami.eduwrote: I am trying to get R to resample my dataset of two columns of age and length data for fish. I got it to work, but it is not resampling every replicate. Instead, it resamples my data once and then repeated it 5 times. Here is my dataset of 9 fish samples with an age and length for each one: Age Length 2 200 5 450 6 600 7 702 8 798 5 453 4 399 1 120 2 202 Here is my code which resamples my data to produce up to 9 different samples and creates a new dataset of 12 samples: testdat-growth[sample(9,12,replace=T),] Now I want R to repeat this procedure 5 times. Here is my code: testdat2 - replicate(5, sample(testdat), simplify=F) testdat2 Here is my output showing that it did it once and then just repeated the values: testdat2 [[1]] Age Length 1 2200 9 2202 8 1120 5 8798 4 7702 6 5453 1.1 2200 4.1 7702 4.2 7702 5.1 8798 4.3 7702 6.1 5453 [[2]] Age Length 1 2200 9 2202 8 1120 5 8798 4 7702 6 5453 1.1 2200 4.1 7702 4.2 7702 5.1 8798 4.3 7702 6.1 5453 [[3]] Age Length 1 2200 9 2202 8 1120 5 8798 4 7702 6 5453 1.1 2200 4.1 7702 4.2 7702 5.1 8798 4.3 7702 6.1 5453 [[4]] Length Age 1 200 2 9 202 2 8 120 1 5 798 8 4 702 7 6 453 5 1.1200 2 4.1702 7 4.2702 7 5.1798 8 4.3702 7 6.1453 5 [[5]] Length Age 1 200 2 9 202 2 8 120 1 5 798 8 4 702 7 6 453 5 1.1200 2 4.1702 7 4.2702 7 5.1798 8 4.3702 7 6.1453 5 Any advice on how to get R to resample each time would be greatly appreciated? Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spplot cuts
Hmmm. Maybe a documentation typo in ?spplot. If you follow the documentation through to ?levelplot, you find that cuts: number of levels the range of ‘z’ would be divided into (no mention of actual breakpoints) but: at: numeric vector giving breakpoints along the range of ‘z’. Contours (if any) will be drawn at these heights, and the regions in between would be colored using ‘col.regions’. In the latter case, values outside the range of ‘at’ will not be drawn at all. This serves as a way to limit the range of the data shown, similar to what a ‘zlim’ argument might have been used for. However, this also means that when supplying ‘at’ explicitly, one has to be careful to include values outside the range of ‘z’ to ensure that all the data are shown. So you might try 'at' instead of 'cuts'. -- View this message in context: http://r.789695.n4.nabble.com/spplot-cuts-tp2716237p2719533.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting a line by line program into an array to perform summary stats
George Coyle wrote: I am trying to turn several lines of information into a variable. I used the filx function to input my file then the readlines to qualify what I want. Essentially I have data in a file every 10 minutes through a day for several years down a column: date time value 9/28/10 02:00 13 9/28/10 02:10 15 9/28/10 02:20 12 9/28/10 02:30 11 etc. I then use if statements to segment the day into various portions (2:00 to 2:30 for example) and find the average or maximum value for the time. Next . George, it is very difficult to understand what you want, so let' try to chop it down. Let's say you want to find the best way to do some time of processing on what generally called a time series. First, reading this into a matrix will not work, because you have three different column types. The standard workhorse for this type of data is a dataframe, and read.csv gives a data frame. Next, note that your structure is even simpler: you have a time series, probably not an regular one, because I assume your stock quotes are not available all the time with equal intervals, but will stop at night. So the general idea would be: read in a data frame with read.csv convert to a time series. See package zoo for many examples do the processing (which I do not understand) on the time series Dieter -- View this message in context: http://r.789695.n4.nabble.com/Converting-a-line-by-line-program-into-an-array-to-perform-summary-stats-tp2719489p2719542.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] String split and concatenation
paste( '(', paste( ', rep(letters[1:3],2), ', sep=, collapse=','), ')', sep= ) [1] ('a','b','c','a','b','c') If you need the space after the comma then just change ',' to ', '. The outer paste can be replaced with sprintf (and that may be more readable). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Steven Kang Sent: Wednesday, September 29, 2010 2:16 AM To: bill.venab...@csiro.au Cc: r-help@r-project.org Subject: Re: [R] String split and concatenation x - rep(letters[1:3], 2) Are there any ways to transform assign the above as the one shown below to an object? (in exact format; i.e length of 1 class of character), i.e x ('a', 'b', 'c', 'a', 'b', 'c') Highly appreciate for any advice. On Wed, Sep 29, 2010 at 3:33 PM, bill.venab...@csiro.au wrote: dump(x, file = x.R) file.show(x.R) will get you most of the way. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Steven Kang Sent: Wednesday, 29 September 2010 3:11 PM To: r-help@r-project.org Subject: [R] String split and concatenation Hi R users, I desire to transform the following vector consisting of repeated characters x - rep(letters, 3) into this exact format (i.e a single string containing each characters in quotation mark separated by comma between each; al ). (a, b, c, d, a, b, c, d, ..., a, b, c, d, .z) Any advice would be much appreciated. -- Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r- project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New package GAD (General ANOVA Design)
Hi, everyone. GAD package analyses complex ANOVA models with any combination of orthogonal/nested and fixed/random factors, as described by Underwood (1997). There are two restrictions: (i) data must be balanced; (ii) fixed nested factors are not allowed. Homogeneity of variances is checked using Cochran's C test and a posteriori comparisons of means are done using Student-Newman-Keuls (SNK) procedure. Best, Maurício G. Camargo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix plot
Thanks, Thats great just what I was trying to do. HH Thomas Stewart wrote: HH- I'm not familiar with the plots you mention, but the following is a quick attempt to create the plot you describe. data-data.frame( org=1:10, q1=sample(1:10,replace=T), q2=sample(1:10,replace=T), q3=sample(1:10,replace=T)) # This generates a random data set like the one you describe. It looks like this: # org q1 q2 q3 #11 9 1 4 #22 1 2 1 #33 1 3 7 #44 10 9 6 # ... n-ncol(data)-1 # NUMBER OF ORGANIZATIONS new.mat-apply(data[,-1],2,function(x){xtabs(~factor(x,levels=1:10))}) plot(rep(1:n,each=10),rep(1:10,n), pch=16, cex=c(new.mat), xaxt=n, xlab=Question, ylab=Score) axis(1,at=1:n) Instead of cex=c(new.mat), you can supply a function of new.mat, say cex=f(new.mat) to specify the size of the dots. Hope that helps. -tgs On Wed, Sep 29, 2010 at 6:55 AM, hairryharry hairryha...@tesco.net mailto:hairryha...@tesco.net wrote: Hi, Fairly new to R - have done basic plots but now faced with plotting a matrix/table of results -I know what I want but cannot find out how to do it. Basically have individual questions ( x) to which an organization can rate themselves 1-10 (y) what I want to show is a matrix/density type plot (like the matrix corrolation plots I have seen on R graph site) showing for eac question (x) a circle or shape of varying size and/or colour to represent the number of organisations rating themselves for each value of y. Hope this makes sense, any advice would be gratefully received. HH __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trying to avoid loop structure
Dear R-helpers, I'm trying to associate linear coefficients (intercept and slope) to tens of thousands of observations based on a table with benchmark values. #Example - Value table and their corresponding coefficients (intercept and slope) coef = data.frame(cbind(st=c(1:5),b = runif(5,0.3,5),a = seq(0.5,5,1)))print(coef) #Example of observations to be computedobs = runif(20,1,5)print(obs) #This should be fairly simple - but I can't get the loop structure out of my head. I don't know how to tell the software if the observation is greater than st[i-1] but smaller than st[i], use a[i] and b[i] - otherwise go to the next line.So far, I've tried this: for(t in 1:length(obs)) { for(i in 1:length(coef$st)) if(coef$st[i-1] obs[t] coef$st[i]) x[t] = coef$b[i] + (coef.$a[i] * obs[t]) }# Doesn't work - if statement do not accept double conditions structure (y[-1] x y) and I should put an else statement - but i'm stuck there. I have the feeling I should look into lapply but it doesn't seem like lapply works with if statements. Any tips would be much appreciated, Thanks, Mathieu Beaulieu, Soil, Water Environment Laboratory Institute for Resources, Environment and Sustainability University of British-Columbia 429-2202 Main Mall, Vancouver, BC, V6T 1Z4 Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trying to avoid loop structure
Mathieu - First of all, you can combine as many conditions as you want in an if statement, using (and) and || (or). So to say coef$st[i-1] obs[t] coef$st[i] use coef$st[i-1] obs[t] obs[t] coef$st[i] So following your logic, you could use x = numeric(length(obs)) for(t in 1:length(obs)) { for(i in 1:length(coef$st)) if(coef$st[i-1] obs[t] obs[t] coef$st[i]) x[t] = coef$b[i] + (coef$a[i] * obs[t]) } Alternatively, you could solve the problem using R: mapply(function(o,i)coef$b[i] + coef$a[i] * o, obs, cut(obs,c(coef$st,6),labels=FALSE) + 1) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 29 Sep 2010, Mathieu Beaulieu wrote: Dear R-helpers, I'm trying to associate linear coefficients (intercept and slope) to tens of thousands of observations based on a table with benchmark values. #Example - Value table and their corresponding coefficients (intercept and slope) coef = data.frame(cbind(st=c(1:5),b = runif(5,0.3,5),a = seq(0.5,5,1)))print(coef) #Example of observations to be computedobs = runif(20,1,5)print(obs) #This should be fairly simple - but I can't get the loop structure out of my head. I don't know how to tell the software if the observation is greater than st[i-1] but smaller than st[i], use a[i] and b[i] - otherwise go to the next line.So far, I've tried this: for(t in 1:length(obs)) { for(i in 1:length(coef$st)) if(coef$st[i-1] obs[t] coef$st[i]) x[t] = coef$b[i] + (coef.$a[i] * obs[t]) }# Doesn't work - if statement do not accept double conditions structure (y[-1] x y) and I should put an else statement - but i'm stuck there. I have the feeling I should look into lapply but it doesn't seem like lapply works with if statements. Any tips would be much appreciated, Thanks, Mathieu Beaulieu, Soil, Water Environment Laboratory Institute for Resources, Environment and Sustainability University of British-Columbia 429-2202 Main Mall, Vancouver, BC, V6T 1Z4 Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A problem with plotting a long expression in ylab ?
Hi On 29/09/2010 8:17 p.m., Tal Galili wrote: My honor. A short question: if there is something in the device that is sensitive to the overlapping of the text, then is it possible to add a warning massage output when the length of the text is longer then the device dimensions? The graphics engine does a lot of this sort of clipping (not just text), but all of it is silent. You certainly wouldn't want to print all clipping warnings to screen and there is no current plan to return clipping information from the graphics engine. The current expectation is that the interested useR/developeR will use the R-level tools, such as strwidth(), to check for overlap, etc. Paul With much respect, Tal Contact Details:--- Contact me: tal.gal...@gmail.com mailto:tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com http://www.talgalili.com (Hebrew) | www.biostatistics.co.il http://www.biostatistics.co.il (Hebrew) | www.r-statistics.com http://www.r-statistics.com (English) -- On Wed, Sep 29, 2010 at 1:24 AM, Paul Murrell p.murr...@auckland.ac.nz mailto:p.murr...@auckland.ac.nz wrote: Hi It is a bug. A fix has been committed. Thanks for the report! Paul On 29/09/2010 10:15 a.m., Ben Bolker wrote: Barry Rowlingsonb.rowlingsonat lancaster.ac.uk http://lancaster.ac.uk writes: My point is that in regular text, ylab plots it where it then goes outside the borders. With the use of expressions - the text just doesn't show up. Originally I thought it was because of my miss-use of expressions, until I figured it was the level of cex.lab I was using. The problem is that when you can't see the text, you don't have a sense of how much to decrease the cex.lab so the text will fit. I hope I was now clearer. Gotcha. Seems to only affect ylab though. Do this: t = expression(paste(test loo(% of 360 *degree, ))) plot(1,xlab=t,ylab=t,main=t) then if I shrink my graphics window I can make the ylab disappear but not the xlab or title. Seems to affect any rotated expressions: plot(1) text(1,1,t,srt=90) text(1,1,t,srt=0) text(1,1,t,srt=45) Now shrink window and watch the rotated expressions vanish! They disappear when they start (or finish) out of the entire graphics device, not the plot region... I cant find anything relating to clipping in the help, and I am on Linux, so see if there's any news about it, try it with R-patched or R-devel and then report a bug after having read all the other stuff about R bug reporting! Barry I don't claim to understand it, but there is something quite fundamental about the properties of the X11() graphics device in R that makes labels that would otherwise overlap, disappear -- if you do 'extreme resizing' with the graphics above, you can see that otherwise-overlapping x- and y-axis tick labels disappear as the graph gets scrunched. This is (apparently) true of X11 graphics on MacOS as well -- Quartz window has a different behavior. Trying with pdf() as well -- for height=2, width=2, only 1 y-axis and 2 x-axis tick labels survive, *but* the x and y labels and the title are all still present (but clipped, of course). [Hmmm. Take my reports above with a grain of salt, I wasn't always using expression()s.] So I would guess that if you reported this as a bug you would be told that it was a poorly documented property of R's X11 graphics model, rather than a bug ... I have no idea where to start looking for more information about what defines this behavior -- if I were desperate to know I would probably try asking Paul Murrell ... I would be very interested to see this discussed on r-devel, if anyone bit ... Ben Bolker __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The
[R] Understanding linear contrasts in Anova using R
#I am trying to understand how R fits models for contrasts in a #simple one-way anova. This is an example, I am not stupid enough to want #to simultaneously apply all of these contrasts to real data. With a few #exceptions, the tests that I would compute by hand (or by other software) #will give the same t or F statistics. It is the contrast estimates that R produces #that I can't seem to understand. # # In searching for answers to this problem, I found a great PowerPoint slide (I think by John Fox). # The slide pointed to the coefficients, said something like these are coeff. that no one could love, and #then suggested looking at the means to understand where they came from. I have stared # and stared at his means and then my means, but can't find a relationship. # The following code and output illustrates the problem. # Various examples of Anova using R dv - c(1.28, 1.35, 3.31, 3.06, 2.59, 3.25, 2.98, 1.53, -2.68, 2.64, 1.26, 1.06, -1.18, 0.15, 1.36, 2.61, 0.66, 1.32, 0.73, -1.06, 0.24, 0.27, 0.72, 2.28, -0.41, -1.25, -1.33, -0.47, -0.60, -1.72, -1.74, -0.77, -0.41, -1.20, -0.31, -0.74, -0.45, 0.54, -0.98, 1.68, 2.25, -0.19, -0.90, 0.78, 0.05, 2.69, 0.15, 0.91, 2.01, 0.40, 2.34, -1.80, 5.00, 2.27, 6.47, 2.94, 0.47, 3.22, 0.01, -0.66) group - factor(rep(1:5, each = 12)) # Use treatment contrasts to compare each group to the first group. options(contrasts = c(contr.treatment,contr.poly)) # The default model2 - lm(dv ~ group) summary(model2) # Summary table is the same--as it should be # Intercept is Group 1 mean and other coeff. are deviations from that. # This is what I would expect. #summary(model1) # Df Sum Sq Mean Sq F valuePr(F) # group4 62.46 15.6151 6.9005 0.0001415 *** # Residuals 55 124.46 2.2629 #Coefficients: #Estimate Std. Error t value Pr(|t|) #(Intercept) 1.802500.43425 4.151 0.000116 *** #group2 -1.127500.61412 -1.836 0.071772 . #group3 -2.715000.61412 -4.421 4.67e-05 *** #group4 -1.258330.61412 -2.049 0.045245 * #group5 0.086670.61412 0.141 0.888288 # Use sum contrasts to compare each group against grand mean. options(contrasts = c(contr.sum,contr.poly)) model3 - lm(dv ~ group) summary(model3) # Again, this is as expected. Intercept is grand mean and others are deviatoions from that. #Coefficients: # Estimate Std. Error t value Pr(|t|) # (Intercept) 0.7997 0.1942 4.118 0.000130 *** # group11.0028 0.3884 2.582 0.012519 * # group2 -0.1247 0.3884 -0.321 0.749449 # group3 -1.7122 0.3884 -4.408 4.88e-05 *** # group4 -0.2555 0.3884 -0.658 0.513399 #SO FAR, SO GOOD # IF I wanted polynomial contrasts BY HAND I would use #a(i) = -2 -1 0 1 2 for linear contrast(or some linear function of this ) #Effect = Sum(a(j)M(i))# where M = mean #Effect(linear) = -2(1.805) -1(0.675) +0(-.912) +1(.544) +2(1.889) = 0.043 #SS(linear) = n*(Effect(linear)^2)/Sum((a(j)^2)) = 12(.043)/10 = .002 #F(linear) = SS(linear)/MS(error) = .002/2.263 = .001 #t(linear) = sqrt(.001) = .031 # To do this in R I would use order.group - ordered(group) model4 - lm(dv~order.group) summary(model4) # This gives: #Coefficients: # Estimate Std. Error t value Pr(|t|) #(Intercept)0.799670.19420 4.118 0.000130 *** #order.group.L 0.013440.43425 0.031 0.975422 #order.group.Q 2.135190.43425 4.917 8.32e-06 *** #order.group.C 0.110150.43425 0.254 0.800703 #order.group^4 -0.796020.43425 -1.833 0.072202 . # The t value for linear is same as I got (as are others) but I don't understand # the estimates. The intercept is the grand mean, but I don't see the relationship # of other estimates to that or to the ones I get by hand. # My estimates are the sum of (coeff times means) i.e. 0 (intercept), .0425, 7.989, .3483, -6.66 # and these are not a linear (or other nice pretty) function of est. from R. # Just as a check, I forced intercept through zero with with deviation scores or -1 in model. # Now force intercept to 0 by using deviation scores devdv - dv-mean(dv) model5 - lm(devdv~order.group) summary(model5) #Same as above except intercept = 0 # Now do it by removing the intercept model6 - lm(dv~order.group -1) summary(model6) # Weird because coeff = cell means # Estimate Std. Error t value Pr(|t|) #order.group1 1.8025 0.4342 4.151 0.000116 *** #order.group2 0.6750 0.4342 1.554 0.125824 #order.group3 -0.9125 0.4342 -2.101 0.040208 * #order.group4 0.5442 0.4342 1.253 0.215464 #order.group5 1.8892 0.4342 4.350 5.94e-05 *** # BUT note that row labels would suggest that we were not solving for polynomials, # but the contrast matrix is made