Re: [R] Matrix question: obtaining the square root of a positive definite matrix?
On Wed, 24 Jan 2007, gallon li wrote: I want to compute B=A^{1/2} such that B*B=A. According to your subject line A is positive definite and hence symmetric? The usual definition of a matrix square root involves a transpose, e.g. B'B = A. There are many square roots: were you looking for a symmetric one? For such an A, e - eigen(A) V - e$vectors V %*% diag(e$values) %*% t(V) recovers A (up to rounding errors), and B - V %*% diag(sqrt(e$values)) %*% t(V) is such that B %*% B = A. Even that is not unique, e.g. -B is an equally good answer. For example (with A = b and B = a, it seems) a=matrix(c(1,.2,.2,.2,1,.2,.2,.2,1),ncol=3) so a [,1] [,2] [,3] [1,] 1.0 0.2 0.2 [2,] 0.2 1.0 0.2 [3,] 0.2 0.2 1.0 a%*%a [,1] [,2] [,3] [1,] 1.08 0.44 0.44 [2,] 0.44 1.08 0.44 [3,] 0.44 0.44 1.08 b=a%*%a i have tried to use singular value decomposion c=svd(b) c$u%*%diag(sqrt(c$d)) [,1] [,2] [,3] [1,] -0.8082904 2.043868e-18 0.6531973 [2,] -0.8082904 -5.656854e-01 -0.3265986 [3,] -0.8082904 5.656854e-01 -0.3265986 this does not come close to the original a. Can anybody on this forum enlight me on how to get a which is the square root of b? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector to Matrix transformation
How to suppress the recycling of items in a matrix..instead NA can be filled. -Original Message- From: Chuck Cleland [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 23, 2007 8:00 PM To: Shubha Vishwanath Karanth Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Vector to Matrix transformation Shubha Vishwanath Karanth wrote: Hi R, I have a vector V1 of unknown length, say n. I need to convert this into a matrix C of row size=5, and accordingly the column should be updated. I tried with: C=as.matrix(V1,5,n/5) But it is not working...Could somebody help me on this? You could try the following: matrix(V1, nrow=5) but note what happens when the length of V1 is not a multiple of 5. Thanks in advance... [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Checking for the existence of an R object
Hi, Is there any way to check whether an R object exists or not? Say example: a data frame. Thanks, Shubha [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking for the existence of an R object
see: ?exists HTH. On 1/24/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote: Hi, Is there any way to check whether an R object exists or not? Say example: a data frame. Thanks, Shubha [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking for the existence of an R object
Thanks all of you... -Original Message- From: talepanda [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 24, 2007 2:10 PM To: Shubha Vishwanath Karanth Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Checking for the existence of an R object see: ?exists HTH. On 1/24/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote: Hi, Is there any way to check whether an R object exists or not? Say example: a data frame. Thanks, Shubha [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix question: obtaining the square root of a positive definite matrix?
Prof Brian Ripley wrote: On Wed, 24 Jan 2007, gallon li wrote: I want to compute B=A^{1/2} such that B*B=A. According to your subject line A is positive definite and hence symmetric? The usual definition of a matrix square root involves a transpose, e.g. B'B = A. There are many square roots: were you looking for a symmetric one? If not, Choleski decomposition by chol() is often the expedient way. For such an A, e - eigen(A) V - e$vectors V %*% diag(e$values) %*% t(V) recovers A (up to rounding errors), and B - V %*% diag(sqrt(e$values)) %*% t(V) is such that B %*% B = A. Even that is not unique, e.g. -B is an equally good answer. and you can flip the sign of the individual root eigenvalues too, and if the eigenvalues are not unique, you can rotate the eigenspace coordinate systems at will and then flip signs. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector to Matrix transformation
non-elegant solution: matrix(c(V1,rep(NA,-length(V1)%%5)),nrow=5) HTH. On 1/24/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote: How to suppress the recycling of items in a matrix..instead NA can be filled. -Original Message- From: Chuck Cleland [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 23, 2007 8:00 PM To: Shubha Vishwanath Karanth Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Vector to Matrix transformation Shubha Vishwanath Karanth wrote: Hi R, I have a vector V1 of unknown length, say n. I need to convert this into a matrix C of row size=5, and accordingly the column should be updated. I tried with: C=as.matrix(V1,5,n/5) But it is not working...Could somebody help me on this? You could try the following: matrix(V1, nrow=5) but note what happens when the length of V1 is not a multiple of 5. Thanks in advance... [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Date variable
Dear R users, I did following with a date variable library(date) date = 03/11/05 date = as.Date(date, format=%m/%d/%y) date [1] 2005-03-11 s = vector(length=3) s[1] = date s[1] [1] 12853 But here I got s[1] as 12853. But this is not that I want. I need s[1] as original date. Can anyone tell me where is the mistake? Thanks and regards, - Heres a new way to find what you're looking for - Yahoo! Answers [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date variable
For Date class, *original* (that is, contents in memory) is 12853, and 2005-03-11 is one expression of the original. So you have to convert from the original to the charecter expression as follows. s[1]-format(date) s [1] 2005-03-11 FALSE FALSE s[1]-as.character(date) s [1] 2005-03-11 FALSE FALSE BTW, I think s = vector(character, length=3) is more preferable for your purpose. HTH. On 1/24/07, stat stat [EMAIL PROTECTED] wrote: Dear R users, I did following with a date variable library(date) date = 03/11/05 date = as.Date(date, format=%m/%d/%y) date [1] 2005-03-11 s = vector(length=3) s[1] = date s[1] [1] 12853 But here I got s[1] as 12853. But this is not that I want. I need s[1] as original date. Can anyone tell me where is the mistake? Thanks and regards, - Here's a new way to find what you're looking for - Yahoo! Answers [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix subsetting {was ... vectorized nested loop...}
Hi Jose, I'm answering your second batch of questions, since Chuck Berry has already well done so with the first one Jose == Jose Quesada [EMAIL PROTECTED] on Tue, 23 Jan 2007 21:46:27 +0100 writes: [] Jose # example Jose library(Matrix) Jose x = as(x,CsparseMatrix) [..] Jose Also, I have noticed that getting a row from a Matrix Jose object produces a normal array (i.e., it does not Jose inherit Matrix class). This is very much on purpose, following the principle of least surprise so I'm surprised you're suprised.. : The 'Matrix' behavior has been modelled to follow the more than 20 years old 'matrix' behavior : matrix(1:9, 3) [,2] [1] 4 5 6 matrix(1:9, 3) [,2 , drop=FALSE] [,1] [1,]4 [2,]5 [3,]6 library(Matrix) Loading required package: lattice Matrix(1:9, 3) [,2] [1] 4 5 6 Matrix(1:9, 3) [,2, drop = FALSE] 3 x 1 Matrix of class dgeMatrix [,1] [1,]4 [2,]5 [3,]6 But then I should not be surprised, because there has been the R FAQ 7.5 Why do my matrices lose dimensions? for quite a while. *And* I think that there is only one thing in the S language about which every knowledgable one agrees that it's a design bug, and that's the fact that 'drop = TRUE' is the default, and not 'drop = FALSE' {but it's not possible to change now, please don't start that discussion!} Given what I say above, I wonder if our (new-style) 'Matrix' objects should not behave differently than (old-style) 'matrix' and indeed do use a default 'drop = FALSE'. This might break some Matrix-based code though, but then 'Matrix' is young enough, and working Matrix indexing is much younger, and there are only about 4 CRAN/Bioconductor packages depending on 'Matrix'. -- This discussion (about changing this behavior in the Matrix package) should definitely be lead on the R-devel mailing list -- CC'ing to R-devel {hence one (but please *only* one !) cross-post} Jose However, selecting 1 rows, Jose does produce a same-class matrix. If I convert with Jose as() the output of selecting one row, am I losing Jose performance? Is there any way to make the resulting Jose vector be a 1-D Matrix object? yes, , drop = FALSE, see above Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to change the dataframe labels' ?
I import a dataframe composed by 2 variables and 14000 cases. Now I need the labels of the cases sorted by the second variable V2, but if I sort the dataframe according to the second variable: mydataframe- mydataframe[order(mydataframe$V2),] I notice that the labels are always the same (that is, not ordered by V2). How to change them? I tried : labels(mydataframe)-1:14000 but it doesn't work. Thanks domenico [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] solving a structural equation model using sem or other package
This is an extract from the sem help page, which deals with your situation: S covariance matrix among observed variables; may be input as a symmetric matrix, or as a lower- or upper-triangular matrix. S may also be a raw (i.e., ``uncorrected'') moment matrix — that is, a sum-of-squares-and-products matrix divided by N. This form of input is useful for fitting models with intercepts, in which case the moment matrix should include the mean square and cross-products for a unit variable all of whose entries are 1; of course, the raw mean square for the unit variable is 1. Raw-moment matrices may be computed by raw.moments. On 24/01/07, Daniel Nordlund [EMAIL PROTECTED] wrote: I am trying to work my way through the book Singer, JD and Willett, JB, Applied Longitudinal Data Analysis. Oxford University Press, 2003 using R. I have the SAS code and S-Plus code from the UCLA site (doesn't include chapter 8 or later problems). In chapter 8, there is a structural equation/path model which can be specified for the sem package as follows S - cov(al2) #al2 contains the variables alc1, alc2, alc3, and cons N - 1122 modelA.ram - specify.model() f1- alc1, NA, 1 f1- alc2, NA, 1 f1- alc3, NA, 1 f2- alc1, NA, 0 f2- alc2, NA, .75 f2- alc3, NA, 1.75 cons - f1,p0, 1 cons - f2,p1, 1 alc1 - alc1, u1, 1 alc2 - alc2, u2, 1 alc3 - alc3, u3, 1 cons - cons, u4, 1 f1 - f1,s1, 1 f2 - f2,s2, 1 f1 - f2,s3, 1 modelA - sem(modelA.ram, S, N, analytic.gradient=FALSE) An equivalent specification in SAS produces the solution presented in the book. The variable cons is a constant vector of 1's. The problem with the sem package is that the covariance matrix which includes the variable cons is singular and sem says so and will not continue. Is there an alternative way to specify this problem for sem to obtain a solution? If not, is there another package that would produce a solution? Thanks, Dan Nordlund Bothell, WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the dataframe labels' ?
rownames()- is what you want. dat-data.frame(V1=sample(10),V2=sample(10)) dat V1 V2 1 2 5 2 3 8 3 8 4 4 9 6 5 6 2 6 5 7 7 10 3 8 4 9 9 1 10 10 7 1 dat-dat[order(dat$V2),] dat V1 V2 10 7 1 5 6 2 7 10 3 3 8 4 1 2 5 4 9 6 6 5 7 2 3 8 8 4 9 9 1 10 rownames(dat)-1:dim(dat)[1] ## or rownames(dat)-dat$V2 dat V1 V2 1 7 1 2 6 2 3 10 3 4 8 4 5 2 5 6 9 6 7 5 7 8 3 8 9 4 9 10 1 10 HTH On 1/24/07, domenico pestalozzi [EMAIL PROTECTED] wrote: I import a dataframe composed by 2 variables and 14000 cases. Now I need the labels of the cases sorted by the second variable V2, but if I sort the dataframe according to the second variable: mydataframe- mydataframe[order(mydataframe$V2),] I notice that the labels are always the same (that is, not ordered by V2). How to change them? I tried : labels(mydataframe)-1:14000 but it doesn't work. Thanks domenico [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to generate 'minor' ticks in lattice (qqmath)
Dear Gabor! thanks for your hints; as a side-effect I learned a lot about lattice. The working example is: library(lattice) library(grid) numy - 100 y - runif(numy,min=0,max=1) sig - 0.05 numsig - length(which(ysig)) Lower - 0 Upper - 1 MajorInterval - 5 # interval for major ticks MinorInterval - 4 # interval within major Major - seq( Lower,Upper,(Upper-Lower)/MajorInterval ) Minor - seq( Lower,Upper,(Upper-Lower)/(MajorInterval*MinorInterval) ) labl - as.character(Major) trellis.focus(panel, 1, 1, clip.off = TRUE) qqmath(y, distribution = qunif, prepanel = NULL, panel = function(x) { panel.abline(c(0,1), lty = 2) panel.polygon(c(0,0,numsig/numy,numsig/numy,0), c(0,sig,sig,0,0), lwd = 0.75) panel.qqmath(x, distribution = qunif, col = 1) }, scales=list(x = list(at = Major), y = list(at = Major), tck=c(1,0), labels=labl, cex=0.9), xlab = uniform [0,1] quantiles, ylab = runif [0,1], min = 0, max = 1) trellis.focus(panel, 1, 1, clip.off = TRUE) panel.axis(bottom, check.overlap = TRUE, outside = TRUE, labels = FALSE, tck = .5, at = Minor) panel.axis(left, check.overlap = TRUE, outside = TRUE, labels = FALSE, tck = .5, at = Minor) trellis.unfocus() Best regards, Helmut -- Helmut Schütz BEBAC Consultancy Services for Bioequivalence and Bioavailability Studies Neubaugasse 36/11 1070 Vienna/Austria tel/fax +43 1 2311746 Web http://BEBAC.at BE/BA Forum http://forum.bebac.at http://www.goldmark.org/netrants/no-word/attach.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the dataframe labels' ?
Sorry, no: row.names-() is what you want. rownames for matrices (and arrays) row.names for data frames. Using them the other way round usually works but can be very inefficient. From R-devel (where the worst inefficiencies are circumvented) The extractor functions try to do something sensible for any matrix-like object 'x'. If the object has 'dimnames' the first component is used as the row names, and the second component (if any) is used for the column names. For a data frame, 'rownames' and 'colnames' are calls to 'row.names' and 'names' respectively, but the latter are preferred. On Wed, 24 Jan 2007, talepanda wrote: rownames()- is what you want. dat-data.frame(V1=sample(10),V2=sample(10)) dat V1 V2 1 2 5 2 3 8 3 8 4 4 9 6 5 6 2 6 5 7 7 10 3 8 4 9 9 1 10 10 7 1 dat-dat[order(dat$V2),] dat V1 V2 10 7 1 5 6 2 7 10 3 3 8 4 1 2 5 4 9 6 6 5 7 2 3 8 8 4 9 9 1 10 rownames(dat)-1:dim(dat)[1] ## or rownames(dat)-dat$V2 dat V1 V2 1 7 1 2 6 2 3 10 3 4 8 4 5 2 5 6 9 6 7 5 7 8 3 8 9 4 9 10 1 10 HTH On 1/24/07, domenico pestalozzi [EMAIL PROTECTED] wrote: I import a dataframe composed by 2 variables and 14000 cases. Now I need the labels of the cases sorted by the second variable V2, but if I sort the dataframe according to the second variable: mydataframe- mydataframe[order(mydataframe$V2),] I notice that the labels are always the same (that is, not ordered by V2). How to change them? I tried : labels(mydataframe)-1:14000 but it doesn't work. Thanks domenico [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conversion of column matrix into a vector without duplicates
Hi R, I have a matrix A, A= [,1] [,2] [1,] a u [2,] b v [3,] c x [4,] d x [5,] e x I want to put the 2nd column of this matrix in a vector without duplicates. i.e., my vector v should be (u, v, x), whose length is 3. Can anybody help me on this? Thanks in advance Shubha. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Capturing output from external executables, in windows
Hi, Any help on the following would be much appreciated I wish to capture the output (currently going to console) from an external executable. The executable is successfully run using system(program -switch ) and the output printed to the DOS console. How do I capture this output? I have tried redirecting the output to a text file, and then reading this in system(program -switch textfile.txt) data-scan(textfile.txt) But this does not seem to work (the textfile.txt is not written). It does however work if I invoke the console to be permanent system(cmd /K program -switch textfile.txt) data-scan(textfile.txt) Unfortunately, this leaves me with an open console window I have to close manually. Is there a way of doing this (under windows) using system( ) or some other command? It appears that pipe( ) may do it, but I cannot understand the documentation. An example of the appropriate syntax would be an enormous help. Thanks in advance, Darren [EMAIL PROTECTED] -- Darren Obbard Institute of Evolutionary Biology Ashworth Labs Kings Buildings University of Edinburgh, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conversion of column matrix into a vector without duplicates
you need: unique(A[, 2]) Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Shubha Vishwanath Karanth [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Wednesday, January 24, 2007 2:12 PM Subject: [R] Conversion of column matrix into a vector without duplicates Hi R, I have a matrix A, A= [,1] [,2] [1,] a u [2,] b v [3,] c x [4,] d x [5,] e x I want to put the 2nd column of this matrix in a vector without duplicates. i.e., my vector v should be (u, v, x), whose length is 3. Can anybody help me on this? Thanks in advance Shubha. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conversion of column matrix into a vector without duplicates
?unique unique(A[, 2]) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Shubha Vishwanath Karanth Verzonden: woensdag 24 januari 2007 14:13 Aan: r-help@stat.math.ethz.ch Onderwerp: [R] Conversion of column matrix into a vector without duplicates Hi R, I have a matrix A, A= [,1] [,2] [1,] a u [2,] b v [3,] c x [4,] d x [5,] e x I want to put the 2nd column of this matrix in a vector without duplicates. i.e., my vector v should be (u, v, x), whose length is 3. Can anybody help me on this? Thanks in advance Shubha. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conversion of column matrix into a vector without duplicates
Shubha Vishwanath Karanth wrote: Hi R, I have a matrix A, A= [,1] [,2] [1,] a u [2,] b v [3,] c x [4,] d x [5,] e x I want to put the 2nd column of this matrix in a vector without duplicates. i.e., my vector v should be (u, v, x), whose length is 3. Can anybody help me on this? A - matrix(c(a,b,c,d,e,u,v,x,x,x), ncol=2) A[,2] [1] u v x x x unique(A[,2]) [1] u v x is.vector(unique(A[,2])) [1] TRUE You probably could have helped yourself by checking the results of RSiteSearch(duplicate) . Thanks in advance Shubha. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date variable
You don't appear to be using any functionality from the date package but are using the Date class which is built into the base of R. Assuming: d - as.Date(03/11/05, %m/%d/%y) # and try one of these: d3 - structure(rep(NA, 3), class = Date) d3[1] - d d3 # or d3 - rep(d, 3) + NA d3[1] - d d3 # or d3 - rep(NA, 3) class(d3) - Date d3[1] - d d3 # or d3 - vector(length = 3, mode = numeric) + NA class(d3) - Date d3[1] - d d3 On 1/24/07, stat stat [EMAIL PROTECTED] wrote: Dear R users, I did following with a date variable library(date) date = 03/11/05 date = as.Date(date, format=%m/%d/%y) date [1] 2005-03-11 s = vector(length=3) s[1] = date s[1] [1] 12853 But here I got s[1] as 12853. But this is not that I want. I need s[1] as original date. Can anyone tell me where is the mistake? Thanks and regards, - Here's a new way to find what you're looking for - Yahoo! Answers [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logistic regression model + precision/recall
Hi, I am using logistic regression model named lrm(Design) Rite now I was using Area Under Curve (AUC) for testing my model. But, now I have to calculate precision/recall of the model on test cases. For lrm, precision and recal would be simply defined with the help of 2 terms below: True Positive (TP) - Number of test cases where class 1 is given probability = 0.5. False Negative (FP) - Number of test cases where class 0 is given probability = 0.5. Precision = TP / (TP + FP) Recall = TP / ( Number of Positive Samples in test data) Any help is appreciated. I an write a long code with for loops and all, but is there any inbuild function or just few commands that would do the task. regards, Nitin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to generate 'minor' ticks in lattice (qqmath)
I think you have a stray trellis.focus statement (just before the qqmath statement). Probably a cut and paste problem. Regards. On 1/24/07, Helmut Schütz [EMAIL PROTECTED] wrote: Dear Gabor! thanks for your hints; as a side-effect I learned a lot about lattice. The working example is: library(lattice) library(grid) numy - 100 y - runif(numy,min=0,max=1) sig - 0.05 numsig - length(which(ysig)) Lower - 0 Upper - 1 MajorInterval - 5 # interval for major ticks MinorInterval - 4 # interval within major Major - seq( Lower,Upper,(Upper-Lower)/MajorInterval ) Minor - seq( Lower,Upper,(Upper-Lower)/(MajorInterval*MinorInterval) ) labl - as.character(Major) trellis.focus(panel, 1, 1, clip.off = TRUE) qqmath(y, distribution = qunif, prepanel = NULL, panel = function(x) { panel.abline(c(0,1), lty = 2) panel.polygon(c(0,0,numsig/numy,numsig/numy,0), c(0,sig,sig,0,0), lwd = 0.75) panel.qqmath(x, distribution = qunif, col = 1) }, scales=list(x = list(at = Major), y = list(at = Major), tck=c(1,0), labels=labl, cex=0.9), xlab = uniform [0,1] quantiles, ylab = runif [0,1], min = 0, max = 1) trellis.focus(panel, 1, 1, clip.off = TRUE) panel.axis(bottom, check.overlap = TRUE, outside = TRUE, labels = FALSE, tck = .5, at = Minor) panel.axis(left, check.overlap = TRUE, outside = TRUE, labels = FALSE, tck = .5, at = Minor) trellis.unfocus() Best regards, Helmut -- Helmut Schütz BEBAC Consultancy Services for Bioequivalence and Bioavailability Studies Neubaugasse 36/11 1070 Vienna/Austria tel/fax +43 1 2311746 Web http://BEBAC.at BE/BA Forum http://forum.bebac.at http://www.goldmark.org/netrants/no-word/attach.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Capturing output from external executables, in windows
You can try: system(command, show.output.on.console=TRUE) On 24/01/07, Darren Obbard [EMAIL PROTECTED] wrote: Hi, Any help on the following would be much appreciated I wish to capture the output (currently going to console) from an external executable. The executable is successfully run using system(program -switch ) and the output printed to the DOS console. How do I capture this output? I have tried redirecting the output to a text file, and then reading this in system(program -switch textfile.txt) data-scan(textfile.txt) But this does not seem to work (the textfile.txt is not written). It does however work if I invoke the console to be permanent system(cmd /K program -switch textfile.txt) data-scan(textfile.txt) Unfortunately, this leaves me with an open console window I have to close manually. Is there a way of doing this (under windows) using system( ) or some other command? It appears that pipe( ) may do it, but I cannot understand the documentation. An example of the appropriate syntax would be an enormous help. Thanks in advance, Darren [EMAIL PROTECTED] -- Darren Obbard Institute of Evolutionary Biology Ashworth Labs Kings Buildings University of Edinburgh, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [fixed] vectorized nested loop: apply a function that takes two rows
Thanks Charles, Martin, Substantial improvement with the vectorized solution. Here is a quick benchmark: # The loop-based solution: nestedCos = function (x) { if (is(x, Matrix) ) { cos = array(NA, c(ncol(x), ncol(x))) for (i in 2:ncol(x)) { for (j in 1:(i - 1)) { cos[i, j] = cosine(x[, i], x[, j]) } } } return(cos) } # Charles C. Berry's vectorized approach flatCos = function (x) { res = crossprod( x , x ) diagnl = Diagonal( ncol(x), 1 / sqrt( diag( res ))) cos = diagnl %*% res %*% diagnl return(cos) } Benchmarking: system.time(for(i in 1:10)nestedCos(x)) (I stopped because it was taking too long) Timing stopped at: 139.37 3.82 188.76 NA NA system.time(for(i in 1:10)flatCos(x)) [1] 0.43 0.00 0.48 NA NA #-- As much as I like to have faster code, I'm still wondering WHY flatCos gets the same results; i.e., why multiplying the inverse sqrt root of the diagonal of x BY x, then BY the diagonal again produces the expected result. I checked the wikipedia page for crossprod and other sources, but it still eludes me. I can see that scaling by the sqrt of the diagonal once makes sense with 'res - crossprod( x , x ) gives your result up to scale factors of sqrt(res[i,i]*res[j,j])', but I still don't see why you need to postmultiply by the diagonal again. Maybe trying to attack a simpler problem might help my understanding: e.g., calculating the cos of a column to all other colums of x (that is, the inner part of the nested loop). How would that work in a vectorized way? I'm trying to get some general technique that I can reuse later from this excellent answer. Thanks, -Jose I am rusty on 'Matrix', but I see there are crossprod methods for those classes. res - crossprod( x , x ) gives your result up to scale factors of sqrt(res[i,i]*res[j,j]), so something like diagnl - Diagonal( ncol(x), sqrt( diag( res ) ) OOPS! Better make that diagnl - Diagonal( ncol(x), 1 / sqrt( diag( res ) ) final.res - diagnl %*% res %*% diagnl should do it. -- Cheers, -Jose -- Jose Quesada, PhD Research fellow, Psychology Dept. Sussex University, Brighton, UK http://www.andrew.cmu.edu/~jquesada __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the dataframe labels' ?
I did not know the fact. Thanks for useful information. On 1/24/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: Sorry, no: row.names-() is what you want. rownames for matrices (and arrays) row.names for data frames. Using them the other way round usually works but can be very inefficient. From R-devel (where the worst inefficiencies are circumvented) The extractor functions try to do something sensible for any matrix-like object 'x'. If the object has 'dimnames' the first component is used as the row names, and the second component (if any) is used for the column names. For a data frame, 'rownames' and 'colnames' are calls to 'row.names' and 'names' respectively, but the latter are preferred. On Wed, 24 Jan 2007, talepanda wrote: rownames()- is what you want. dat-data.frame(V1=sample(10),V2=sample(10)) dat V1 V2 1 2 5 2 3 8 3 8 4 4 9 6 5 6 2 6 5 7 7 10 3 8 4 9 9 1 10 10 7 1 dat-dat[order(dat$V2),] dat V1 V2 10 7 1 5 6 2 7 10 3 3 8 4 1 2 5 4 9 6 6 5 7 2 3 8 8 4 9 9 1 10 rownames(dat)-1:dim(dat)[1] ## or rownames(dat)-dat$V2 dat V1 V2 1 7 1 2 6 2 3 10 3 4 8 4 5 2 5 6 9 6 7 5 7 8 3 8 9 4 9 10 1 10 HTH On 1/24/07, domenico pestalozzi [EMAIL PROTECTED] wrote: I import a dataframe composed by 2 variables and 14000 cases. Now I need the labels of the cases sorted by the second variable V2, but if I sort the dataframe according to the second variable: mydataframe- mydataframe[order(mydataframe$V2),] I notice that the labels are always the same (that is, not ordered by V2). How to change them? I tried : labels(mydataframe)-1:14000 but it doesn't work. Thanks domenico [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with ordered probit model in MASS
Dear all, I got this message, while using the polr function in MASS EQ-as.formula(dep~fpta+tcdv+cdta+cmta+prcd+patc+lactifs+excta) Estim-polr(EQ,don, subset=(cote!=0),method=probit,na.action=na.omit) Error in polr(EQ, don, subset = (cote != 0), method = probit, na.action = na.omit) : attempt for find suitable starting values failed In addition: Warning messages: 1: algorithm did not converge in: glm.fit(X, y1, wt, family = binomial(probit), offset = offset) 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(X, y1, wt, family = binomial(probit), offset = offset) how can I initialise starting values ? Justin BEM Elève Ingénieur Statisticien Economiste BP 294 Yaoundé. Tél (00237)9597295. ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] keep track of selected observations over time
Dear all, Attached is a description of my data, graph and the problem which I need help with. Hope you have time to open the file and help me out. Many thanks, Jenny - __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression model + precision/recall
nitin jindal wrote: Hi, I am using logistic regression model named lrm(Design) Rite now I was using Area Under Curve (AUC) for testing my model. But, now I have to calculate precision/recall of the model on test cases. For lrm, precision and recal would be simply defined with the help of 2 terms below: True Positive (TP) - Number of test cases where class 1 is given probability = 0.5. False Negative (FP) - Number of test cases where class 0 is given probability = 0.5. Why 0.5? Precision = TP / (TP + FP) Recall = TP / ( Number of Positive Samples in test data) Those are improper scoring rules that can be tricked. If the outcome is rare (say 0.02 incidence) you could just predict that no one will have the outcome and be correct 0.98 of the time. I suggest validating the model for discrimination (e.g., AUC) and calibration. Frank Any help is appreciated. I an write a long code with for loops and all, but is there any inbuild function or just few commands that would do the task. regards, Nitin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to generate 'minor' ticks in lattice (qqmath)
Yes thanks, just a kind of 'left-over' ;-) Regards, Helmut Gabor Grothendieck wrote: I think you have a stray trellis.focus statement (just before the qqmath statement). Probably a cut and paste problem. Regards. trellis.focus(panel, 1, 1, clip.off = TRUE) -- Helmut Schütz BEBAC Consultancy Services for Bioequivalence and Bioavailability Studies Neubaugasse 36/11 1070 Vienna/Austria tel/fax +43 1 2311746 Web http://BEBAC.at BE/BA Forum http://forum.bebac.at http://www.goldmark.org/netrants/no-word/attach.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Capturing output from external executables, in windows
On 1/24/2007 8:22 AM, Darren Obbard wrote: Hi, Any help on the following would be much appreciated I wish to capture the output (currently going to console) from an external executable. The executable is successfully run using system(program -switch ) and the output printed to the DOS console. How do I capture this output? I have tried redirecting the output to a text file, and then reading this in system(program -switch textfile.txt) data-scan(textfile.txt) But this does not seem to work (the textfile.txt is not written). The redirection character is normally handled by the shell. The system() function is low level, so it will pass it directly to your program. It does however work if I invoke the console to be permanent system(cmd /K program -switch textfile.txt) data-scan(textfile.txt) Unfortunately, this leaves me with an open console window I have to close manually. Use the /C option instead of /K; it executes and then terminates. Or use the intern or show.output.on.console arg to system(). Duncan Murdoch Is there a way of doing this (under windows) using system( ) or some other command? It appears that pipe( ) may do it, but I cannot understand the documentation. An example of the appropriate syntax would be an enormous help. Thanks in advance, Darren [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [fixed] vectorized nested loop: apply a function that takes two rows
Jose Quesada wrote: Thanks Charles, Martin, Substantial improvement with the vectorized solution. Here is a quick benchmark: # The loop-based solution: nestedCos = function (x) { if (is(x, Matrix) ) { cos = array(NA, c(ncol(x), ncol(x))) for (i in 2:ncol(x)) { for (j in 1:(i - 1)) { cos[i, j] = cosine(x[, i], x[, j]) } } } return(cos) } # Charles C. Berry's vectorized approach flatCos = function (x) { res = crossprod( x , x ) diagnl = Diagonal( ncol(x), 1 / sqrt( diag( res ))) cos = diagnl %*% res %*% diagnl return(cos) } Benchmarking: system.time(for(i in 1:10)nestedCos(x)) (I stopped because it was taking too long) Timing stopped at: 139.37 3.82 188.76 NA NA system.time(for(i in 1:10)flatCos(x)) [1] 0.43 0.00 0.48 NA NA #-- As much as I like to have faster code, I'm still wondering WHY flatCos gets the same results; i.e., why multiplying the inverse sqrt root of the diagonal of x BY x, then BY the diagonal again produces the expected result. I checked the wikipedia page for crossprod and other sources, but it still eludes me. I can see that scaling by the sqrt of the diagonal once makes sense with 'res - crossprod( x , x ) gives your result up to scale factors of sqrt(res[i,i]*res[j,j])', but I still don't see why you need to postmultiply by the diagonal again. Didn't follow this thread too closely, but the point would seem to be that the inner product of two normalized vectors is the cosine of the angle. So basically, you want crossprod(X%*%diagnl, X%*%diagnl) == t(diagnl) %*% t(X) %*% X %*% diagnl I think, BTW, that another version not requiring Matrix is Cr - crossprod(X) D - sqrt(diag(C)) Cr/outer(D, D) Maybe trying to attack a simpler problem might help my understanding: e.g., calculating the cos of a column to all other colums of x (that is, the inner part of the nested loop). How would that work in a vectorized way? I'm trying to get some general technique that I can reuse later from this excellent answer. Thanks, -Jose I am rusty on 'Matrix', but I see there are crossprod methods for those classes. res - crossprod( x , x ) gives your result up to scale factors of sqrt(res[i,i]*res[j,j]), so something like diagnl - Diagonal( ncol(x), sqrt( diag( res ) ) OOPS! Better make that diagnl - Diagonal( ncol(x), 1 / sqrt( diag( res ) ) final.res - diagnl %*% res %*% diagnl should do it. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] double integral with C
Hi all, This is more a C querie rather than a R one: I'm writing a C code passing a function F to adapt fortran subroutine. I need to integrate over two variables of F, call them x1 and x2. Then I call the C code in R to optimize the integrated F function. for example F could be defined as --- static double marg_like(const double *param, double x1, double x2){...} --- Then I integrate over x1 and x2 with - F77_CALL(adapt)(2,(-5,-5),(5,5),100,1700,F,0.01,1) - So here my question: I should I define x1 and x2? For the time being I defined them as static variables, i.e. static double x1; static double x2; but I'm pretty sure this is wrong Any hint? thanks in advance Stefano __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Capturing output from external executables, in windows
Is this for RGui under Windows? I will assume so (but 'in windows' is not unambiguous, and there are alternative front-ends like Rterm). Have you consulted the help page for system()?: this is precisely what 'show.output.on.console' is for. You cannot redirect in system (it does say so on the current help page): you need to use shell(). On Wed, 24 Jan 2007, Darren Obbard wrote: Hi, Any help on the following would be much appreciated I wish to capture the output (currently going to console) from an external executable. The executable is successfully run using system(program -switch ) and the output printed to the DOS console. How do I capture this output? I have tried redirecting the output to a text file, and then reading this in system(program -switch textfile.txt) data-scan(textfile.txt) But this does not seem to work (the textfile.txt is not written). It does however work if I invoke the console to be permanent system(cmd /K program -switch textfile.txt) data-scan(textfile.txt) Unfortunately, this leaves me with an open console window I have to close manually. Is there a way of doing this (under windows) using system( ) or some other command? It appears that pipe( ) may do it, but I cannot understand the documentation. An example of the appropriate syntax would be an enormous help. system(program -switch , show.output.on.console=TRUE) could it be any easier? (Well, in the next version of R that will be default, so a little.) Thanks in advance, Darren [EMAIL PROTECTED] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression model + precision/recall
On 1/24/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote: Why 0.5? The probability has to adjusted based on some hit and trials. I just mentioned it as an example Those are improper scoring rules that can be tricked. If the outcome is rare (say 0.02 incidence) you could just predict that no one will have the outcome and be correct 0.98 of the time. I suggest validating the model for discrimination (e.g., AUC) and calibration. I just have to calculate precision/recall for rare outcome. If the positive outcome is rare ( say 0.02 incidence) and I predict it to be negative all the time, my recall would be 0, which is bad. So, precision and recall can take care of skewed data. Frank [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with ordered probit model in MASS
On Wed, 24 Jan 2007, justin bem wrote: (Again: this is a duplicate post.) Dear all, I got this message, while using the polr function in MASS EQ-as.formula(dep~fpta+tcdv+cdta+cmta+prcd+patc+lactifs+excta) Estim-polr(EQ,don, subset=(cote!=0),method=probit,na.action=na.omit) Error in polr(EQ, don, subset = (cote != 0), method = probit, na.action = na.omit) : attempt for find suitable starting values failed In addition: Warning messages: 1: algorithm did not converge in: glm.fit(X, y1, wt, family = binomial(probit), offset = offset) 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(X, y1, wt, family = binomial(probit), offset = offset) how can I initialise starting values ? What do you think the 'start' argument to polr() is for? If you are asking how you find suitable values, I cannot help you as I know nothing about your problem, and failing to find starting values usually means that the model is very far from appropriate. Justin BEM Elève Ingénieur Statisticien Economiste BP 294 Yaoundé. Tél (00237)9597295. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression model + precision/recall
nitin jindal wrote: On 1/24/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote: Why 0.5? The probability has to adjusted based on some hit and trials. I just mentioned it as an example Using a cutoff is not a good idea unless the utility (loss) function is discontinuous and is the same for every subject (in the medical field utilities are almost never constant). And if you are using the data to find the cutoff, this will require bootstrapping to penalize for the cutoff not being pre-specified. Those are improper scoring rules that can be tricked. If the outcome is rare (say 0.02 incidence) you could just predict that no one will have the outcome and be correct 0.98 of the time. I suggest validating the model for discrimination (e.g., AUC) and calibration. I just have to calculate precision/recall for rare outcome. If the positive outcome is rare ( say 0.02 incidence) and I predict it to be negative all the time, my recall would be 0, which is bad. So, precision and recall can take care of skewed data. No, that is not clear. The overall classification error would only be 0.02 in that case. It is true though that one of the two conditional probabilities would not be good. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression model + precision/recall
On 1/24/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote: nitin jindal wrote: Using a cutoff is not a good idea unless the utility (loss) function is discontinuous and is the same for every subject (in the medical field utilities are almost never constant). And if you are using the data to find the cutoff, this will require bootstrapping to penalize for the cutoff not being pre-specified. Thnx for this info. If I still have to use cutoff, I will do bootstrapping. I dont know any alternative to this to compute precision/recall for logistic regression model. No, that is not clear. The overall classification error would only be 0.02 in that case. It is true though that one of the two conditional probabilities would not be good. I forgot to mention that for my data, overall classification error is non-significant. I am only interested in precision/recall for rare outcome. nitin Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mixed effects or fixed effects?
Hi, I am running a learning experiment in which both training subjects and controls complete a pretest and posttest. All analyses are being conducted in R. We are looking to compare two training methodologies, and so have run this experiment twice, once with each methodology. Methodology is a between-subjects factor. Trying to run this analysis with every factor included (ie, subject as a random factor, session nested within group nested within experiment) seems to me (after having tried) to be clumsy and probably uninterpretable. My favoured model for the analysis is a linear mixed-effects model, and to combine the data meaningfully, I have collated all the pretest data for controls and trained subjects from each experiment, and assumed this data to represent a population sample for naive subjects for each experiment. I have also ditched the posttest data for the controls, and assumed the posttest training data to represent a population sample for trained subjects for each experiment. I have confirmed the validity of these assumptions by ascertaining that a) controls and trained listeners did not differ significantly at pretest for either experiment; and b) control listeners did not learn significantly between pretest and posttest (and therefore their posttest data are not relevant). This was done using a linear mixed-effects model for each experiment, with subject as a random factor and session (pretest vs posttest) nested within Group (trained vs control). Therefore, the model I want to use to analyse the data would ideally be a linear mixed-effects model, with subject as a random factor, and session (pre vs post) nested within experiment. Note that my removal of the Group (Trained vs Control) factor simplifies the model somewhat, and makes it more interpretable in terms of evaluating the relative effects of each experiment. What I would like to know is- a) would people agree that this is a meaningful way to combine my data? I believe the logic is sound, but am slightly concerned that I am ignoring a whole block of posttest data for the controls (even though this does not account for a significant amount of the variance); and b) given that each of my trained subjects appear twice- one in the pretest and once in the posttest, and the controls only appear once- in the pretest sample, is there any problem with making subject a random factor? Conceptually, I see no problem with this, but I would like to be sure before I finish writing up. Many thanks for your time Dan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic regression model + precision/recall
Hi. Thnx a lot. I will try that. nitin On 1/24/07, Tobias Sing [EMAIL PROTECTED] wrote: Maybe ROCR might help you. You can visualize the prec/rec-trade-off across the range of all cutoffs: assuming your numerical predictions are in scores and the true class labels are in classes: pred - prediction( scores, classes ) perf - performance(pred, 'rec','prec') plot(perf) HTH, Tobias On 1/24/07, nitin jindal [EMAIL PROTECTED] wrote: Hi, I am using logistic regression model named lrm(Design) Rite now I was using Area Under Curve (AUC) for testing my model. But, now I have to calculate precision/recall of the model on test cases. For lrm, precision and recal would be simply defined with the help of 2 terms below: True Positive (TP) - Number of test cases where class 1 is given probability = 0.5. False Negative (FP) - Number of test cases where class 0 is given probability = 0.5. Precision = TP / (TP + FP) Recall = TP / ( Number of Positive Samples in test data) Any help is appreciated. I an write a long code with for loops and all, but is there any inbuild function or just few commands that would do the task. regards, Nitin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tobias Sing Computational Biology and Applied Algorithmics Max Planck Institute for Informatics Saarbrucken, Germany Phone: +49 681 9325 315 Fax: +49 681 9325 399 http://www.tobiassing.net [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace missing values in lapply
I have some matrices stored as elements in a list that I am working with. On example is provided below as TP[[18]] TP[[18]] level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 Now, using prop.table on this gives prop.table(TP[[18]],1) level2 level1 1 2 3 4 1 1 0 0 0 2 3 4 It is important for the zero's to retain their position as this matrix will subsequently be used in some matrix multiplication and hence, must be of dimension 4 by 4 so that is it conformable for multiplcation with another matrix. In looking at the structure of the object resulting from prop.table I see NaNs, and so I can do this rr - TP[[18]] rr[is.na(rr)] - 0 rr level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 This is exactly what I want for each matrix. But, I have multiple matrices stored within the list that need to be changed and so I am trying to resolve this via lapply, but something is awry (namely the user), but I could use a little help. I was thinking the following function should work, but it doesn't. It reduces each matrix within the list to a 0. PP - lapply(TP, function(x) x[is.na(x)] - 0) Am I missing something obvious? Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace missing values in lapply
you need to return x in the function within lapply(), e.g., something like lapply(TP, function(x) { x[is.na(x)] - 0; x }) I hope it works. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Doran, Harold [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Wednesday, January 24, 2007 4:40 PM Subject: [R] Replace missing values in lapply I have some matrices stored as elements in a list that I am working with. On example is provided below as TP[[18]] TP[[18]] level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 Now, using prop.table on this gives prop.table(TP[[18]],1) level2 level1 1 2 3 4 1 1 0 0 0 2 3 4 It is important for the zero's to retain their position as this matrix will subsequently be used in some matrix multiplication and hence, must be of dimension 4 by 4 so that is it conformable for multiplcation with another matrix. In looking at the structure of the object resulting from prop.table I see NaNs, and so I can do this rr - TP[[18]] rr[is.na(rr)] - 0 rr level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 This is exactly what I want for each matrix. But, I have multiple matrices stored within the list that need to be changed and so I am trying to resolve this via lapply, but something is awry (namely the user), but I could use a little help. I was thinking the following function should work, but it doesn't. It reduces each matrix within the list to a 0. PP - lapply(TP, function(x) x[is.na(x)] - 0) Am I missing something obvious? Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace missing values in lapply
Perfect, thxs -Original Message- From: Dimitris Rizopoulos [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 24, 2007 10:49 AM To: Doran, Harold Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Replace missing values in lapply you need to return x in the function within lapply(), e.g., something like lapply(TP, function(x) { x[is.na(x)] - 0; x }) I hope it works. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Doran, Harold [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Wednesday, January 24, 2007 4:40 PM Subject: [R] Replace missing values in lapply I have some matrices stored as elements in a list that I am working with. On example is provided below as TP[[18]] TP[[18]] level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 Now, using prop.table on this gives prop.table(TP[[18]],1) level2 level1 1 2 3 4 1 1 0 0 0 2 3 4 It is important for the zero's to retain their position as this matrix will subsequently be used in some matrix multiplication and hence, must be of dimension 4 by 4 so that is it conformable for multiplcation with another matrix. In looking at the structure of the object resulting from prop.table I see NaNs, and so I can do this rr - TP[[18]] rr[is.na(rr)] - 0 rr level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 This is exactly what I want for each matrix. But, I have multiple matrices stored within the list that need to be changed and so I am trying to resolve this via lapply, but something is awry (namely the user), but I could use a little help. I was thinking the following function should work, but it doesn't. It reduces each matrix within the list to a 0. PP - lapply(TP, function(x) x[is.na(x)] - 0) Am I missing something obvious? Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] solving a structural equation model using sem or other package
David, Thanks for the help. I missed the significance of the section you quoted below from the help. That does indeed solve the problem. Dan Dan Nordlund Bothell, WA USA -Original Message- From: David Barron [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 24, 2007 3:32 AM To: Daniel Nordlund; r-help Subject: Re: [R] solving a structural equation model using sem or other package This is an extract from the sem help page, which deals with your situation: S covariance matrix among observed variables; may be input as a symmetric matrix, or as a lower- or upper-triangular matrix. S may also be a raw (i.e., ``uncorrected'') moment matrix — that is, a sum-of-squares-and-products matrix divided by N. This form of input is useful for fitting models with intercepts, in which case the moment matrix should include the mean square and cross-products for a unit variable all of whose entries are 1; of course, the raw mean square for the unit variable is 1. Raw-moment matrices may be computed by raw.moments. On 24/01/07, Daniel Nordlund [EMAIL PROTECTED] wrote: I am trying to work my way through the book Singer, JD and Willett, JB, Applied Longitudinal Data Analysis. Oxford University Press, 2003 using R. I have the SAS code and S-Plus code from the UCLA site (doesn't include chapter 8 or later problems). In chapter 8, there is a structural equation/path model which can be specified for the sem package as follows snip An equivalent specification in SAS produces the solution presented in the book. The variable cons is a constant vector of 1's. The problem with the sem package is that the covariance matrix which includes the variable cons is singular and sem says so and will not continue. Is there an alternative way to specify this problem for sem to obtain a solution? If not, is there another package that would produce a solution? Thanks, Dan Nordlund Bothell, WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace missing values in lapply
I wonder if a list of matrices is the best representation? Do your matrices all have the same dimension as in: TP - list(matrix(c(1:3, NA), 2), matrix(c(NA, 1:3), 2)) # Then you could consider representing them as an array: TPa - array(unlist(TP), c(2,2,2)) # in which case its just TPa[is.na(TPa)] - 0 TPa On 1/24/07, Doran, Harold [EMAIL PROTECTED] wrote: I have some matrices stored as elements in a list that I am working with. On example is provided below as TP[[18]] TP[[18]] level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 Now, using prop.table on this gives prop.table(TP[[18]],1) level2 level1 1 2 3 4 1 1 0 0 0 2 3 4 It is important for the zero's to retain their position as this matrix will subsequently be used in some matrix multiplication and hence, must be of dimension 4 by 4 so that is it conformable for multiplcation with another matrix. In looking at the structure of the object resulting from prop.table I see NaNs, and so I can do this rr - TP[[18]] rr[is.na(rr)] - 0 rr level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 This is exactly what I want for each matrix. But, I have multiple matrices stored within the list that need to be changed and so I am trying to resolve this via lapply, but something is awry (namely the user), but I could use a little help. I was thinking the following function should work, but it doesn't. It reduces each matrix within the list to a 0. PP - lapply(TP, function(x) x[is.na(x)] - 0) Am I missing something obvious? Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace missing values in lapply
I hadn't thought of that. I use the following at one point in my program tmp - with(data, tapply(variable, index, table)) Which returns a list. So, I just went with it for the rest of my program. I'm changing code now to arrays, I think you're right and this may be a better representation. I need to walk through this and see what turns up. Thanks for the recommendation. -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 24, 2007 11:06 AM To: Doran, Harold Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Replace missing values in lapply I wonder if a list of matrices is the best representation? Do your matrices all have the same dimension as in: TP - list(matrix(c(1:3, NA), 2), matrix(c(NA, 1:3), 2)) # Then you could consider representing them as an array: TPa - array(unlist(TP), c(2,2,2)) # in which case its just TPa[is.na(TPa)] - 0 TPa On 1/24/07, Doran, Harold [EMAIL PROTECTED] wrote: I have some matrices stored as elements in a list that I am working with. On example is provided below as TP[[18]] TP[[18]] level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 Now, using prop.table on this gives prop.table(TP[[18]],1) level2 level1 1 2 3 4 1 1 0 0 0 2 3 4 It is important for the zero's to retain their position as this matrix will subsequently be used in some matrix multiplication and hence, must be of dimension 4 by 4 so that is it conformable for multiplcation with another matrix. In looking at the structure of the object resulting from prop.table I see NaNs, and so I can do this rr - TP[[18]] rr[is.na(rr)] - 0 rr level2 level1 1 2 3 4 1 79 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 This is exactly what I want for each matrix. But, I have multiple matrices stored within the list that need to be changed and so I am trying to resolve this via lapply, but something is awry (namely the user), but I could use a little help. I was thinking the following function should work, but it doesn't. It reduces each matrix within the list to a 0. PP - lapply(TP, function(x) x[is.na(x)] - 0) Am I missing something obvious? Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] as.numeric(.1)
Hello, I noticed the following strange behavior under R-2.4.0 (Linux Mandriva 2007) : options(OutDec) $OutDec [1] . as.numeric(.1) [1] NA Warning message: NAs introduits lors de la conversion automatique as.numeric(,1) [1] 0,1 So I need to use the comma as the decimal separator, at least as input. Moreover, the last output also use a comma, though the OutDec option was set to .. Basic arithmetic ops on the command line work OK with decimal dots. I am pretty sure as.numeric(.1) used to work under older versions of R. Could it be a localization problem ? I would like to use the dot as the decimal separator both for input and output. Any suggestion ? Thank you very much in advance, Yvonnick Noel Dprt of Psychology U. of Rennes France platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 4.0 year 2006 month 10 day03 svn rev39566 language R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fit model to data and use model for data generation
Hi, Suppose I have a set of values x and I want to calculate the distribution of the data. Ususally I would use the density command. Now, can I use the resulting density-object model to generate a number of new values which have the same distribution? Or do I have to use some different function? Regards, Benjamin -- Benjamin Otto Universitaetsklinikum Eppendorf Hamburg Institut fuer Klinische Chemie Martinistrasse 52 20246 Hamburg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.numeric(.1)
NOEL == NOEL Yvonnick [EMAIL PROTECTED] on Wed, 24 Jan 2007 17:29:14 +0100 writes: NOEL Hello, NOEL I noticed the following strange behavior under R-2.4.0 (Linux Mandriva NOEL 2007) : options(OutDec) NOEL $OutDec NOEL [1] . as.numeric(.1) NOEL [1] NA NOEL Warning message: NOEL NAs introduits lors de la conversion automatique as.numeric(,1) NOEL [1] 0,1 Oops ! Should not happen, given your getOption(OutDec) NOEL So I need to use the comma as the decimal separator, NOEL at least as input. Moreover, the last output also use NOEL a comma, though the OutDec option was set to NOEL .. Basic arithmetic ops on the command line work OK NOEL with decimal dots. NOEL I am pretty sure as.numeric(.1) used to work under older versions of NOEL R. Could it be a localization problem ? maybe / probably. We cannot easily reproduce. Instead of the output below, can you please give the full sessionInfo() output? NOEL I would like to use the dot as the decimal separator NOEL both for input and output. Well definitely! (Using , is crazyness in my eyes!!) NOEL Any suggestion ? NOEL Thank you very much in advance, NOEL Yvonnick Noel NOEL Dprt of Psychology NOEL U. of Rennes NOEL France NOEL platform i686-pc-linux-gnu NOEL arch i686 NOEL os linux-gnu NOEL system i686, linux-gnu NOEL status NOEL major 2 NOEL minor 4.0 NOEL year 2006 NOEL month 10 NOEL day03 NOEL svn rev39566 NOEL language R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Easy to install GNU Emacs for Windows
[Sorry for cross-posting in an attempt to reach as many interested parties as possible.] Users (present or future) of GNU Emacs and ESS on Windows might be interested in my distribution of an easy to install (read: with an installation wizard) version of GNU Emacs 21.3 with the following additions: * ESS 5.3.3 configured to work with the latest stable release of R; * AUCTeX 11.84; * Aspell 0.50.3; * English and French dictionaries for Aspell; * w32-winprint.el, to ease printing under Windows; * htmlize.el, to print in color with w32-winprint.el; * site-start.el, a site wide configuration file to make everything work. For details: http://vgoulet.act.ulaval.ca/en/emacs/ The plan is to keep my distribution current with R releases. The installation wizard is not all that smart for now, but I'll try to improve it in the future. For example, it creates a HOME environment variable (if it doesn't already exist), but doesn't ask for a value. %USERPROFILE% is used as default value. Comments, criticisms and translation of messages appreciated! [Disclaimer: I am not a WIndows user myself. I created this distribution to ease adoption of Emacs by my students.] -- Vincent Goulet, Associate Professor École d'actuariat Université Laval, Québec [EMAIL PROTECTED] http://vgoulet.act.ulaval.ca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.numeric(.1) + SessionInfo
as.numeric(,1) NOEL [1] 0,1 Instead of the output below, can you please give the full sessionInfo() output? Here it is: R version 2.4.0 (2006-10-03) i686-pc-linux-gnu locale: fr_FR.UTF-8 attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: lattice cairoDevice gWidgetsRGtk2 RGtk2 gWidgets 0.14-16 1.2 0.0-9 2.8.6 0.0-11 Yvonnick Noel, PhD. Dpt of Psychology U. of Rennes France __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to write randomforest in r
myat wai wmyonwesit at gmail.com writes: Dear Sir, I want to know how to do for getting the results. 1. data set is not in r. how to use my data set in r. 2. using randomForest function to build tree with my data set how to write for it 3. using this random forest how to predict the new data Please reply me.I want to bulid random forest in r and predict the new data. My data set is in the attachment file.Like my attachment file,I want to get the results in R as the output. Please help me. Your Sincerely, Myat I'm afraid your question is far too vague. At the very least you need to (1) indicate what format your data are in (there are very many non-R data formats!) and (2) indicate that you have actually read the documentation for the randomForest function (and the predict.randomForest function, which would probably help you predict new data!), as well as the Introduction to R. Then you can tell us where you got stuck, which will both make it easier for us to help you and perhaps help the authors improve the documentation. (Your attachment appears to have gotten lost somewhere along the way.) If this is too much for you, you will need to find someone (preferably someone at your own institution) who can help you get started with R basics. Reading the posting guide wouldn't hurt either. good luck, Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cronbach's alpha
Dear Listers: I used cronbach{psy} to evaluate the internal consistency and some set of variables gave me alpha=-1.1003, while other, alpha=-0.2; alpha=0.89; and so on. I am interested in knowing how to interpret 1. negative value 2. negative value less than -1. I also want to re-mention my previous question about how to evaluate the consistency of a set of variables and about the total correlation (my 2 cent to answer the question). Is there any function in R to do that? Thank you very much! -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cronbach's alpha
Weiwei Something is wrong. Coefficient alpha is bounded between 0 and 1, so negative values are outside the parameter space for a reliability statistic. Recall that reliability is the ratio of true score variance to total score variance. That is var(t)/ var(t) + var(e) If all variance is true score variance, then var(e)=0 and the reliability is var(t)/var(t)=1. On the other hand, if all variance is measurement error, then var(t) = 0 and reliability is 0. Here is a function I wrote to compute alpha along with an example. Maybe try recomputing your statistic using this function and see if you get the same result. alpha - function(columns){ k - ncol(columns) colVars - apply(columns, 2, var) total - var(apply(columns, 1, sum)) a - (total - sum(colVars)) / total * (k/(k-1)) a } data(LSAT, package='ltm') alpha(LSAT) [1] 0.2949972 Harold -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Wednesday, January 24, 2007 1:17 PM To: R R Subject: [R] Cronbach's alpha Dear Listers: I used cronbach{psy} to evaluate the internal consistency and some set of variables gave me alpha=-1.1003, while other, alpha=-0.2; alpha=0.89; and so on. I am interested in knowing how to interpret 1. negative value 2. negative value less than -1. I also want to re-mention my previous question about how to evaluate the consistency of a set of variables and about the total correlation (my 2 cent to answer the question). Is there any function in R to do that? Thank you very much! -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] n step ahead forecasts
hello, I have a question about making n step ahead forecasts in cases where test and validation sets are availiable. For instance, I would like to make one step ahead forecasts on the WWWusage data so I hold out the last 10 observations as the validation set and fit an ARIMA model on the first 90 observations. I then use a for loop to sequentially add 9 of the holdout observations to make 1 step ahead forecasts for the last 10 periods (see example code). In cases where there are relatively few periods I want to forecast for this seems to work fine, however I am working with a rather large validation set and I need to make n step ahead forecasts for many periods and it takes a very long time. Is there a more efficient way to do this? vset - WWWusage[91:100] pred -c() for (i in 0:9) { fit -arima(WWWusage[1:(90+i)],c(3,1,0)) p- predict(fit,se.fit=F) pred - c(pred, p) } plot(pred,type=o,col=2) lines(vset,type=o,col=1) thanks, Spencer [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dataframe operation
hi i have a dataframe a which looks like: column1, column2, column3 10,12, 0 NA, 0,1 12,NA,50 i want to replace all values in column1 to column3 which do not contain NA with values of vector b (100,200,300). any idea i can do it? i appreciate any hint regards lukas °°° Lukas Indermaur, PhD student eawag / Swiss Federal Institute of Aquatic Science and Technology ECO - Department of Aquatic Ecology Überlandstrasse 133 CH-8600 Dübendorf Switzerland Phone: +41 (0) 71 220 38 25 Fax: +41 (0) 44 823 53 15 Email: [EMAIL PROTECTED] www.lukasindermaur.ch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe operation
On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote: hi i have a dataframe a which looks like: column1, column2, column3 10,12, 0 NA, 0,1 12,NA,50 i want to replace all values in column1 to column3 which do not contain NA with values of vector b (100,200,300). any idea i can do it? i appreciate any hint regards lukas Here is one possibility: sapply(seq(along = colnames(DF)), function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]])) [,1] [,2] [,3] [1,] 10 120 [2,] 10001 [3,] 12 200 50 Note that the returned object will be a matrix, so if you need a data frame, just coerce the result with as.data.frame(). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe operation
On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote: On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote: hi i have a dataframe a which looks like: column1, column2, column3 10,12, 0 NA, 0,1 12,NA,50 i want to replace all values in column1 to column3 which do not contain NA with values of vector b (100,200,300). any idea i can do it? i appreciate any hint regards lukas Here is one possibility: sapply(seq(along = colnames(DF)), function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]])) [,1] [,2] [,3] [1,] 10 120 [2,] 10001 [3,] 12 200 50 Note that the returned object will be a matrix, so if you need a data frame, just coerce the result with as.data.frame(). OKthat's what I get for pulling the trigger too fast. Just reverse the logic in the function: sapply(seq(along = colnames(DF)), function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]])) [,1] [,2] [,3] [1,] 100 200 300 [2,] NA 200 300 [3,] 100 NA 300 I misread the query initially. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe operation
Hint: Try ?subset at the R prompt Indermaur Lukas [EMAIL PROTECTED] wrote: hi i have a dataframe a which looks like: column1, column2, column3 10,12, 0 NA, 0,1 12,NA,50 i want to replace all values in column1 to column3 which do not contain NA with values of vector b (100,200,300). any idea i can do it? i appreciate any hint -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC
Hello, I am fairly new to R and its connectivity to MS-Access. I just installed RODBC and it seems to be working well except when I use the date to condition the query. For example the query below sqlQuery(channel, select date from tblUScpi where (date d2) order by date) returns the following error [1] [RODBC] ERROR: Could not SQLExecDirect [2] 07001 -3010 [Microsoft][ODBC Microsoft Access Driver] Too few parameters. Expected 1. I checked that d2 and the elements in date belong to the same class (POSIXt POSIXct). Can anybody help me? -- View this message in context: http://www.nabble.com/RODBC-tf3084357.html#a8571820 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Query about extracting subset of datafram
Hi I have a table read from a mysql database which is of the kind clusterid clockrate I obtained this table in R as clockrates_table -sqlQuery(channel,select); I have a function within which I wish to extract the clusterid for a given cluster. Although I know that there is just one row per clusterid in the data frame, I am using subset to extract the clockrate. clockrate = subset(clockrates_table, clusterid==15, select=c(clockrate)); Is there any way of extracting the clockrate without using subset. In the help section for subset, it mentioned to see also: [,... However I could find no mention for this entry when I searched as ?[, etc. The R manuals also, despite discussing complex libraries, techniques etc, dont always seem to provide such handy hints/tips and tricks for manipulating data, which is a first stumbling block for newbies like me. I would greatly appreciate if you could point me to such resources as well, for future reference. Thanks Lalitha 8:00? 8:25? 8:40? Find a flick in no time __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe operation
On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote: On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote: On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote: hi i have a dataframe a which looks like: column1, column2, column3 10,12, 0 NA, 0,1 12,NA,50 i want to replace all values in column1 to column3 which do not contain NA with values of vector b (100,200,300). any idea i can do it? i appreciate any hint regards lukas Here is one possibility: sapply(seq(along = colnames(DF)), function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]])) [,1] [,2] [,3] [1,] 10 120 [2,] 10001 [3,] 12 200 50 Note that the returned object will be a matrix, so if you need a data frame, just coerce the result with as.data.frame(). OKthat's what I get for pulling the trigger too fast. Just reverse the logic in the function: sapply(seq(along = colnames(DF)), function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]])) [,1] [,2] [,3] [1,] 100 200 300 [2,] NA 200 300 [3,] 100 NA 300 I misread the query initially. Here is another possibility, which may be faster depending upon the actual size and dims of your initial data frame. Preallocate a matrix of replacement values: Mat - matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)), ncol = ncol(DF)) Mat [,1] [,2] [,3] [1,] 100 200 300 [2,] 100 200 300 [3,] 100 200 300 Now do the replacement: ifelse(!is.na(DF), Mat, NA) column1 column2 column3 1 100 200 300 2 NA 200 300 3 100 NA 300 In doing some testing, the above may be about 10 times faster than using sapply() in my first solution, again depending upon the structure of your DF. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Importing XPORT datasets into R
Hi, I'm experiencing a strange issue while attempting to import some XPORT formatted datasets created using SAS 9.3. I 'm using R 2.2.1 with a recently downloaded version of the foreign package, although I don't have the version for that handy. I have been able to load some of the XPORT datasets, however when I increase to the entire dataset, I get the following error: Error in lookup.xport(file) : file not in SAS transfer format I am trying to import a dataset of roughly 1000 observations with about 600 variables. If I restrict the dataset somewhat and allow only 300 variables, the import appears to succeed without error. I have not seen a note in the documentation anywhere about the maximum number of variables for a dataset in R or for importing with read.xport, but I suspect this is the issue. For completeness, here is the start of the .xport file (it's all on one line): HEADER RECORD***LIBRARY HEADER RECORD!!!00 SAS SAS SASLIB 9.1 SunOS If anyone has any insights into this, I would greatly appreciate it. Thanks! Mike Greene [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cronbach's alpha
Harold Weiwei-- Actually, alpha *can* go negative, which means that items are reliably different as opposed to reliably similar. This happens when the sum of the covariances among items is negative. See the ATS site below for a more thorough explanation: http://www.ats.ucla.edu/STAT/SPSS/library/negalpha.htm Hope that helps. cheers, Dave -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone: 626.584.5554 Weiwei Something is wrong. Coefficient alpha is bounded between 0 and 1, so negative values are outside the parameter space for a reliability statistic. Recall that reliability is the ratio of true score variance to total score variance. That is var(t)/ var(t) + var(e) If all variance is true score variance, then var(e)=0 and the reliability is var(t)/var(t)=1. On the other hand, if all variance is measurement error, then var(t) = 0 and reliability is 0. Here is a function I wrote to compute alpha along with an example. Maybe try recomputing your statistic using this function and see if you get the same result. alpha - function(columns){ k - ncol(columns) colVars - apply(columns, 2, var) total - var(apply(columns, 1, sum)) a - (total - sum(colVars)) / total * (k/(k-1)) a } data(LSAT, package='ltm') alpha(LSAT) [1] 0.2949972 Harold -Original Message- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Weiwei Shi Sent: Wednesday, January 24, 2007 1:17 PM To: R R Subject: [R] Cronbach's alpha Dear Listers: I used cronbach{psy} to evaluate the internal consistency and some set of variables gave me alpha=-1.1003, while other, alpha=-0.2; alpha=0.89; and so on. I am interested in knowing how to interpret 1. negative value 2. negative value less than -1. I also want to re-mention my previous question about how to evaluate the consistency of a set of variables and about the total correlation (my 2 cent to answer the question). Is there any function in R to do that? Thank you very much! -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone: 626.584.5554 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] User defined function calls
Hi I have a script processfiles.R that contains, amongst other functions 1) a database access function called get_clockrates which retreives from a database, a table containing columns (clusterid, clockrate) and 45000 rows(one for each clusterid). Clusterid is an integer and clockrate is a float. 2) process_clusterid which takes clusterid as an argument and after doing some data processing, retrieves the clockrate corresponding to the clusterid. I wish to call get_clockrates only once and keep the dataframe returned by it as a GLOBAL which the function process_clusterid can use for each clusterid that it processes. To ensure that clockrates is global, I retreive it as clockrate - sqlQuery.. Trust that this is correct. Without the inclusion of get_clockrates function, I have run this script under R as follows source(process_files.R); for (index in c(1:45000)) { try(process_file, silent=TRUE); } How do I get this code to execute get_clockrates only once and subsequently call process_file for each of the 45000 files in turn. I would greatly appreciate your input regarding my query. Thanks Lalitha Finding fabulous fares is fun. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Query about extracting subset of datafram
On Wed, 2007-01-24 at 12:31 -0800, lalitha viswanath wrote: Hi I have a table read from a mysql database which is of the kind clusterid clockrate I obtained this table in R as clockrates_table -sqlQuery(channel,select); I have a function within which I wish to extract the clusterid for a given cluster. Although I know that there is just one row per clusterid in the data frame, I am using subset to extract the clockrate. clockrate = subset(clockrates_table, clusterid==15, select=c(clockrate)); You don't need the ';', though some will argue that it is a personal preference. Also, the c(...) around 'clockrate' is not needed when only one column is being selected. Is there any way of extracting the clockrate without using subset. You could use: clockrates_table[clockrates_table$clusterid == 15, clockrates_table$clockrate] or perhaps: with(clockrates_table, clockrates_table[clusterid == 15, clockrate]) See ?with If you did not need the conditional, you could of course use: clockrates_table$clockrate or: clockrates_table[[clockrate]] or: clockrates_table[, clockrate] or: with(clockrates_table, clockrate) My personal preference is to use subset(), as for me, it makes the code easier to read. In the help section for subset, it mentioned to see also: [,... However I could find no mention for this entry when I searched as ?[, etc. Try: ?[ or ?Extract Note the placement of the quotes in the first case. The R manuals also, despite discussing complex libraries, techniques etc, dont always seem to provide such handy hints/tips and tricks for manipulating data, which is a first stumbling block for newbies like me. I would greatly appreciate if you could point me to such resources as well, for future reference. If you have not yet, reading the Posting Guide, for which there is a link at the bottom of each e-mail is a good place to start. Also, see ?RSiteSearch for a function which will enable you to search the e-mail list archives. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Probabilities calibration error ROCR
Hello, I'd need to compute the calibration error of posterior class probabilities p(y|x) estimated by using rpart as classification tree. Namely, I train rpart on a dataset D and then use predict(... type=prob) to estimate p(y|x). I've found the possibility to do that in the ROCR package, but I cannot find a link to a paper/book which explains the details of the implemented algorithm. Do you know of any reference where I can find the details of the algorithm that computes the calibration error implemented in ROCR (apart from ROCR's source code)? Is there any other function/package I can use to compute the calibration error? thank you, regards, Roberto __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cronbach's alpha
Hi Dave We had a bit of an off list discussion on this. You're correct, it can be negative IF the covariance among individual items is negative AND if that covariance term is larger than the sum of the individual item variances. Both of these conditions would be needed to make alpha go negative. Psychometrically speaking, this introduces some question as to whether the items are measuring the same latent trait. That is, if there is a negative covariance among items, but those items are thought to measure a common trait, then (I'm scratching my head) I think we have a dimensionality issue. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Atkins Sent: Wednesday, January 24, 2007 4:08 PM To: R-help@stat.math.ethz.ch Subject: Re: [R] Cronbach's alpha Harold Weiwei-- Actually, alpha *can* go negative, which means that items are reliably different as opposed to reliably similar. This happens when the sum of the covariances among items is negative. See the ATS site below for a more thorough explanation: http://www.ats.ucla.edu/STAT/SPSS/library/negalpha.htm Hope that helps. cheers, Dave -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone: 626.584.5554 Weiwei Something is wrong. Coefficient alpha is bounded between 0 and 1, so negative values are outside the parameter space for a reliability statistic. Recall that reliability is the ratio of true score variance to total score variance. That is var(t)/ var(t) + var(e) If all variance is true score variance, then var(e)=0 and the reliability is var(t)/var(t)=1. On the other hand, if all variance is measurement error, then var(t) = 0 and reliability is 0. Here is a function I wrote to compute alpha along with an example. Maybe try recomputing your statistic using this function and see if you get the same result. alpha - function(columns){ k - ncol(columns) colVars - apply(columns, 2, var) total - var(apply(columns, 1, sum)) a - (total - sum(colVars)) / total * (k/(k-1)) a } data(LSAT, package='ltm') alpha(LSAT) [1] 0.2949972 Harold -Original Message- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Weiwei Shi Sent: Wednesday, January 24, 2007 1:17 PM To: R R Subject: [R] Cronbach's alpha Dear Listers: I used cronbach{psy} to evaluate the internal consistency and some set of variables gave me alpha=-1.1003, while other, alpha=-0.2; alpha=0.89; and so on. I am interested in knowing how to interpret 1. negative value 2. negative value less than -1. I also want to re-mention my previous question about how to evaluate the consistency of a set of variables and about the total correlation (my 2 cent to answer the question). Is there any function in R to do that? Thank you very much! -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone: 626.584.5554 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe operation
Here is a slight variation on Marc's idea: isna - is.na(DF) DF[] - replace(100 * col(isna), isna, NA) On 1/24/07, Marc Schwartz [EMAIL PROTECTED] wrote: On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote: On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote: On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote: hi i have a dataframe a which looks like: column1, column2, column3 10,12, 0 NA, 0,1 12,NA,50 i want to replace all values in column1 to column3 which do not contain NA with values of vector b (100,200,300). any idea i can do it? i appreciate any hint regards lukas Here is one possibility: sapply(seq(along = colnames(DF)), function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]])) [,1] [,2] [,3] [1,] 10 120 [2,] 10001 [3,] 12 200 50 Note that the returned object will be a matrix, so if you need a data frame, just coerce the result with as.data.frame(). OKthat's what I get for pulling the trigger too fast. Just reverse the logic in the function: sapply(seq(along = colnames(DF)), function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]])) [,1] [,2] [,3] [1,] 100 200 300 [2,] NA 200 300 [3,] 100 NA 300 I misread the query initially. Here is another possibility, which may be faster depending upon the actual size and dims of your initial data frame. Preallocate a matrix of replacement values: Mat - matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)), ncol = ncol(DF)) Mat [,1] [,2] [,3] [1,] 100 200 300 [2,] 100 200 300 [3,] 100 200 300 Now do the replacement: ifelse(!is.na(DF), Mat, NA) column1 column2 column3 1 100 200 300 2 NA 200 300 3 100 NA 300 In doing some testing, the above may be about 10 times faster than using sapply() in my first solution, again depending upon the structure of your DF. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JOB: LARS Package Developer
Insightful is seeking a Research Scientist with a strong background in statistical methodology, algorithms development, data analysis, and software development. The primary responsibilities are to develop software for high-dimensional regression and machine learning applications using least angle regression (LARS). The official position is listed at: http://www.insightful.com/company/jobdescription.asp?JobID=118 More information about the project is at: http://www.insightful.com/Hesterberg/glars For technical questions contact Tim Hesterberg [EMAIL PROTECTED] To apply, please contact Jill Goldschneider [EMAIL PROTECTED] or Human Resources [EMAIL PROTECTED] Thank you, Jill Jill R. Goldschneider, Ph.D. Director of Research Insightful Corporation 1700 Westlake Ave. N. Suite 500 Seattle WA 98109 (206) 802-2327 (office) (206) 953-9355 (mobile) (206) 802-2500 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JOB: LARS internships
Insightful is seeking a pre-doctoral student and an undergraduate student for two internship positions. The primary responsibilities are to assist in the development of software for high-dimensional regression and machine learning applications using least angle regression (LARS). The pre-doctoral candidate should have a background and interest in statistical methodology, algorithms, data analysis, simulation studies and software development and documentation. The candidate should be currently pursuing a Ph.D. degree. The undergraduate intern position requires a person with at least three years of undergraduate training and a solid background in mathematics and interest in statistical methodology, algorithms, data analysis, simulation studies and software development and documentation. The candidate should be currently pursuing a bachelor's degree. The official positions are listed at: http://www.insightful.com/company/jobdescription.asp?JobID=116 http://www.insightful.com/company/jobdescription.asp?JobID=117 More information about the project can be found at: http://www.insightful.com/Hesterberg/glars For technical questions contact Tim Hesterberg [EMAIL PROTECTED] To apply, please contact Jill Goldschneider [EMAIL PROTECTED] or Human Resources [EMAIL PROTECTED] Thank you, Jill Jill R. Goldschneider, Ph.D. Director of Research Insightful Corporation 1700 Westlake Ave. N. Suite 500 Seattle WA 98109 (206) 802-2327 (office) (206) 953-9355 (mobile) (206) 802-2500 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Text position in Traditional Graphics
R 2.4.1 on Windows XP. Question: In traditional graphics, is it possible to find out the height of a line of text in units that can be used in arithmetic and then in calls to text()? Context: I have written a function that draws a plot and then, depending on whether some arguments are TRUE or FALSE, draws various lines of text in the plot. The text lines may be turned on or off individually by the user. The function uses plot() and several calls to text(). However, I have not found a good way to adjust the Y coordinate of the text for lines after the first. I would like this to work when the graphics device (windows) is opened at (or resized to) a wide range of sizes. The issue is that a line of text takes up a smaller fraction of the total Y span of the plotting region as the window gets larger. It seems this can be done with grid graphics, but although I plan to learn grid, I am hoping that for now, I can do this work with traditional graphics. Thanks! -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cronbach's alpha
Hi, there: I read that article (thanks Chucks, etc to point that out). Now I understand how those negatives are generated since my research subject should have negative convariance but they are measuring the same thing. So, I am confused about this same thing and about if it is proper to go ahead to use this measurement. To clear my point , I describe my idea here a little bit. My idea is to look for a way to assign a statistic or measurement to a set of variables to see if they act cohesively or coherently for an event. Instead of using simple correlation, which describes var/var correlation; I wanted to get a total correlation so that I can compare between setS of variables. Initially I made that word but google helps me find that statistic exists! So I read into it and post my original post on total correlation. (Ben, you can find total correlation from wiki). I was suggested to use this alpha since it measures a one latent construct, in which matches my idea about one event. I have a feeling it is like factor analysis; however, the grouping of variables has been fixed by domain knowledge. Sorry if it is off-list topic but I feel it is very interesting to go ahead. Thanks, Weiwei On 1/24/07, Doran, Harold [EMAIL PROTECTED] wrote: Hi Dave We had a bit of an off list discussion on this. You're correct, it can be negative IF the covariance among individual items is negative AND if that covariance term is larger than the sum of the individual item variances. Both of these conditions would be needed to make alpha go negative. Psychometrically speaking, this introduces some question as to whether the items are measuring the same latent trait. That is, if there is a negative covariance among items, but those items are thought to measure a common trait, then (I'm scratching my head) I think we have a dimensionality issue. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Atkins Sent: Wednesday, January 24, 2007 4:08 PM To: R-help@stat.math.ethz.ch Subject: Re: [R] Cronbach's alpha Harold Weiwei-- Actually, alpha *can* go negative, which means that items are reliably different as opposed to reliably similar. This happens when the sum of the covariances among items is negative. See the ATS site below for a more thorough explanation: http://www.ats.ucla.edu/STAT/SPSS/library/negalpha.htm Hope that helps. cheers, Dave -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone: 626.584.5554 Weiwei Something is wrong. Coefficient alpha is bounded between 0 and 1, so negative values are outside the parameter space for a reliability statistic. Recall that reliability is the ratio of true score variance to total score variance. That is var(t)/ var(t) + var(e) If all variance is true score variance, then var(e)=0 and the reliability is var(t)/var(t)=1. On the other hand, if all variance is measurement error, then var(t) = 0 and reliability is 0. Here is a function I wrote to compute alpha along with an example. Maybe try recomputing your statistic using this function and see if you get the same result. alpha - function(columns){ k - ncol(columns) colVars - apply(columns, 2, var) total - var(apply(columns, 1, sum)) a - (total - sum(colVars)) / total * (k/(k-1)) a } data(LSAT, package='ltm') alpha(LSAT) [1] 0.2949972 Harold -Original Message- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Weiwei Shi Sent: Wednesday, January 24, 2007 1:17 PM To: R R Subject: [R] Cronbach's alpha Dear Listers: I used cronbach{psy} to evaluate the internal consistency and some set of variables gave me alpha=-1.1003, while other, alpha=-0.2; alpha=0.89; and so on. I am interested in knowing how to interpret 1. negative value 2. negative value less than -1. I also want to re-mention my previous question about how to evaluate the consistency of a set of variables and about the total correlation (my 2 cent to answer the question). Is there any function in R to do that? Thank you very much! -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone:
[R] modify rectangle color from image
Hi, I need some suggestion on how I could modify the color on some rectangle that I have created using image. In other words, I have a 5x5 matrix, say, m. m - matrix(rnorm(25), nrow=5) I create a grid of rectangles by: image(m) Now I want to change the color of rectangle (3,3) to blue. I don't know how this could be done, and searching the web has given me no hint. Thanks for your help. -- saurav __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] n step ahead forecasts
Dear Spencer: just my2cent: could you change the step from 1 to m, like 5 if you have a very large validation set. I have a feeling that it won't change too much about the result but I am not sure what your endpoint is. weiwei On 1/24/07, sj [EMAIL PROTECTED] wrote: hello, I have a question about making n step ahead forecasts in cases where test and validation sets are availiable. For instance, I would like to make one step ahead forecasts on the WWWusage data so I hold out the last 10 observations as the validation set and fit an ARIMA model on the first 90 observations. I then use a for loop to sequentially add 9 of the holdout observations to make 1 step ahead forecasts for the last 10 periods (see example code). In cases where there are relatively few periods I want to forecast for this seems to work fine, however I am working with a rather large validation set and I need to make n step ahead forecasts for many periods and it takes a very long time. Is there a more efficient way to do this? vset - WWWusage[91:100] pred -c() for (i in 0:9) { fit -arima(WWWusage[1:(90+i)],c(3,1,0)) p- predict(fit,se.fit=F) pred - c(pred, p) } plot(pred,type=o,col=2) lines(vset,type=o,col=1) thanks, Spencer [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text position in Traditional Graphics
A few days a go Jim Holman [EMAIL PROTECTED] suggested this (Re: [R] How to annotate a graph with non-transparent math labels?) for a similar circumstance. Perhaps it will for in your case. try using strwidth strheight x-c(0,1) plot(x,x,type='l') dimensions-matrix(c(strwidth(expression(theta),cex=5),strheight(expression( theta), cex=5)),nrow=1) symbols(0.5,0.5 ,rectangle=dimensions,bg='white',fg='white',add=TRUE,inches=FALSE) text(0.5,0.5,expression(theta),cex=5) ~ Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Prager Sent: Wednesday, January 24, 2007 4:30 PM To: r-help@stat.math.ethz.ch Subject: [R] Text position in Traditional Graphics R 2.4.1 on Windows XP. Question: In traditional graphics, is it possible to find out the height of a line of text in units that can be used in arithmetic and then in calls to text()? Context: I have written a function that draws a plot and then, depending on whether some arguments are TRUE or FALSE, draws various lines of text in the plot. The text lines may be turned on or off individually by the user. The function uses plot() and several calls to text(). However, I have not found a good way to adjust the Y coordinate of the text for lines after the first. I would like this to work when the graphics device (windows) is opened at (or resized to) a wide range of sizes. The issue is that a line of text takes up a smaller fraction of the total Y span of the plotting region as the window gets larger. It seems this can be done with grid graphics, but although I plan to learn grid, I am hoping that for now, I can do this work with traditional graphics. Thanks! -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text position in Traditional Graphics
On Wed, 2007-01-24 at 16:30 -0500, Mike Prager wrote: R 2.4.1 on Windows XP. Question: In traditional graphics, is it possible to find out the height of a line of text in units that can be used in arithmetic and then in calls to text()? Context: I have written a function that draws a plot and then, depending on whether some arguments are TRUE or FALSE, draws various lines of text in the plot. The text lines may be turned on or off individually by the user. The function uses plot() and several calls to text(). However, I have not found a good way to adjust the Y coordinate of the text for lines after the first. I would like this to work when the graphics device (windows) is opened at (or resized to) a wide range of sizes. The issue is that a line of text takes up a smaller fraction of the total Y span of the plotting region as the window gets larger. It seems this can be done with grid graphics, but although I plan to learn grid, I am hoping that for now, I can do this work with traditional graphics. Thanks! Mike, you might want to take a look at: ?strheight The one thing to be potentially aware of, is if the plot window is resized, some aspects of drawing text can be subject to alteration. It may take some trial and error to determine how the method you wish to use may be prone to such problems. For example: plot(1, type = n) text(1, 1, This is a test) text(1, 1 + strheight(T), This is a test) text(1, 1 + strheight(T) * 2, This is a test) text(1, 1 + strheight(T) * 3, This is a test) Now, drag and resize the plot window here Then run: text(1, 1 + strheight(T) * 4, This is a test) This may behave differently on Windows, but on Linux, when I resize the X window, the last line of text (* 4) is placed between (* 2) and (* 3). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text position in Traditional Graphics
Mike Prager [EMAIL PROTECTED] wrote: R 2.4.1 on Windows XP. Question: In traditional graphics, is it possible to find out the height of a line of text in units that can be used in arithmetic and then in calls to text()? [...] I seem to have solved my own question by setting the user scale to the size of the window in inches, converting the point size into inches, and going from there. This works well for all sizes of windows. It doesn't change the spacing when windows are resized, but I can live with that. There is nothing like posting to R-help to stimulate one's own thoughts. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modify rectangle color from image
Thanks, Saurav Saurav Pathak [Wed, Jan 24, 2007 at 04:37:20PM -0500]: + Hi, + + I need some suggestion on how I could modify the color on some + rectangle that I have created using image. + + In other words, I have a 5x5 matrix, say, m. + + m - matrix(rnorm(25), nrow=5) + + I create a grid of rectangles by: + + image(m) + + Now I want to change the color of rectangle (3,3) to blue. using rect for this. DUH. -- saurav __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mixed effects or fixed effects?
Hi Dan, this is an interesting and intricate question, but only marginally related to the subject line. On Wed, Jan 24, 2007 at 03:25:39PM +, dan kumpik wrote: Hi, I am running a learning experiment in which both training subjects and controls complete a pretest and posttest. All analyses are being conducted in R. We are looking to compare two training methodologies, and so have run this experiment twice, once with each methodology. Methodology is a between-subjects factor. Trying to run this analysis with every factor included (ie, subject as a random factor, session nested within group nested within experiment) seems to me (after having tried) to be clumsy and probably uninterpretable. My favoured model for the analysis is a linear mixed-effects model, and to combine the data meaningfully, I have collated all the pretest data for controls and trained subjects from each experiment, and assumed this data to represent a population sample for naive subjects for each experiment. I have also ditched the posttest data for the controls, and assumed the posttest training data to represent a population sample for trained subjects for each experiment. I have confirmed the validity of these assumptions by ascertaining that a) controls and trained listeners did not differ significantly at pretest for either experiment; and b) control listeners did not learn significantly between pretest and posttest (and therefore their posttest data are not relevant). This was done using a linear mixed-effects model for each experiment, with subject as a random factor and session (pretest vs posttest) nested within Group (trained vs control). I don't agree with ditching the posttest data for the controls. Although you may have failed to detect a lack of statistically significant learning, that doesn't mean that there isn't enough learning to imperil your inference. Also, under an appropriate model, posttest control data could contribute to estimating the variance components, so by discarding data you risk losing power. And by your description of the strategy, you lose balance, but this is not such a problem as far as I am aware. Therefore, the model I want to use to analyse the data would ideally be a linear mixed-effects model, with subject as a random factor, and session (pre vs post) nested within experiment. Note that my removal of the Group (Trained vs Control) factor simplifies the model somewhat, and makes it more interpretable in terms of evaluating the relative effects of each experiment. I see that it simplifies the interpretation but not necessarily in a constructive way! What I would like to know is- a) would people agree that this is a meaningful way to combine my data? I believe the logic is sound, but am slightly concerned that I am ignoring a whole block of posttest data for the controls (even though this does not account for a significant amount of the variance); and b) given that each of my trained subjects appear twice- one in the pretest and once in the posttest, and the controls only appear once- in the pretest sample, is there any problem with making subject a random factor? Conceptually, I see no problem with this, but I would like to be sure before I finish writing up. Many thanks for your time I think that you need to make the model structure match the experiment. I hope that this is useful for you! Andrew -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text position in Traditional Graphics
If you are going to take this approach, you may want to look at the cnvrt.coords function in the TeachingDemos package. That may save you a few calculations. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Prager Sent: Wednesday, January 24, 2007 3:01 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Text position in Traditional Graphics Mike Prager [EMAIL PROTECTED] wrote: R 2.4.1 on Windows XP. Question: In traditional graphics, is it possible to find out the height of a line of text in units that can be used in arithmetic and then in calls to text()? [...] I seem to have solved my own question by setting the user scale to the size of the window in inches, converting the point size into inches, and going from there. This works well for all sizes of windows. It doesn't change the spacing when windows are resized, but I can live with that. There is nothing like posting to R-help to stimulate one's own thoughts. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modify rectangle color from image
On Wed, 2007-01-24 at 16:37 -0500, Saurav Pathak wrote: Hi, I need some suggestion on how I could modify the color on some rectangle that I have created using image. In other words, I have a 5x5 matrix, say, m. m - matrix(rnorm(25), nrow=5) I create a grid of rectangles by: image(m) Now I want to change the color of rectangle (3,3) to blue. I don't know how this could be done, and searching the web has given me no hint. Thanks for your help. Try this: m - matrix(rnorm(25), nrow = 5) image(m) # Get the plot region coords USR - par(usr) # Calc the length of a side of a square SIDE - abs(USR[1] - USR[2]) / 5 # Draw the rect using the appropriate offsets rect(USR[1] + (SIDE * 2), USR[3] + (SIDE * 2), USR[1] + (SIDE * 3), USR[3] + (SIDE * 3), col = blue) See ?par and review usr, then see ?rect par(usr) gives you the coordinates of the plot region. Then just do the math to calculate the coordinates of each rectangle. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cronbach's alpha
Continuing off topic: 1. The range of alpha -infinity alpha 1. 2. Alpha is NOT reliability 3. There are trivial examples of alpha 1 with reliability approaching 1. 4. There are trivial examples of alpha = 0 with reliability approaching 1. 5. Alpha cannot assess dimensionality. Lucke, Joseph F. The $\alpha$ and the $\omega$ of congeneric test theory: An extension of reliability and internal consistency to heterogeneous tests. Applied Psychological Measurement, 2005} 29(1),65--81}. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Wednesday, January 24, 2007 3:45 PM To: Doran, Harold Cc: R-help@stat.math.ethz.ch; Dave Atkins Subject: Re: [R] Cronbach's alpha Hi, there: I read that article (thanks Chucks, etc to point that out). Now I understand how those negatives are generated since my research subject should have negative convariance but they are measuring the same thing. So, I am confused about this same thing and about if it is proper to go ahead to use this measurement. To clear my point , I describe my idea here a little bit. My idea is to look for a way to assign a statistic or measurement to a set of variables to see if they act cohesively or coherently for an event. Instead of using simple correlation, which describes var/var correlation; I wanted to get a total correlation so that I can compare between setS of variables. Initially I made that word but google helps me find that statistic exists! So I read into it and post my original post on total correlation. (Ben, you can find total correlation from wiki). I was suggested to use this alpha since it measures a one latent construct, in which matches my idea about one event. I have a feeling it is like factor analysis; however, the grouping of variables has been fixed by domain knowledge. Sorry if it is off-list topic but I feel it is very interesting to go ahead. Thanks, Weiwei On 1/24/07, Doran, Harold [EMAIL PROTECTED] wrote: Hi Dave We had a bit of an off list discussion on this. You're correct, it can be negative IF the covariance among individual items is negative AND if that covariance term is larger than the sum of the individual item variances. Both of these conditions would be needed to make alpha go negative. Psychometrically speaking, this introduces some question as to whether the items are measuring the same latent trait. That is, if there is a negative covariance among items, but those items are thought to measure a common trait, then (I'm scratching my head) I think we have a dimensionality issue. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Atkins Sent: Wednesday, January 24, 2007 4:08 PM To: R-help@stat.math.ethz.ch Subject: Re: [R] Cronbach's alpha Harold Weiwei-- Actually, alpha *can* go negative, which means that items are reliably different as opposed to reliably similar. This happens when the sum of the covariances among items is negative. See the ATS site below for a more thorough explanation: http://www.ats.ucla.edu/STAT/SPSS/library/negalpha.htm Hope that helps. cheers, Dave -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone: 626.584.5554 Weiwei Something is wrong. Coefficient alpha is bounded between 0 and 1, so negative values are outside the parameter space for a reliability statistic. Recall that reliability is the ratio of true score variance to total score variance. That is var(t)/ var(t) + var(e) If all variance is true score variance, then var(e)=0 and the reliability is var(t)/var(t)=1. On the other hand, if all variance is measurement error, then var(t) = 0 and reliability is 0. Here is a function I wrote to compute alpha along with an example. Maybe try recomputing your statistic using this function and see if you get the same result. alpha - function(columns){ k - ncol(columns) colVars - apply(columns, 2, var) total - var(apply(columns, 1, sum)) a - (total - sum(colVars)) / total * (k/(k-1)) a } data(LSAT, package='ltm') alpha(LSAT) [1] 0.2949972 Harold -Original Message- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Weiwei Shi Sent: Wednesday, January 24, 2007 1:17 PM To: R R Subject: [R] Cronbach's alpha Dear Listers: I used cronbach{psy} to evaluate the internal consistency and some set of variables gave me alpha=-1.1003, while other, alpha=-0.2; alpha=0.89; and so on. I am interested in knowing how to interpret 1. negative value 2. negative value less than -1. I also want to re-mention my previous question about how to evaluate the consistency of a set of variables and about the total
Re: [R] RODBC
On 1/24/2007 3:25 PM, jgaseff wrote: Hello, I am fairly new to R and its connectivity to MS-Access. I just installed RODBC and it seems to be working well except when I use the date to condition the query. For example the query below sqlQuery(channel, select date from tblUScpi where (date d2) order by date) returns the following error [1] [RODBC] ERROR: Could not SQLExecDirect [2] 07001 -3010 [Microsoft][ODBC Microsoft Access Driver] Too few parameters. Expected 1. I checked that d2 and the elements in date belong to the same class (POSIXt POSIXct). Can anybody help me? If d2 is an R variable, you'll need to convert it to SQL syntax for dates (whatever that is, format() can probably do it). R just passes the query string to the database, it doesn't do any variable substitutions for you. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cronbach's alpha
Even if the grouping of variables has been fixed by domain knowledge, it does not mean there is unidimensionality in your items (at least for the sample of folks you have). For example, math reasoning and math fluency could both be conceptually put into a single math test, but, assuming a random sample of folks and enough items, it would really be measuring two different areas (which then could attenuate the overall alpha). You are right that alpha is similar to latent variable modeling. Here is a reference you might find useful. Miller, M. B. (1995). Coefficient alpha: A basic introduction from the perspectives of classical test theory and structural equation modeling. Structural Equation Modeling, 2(3), 255-273 I am not sure if the R IRT package can do item-level factor analysis, but the TESTFACT program does (it is the one I have had to use in the past). Also, the R psych package can compute McDonald's omega estimates the general factor saturation of a test. Best, Alex On 1/24/07, Weiwei Shi [EMAIL PROTECTED] wrote: Hi, there: I read that article (thanks Chucks, etc to point that out). Now I understand how those negatives are generated since my research subject should have negative convariance but they are measuring the same thing. So, I am confused about this same thing and about if it is proper to go ahead to use this measurement. To clear my point , I describe my idea here a little bit. My idea is to look for a way to assign a statistic or measurement to a set of variables to see if they act cohesively or coherently for an event. Instead of using simple correlation, which describes var/var correlation; I wanted to get a total correlation so that I can compare between setS of variables. Initially I made that word but google helps me find that statistic exists! So I read into it and post my original post on total correlation. (Ben, you can find total correlation from wiki). I was suggested to use this alpha since it measures a one latent construct, in which matches my idea about one event. I have a feeling it is like factor analysis; however, the grouping of variables has been fixed by domain knowledge. Sorry if it is off-list topic but I feel it is very interesting to go ahead. Thanks, Weiwei On 1/24/07, Doran, Harold [EMAIL PROTECTED] wrote: Hi Dave We had a bit of an off list discussion on this. You're correct, it can be negative IF the covariance among individual items is negative AND if that covariance term is larger than the sum of the individual item variances. Both of these conditions would be needed to make alpha go negative. Psychometrically speaking, this introduces some question as to whether the items are measuring the same latent trait. That is, if there is a negative covariance among items, but those items are thought to measure a common trait, then (I'm scratching my head) I think we have a dimensionality issue. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Atkins Sent: Wednesday, January 24, 2007 4:08 PM To: R-help@stat.math.ethz.ch Subject: Re: [R] Cronbach's alpha Harold Weiwei-- Actually, alpha *can* go negative, which means that items are reliably different as opposed to reliably similar. This happens when the sum of the covariances among items is negative. See the ATS site below for a more thorough explanation: http://www.ats.ucla.edu/STAT/SPSS/library/negalpha.htm Hope that helps. cheers, Dave -- Dave Atkins, PhD Assistant Professor in Clinical Psychology Fuller Graduate School of Psychology Email: [EMAIL PROTECTED] Phone: 626.584.5554 Weiwei Something is wrong. Coefficient alpha is bounded between 0 and 1, so negative values are outside the parameter space for a reliability statistic. Recall that reliability is the ratio of true score variance to total score variance. That is var(t)/ var(t) + var(e) If all variance is true score variance, then var(e)=0 and the reliability is var(t)/var(t)=1. On the other hand, if all variance is measurement error, then var(t) = 0 and reliability is 0. Here is a function I wrote to compute alpha along with an example. Maybe try recomputing your statistic using this function and see if you get the same result. alpha - function(columns){ k - ncol(columns) colVars - apply(columns, 2, var) total - var(apply(columns, 1, sum)) a - (total - sum(colVars)) / total * (k/(k-1)) a } data(LSAT, package='ltm') alpha(LSAT) [1] 0.2949972 Harold -Original Message- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Weiwei Shi Sent: Wednesday, January 24, 2007 1:17 PM To: R R Subject: [R] Cronbach's alpha Dear Listers:
Re: [R] Probabilities calibration error ROCR
Roberto, On 1/24/07, Roberto Perdisci [EMAIL PROTECTED] wrote: [...]. Do you know of any reference where I can find the details of the algorithm that computes the calibration error implemented in ROCR (apart from ROCR's source code)? we use it as defined in Caruana Niculescu-Mizil: Data mining in metric space: An empirical evaluation of supervised learning performance criteria. Knowledge Discovery and Data Mining (KDD) 2006. http://www.cs.cornell.edu/~caruana/perfs.rocai04.revised.rev1.ps Also, have a look at ?performance. HTH, Tobias __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using non-ASCII strings in R packages
Hello dear useRs and wizaRds, I am currently developing a package that will enable to use administrative map of Poland in R plots. Among other things I wanted to include region names in proper Polish language so that they can be used in creating graphics etc. I am working on Windows and when I build the package it is complaining about non-ASCII characters R code files. I was wondering what would be the best way to properly implement them in a platform-independent way so that they can be used in computations as well as in producing PS, PDF and other graphic output. Unfortunately I have a limited knowledge of encoding schemes etc. Is it OK to include them in Windows-1250 encoding (default for Polish locale, as far as I know)? I believe this problem is frequently confronted for other non-latin1 languages. If it is not the way to go, I would be very grateful for suggestions. Thanks in advance and kind regards, Michal Bojanowski Michal Bojanowski ICS / Department of Sociology Utrecht University Heidelberglaan 2; 3584 CS Utrecht Room 1428 [EMAIL PROTECTED] http://www.fss.uu.nl/soc/bojanowski/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] poly(x) workaround when x has missing values
Often in practical situations a predictor has missing values, so that poly crashes. For instance: x-1:10 y- x - 3 * x^2 + rnorm(10)/3 x[3]-NA lm( y ~ poly(x,2) ) Error in poly(x, 2) : missing values are not allowed in 'poly' lm( y ~ poly(x,2) , subset=!is.na(x)) # This does not help?!? Error in poly(x, 2) : missing values are not allowed in 'poly' The following function seems to be an okay workaround. Poly- function(x, degree = 1, coefs = NULL, raw = FALSE, ...) { notNA-!is.na(x) answer-poly(x[notNA], degree=degree, coefs=coefs, raw=raw, ...) THEMATRIX-matrix(NA, nrow=length(x), ncol=degree) THEMATRIX[notNA,]-answer attributes(THEMATRIX)[c('degree', 'coefs', 'class')]- attributes(answer)[c('degree', 'coefs', 'class')] THEMATRIX } lm( y ~ Poly(x,2) ) Call: lm(formula = y ~ Poly(x, 2)) Coefficients: (Intercept) Poly(x, 2)1 Poly(x, 2)2 209.1475.0114.0 and it works when x and y are in a dataframe too: DAT-data.frame(x=x, y=y) lm(y~Poly(x,2), data=DAT) Call: lm(formula = y ~ Poly(x, 2), data = DAT) Coefficients: (Intercept) Poly(x, 2)1 Poly(x, 2)2 -119.54 -276.11 -68.24 Is there a better way to do this? My workaround seems a bit awkward. Whoever wrote poly must have had a good reason for not making it deal with missing values? Thanks for any thoughts Jacob Wegelin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Days of the week?
Hi WizaRds, What is the standard way to get the day of the week from a date such as as.Date(2006-12-01)? It looks like fCalendar has some functions but this requires a change in the R locale to GMT. Is there another way? Thanks! Jack. - Be a PS3 game guru. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Days of the week?
?Date seems to say that weekdays() is appropriate for that: weekdays(as.Date(2006-12-01)) see: ?weekdays On 1/25/07, John McHenry [EMAIL PROTECTED] wrote: Hi WizaRds, What is the standard way to get the day of the week from a date such as as.Date(2006-12-01)? It looks like fCalendar has some functions but this requires a change in the R locale to GMT. Is there another way? Thanks! Jack. - Be a PS3 game guru. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Size of data vs. needed memory...rule of thumb?
I have been searching all day most of last night, but can't find any benchmarking or recommendations regarding R system requirements for very large (2-5GB) data sets to help guide our hardware configuration. If anybody has experience with this they're willing to share or could anybody point me in a direction that might be productive to research, it would be much appreciated. Specifically: will R simply use as much memory as the OS makes available to it, unlimited? Is there a multi-threading version R, packages? Does the core R package support 64-bit should I expect to see any difference in how memory's handled under that version? Is 3 GB of memory to 1GB of data a reasonable ballpark? Our testing thus far has been on a windows 32-bit box w/1GB of RAM 1 CPU; it appears to indicate something like 3GB of RAM for every 1GB of sql table (ex-indexes, byte-sized factors). At this point, we're planning on setting up a dual core 64-bit Linux box w/16GB of RAM for starters, since we have summed-down sql tables of approx 2-5GB generally. Here's details, just for context, or in case I'm misinterpreting the results, or in case there's some more memory-efficient way to get data in R's binary format than going w/the data.frame. R session: library(RODBC) channel-odbcConnect(psmrd) FivePer -data.frame(sqlQuery(channel, select * from AUTCombinedWA_BILossCost_5per)) Error: cannot allocate vector of size 2000 Kb In addition: Warning messages: 1: Reached total allocation of 1023Mb: see help(memory.size) 2: Reached total allocation of 1023Mb: see help(memory.size) ODBC connection: Microsoft SQL Server ODBC Driver Version 03.86.1830 Data Source Name: psmrd Data Source Description: Server: psmrdcdw01\modeling Database: OpenSeas_Work1 Language: (Default) Translate Character Data: Yes Log Long Running Queries: No Log Driver Statistics: No Use Integrated Security: Yes Use Regional Settings: No Prepared Statements Option: Drop temporary procedures on disconnect Use Failover Server: No Use ANSI Quoted Identifiers: Yes Use ANSI Null, Paddings and Warnings: Yes Data Encryption: No Please be patient, I'm a new R user (or at least I'm trying to be...at this point I'm mostly a new R-help-reader); I'd appreciated being pointed in the right direction if this isn't the right help list to send this question to...or if this question is poorly worded (I did read the posting guide). Jill Willie Open Seas Safeco Insurance [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Days of the week?
On 24 January 2007 at 17:39, John McHenry wrote: | What is the standard way to get the day of the week from a date such | as as.Date(2006-12-01)? It looks like fCalendar has some functions | but this requires a change in the R locale to GMT. Is there another way? Yes, go to POSIXlt and extract the wday field (see ?POSIXlt for more): as.POSIXlt(as.Date(2006-12-01))$wday [1] 5 as.POSIXlt(as.Date(2006-12-01)+0:6)$wday [1] 5 6 0 1 2 3 4 Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Days of the week?
You can use as.numeric(format(d, %w)) . See ?strptime and also the help desk article in R News 4/1. On 1/24/07, John McHenry [EMAIL PROTECTED] wrote: Hi WizaRds, What is the standard way to get the day of the week from a date such as as.Date(2006-12-01)? It looks like fCalendar has some functions but this requires a change in the R locale to GMT. Is there another way? Thanks! Jack. - Be a PS3 game guru. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fit model to data and use model for data generation
On Jan 24, 2007, at 10:34 AM, Benjamin Otto wrote: Hi, Suppose I have a set of values x and I want to calculate the distribution of the data. Ususally I would use the density command. Now, can I use the resulting density-object model to generate a number of new values which have the same distribution? Or do I have to use some different function? Regards, Benjamin -- Benjamin Otto Universitaetsklinikum Eppendorf Hamburg Institut fuer Klinische Chemie Martinistrasse 52 20246 Hamburg You could sample from the x's in the density object with probability given by the y's: ### Create a bimodal distribution x - c(rnorm(25, -2, 1), rnorm(50, 3, 2)) d - density(x, n = 1000) plot(d) ### Sample from the distribution and show the two ### distributions are the same x.new - sample(d$x, size = 10, # large n for proof of concept replace = TRUE, prob = d$y/sum(d$y)) dx.new - density(x.new) lines(dx.new$x, dx.new$y, col = blue) Hope this helps, Stephen Rochester, Minnesota, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] as.numeric(.1) under RGtk2
Prof Brian Ripley a écrit : I can reproduce this via Sys.setlocale(LC_NUMERIC, fr_FR) [1] fr_FR Warning message: setting 'LC_NUMERIC' may cause R to function strangely in: setlocale(category, locale) as.numeric(,1) [1] 0,1 as.numeric(.1) [1] NA Warning message: NAs introduced by coercion Assuming you have not done that anywhere, it should not happen. If you have, you were warned. (Have you tried starting R with --vanilla to be sure?) as.numeric() is using strtod which should only be affected by the locale category LC_NUMERIC, and R itself does not set LC_NUMERIC. So either you or some rogue OS function must have, unless there is a pretty major bug in the OS. (Just using a UTF-8 fr_FR locale does not do it on either of the Linux variants I tried.) Thanks for these helpful indications. This seems to be related to the RGtk2 package : # Before loading RGtk2 as.numeric(.1) [1] 0.1 as.numeric(,1) [1] NA Warning message: NAs introduits lors de la conversion automatique # After library loading library(RGtk2) as.numeric(.1) [1] NA Warning message: NAs introduits lors de la conversion automatique as.numeric(,1) [1] 0,1 I send a copy of this post to the RGtk2 package maintainers. Thanks for your help, Yvonnick Noel, PhD. Dpt of Psychology U. of Rennes France __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] version 0.3 of QCA
Dear list members, A new version of the QCA package is now on CRAN. The QCA package implements the Quine-McCluskey algorithm for boolean minimizations, according to the Qualitative Comparative Analysis. Along with the additional improvements in version 0.3-1 (soon to be released on CRAN), this code is about 100 times faster than the previous major release (0.2-6). It can now reasonably work with 11 binary variables, finding a complete (and exact) solution in less than 2 minutes. This dramatic increase in speed is due to using a mathematical reduction instead of an algorithmic one. This approach openes the way for _exact_ multi-value minimizations, and an even better (and faster) approach is searched for the future versions. Best, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ___ R-packages mailing list R-packages@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.numeric(.1) under RGtk2
This seems to be a conflict between GTK+ and R. Apparently, GTK+ sets the locale by itself. There is a way to prevent GTK+ from doing that. I will release a hotfix for RGtk2 soon and we'll see if it fixes it. I just need to run gtk_disable_setlocale() before gtk_init_check(). Thanks for reporting this, Michael On 1/24/07, NOEL Yvonnick [EMAIL PROTECTED] wrote: Prof Brian Ripley a écrit : I can reproduce this via Sys.setlocale(LC_NUMERIC, fr_FR) [1] fr_FR Warning message: setting 'LC_NUMERIC' may cause R to function strangely in: setlocale(category, locale) as.numeric(,1) [1] 0,1 as.numeric(.1) [1] NA Warning message: NAs introduced by coercion Assuming you have not done that anywhere, it should not happen. If you have, you were warned. (Have you tried starting R with --vanilla to be sure?) as.numeric() is using strtod which should only be affected by the locale category LC_NUMERIC, and R itself does not set LC_NUMERIC. So either you or some rogue OS function must have, unless there is a pretty major bug in the OS. (Just using a UTF-8 fr_FR locale does not do it on either of the Linux variants I tried.) Thanks for these helpful indications. This seems to be related to the RGtk2 package : # Before loading RGtk2 as.numeric(.1) [1] 0.1 as.numeric(,1) [1] NA Warning message: NAs introduits lors de la conversion automatique # After library loading library(RGtk2) as.numeric(.1) [1] NA Warning message: NAs introduits lors de la conversion automatique as.numeric(,1) [1] 0,1 I send a copy of this post to the RGtk2 package maintainers. Thanks for your help, Yvonnick Noel, PhD. Dpt of Psychology U. of Rennes France [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R programming question, one dimensional optimization
Hi, I have an optimization for x'Ax/x'Bx, x is a vector, A/B are matrix, I wrote a small program which can take in 2 matrices and a vector and a variable, this program combine the variable and the vector and generate a new vector, then test the x'Ax/x'Bx. However I dodnot know if there is a way that can calculate the x automatically instead of I typing different values to get the result and compare. === getMultiVal3 = function (a, b, cc, x) { + i-1; j=1; sum=0; sum2=0 + n=nrow(a); v= c(x, cc) + for (i in 1:n) { + for (j in 1:n) { + sum=sum+a[i,j]*v[i]*v[j]; sum2=sum2+b[i,j]*v[i]*v[j] + } + } +return(sum/sum2) + } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] poly(x) workaround when x has missing values
Orthpgpnality of polynomials is not defined if they contain missing values, which seems a good enough reason to me. Put it another way, in your solution whether the columns are orthogonal depends on the unknown values of the NAs, and it looks like is only true if the unknown values are all zero. On Wed, 24 Jan 2007, Jacob Wegelin wrote: Often in practical situations a predictor has missing values, so that poly crashes. For instance: x-1:10 y- x - 3 * x^2 + rnorm(10)/3 x[3]-NA lm( y ~ poly(x,2) ) Error in poly(x, 2) : missing values are not allowed in 'poly' lm( y ~ poly(x,2) , subset=!is.na(x)) # This does not help?!? Error in poly(x, 2) : missing values are not allowed in 'poly' The following function seems to be an okay workaround. Poly- function(x, degree = 1, coefs = NULL, raw = FALSE, ...) { notNA-!is.na(x) answer-poly(x[notNA], degree=degree, coefs=coefs, raw=raw, ...) THEMATRIX-matrix(NA, nrow=length(x), ncol=degree) THEMATRIX[notNA,]-answer attributes(THEMATRIX)[c('degree', 'coefs', 'class')]- attributes(answer)[c('degree', 'coefs', 'class')] THEMATRIX } lm( y ~ Poly(x,2) ) Call: lm(formula = y ~ Poly(x, 2)) Coefficients: (Intercept) Poly(x, 2)1 Poly(x, 2)2 209.1475.0114.0 and it works when x and y are in a dataframe too: DAT-data.frame(x=x, y=y) lm(y~Poly(x,2), data=DAT) Call: lm(formula = y ~ Poly(x, 2), data = DAT) Coefficients: (Intercept) Poly(x, 2)1 Poly(x, 2)2 -119.54 -276.11 -68.24 Is there a better way to do this? My workaround seems a bit awkward. Whoever wrote poly must have had a good reason for not making it deal with missing values? Thanks for any thoughts Jacob Wegelin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpart question
I make classification tree like this code p.t2.90 - rpart(y~aa_three+bas+bcu+aa_ss, data=training,method=class,control=rpart.control(cp=0.0001)) Here I want to set weight for 4 predictors(aa_three,bas,bcu,aa_ss). I know that there is a weight set-up in rpart. Can this set-up satisfy my need? If so, could someone give me an example? Thanks, Aimin Yan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.