Re: [R] error code 5 from Lapack routine 'dsyevr'
On Fri, 16 Mar 2007, Richard D. Morey wrote: While using the rmvnorm function, I get the error: Error in eigen(sigma, sym = TRUE) : error code 5 from Lapack routine 'dsyevr' The same thing happens when I try the eigen() function on my covariance matrix. The matrix is a symmetric 111x111 matrix. Well, it is almost symmetric; there are slight deviations from symmetry (the largest is 3e-18). I have this in an MCMC loop, and it happens about once in every 40 iterations or so. What does this error mean? See the LAPACK code (in src/modules/lapack). Internal LAPACK errors are usually problems with arithmetic accuracy, and as such are compiler- and CPU-specific. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ARIMA standard error
On Fri, 16 Mar 2007, Gad Abraham wrote: Andrew Robinson wrote: Hi Gad, try: class(a) [1] Arima getAnywhere(print.Arima) Thanks Andrew. For the record, the standard error is the square root of the diagonal of the covariance matrix a$var.coef (itself obtained through some magic): ses[x$mask] - round(sqrt(diag(x$var.coef)), digits = digits) And for the record, ?arima does tell you about the component var.coef, and also suggests the vcov() method to extract the variance matrix. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Defining a floor value in a data.frame on a row-by-row basis
Hi, I have a data frame x, and a vector y which limits x by rows so that no element of x in that row is smaller than y. For example, x - as.data.frame( t( array( 1:9, dim=c(3,3 y - c( 2, 8, 0 ) x V1 V2 V3 1 1 2 3 2 4 5 6 3 7 8 9 y [1] 2 8 0 = I want to get this: V1 V2 V3 1 2 2 3 2 8 8 8 3 7 8 9 = I tried: x[ x=y ] - y but R wasn't happy. I know I can do this with a loop, but there has to be a way which I can avoid it... Thanks in advance. - Ken __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Thanks for your answer which was very helpfull. I have another question: I have read in this document (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the programs written in R are ephemeral and that new releases are not always compatible with previous releases. What I would like to know is if R functions are already validated and if not, what should we do to validate a R function ? -- Delphine Fontaine Quoting Soukup, Mat [EMAIL PROTECTED]: Delphine, Please see the following message posted a week ago: http://comments.gmane.org/gmane.comp.lang.r.general/80175. HTH, -Mat -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Delphine Fontaine Sent: Friday, March 09, 2007 8:29 AM To: r-help@stat.math.ethz.ch Subject: [R] R and clinical studies Does anyone know if for clinical studies the FDA would accept statistical analyses performed with R ? Delphine Fontaine __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on sigmoid curve fitting
Hi list, I was wondering how I should go about fitting a sigmoid curve to a dataset. More specifically how I estimate parameters a and b in the following equation: 1 / 1+exp(-(x-a)*b) with b the steepness of the sigmoid curve and a the shift of the center of the sigmoid curve relative to the center of your dataframe. The fit is in function of x, the location within the input vector and y, an ecological value. So I would like to estimate parameters a and b and a goodness of fit/ F-value of some kind. Any suggestions? Kind regards, Koen -- Checked by AVG Free Edition. 11:27 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can i do the same thing in the China map?
The maps package has a function called match.map, which is for map coloring . Its example is followed: # filled map showing Republican vote in 1900 # (figure 6 in the reference) data(state, package = datasets) data(votes.repub) state.to.map - match.map(state, state.name) x - votes.repub[state.to.map, 1900] gray.colors - function(n) gray(rev(0:(n - 1))/n) color - gray.colors(100)[floor(x)] map(state, fill = TRUE, col = color); map(state, add = TRUE) I want to do the same thing in the China map, but I can't find the Provinces name of China. Who can help me ? a rookie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expm() within the Matrix package
Gad == Gad Abraham [EMAIL PROTECTED] on Fri, 16 Mar 2007 09:48:52 +1100 writes: Gad Laura Hill wrote: Hi Could anybody give me a bit of advice on some code I'm having trouble with? I've been trying to calculate the loglikelihood of a function iterated over a data of time values and I seem to be experiencing difficulty when I use the function expm(). Here's an example of what I am trying to do y-c(5,10) #vector of 2 survival times p-Matrix(c(1,0),1,2) #1x2 matrix Q-Matrix(c(1,2,3,4),2,2) #2x2 matrix q-Matrix(c(1,2),2,1) #2x1 matrix Loglik-rep(0,length(y))#creating empty vector of same length as y for(i in 1:length(y)){ Loglik[i]-log((p %*% (expm(Q*y[i]))) %*% q) #calculating # Loglikelihood for each y[i] } The code works perfectly if I use exp(Q*y[i]) but not for expm() Gad You need to do a type conversion here, because you get the loglik as a Gad 1x1 Matrix, instead of a scalar: Gad (after running your code) log(p %*% expm(Q * y[i]) %*% q) Gad 1 x 1 Matrix of class dgeMatrix Gad [,1] Gad [1,] 134.5565 Gad If you convert to numeric, you can then assign it to Loglik: Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q)) Loglik[1] Gad [1] 134.5565 Hmm, I don't think that's Laura's problem (and actually I don't know what her problem is) : Assignment of a 1 x 1 matrix to a vector is not a problem, and hence the as.numeric(.) above really has no effect : ll - 1:2 (m - matrix(pi, 1,1)) [,1] [1,] 3.141593 ll[1] - m ll [1] 3.141593 2.00 Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expm() within the Matrix package
Gad == Gad Abraham [EMAIL PROTECTED] on Fri, 16 Mar 2007 09:48:52 +1100 writes: Gad Laura Hill wrote: Hi Could anybody give me a bit of advice on some code I'm having trouble with? I've been trying to calculate the loglikelihood of a function iterated over a data of time values and I seem to be experiencing difficulty when I use the function expm(). Here's an example of what I am trying to do y-c(5,10) #vector of 2 survival times p-Matrix(c(1,0),1,2) #1x2 matrix Q-Matrix(c(1,2,3,4),2,2) #2x2 matrix q-Matrix(c(1,2),2,1) #2x1 matrix Loglik-rep(0,length(y))#creating empty vector of same length as y for(i in 1:length(y)){ Loglik[i]-log((p %*% (expm(Q*y[i]))) %*% q) #calculating # Loglikelihood for each y[i] } The code works perfectly if I use exp(Q*y[i]) but not for expm() Gad You need to do a type conversion here, because you get the loglik as a Gad 1x1 Matrix, instead of a scalar: Gad (after running your code) log(p %*% expm(Q * y[i]) %*% q) Gad 1 x 1 Matrix of class dgeMatrix Gad [,1] Gad [1,] 134.5565 Gad If you convert to numeric, you can then assign it to Loglik: Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q)) Loglik[1] Gad [1] 134.5565 Gad -- Gad Gad Abraham Gad Department of Mathematics and Statistics Gad The University of Melbourne Gad Parkville 3010, Victoria, Australia Gad email: [EMAIL PROTECTED] Gad web: http://www.ms.unimelb.edu.au/~gabraham Gad __ Gad R-help@stat.math.ethz.ch mailing list Gad https://stat.ethz.ch/mailman/listinfo/r-help Gad PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Gad and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rattle- MSACCESS database problem
Received Thu 08 Mar 2007 2:38am +1100 from j.joshua thomas: library(RGtk2) library(rattle) rattle() click the ODBC option it as the DSN i am a bit confused with this i already put my *.mdb file in C:drive i try put the DSN name as Microsoft Access driver, in the appropriate text box but i couldnt locate the table i tried the other way round open- locate the *.mdb in C:drive couldnt locate i tried RODBC aswell, but i want to use rattle to Data mine my database need someone's help Did it work with RODBC? Could you send what appears in the Log tab. That will give me an idea of what is going on. Regards, Graham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem installing R onto Solaris 2.10 system - need advice!!!!!
Dear R-Help friends, I am unable to get the latest version of R (2.4.1) to compile on my solaris 10 system - has anybody else experienced this problem and are you able to offer me any advice? I appreciate your time, many thanks, Jenny Barnes Here are my CURRENT specifications: platform sparc-sun-solaris2.10 arch sparc os solaris2.10 system sparc, solaris2.10 status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) ~~ Jennifer Barnes PhD student: long range drought prediction Climate Extremes Group Department of Space and Climate Physics University College London Holmbury St Mary Dorking, Surrey, RH5 6NT Tel: 01483 204149 Mob: 07916 139187 Web: http://climate.mssl.ucl.ac.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Defining a floor value in a data.frame on a row-by-row basis
* Feng, Ken [CIB-EQTY] [EMAIL PROTECTED] [070316 09:40]: Hi, I have a data frame x, and a vector y which limits x by rows so that no element of x in that row is smaller than y. For example, x - as.data.frame( t( array( 1:9, dim=c(3,3 y - c( 2, 8, 0 ) x V1 V2 V3 1 1 2 3 2 4 5 6 3 7 8 9 y [1] 2 8 0 = I want to get this: V1 V2 V3 1 2 2 3 2 8 8 8 3 7 8 9 = I tried: x[ x=y ] - y but R wasn't happy. I know I can do this with a loop, but there has to be a way which I can avoid it... sapply(x, function(xc) ifelse(xc y, y, xc)) Uwe Ligges Thanks in advance. - Ken __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Complex superscript plot text
* Bob Farmer [EMAIL PROTECTED] [070316 05:57]: Hi all: I would like to create a line of plot margin text (using mtext() ) that features both a superscript and a subset of an object. However, I cannot seem to do both things at once, nor have I found a way to paste the two results together. (I pull the object subset because this is part of a for-next loop to make multiple graphs) This example doesn't give me what I want from mtext(): GoodnessOfFits-c(1, 0.75) graphNumber-1 #first loop of the for-next loop (not shown) x-seq(-10, 10, 1) y-(x^2) plot(x,y) lines(x, predict(lm(y~I(x^2 mtext(text= expression(R^2 * : * GoodnessOfFits[graphNumber])) plot(x,y) lines(x, predict(lm(y~I(x^2 mtext(text = substitute(R^2 == GF, list(GF = GoodnessOfFits[graphNumber])), line = 1) Uwe Ligges I recognize that in this example, I could extract the R-squared value from the model in each loop, however this does not apply to my more complicated real scenario. Any suggestions? Thanks. --Bob Farmer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Complex superscript plot text
try: mtext(substitute(R^2 * : * GoodnessOfFits[i],list(i=graphNumber))) HTH. On 3/16/07, Bob Farmer [EMAIL PROTECTED] wrote: Hi all: I would like to create a line of plot margin text (using mtext() ) that features both a superscript and a subset of an object. However, I cannot seem to do both things at once, nor have I found a way to paste the two results together. (I pull the object subset because this is part of a for-next loop to make multiple graphs) This example doesn't give me what I want from mtext(): GoodnessOfFits-c(1, 0.75) graphNumber-1 #first loop of the for-next loop (not shown) x-seq(-10, 10, 1) y-(x^2) plot(x,y) lines(x, predict(lm(y~I(x^2 mtext(text= expression(R^2 * : * GoodnessOfFits[graphNumber])) I recognize that in this example, I could extract the R-squared value from the model in each loop, however this does not apply to my more complicated real scenario. Any suggestions? Thanks. --Bob Farmer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on sigmoid curve fitting
you can use nls: see ?nls and its example. HTH. On 3/16/07, Hufkens Koen [EMAIL PROTECTED] wrote: Hi list, I was wondering how I should go about fitting a sigmoid curve to a dataset. More specifically how I estimate parameters a and b in the following equation: 1 / 1+exp(-(x-a)*b) with b the steepness of the sigmoid curve and a the shift of the center of the sigmoid curve relative to the center of your dataframe. The fit is in function of x, the location within the input vector and y, an ecological value. So I would like to estimate parameters a and b and a goodness of fit/ F-value of some kind. Any suggestions? Kind regards, Koen -- Checked by AVG Free Edition. 11:27 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can i do the same thing in the China map?
usstata wrote: The maps package has a function called match.map, which is for map coloring . Its example is followed: # filled map showing Republican vote in 1900 # (figure 6 in the reference) data(state, package = datasets) data(votes.repub) state.to.map - match.map(state, state.name) x - votes.repub[state.to.map, 1900] gray.colors - function(n) gray(rev(0:(n - 1))/n) color - gray.colors(100)[floor(x)] map(state, fill = TRUE, col = color); map(state, add = TRUE) I want to do the same thing in the China map, but I can't find the Provinces name of China. Who can help me ? nobody a rookie Tell us who you are, and you may get a more substantial reply. Ray Brownrigg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gls bug?
Dear R-help, I found that the following code crashes R (version 2.4.0 in windows). x=rnorm(10,0.1,1) library(nlme) gls(x~0) I quickly found a work-around for what I was trying to do, but I thought I should report this. Best wishes Simon Bond __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem to add legend to panel.histogram function
Hi, R users, The following script works well, but I have problem to add some legend on graphs. For example, I would like to past on each topright graphs one thing: brown color for Gumbel density green color for Weibull density Thank you for your help (See attached file: compar_densités.txt) Lassana KOITA Chargé d'Etudes de Sécurité Aéroportuaire et d'Analyse Statistique / Project Engineer Airport Safety Studies Statistical analysis Service Technique de l'Aviation Civile (STAC) / Civil Aviation Technical Department Direction Générale de l'Aviation Civile (DGAC) / French Civil Aviation Headquarters Tel: 01 49 56 80 60 Fax: 01 49 56 82 14 E-mail: [EMAIL PROTECTED] http://www.stac.aviation-civile.gouv.fr/__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use result of approxfun in a package?
* Steve Buyske [EMAIL PROTECTED] [070315 19:05]: I am working on a project where we start with start with 2 long, equal-length vectors, massage them in various ways, and end up with a function mapping one interval to another. I'll call that function f1. The last step in R is to generate f1 as the value of the approxfun function. I would like to put f1 into a package, but without having the package redo the creation of function f1. The value of approxfun is a function; for example, I have f1 function (v) .C(R_approx, as.double(x), as.double(y), as.integer(n), xout = as.double(v), as.integer(length(v)), as.integer(method), as.double(yleft), as.double(yright), as.double(f), NAOK = TRUE, PACKAGE = base) $xout environment: 0x17719324 I don't really understand how this is stored, and in particular, how I should handle it so as to include the function f1 in a package. I would like the users to be able to load the package and use f1 directly, rather than re-generate it using approxfun. approxfun() makes use of some feature of some nice lexical scoping feature: If a function (approxfun) returns some other function (f1), then this one is stored together with the environment of the first function (approxfun). f1 requires not only v (its formal argument), but also objects x, y, n, method, yleft, yright, and f. All these are in the envionment that was created at runtime of approxfun() which is referenced as environment: 0x17719324 in your special case. You can save() the object f1 together with the environment as in: save(f1, f1.Rdata) and reload it later with load(f1.Rdata) In a package, this means you can either save it as data or you can specify the call to approxfun() directly in an R file that executes it during either loading the installed package or during installation if a saved image of the packages is used, depending on the setting in your DESCRIPTION file. Uwe Ligges Thanks, Steve --- Steve Buyske (rhymes with nice key) Associate Research Professor Department of Statistics Biostatistics Rutgers University Hill Center, 110 Frelinghuysen Rd Piscataway, NJ 08854 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem installing R onto Solaris 2.10 system - need advice!!!!!
Dear Andrew and R-help, Here is the error message that we got when trying to install R v2.4 (which we tried to install before this newer version 2.4.1 - I'm afrid I didn't save the error message from the latest attempt with the new version): configure make ... ... ... f90: CODE: 0 WORDS, DATA: 0 WORDS gcc -G -L/usr/local/lib -o stats.so init.o kmeans.o ansari.o bandwidths.o chisq sim.o d2x2xk.o fexact.o kendall.o ks.o line.o smooth.o prho.o swilk.o ksmooth .o loessc.o isoreg.o Srunmed.o Trunmed.o dblcen.o distance.o hclust-utils.o nl s.o HoltWinters.o PPsum.o arima.o burg.o filter.o mAR.o pacf.o starma.o port.o family.o sbart.o bsplvd.o bvalue.o bvalus.o loessf.o ppr.o qsbart.o sgram.o si nerp.o sslvrg.o stxwx.o hclust.o kmns.o eureka.o stl.o portsrc.o -L../../../.. /lib -lRblas -lg2c -lm -lgcc_s mkdir ../../../../library/stats/libs building package 'datasets' mkdir ../../../library/datasets mkdir ../../../library/datasets/R mkdir ../../../library/datasets/data Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/tmp/R-2.4.0/library/stats/libs/stats.so' : ld.so.1: R: fatal: relocation error: file /tmp/R-2.4.0/library/stats/libs/stat s.so: symbol __i_abs: referenced symbol not found Execution halted *** Error code 1 Many thanks, Jenny Hi Jenny, advice: try posting the error message accompanying the failure to compile. Cheers Andrew On Fri, Mar 16, 2007 at 09:19:24AM +, Jenny Barnes wrote: Dear R-Help friends, I am unable to get the latest version of R (2.4.1) to compile on my solaris 10 system - has anybody else experienced this problem and are you able to offer me any advice? I appreciate your time, many thanks, Jenny Barnes Here are my CURRENT specifications: platform sparc-sun-solaris2.10 arch sparc os solaris2.10 system sparc, solaris2.10 status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) ~~ Jennifer Barnes PhD student: long range drought prediction Climate Extremes Group Department of Space and Climate Physics University College London Holmbury St Mary Dorking, Surrey, RH5 6NT Tel: 01483 204149 Mob: 07916 139187 Web: http://climate.mssl.ucl.ac.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/ ~~ Jennifer Barnes PhD student: long range drought prediction Climate Extremes Group Department of Space and Climate Physics University College London Holmbury St Mary Dorking, Surrey, RH5 6NT Tel: 01483 204149 Mob: 07916 139187 Web: http://climate.mssl.ucl.ac.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] corAR1 in a random effects panel
Hi everyone, I am interested in estimating this type of random effects panel: y_it = x'_it * beta + u_it + e_it u_it = rho * u_it-1 + d_itrho belongs to (-1, 1) where: u and e are independent and normally zero-mean distributed. d is also independently normally zero-mean distributed. So, I want random effects for group i to be correlated in t, following an AR(1) process. I am using the mle command, including correlation=corAR1: lme(asis~prec+pobl+gola+entr,random=~1|codi,correlation=corAR1(0.8 ,form=~temp|codi))) i = codi t = temp I am not sure whether the AR(1) process is applied to the random effects (u_it) or the error term (e_it)... Any idea? Thanks. G -- Guillermo Villa Universidad Carlos III de Madrid Business Economics Department Office 6.0.26 Madrid, 126 28903, Getafe (Madrid) SPAIN Email: [EMAIL PROTECTED] Phone: (+34) 916249772 Mobil: (+34) 655112743 Fax: (+34) 916249607 Skype: guillermo.villa Website: www.guillermovilla.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bad points in regression
I have a question, maybe it's better to explain by example: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) Now, the graphs clearly show that 15, 25 and 50 are the indexes of the bad points. How can I retrieve this information from ll? Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Segmentation fault in estimating structural equation models with the SEM package.
Dear R-users, I am running a large number of simulations and estimating a structural equation model for each one using the SEM package. Each run of my program has around 8000 simulations. Most of the time the program completes all of them correctly but sometimes I get a segmentation fault in the sem routine and my program stops with the following error message: *** caught segfault *** address (nil), cause 'unknown' Traceback: 1: nlm(if (analytic.gradient) objective.2 else objective.1, start, hessian = TRUE, iterlim = maxiter, print.level = if (debug) 2 else 0, typsize = typsize, . ..) 2: sem.default(ram = ram, S = S, N = N, param.names = pars, var.names = vars, fixed.x = fixed.x, debug = debug, ...) 3: sem(ram = ram, S = S, N = N, param.names = pars, var.names = vars, fixed.x = fixed.x, debug = debug, ...) 4: sem.mod(modelspec, cor.or.cov.mat, N, obs.variables = comb.set, warn = FALSE) 5: sem(modelspec, cor.or.cov.mat, N, obs.variables = comb.set, warn = FALSE) 6: try(sem(modelspec, cor.or.cov.mat, N, obs.variables = comb.set, warn = FALSE), silent = TRUE) 7: eval.with.vis(expr, envir, enclos) 8: eval.with.vis(ei, envir) 9: source(analysis.base.R) aborting ... Does anybody have an idea about how I could prevent this from happening or catch it so that my program could continue running? Could I, in some way, manage the memory myself to prevent this? The model is correctly specified and clearly identified but the covariance matrix changes between simulations (it is a two latent variable measurement model with more than 4 observed variables on each LV). Many thanks, regards, Snaebjorn P.S. Using R 2.4.1 (2006-06-01) on x86_64 GNU/Linux Using version 0.9-6 (2006/11/22) of the SEM package. --- Snaebjorn Gunnsteinsson JiVitA Science Fellow The JiVitA Project (NIPHP-USAID) Rangpur, Bangladesh www.jivita.org Mobile: +880-171-320-2560 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can I scale the labels in a 'persp' graph?
Hi all: I'm using 'persp' for 3D graphics. I need the axis's labels smaller than by defect. I see in 'help()', the information about 'par()'. I have wrote: par(.,cex.axis=0.5,cex.lab=0.5) perspc(.) and the result don't change. The question is: Can I change the size of labels in the perps graph?? Thank you in advance: /salva 'cex.axis' The magnification to be used for axis annotation relative to the current setting of 'cex'. (Some functions such as 'points' accept a vector of values which are recycled. Other uses will take just the first value if a vector of length greater than one is supplied.) 'cex.lab' The magnification to be used for x and y labels relative to the current setting of 'cex'. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad points in regression
On 16-Mar-07 11:56:56, Alberto Monteiro wrote: I have a question, maybe it's better to explain by example: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) Now, the graphs clearly show that 15, 25 and 50 are the indexes of the bad points. How can I retrieve this information from ll? Alberto Monteiro ll is the output of a linear model fiited by lm(), and so has several components (see ?lm in the section Value), one of which is residuals (which can be abbreviated to res). So, in the case of your example, which(abs(ll$res)2) 15 25 50 extracts the information you want (and the 2 was inspired by looking at the residuals plot from your plot(ll)). Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 16-Mar-07 Time: 12:19:02 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad points in regression
Ted Harding wrote: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) ll is the output of a linear model fiited by lm(), and so has several components (see ?lm in the section Value), one of which is residuals (which can be abbreviated to res). So, in the case of your example, which(abs(ll$res)2) 15 25 50 extracts the information you want (and the 2 was inspired by looking at the residuals plot from your plot(ll)). Ok, but how can I grab those points _in general_? What is the criterium that plot used to mark those points as bad points? names(ll) gives: [1] coefficients residuals effects rank [5] fitted.values assignqrdf.residual [9] xlevels call terms model None of them include information about those bad points. Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem installing R onto Solaris 2.10 system - need advice!!!!!
It seems you are using f90, but FLIBS was computed using g77. That would mean that something has been altered since R was configured, or that the way the Fortran compiler was specified was incorrect. I think you need to start again and make use of only one set of compilers. On Fri, 16 Mar 2007, Jenny Barnes wrote: Dear Andrew and R-help, Here is the error message that we got when trying to install R v2.4 (which we tried to install before this newer version 2.4.1 - I'm afrid I didn't save the error message from the latest attempt with the new version): configure make ... ... ... f90: CODE: 0 WORDS, DATA: 0 WORDS gcc -G -L/usr/local/lib -o stats.so init.o kmeans.o ansari.o bandwidths.o chisq sim.o d2x2xk.o fexact.o kendall.o ks.o line.o smooth.o prho.o swilk.o ksmooth .o loessc.o isoreg.o Srunmed.o Trunmed.o dblcen.o distance.o hclust-utils.o nl s.o HoltWinters.o PPsum.o arima.o burg.o filter.o mAR.o pacf.o starma.o port.o family.o sbart.o bsplvd.o bvalue.o bvalus.o loessf.o ppr.o qsbart.o sgram.o si nerp.o sslvrg.o stxwx.o hclust.o kmns.o eureka.o stl.o portsrc.o -L../../../.. /lib -lRblas -lg2c -lm -lgcc_s mkdir ../../../../library/stats/libs building package 'datasets' mkdir ../../../library/datasets mkdir ../../../library/datasets/R mkdir ../../../library/datasets/data Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/tmp/R-2.4.0/library/stats/libs/stats.so' : ld.so.1: R: fatal: relocation error: file /tmp/R-2.4.0/library/stats/libs/stat s.so: symbol __i_abs: referenced symbol not found Execution halted *** Error code 1 Many thanks, Jenny Hi Jenny, advice: try posting the error message accompanying the failure to compile. Cheers Andrew On Fri, Mar 16, 2007 at 09:19:24AM +, Jenny Barnes wrote: Dear R-Help friends, I am unable to get the latest version of R (2.4.1) to compile on my solaris 10 system - has anybody else experienced this problem and are you able to offer me any advice? I appreciate your time, many thanks, Jenny Barnes Here are my CURRENT specifications: platform sparc-sun-solaris2.10 arch sparc os solaris2.10 system sparc, solaris2.10 status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) ~~ Jennifer Barnes PhD student: long range drought prediction Climate Extremes Group Department of Space and Climate Physics University College London Holmbury St Mary Dorking, Surrey, RH5 6NT Tel: 01483 204149 Mob: 07916 139187 Web: http://climate.mssl.ucl.ac.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/ ~~ Jennifer Barnes PhD student: long range drought prediction Climate Extremes Group Department of Space and Climate Physics University College London Holmbury St Mary Dorking, Surrey, RH5 6NT Tel: 01483 204149 Mob: 07916 139187 Web: http://climate.mssl.ucl.ac.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I scale the labels in a 'persp' graph?
On 3/16/2007 8:02 AM, [EMAIL PROTECTED] wrote: Hi all: I'm using 'persp' for 3D graphics. I need the axis's labels smaller than by defect. I see in 'help()', the information about 'par()'. I have wrote: par(.,cex.axis=0.5,cex.lab=0.5) perspc(.) and the result don't change. The question is: Can I change the size of labels in the perps graph?? Thank you in advance: /salva 'cex.axis' The magnification to be used for axis annotation relative to the current setting of 'cex'. (Some functions such as 'points' accept a vector of values which are recycled. Other uses will take just the first value if a vector of length greater than one is supplied.) 'cex.lab' The magnification to be used for x and y labels relative to the current setting of 'cex'. Those don't appear to be supported by persp, but cex is: e.g. x - 1:10 y - 1:10 z - outer(x,y,function(x,y) sin((x+y)/10)) persp(x,y,z, cex=0.5) Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad points in regression
On Fri, 16 Mar 2007, Alberto Monteiro wrote: Ted Harding wrote: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) ll is the output of a linear model fiited by lm(), and so has several components (see ?lm in the section Value), one of which is residuals (which can be abbreviated to res). So, in the case of your example, which(abs(ll$res)2) 15 25 50 extracts the information you want (and the 2 was inspired by looking at the residuals plot from your plot(ll)). Ok, but how can I grab those points _in general_? What is the criterium that plot used to mark those points as bad points? names(ll) gives: [1] coefficients residuals effects rank [5] fitted.values assignqrdf.residual [9] xlevels call terms model None of them include information about those bad points. But it is the plot method that you are using, not the object ll. If you examine stats::plot.lm you will see what it does: label the points with the 'id.n' largest (in absolute value) residuals (standardized residuals for types 2 and 3). And ?plot.lm also tells you that. BTW, 'bad points' seems your own description: it does not appear in the R documentation. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad points in regression
Hi you can check ?influence or ?influence.measures to evaluate some regression diagnostics Regards Petr On 16 Mar 2007 at 9:56, Alberto Monteiro wrote: From: Alberto Monteiro [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Date sent: Fri, 16 Mar 2007 09:56:56 -0200 Subject:[R] Bad points in regression I have a question, maybe it's better to explain by example: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) Now, the graphs clearly show that 15, 25 and 50 are the indexes of the bad points. How can I retrieve this information from ll? Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Delphine Fontaine wrote: Thanks for your answer which was very helpfull. I have another question: I have read in this document (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the programs written in R are ephemeral and that new releases are not always compatible with previous releases. What I would like to know is if R functions are already validated and if not, what should we do to validate a R function ? In the sense in which most persons use the term 'validate', it means to show with one or more datasets that the function is capable of producing the right answer. It doesn't mean that it produces the right answer for every dataset although we hope it does. [As an aside, most errors are in the data manipulation phase, not in the analysis phase.] So I think that instead of validating functions we should spend more effort on validating analyses [and validating analysis file derivation]. Pivotal analyses can be re-done a variety of ways, in R or in separate programmable packages such as Stata. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Implementing trees in R
Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad points in regression
On 16-Mar-07 12:41:50, Alberto Monteiro wrote: Ted Harding wrote: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) ll is the output of a linear model fiited by lm(), and so has several components (see ?lm in the section Value), one of which is residuals (which can be abbreviated to res). So, in the case of your example, which(abs(ll$res)2) 15 25 50 extracts the information you want (and the 2 was inspired by looking at the residuals plot from your plot(ll)). Ok, but how can I grab those points _in general_? What is the criterium that plot used to mark those points as bad points? Ahh ... ! I see what you're after. OK, look at the plot method for lm(): ?plot.lm ## S3 method for class 'lm': plot(x, which = 1:4, caption = c(Residuals vs Fitted, Normal Q-Q plot, Scale-Location plot, Cook's distance plot), panel = points, sub.caption = deparse(x$call), main = , ask = prod(par(mfcol)) length(which) dev.interactive(), ..., id.n = 3, labels.id = names(residuals(x)), cex.id = 0.75) where (see further down): id.n: number of points to be labelled in each plot, starting with the most extreme. and note, in the default parameter-values listing above: id.n = 3 Hence, the 3 most extreme points (according to the criterion being plotted in each plot) are marked in each plot. So, for instance3, try plot(ll,id.n=5) and you will get points 10,15,25,28,50. And so on. But that pre-supposes that you know how many points are exceptional. What is meant by extremeis not stated in the help page ?plot.lm, but can be identified by inspecting the code for plot.lm(), which you can see by entering plot.lm In your example, if you omit the line which assigns anomalous values to err[15[, err[25] and err[50], then you are likely to observe that different points get identified on different plots. For instance, I just got the following results for the default id.n=3: [1] Residuals vs Fitted: 41,53,59 [2] Standardised Residuals:41,53,59 [3] sqrt(Stand Res) vs Fitted: 41,53,59 [4] Cook's Distance: 59,96,97 There are several approaches (with somewhat different outcomes) to identifying outliers. If you apply one of these, you will probably get the identities of the points anyway. Again in the context of your example (where in fact you deliberately set 3 points to have exceptional errors, thus coincidentally the same as the default value 3 of id.n), you could try different values for id.n and inspect the graphs to see whether a given value of id.n marks some points that do not look exceptional relative to the mass of the other points. So, the above plot(ll,id.n=5) gave me one point, 10 on the residuals plot, which apparently belonged to the general distribution of residuals. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 16-Mar-07 Time: 13:43:54 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting of all possible models
Rense Nieuwenhuis a écrit : Dear Lukas, allthough I'm intrigued by the purpose of what you are trying to do, as mentioned by some of the other persons on this list, I liked the challenge to write such a function. I came up with the following during some train-traveling this morning: tum - function(x) { tum - matrix(data=NA, nrow=2^x, ncol=x) for (i in 1:x) { tum[,i] - c(rep(NA,2^i/2),rep(i+1,2^i/2)) } return(tum) } ### all.models - function(model) { npred - length(model$coefficients) - 1 matr.model - tum(npred) output - matrix(data=NA, nrow=2^npred, ncol=1) for (t in 2:2^npred) { preds - names(model$coefficients) interc - names(model$coefficients)[1] form - as.formula(paste(. ~, paste(preds[na.omit(matr.model [t,])],collapse=+))) model2 - update(model, form) output[t,] - mean(resid(model2)^2) } return(output) } ## As you can see, I used a helper-function (tum, the ultimate matrix) to the actual function. Also, I wrote it using lm instead of glm, but I suppose that you can easily alter that. As well, the function now returns just some regular fit-measurement. But that is not all that essential, I think. The main point is: it works! Using this on my G4 mac, with a lm of 10 predictors and 18 cases, it returns the output quite fast (1 minute). I hope you can put this to use. It needs some easy adapting to your specific needs, but I don't expect that to be a problem. If you need help with that, please contact me. I'd appreciate to hear from you, if this function is helpful in any way. sincerely, Rense Nieuwenhuis On Feb 27, 2007, at 8:46 , Indermaur Lukas wrote: Hi, Fitting all possible models (GLM) with 10 predictors will result in loads of (2^10 - 1) models. I want to do that in order to get the importance of variables (having an unbalanced variable design) by summing the up the AIC-weights of models including the same variable, for every variable separately. It's time consuming and annoying to define all possible models by hand. Is there a command, or easy solution to let R define the set of all possible models itself? I defined models in the following way to process them with a batch job: # e.g. model 1 preference- formula(Y~Lwd + N + Sex + YY) # e.g. model 2 preference_heterogeneity- formula(Y~Ri + Lwd + N + Sex + YY) etc. etc. I appreciate any hint Cheers Lukas °°° Lukas Indermaur, PhD student eawag / Swiss Federal Institute of Aquatic Science and Technology ECO - Department of Aquatic Ecology Überlandstrasse 133 CH-8600 Dübendorf Switzerland Phone: +41 (0) 71 220 38 25 Fax: +41 (0) 44 823 53 15 Email: [EMAIL PROTECTED] www.lukasindermaur.ch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dear Lukas, In order to define the set of all possible models I used the function combos() of the hier.part package. The function all.regs() of the same package fit all the possible models. God luck Amélie Vaniscotte (Phd) Department of Environmental Biology UsC INRA EA3184MRT Institute for Environmental Sciences and Technology University of Franche-comté Place Leclerc, 25030 Besançon cedex FRANCE Tel. : +33 (0)381 665 714 Fax : +33 (0)381 665 797 [EMAIL PROTECTED]: [EMAIL PROTECTED] http://lbe.univ-fcomte.fr/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Implementing trees in R
Lists are not good for this. There is an example in section 3.3 of the proto vignette of using proto objects for this. That section also references an S4 example although its pretty messy with S4. You might want to look at the graph, RBGL and graphviz packages in Bioconductor and the dynamicgraph, mathgraph and sna packages on CRAN. On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] corAR1 in a random effects panel
(I am posting the same question again, as some has replied my previous question with his own question...) Hi everyone, I am interested in estimating this type of random effects panel: y_it = x'_it * beta + u_it + e_it u_it = rho * u_it-1 + d_itrho belongs to (-1, 1) where: u and e are independent and normally zero-mean distributed. d is also independently normally zero-mean distributed. So, I want random effects for group i to be correlated in t, following an AR(1) process. I am using the mle command, including correlation=corAR1: lme(asis~prec+pobl+gola+entr,random=~1|codi,correlation=corAR1(0.8 ,form=~temp|codi))) i = codi t = temp I am not sure whether the AR(1) process is applied to the random effects (u_it) or the error term (e_it)... Any idea? Thanks. G -- Guillermo Villa Universidad Carlos III de Madrid Business Economics Department Office 6.0.26 Madrid, 126 28903, Getafe (Madrid) SPAIN Email: [EMAIL PROTECTED] Phone: (+34) 916249772 Mobil: (+34) 655112743 Fax: (+34) 916249607 Skype: guillermo.villa Website: www.guillermovilla.com -- Guillermo Villa Universidad Carlos III de Madrid Business Economics Department Office 6.0.26 Madrid, 126 28903, Getafe (Madrid) SPAIN Email: [EMAIL PROTECTED] Phone: (+34) 916249772 Mobil: (+34) 655112743 Fax: (+34) 916249607 Skype: guillermo.villa Website: www.guillermovilla.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hierarchical partitioning
Hello, I iterate my question concerning hierarchical partitioning: Is it possible to run that for a multinomial model? I have adapted the function hier.part (hier.part package) to my multinomial model. I met a problem with the partition() function: I obtain NEGATIVE Independent contributions! It seems to be a huge error! Unfortunately, I cannot access the script as it calls a C code. Consequently this function is a black box for me. Could anyone help me? I really need to investigate this independent contributions! Thanks a lot, Amélie Vaniscotte (Phd) Department of Environmental Biology UsC INRA EA3184MRT Institute for Environmental Sciences and Technology University of Franche-comté Place Leclerc, 25030 Besançon cedex FRANCE Tel. : +33 (0)381 665 714 Fax : +33 (0)381 665 797 [EMAIL PROTECTED]: [EMAIL PROTECTED] http://lbe.univ-fcomte.fr/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Implementing trees in R
Let me rephrase that. Lists do not support references but they could be used to represent trees. list(a = list(a = 1, b = list(2, 3, d = list(4, 5)), c = list(4, 5)) is a tree whose top nodes are a, b, c and b contains three nodes 2, 3 and d and d contains 2 nodes. However, if you want to do it via references as requested then lists are not appropriate. On 3/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: Lists are not good for this. There is an example in section 3.3 of the proto vignette of using proto objects for this. That section also references an S4 example although its pretty messy with S4. You might want to look at the graph, RBGL and graphviz packages in Bioconductor and the dynamicgraph, mathgraph and sna packages on CRAN. On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Implementing trees in R
Hi Gabor, Thanks for the suggestions. I tried to look for the proto vignette document but could not find it, could you tell me how to reach it? Besides, is it possible to define my own node object type with a default behavior for the - operator of its member variables being referencing rather than copying? Any good reference material/ similar code examples? Thanks. Gabor Grothendieck wrote: Lists are not good for this. There is an example in section 3.3 of the proto vignette of using proto objects for this. That section also references an S4 example although its pretty messy with S4. You might want to look at the graph, RBGL and graphviz packages in Bioconductor and the dynamicgraph, mathgraph and sna packages on CRAN. On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ideas to speed up code: converting a matrix of integers to a matrix of normally distributed values
Hi all, [this is a bit hard to describe, so if my initial description is confusing, please try running my code below] #WHAT I'M TRYING TO DO I'd appreciate any help in trying to speed up some code. I've written a script that converts a matrix of integers (usually between 1-10,000 - these represent allele names) into two new matrices of normally distributed values (representing genetic effects), such that a given number in the integer (allele name) matrix always corresponds to the *same* randomly distributed (genetic) effects. For example, every time my function sees the allele name 3782, it converts that integer into the same two effects (e.g., -.372 1.727), which follow normal distributions (these effects can be correlated; below I've set their correlation to .5). I have an entire matrix of integers, and am converting those into two entire matrices of effects. #HOW I'VE DONE IT SO FAR To get the correlations between the effects, I've used the mvrnorm function from MASS To convert the allele names to genetic effects, I've created a function (make.effect) that resets the set.seed() to the allele name each time its called. To get the matrix of genetic effects, I use sapply. #THE PROBLEM The problem is that I often need to convert matrices that have 500K cells, and do this over several hundred iterations, so it is quite slow. If anyone has ideas to speed this up (e.g., some clever way to convert a matrix of integers to a matrix of effects without using sapply), I would be forever indebted. ##MY CODE library(MASS) ##run this example to see what I'm talking about above make.effects - function(x,mn=0,var=1,cov=.5){ set.seed(x) return(mvrnorm(1,mu=c(mn,mn),Sigma=matrix(c(var,cov,cov,var),nrow=2),empirical=FALSE))} (alleles - matrix(c(5400,3309,2080,1080,2080,,389,9362,6372,3787,2798,1009),ncol=4)) a12 - array(sapply(alleles,make.effects),dim=c(2,nrow(alleles),ncol(alleles))) (a1 - a12[1,,]) (a2 - a12[2,,]) #I've set the population correlation = .5; empirical corr=.4635 cor(as.vector(a1),as.vector(a2)) ##run this to see that the code begins to get pretty slow with even a 3000x4 matrix system.time({ alleles - matrix(rep(c(5400,3309,2080,1080,2080,,389,9362,6372,3787,2798,1009),1000),ncol=4) a12 - array(sapply(alleles,make.effects),dim=c(2,nrow(alleles),ncol(alleles))) a1 - a12[1,,] a2 - a12[2,,] }) -- Matthew C Keller Postdoctoral Fellow Virginia Institute for Psychiatric and Behavioral Genetics __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad points in regression
(mount soapbox...) While I know the prior discussion represents common practice, I would argue -- perhaps even plead -- that the modern(?? 30 years old now) alternative of robust/resistant estimation be used, especially in the readily available situation of least-squares regression. RSiteSearch(Robust) will bring up numerous possibilities.rrcov and robustbase are at least two packages devoted to this, but the functionality is available in many others (e.g. rlm() in MASS). Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 650-467-7374 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ted Harding Sent: Friday, March 16, 2007 6:44 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] Bad points in regression On 16-Mar-07 12:41:50, Alberto Monteiro wrote: Ted Harding wrote: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) ll is the output of a linear model fiited by lm(), and so has several components (see ?lm in the section Value), one of which is residuals (which can be abbreviated to res). So, in the case of your example, which(abs(ll$res)2) 15 25 50 extracts the information you want (and the 2 was inspired by looking at the residuals plot from your plot(ll)). Ok, but how can I grab those points _in general_? What is the criterium that plot used to mark those points as bad points? Ahh ... ! I see what you're after. OK, look at the plot method for lm(): ?plot.lm ## S3 method for class 'lm': plot(x, which = 1:4, caption = c(Residuals vs Fitted, Normal Q-Q plot, Scale-Location plot, Cook's distance plot), panel = points, sub.caption = deparse(x$call), main = , ask = prod(par(mfcol)) length(which) dev.interactive(), ..., id.n = 3, labels.id = names(residuals(x)), cex.id = 0.75) where (see further down): id.n: number of points to be labelled in each plot, starting with the most extreme. and note, in the default parameter-values listing above: id.n = 3 Hence, the 3 most extreme points (according to the criterion being plotted in each plot) are marked in each plot. So, for instance3, try plot(ll,id.n=5) and you will get points 10,15,25,28,50. And so on. But that pre-supposes that you know how many points are exceptional. What is meant by extremeis not stated in the help page ?plot.lm, but can be identified by inspecting the code for plot.lm(), which you can see by entering plot.lm In your example, if you omit the line which assigns anomalous values to err[15[, err[25] and err[50], then you are likely to observe that different points get identified on different plots. For instance, I just got the following results for the default id.n=3: [1] Residuals vs Fitted: 41,53,59 [2] Standardised Residuals:41,53,59 [3] sqrt(Stand Res) vs Fitted: 41,53,59 [4] Cook's Distance: 59,96,97 There are several approaches (with somewhat different outcomes) to identifying outliers. If you apply one of these, you will probably get the identities of the points anyway. Again in the context of your example (where in fact you deliberately set 3 points to have exceptional errors, thus coincidentally the same as the default value 3 of id.n), you could try different values for id.n and inspect the graphs to see whether a given value of id.n marks some points that do not look exceptional relative to the mass of the other points. So, the above plot(ll,id.n=5) gave me one point, 10 on the residuals plot, which apparently belonged to the general distribution of residuals. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 16-Mar-07 Time: 13:43:54 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fast lookup in ragged array
Hello, I'm running an algorithm for graph structural cohesion that requires a depth-first search of subgraphs of a rather large network. The algorithm will necessarily be redundant in the subgraphs it recurses to, so to speed up the process I implemented a check at each subgraph to see if it's been searched already. This algorithm is very slow, and takes days to complete on a graph with about 200 nodes. It seems that a significant portion of the computation time is spent looking up the current subgraph in the list of searched subgraphs to see if it is redundant, and I'm wondering if there's a faster way to do this. Currently, the list of searched subgraphs is a list (`theseSearchedBranches`) of unnamed numeric vectors. I'm checking against the list using the following command: if(list(v) %in% theseSearchedBranches){ cat(Branch already searched: passing.\n\n) return(NULL) } v is a sorted numeric, with length anywhere between 3 and 200. Because theseSearchedBranches gets quite long as the algorithm progresses, the %in% command is taking maybe one or two seconds per lookup. So to reiterate, I have a list() that looks something like this: [[1]] [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 [18]56 72 75 76 85 95 105 110 134 158 159 165 186 [[2]] [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 [18]56 72 75 76 85 95 105 110 134 147 159 165 186 [[3]] [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 [18]50 56 72 75 85 95 105 110 134 147 158 159 165 186 ... and so on for tens of thousands of entries, and I am trying to find some sort of fast equivalent for %in% to search it. I'm also not adding the vectors to the list in any particular order, as I don't think %in% would know how to take advantage of that anyway. Is there a data structure other than list() that I can use that would be faster? Would it be better to just create a hashed env and add empty variables named 0.5.6.11.12...? I know there are fast lookup algorithms out there that could take advantage of the fact that the items being searched are indiviually ordered numeric vectors, but I can't find any info about R implementations on the mailing lists or help. Is there any data type that implements a b-tree type of lookup scheme? Any help would be greatly appreciated. Thanks, Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to...
for example: I have got these data, organized in a dataframe. sample1 sample2 sample3 sample4 group replicate1 1.000.020.350.50A replicate2 1.000.021.541.11A replicate3 1.000.021.541.11A replicate4 1.000.021.541.11A replicate5 1.000.100.180.72B replicate6 1000.00 0.750.867.26B replicate7 1000.00 0.750.180.36B replicate8 1000.00 0.7512.09 0.74B replicate9 1000.00 0.7512.09 0.84C replicate10 1000.00 0.980.650.50C replicate11 2.006.006.002.00C replicate12 6.006.002.006.00C the first four columns represent the diffent sample I have to test with ANOVA.the last column is related to the group of each entry. Using aov() I can run a test on each column. but I would like to run the ANOVAs for each colum (that in my case are hundreds) in an automated way. I can't set up a working script with the loop in this case, surely because of my scarce knowledge in programming. can you help me? the next problem is how to collect the results in a simple way. for example having them organized in a table such as SAMPLE ANOVA sample1 ok sample2 ok sample3 not significant thank you so much -- Initial Header --- From : Petr Pikal [EMAIL PROTECTED] To : [EMAIL PROTECTED] [EMAIL PROTECTED],R Help R-help@stat.math.ethz.ch Cc : Date : Thu, 15 Mar 2007 20:38:25 +0100 Subject : Re: [R] how to... Hi I suppose you will not get usefull response for such poorly specified question. For automating procedures on data frames you can either do looping or use lapply or maybe do.call can also provide some functionality. If you elaborate what you did and in what respect it was unsatisfactory maybe you will get better answer. Anyway, before your next post you shall look to posting guide. Regards Petr On 15 Mar 2007 at 17:20, [EMAIL PROTECTED] wrote: Date sent:Thu, 15 Mar 2007 17:20:57 +0100 From: [EMAIL PROTECTED] [EMAIL PROTECTED] To: R Help R-help@stat.math.ethz.ch Subject: [R] how to... I have to perform ANOVA's on many different data organized in a dataframe. I can run an ANOVA for each sample, but I've got hundreds of data and I would like to avoid manually carrying out each test. in addition, I would like to have the results organized in a simple way, for example in a table, wich could be easy to export. thank you for assistance simone -- Leggi GRATIS le tue mail con il telefonino i-mode di Wind http://i-mode.wind.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] -- Initial Header --- From : Petr Pikal [EMAIL PROTECTED] To : [EMAIL PROTECTED] [EMAIL PROTECTED],R Help R-help@stat.math.ethz.ch Cc : Date : Thu, 15 Mar 2007 20:38:25 +0100 Subject : Re: [R] how to... Hi I suppose you will not get usefull response for such poorly specified question. For automating procedures on data frames you can either do looping or use lapply or maybe do.call can also provide some functionality. If you elaborate what you did and in what respect it was unsatisfactory maybe you will get better answer. Anyway, before your next post you shall look to posting guide. Regards Petr On 15 Mar 2007 at 17:20, [EMAIL PROTECTED] wrote: Date sent:Thu, 15 Mar 2007 17:20:57 +0100 From: [EMAIL PROTECTED] [EMAIL PROTECTED] To: R Help R-help@stat.math.ethz.ch Subject: [R] how to... I have to perform ANOVA's on many different data organized in a dataframe. I can run an ANOVA for each sample, but I've got hundreds of data and I would like to avoid manually carrying out each test. in addition, I would like to have the results organized in a simple way, for example in a table, wich could be easy to export. thank you for assistance simone -- Leggi GRATIS le tue mail con il telefonino i-mode di Wind http://i-mode.wind.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED]
Re: [R] Implementing trees in R
1. Here is your example redone using proto: library(proto) parent - proto() child - proto(a = 1) parent$child1 - child child$parent.env - parent # also just for illustration lets change a parent$child1$a # 1 child$a - 2 parent$child1$a # 2 2. To redefine $- use S3 or S4 but it can be done in conjunction with proto like this: # constructor node - function(. = parent.frame(), ...) structure(proto(...), class = c(node, proto)) $-.node - function(this, s, value) { if (s == .super) parent.env(this) - value if (is.function(value)) environment(value) - this if (inherits(value, node)) parent.env(value) - this this[[as.character(substitute(s))]] - value this } p - node(a = 1) p$child - node(b = 2) p$child$parent.env() p # same On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi Gabor, Thanks for the suggestions. I tried to look for the proto vignette document but could not find it, could you tell me how to reach it? Besides, is it possible to define my own node object type with a default behavior for the - operator of its member variables being referencing rather than copying? Any good reference material/ similar code examples? Thanks. Gabor Grothendieck wrote: Lists are not good for this. There is an example in section 3.3 of the proto vignette of using proto objects for this. That section also references an S4 example although its pretty messy with S4. You might want to look at the graph, RBGL and graphviz packages in Bioconductor and the dynamicgraph, mathgraph and sna packages on CRAN. On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Implementing trees in R
On 3/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: 1. Here is your example redone using proto: library(proto) parent - proto() child - proto(a = 1) parent$child1 - child child$parent.env - parent This last line should have been: parent.env(child) - parent # also just for illustration lets change a parent$child1$a # 1 child$a - 2 parent$child1$a # 2 2. To redefine $- use S3 or S4 but it can be done in conjunction with proto like this: # constructor node - function(. = parent.frame(), ...) structure(proto(...), class = c(node, proto)) $-.node - function(this, s, value) { if (s == .super) parent.env(this) - value if (is.function(value)) environment(value) - this if (inherits(value, node)) parent.env(value) - this this[[as.character(substitute(s))]] - value this } p - node(a = 1) p$child - node(b = 2) p$child$parent.env() p # same On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi Gabor, Thanks for the suggestions. I tried to look for the proto vignette document but could not find it, could you tell me how to reach it? Besides, is it possible to define my own node object type with a default behavior for the - operator of its member variables being referencing rather than copying? Any good reference material/ similar code examples? Thanks. Gabor Grothendieck wrote: Lists are not good for this. There is an example in section 3.3 of the proto vignette of using proto objects for this. That section also references an S4 example although its pretty messy with S4. You might want to look at the graph, RBGL and graphviz packages in Bioconductor and the dynamicgraph, mathgraph and sna packages on CRAN. On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Duplicated non contiguous element in a list
Hello, Given a vector I would like to rapidly identify duplicated non contiguous elements. Given for example c(1,1,2, 3,2, 4, 5, 6,4) I would like to get: FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE In fact I need to check this on the columns of a matrix! I can do that of couse with loops but is there any function already available? Thanks -- Passa a Infostrada. ADSL e Telefono senza limiti e senza canone Telecom http://infostrada.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] log(1+exp(x)) function
Hello, I would like to know wether R implements efficient approximations of the functions log(1+exp(x)) and log(1-exp(x)). It seems the case for the latter one but I'd like to make sure. Thank you in advance. Feriel Lahlali Graduate student - France. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast lookup in ragged array
Well, I hadn't ever seen RBGL before, so that's great. I've been using igraph and sna mainly, but there are a few points lacking between these two. RBGL solves a lot of problems for me! But I'm not sure it will solve this specific problem. Are you suggesting I use RBGL to do a depth-first search of all the subgraphs? For this particular depth-first search I'm not searching every subgraph, but just those that are constructed from a minimal cutset of the parent subgraph. At each level of the search, I have to compute graph cohesion (vertex connectivity), which can take considerable time. A lot of computation time is saved by only searching subgraphs obtained through cutsets. So a complete search of all the subgraphs won't work, but the redundancy I come across is I think unavoidable. The particular algorithm I'm trying to implement is Moody and White's cohesive blocking, in which the end result is a nested set of all subgraphs with a higher cohesion (connectivity) than their parents. (see http://www.santafe.edu/research/publications/workingpapers/ 00-08-049.pdf ) On Mar 16, 2007, at 11:00 AM, Robert Gentleman wrote: why not just use the graph package and RBGL at www.bioconductor.org Peter McMahan wrote: Hello, I'm running an algorithm for graph structural cohesion that requires a depth-first search of subgraphs of a rather large network. The algorithm will necessarily be redundant in the subgraphs it recurses to, so to speed up the process I implemented a check at each subgraph to see if it's been searched already. This algorithm is very slow, and takes days to complete on a graph with about 200 nodes. It seems that a significant portion of the computation time is spent looking up the current subgraph in the list of searched subgraphs to see if it is redundant, and I'm wondering if there's a faster way to do this. Currently, the list of searched subgraphs is a list (`theseSearchedBranches`) of unnamed numeric vectors. I'm checking against the list using the following command: if(list(v) %in% theseSearchedBranches){ cat(Branch already searched: passing.\n\n) return(NULL) } v is a sorted numeric, with length anywhere between 3 and 200. Because theseSearchedBranches gets quite long as the algorithm progresses, the %in% command is taking maybe one or two seconds per lookup. So to reiterate, I have a list() that looks something like this: [[1]] [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 [18]56 72 75 76 85 95 105 110 134 158 159 165 186 [[2]] [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 [18]56 72 75 76 85 95 105 110 134 147 159 165 186 [[3]] [1] 0 5 6 11 12 13 14 16 19 21 23 24 26 30 31 39 41 [18]50 56 72 75 85 95 105 110 134 147 158 159 165 186 ... and so on for tens of thousands of entries, and I am trying to find some sort of fast equivalent for %in% to search it. I'm also not adding the vectors to the list in any particular order, as I don't think %in% would know how to take advantage of that anyway. Is there a data structure other than list() that I can use that would be faster? Would it be better to just create a hashed env and add empty variables named 0.5.6.11.12...? I know there are fast lookup algorithms out there that could take advantage of the fact that the items being searched are indiviually ordered numeric vectors, but I can't find any info about R implementations on the mailing lists or help. Is there any data type that implements a b-tree type of lookup scheme? Any help would be greatly appreciated. Thanks, Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] log(1+exp(x)) function
Hello, I would like to know wether R implements efficient approximations of the functions log(1+exp(x)) and log(1-exp(x)). It seems the case for the latter one but I'd like to make sure. Thank you in advance. Feriel Lahlali Graduate student - France. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MANOVA permutation testing
On Fri, 2007-03-16 at 00:50 +, Tyler Smith wrote: Hi, I've got a dataset with 7 variables for 8 different species. I'd like to test the null hypothesis of no difference among species for these variables. MANOVA seems like the appropriate test, but since I'm unsure of how well the data fit the assumptions of equal variance/covariance and multivariate normality, I want to use a permutation test. I've been through CRAN looking at packages boot, bootstrap, coin, permtest, but they all seem to be doing more than I need. Is the following code an appropriate way to test my hypothesis: result.vect - c() for (i in 1:1000){ wilks - summary.manova(manova(maxent~sample(max.spec)), test=Wilks)$stats[1,2] result.vect - c(res.vect,wilks) } maxent is the data, max.spec is a vector of species names. Comparing the result.vect with the wilks value for the unpermuted data suggests there are very significant differences among species -- but did I do this properly? Hi Tyler, (without knowing more about your data) I think you are almost there, but the R code can be made much more efficient. When you create your result vector, is is of length 0. Each time you add a result, R has to copy the current result object, enlarge it and so on. This all takes a lot of time. Better to allocate storage first, then add each result in turn be replacement. E.g.: Using an example from ?summary.manova tear - c(6.5, 6.2, 5.8, 6.5, 6.5, 6.9, 7.2, 6.9, 6.1, 6.3, 6.7, 6.6, 7.2, 7.1, 6.8, 7.1, 7.0, 7.2, 7.5, 7.6) gloss - c(9.5, 9.9, 9.6, 9.6, 9.2, 9.1, 10.0, 9.9, 9.5, 9.4, 9.1, 9.3, 8.3, 8.4, 8.5, 9.2, 8.8, 9.7, 10.1, 9.2) opacity - c(4.4, 6.4, 3.0, 4.1, 0.8, 5.7, 2.0, 3.9, 1.9, 5.7, 2.8, 4.1, 3.8, 1.6, 3.4, 8.4, 5.2, 6.9, 2.7, 1.9) Y - cbind(tear, gloss, opacity) rate - factor(gl(2,10), labels=c(Low, High)) ## define number of permutations nperm - 999 ## allocate storage, here we want 999 + 1 for our observed stat res - numeric(nperm+1) ## do the loop - the seq(along ...) bit avoids certain problems for(i in seq(along = res[-1])) { ## here we replace the ith value in the vector res with the stat res[i] - summary(manova(Y ~ sample(rate)), test = Wilks)$stats[1,2] } ## now we append the observed stat onto the end of the result vector res ## we also store this in 'obs' for convenience res[nperm+1] - obs - summary(manova(Y ~ rate), test = Wilks)$stats[1,2] ## this is the permutation p-value - the proportion of the nperm ## permutations + 1 that are greater than or equal to the ## observed stat 'obs' sum(res = obs) / (nperm+1) HTH, G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicated non contiguous element in a list
Maybe not the most efficient, but here's a vector-level solution: non.contig.dupes - function(x){ is.dup - x %in% x[duplicated(x)] is.con - c(x[-length(x)]==x[-1],F) | c(F,x[-length(x)]==x[-1]) is.dup !is.con} On Mar 16, 2007, at 11:14 AM, Bruno C.. wrote: Hello, Given a vector I would like to rapidly identify duplicated non contiguous elements. Given for example c(1,1,2, 3,2, 4, 5, 6,4) I would like to get: FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE In fact I need to check this on the columns of a matrix! I can do that of couse with loops but is there any function already available? Thanks -- Passa a Infostrada. ADSL e Telefono senza limiti e senza canone Telecom http://infostrada.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ideas to speed up code: converting a matrix of integers to a matrix of normally distributed values
Considering that the vast majority of your time is spent in the function mvrnorm (on my system 5.7 out of 6.1 seconds). In your example that is 12000 calls to the function. To improve your speed you have to cut down the number of calls to the function. For example, how many unique integers do you have and can to do the calls for those and then substitute matching values. Here is what Rprof showed: total.time total.pct self.time self.pct system.time 6.12 99.7 0.00 0.0 as.vector 6.06 98.7 0.18 2.9 FUN 6.06 98.7 0.12 2.0 array 6.06 98.7 0.10 1.6 lapply 6.06 98.7 0.00 0.0 sapply 6.06 98.7 0.00 0.0 eval6.04 98.4 0.06 1.0 mvrnorm 5.72 93.2 0.34 5.5 eigen 2.58 42.0 0.52 8.5 or another way of looking at it: 0 6.1 root 1.6.1 system.time 2. .6.0 eval 3. . .6.0 eval 4. . . .6.0 array 5. . . . .6.0 as.vector 6. . . . . .6.0 sapply 7. . . . . . .6.0 lapply 8. . . . . . . .6.0 FUN 9. . . . . . . . .5.7 mvrnorm 10. . . . . . . . . .2.6 eigen 11. . . . . . . . . . .1.2 sort.list 12. . . . . . . . . . . .1.0 match.arg 13. . . . . . . . . . . . .0.7 eval On 3/16/07, Matthew Keller [EMAIL PROTECTED] wrote: Hi all, [this is a bit hard to describe, so if my initial description is confusing, please try running my code below] #WHAT I'M TRYING TO DO I'd appreciate any help in trying to speed up some code. I've written a script that converts a matrix of integers (usually between 1-10,000 - these represent allele names) into two new matrices of normally distributed values (representing genetic effects), such that a given number in the integer (allele name) matrix always corresponds to the *same* randomly distributed (genetic) effects. For example, every time my function sees the allele name 3782, it converts that integer into the same two effects (e.g., -.372 1.727), which follow normal distributions (these effects can be correlated; below I've set their correlation to .5). I have an entire matrix of integers, and am converting those into two entire matrices of effects. #HOW I'VE DONE IT SO FAR To get the correlations between the effects, I've used the mvrnorm function from MASS To convert the allele names to genetic effects, I've created a function (make.effect) that resets the set.seed() to the allele name each time its called. To get the matrix of genetic effects, I use sapply. #THE PROBLEM The problem is that I often need to convert matrices that have 500K cells, and do this over several hundred iterations, so it is quite slow. If anyone has ideas to speed this up (e.g., some clever way to convert a matrix of integers to a matrix of effects without using sapply), I would be forever indebted. ##MY CODE library(MASS) ##run this example to see what I'm talking about above make.effects - function(x,mn=0,var=1,cov=.5){ set.seed(x) return(mvrnorm(1,mu=c(mn,mn),Sigma=matrix(c(var,cov,cov,var),nrow=2),empirical=FALSE))} (alleles - matrix(c(5400,3309,2080,1080,2080,,389,9362,6372,3787,2798,1009),ncol=4)) a12 - array(sapply(alleles,make.effects ),dim=c(2,nrow(alleles),ncol(alleles))) (a1 - a12[1,,]) (a2 - a12[2,,]) #I've set the population correlation = .5; empirical corr=.4635 cor(as.vector(a1),as.vector(a2)) ##run this to see that the code begins to get pretty slow with even a 3000x4 matrix system.time({ alleles - matrix(rep(c(5400,3309,2080,1080,2080,,389,9362,6372,3787,2798,1009),1000),ncol=4) a12 - array(sapply(alleles,make.effects ),dim=c(2,nrow(alleles),ncol(alleles))) a1 - a12[1,,] a2 - a12[2,,] }) -- Matthew C Keller Postdoctoral Fellow Virginia Institute for Psychiatric and Behavioral Genetics __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create coef.table as in summary.glm (with Signif. codes)
Hi, how do I create a coef.table as for example returned by summary.glm? I don't see where the significance codes are assigned during summary.glm. Thank you, Benjamin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad points in regression [Broadcast]
(My turn on the soapbox ...) I'd like to add a bit of caveat to Bert's view. I'd argue (perhaps even plead) that robust/resistant procedures be used with care. They should not be used as a shortcut to avoid careful analysis of data. I recalled that in my first course on regression, the professor made it clear that we're fitting models to data, not the other way around. When the model fits badly to (some of the) the data, do examine and think carefully about what happened. Verify that bad data are indeed bad, instead of using statistical criteria to make that judgment. A scientific colleague reminded me of this point when I tried to sell him the idea of robust/resistant methods: Don't use these methods as a crutch to stand on badly run experiments (or poorly fitted models). Cheers, Andy From: Bert Gunter (mount soapbox...) While I know the prior discussion represents common practice, I would argue -- perhaps even plead -- that the modern(?? 30 years old now) alternative of robust/resistant estimation be used, especially in the readily available situation of least-squares regression. RSiteSearch(Robust) will bring up numerous possibilities.rrcov and robustbase are at least two packages devoted to this, but the functionality is available in many others (e.g. rlm() in MASS). Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 650-467-7374 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ted Harding Sent: Friday, March 16, 2007 6:44 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] Bad points in regression On 16-Mar-07 12:41:50, Alberto Monteiro wrote: Ted Harding wrote: alpha - 0.3 beta - 0.4 sigma - 0.5 err - rnorm(100) err[15] - 5; err[25] - -4; err[50] - 10 x - 1:100 y - alpha + beta * x + sigma * err ll - lm(y ~ x) plot(ll) ll is the output of a linear model fiited by lm(), and so has several components (see ?lm in the section Value), one of which is residuals (which can be abbreviated to res). So, in the case of your example, which(abs(ll$res)2) 15 25 50 extracts the information you want (and the 2 was inspired by looking at the residuals plot from your plot(ll)). Ok, but how can I grab those points _in general_? What is the criterium that plot used to mark those points as bad points? Ahh ... ! I see what you're after. OK, look at the plot method for lm(): ?plot.lm ## S3 method for class 'lm': plot(x, which = 1:4, caption = c(Residuals vs Fitted, Normal Q-Q plot, Scale-Location plot, Cook's distance plot), panel = points, sub.caption = deparse(x$call), main = , ask = prod(par(mfcol)) length(which) dev.interactive(), ..., id.n = 3, labels.id = names(residuals(x)), cex.id = 0.75) where (see further down): id.n: number of points to be labelled in each plot, starting with the most extreme. and note, in the default parameter-values listing above: id.n = 3 Hence, the 3 most extreme points (according to the criterion being plotted in each plot) are marked in each plot. So, for instance3, try plot(ll,id.n=5) and you will get points 10,15,25,28,50. And so on. But that pre-supposes that you know how many points are exceptional. What is meant by extremeis not stated in the help page ?plot.lm, but can be identified by inspecting the code for plot.lm(), which you can see by entering plot.lm In your example, if you omit the line which assigns anomalous values to err[15[, err[25] and err[50], then you are likely to observe that different points get identified on different plots. For instance, I just got the following results for the default id.n=3: [1] Residuals vs Fitted: 41,53,59 [2] Standardised Residuals:41,53,59 [3] sqrt(Stand Res) vs Fitted: 41,53,59 [4] Cook's Distance: 59,96,97 There are several approaches (with somewhat different outcomes) to identifying outliers. If you apply one of these, you will probably get the identities of the points anyway. Again in the context of your example (where in fact you deliberately set 3 points to have exceptional errors, thus coincidentally the same as the default value 3 of id.n), you could try different values for id.n and inspect the graphs to see whether a given value of id.n marks some points that do not look exceptional relative to the mass of the other points. So, the above plot(ll,id.n=5) gave me one point, 10 on the residuals plot, which apparently belonged to the general distribution of residuals. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 16-Mar-07 Time: 13:43:54 -- XFMail
[R] ERROR: 'latex' needed but missing on your system.
After successfully building R on Slackware Linux v11.0 I went to make the documentation; the texi files went fine and then I hopefully issued make dvi after having gotten the warning to the effect of You cannot build the DVI or PDF manuals during compilation. And, as expected I got the error ERROR: 'latex' needed but missing on your system. The problem is that latex is on my system and is in root's path: /usr/src/R-2.4.1 Super-User echo $PATH /usr/share/texmf/bin/:/opt/kde/bin/:/uCsr/local/stata{sic}:/usr/local/sbin:/usr/local/bin:/sbin:/usr/sbin:/bin:/usr/bin I can issue latex from the command line as root (su'd to root, that is) and it will run successfully. Also, whereis latex turns up empty. I did not have this problem on PCLinuxOS 0.93a. Thanks for any suggestions, Joel -- Joel J. Adamson Biostatistician Pediatric Psychopharmacology Research Unit Massachusetts General Hospital Boston, MA 02114 (617) 643-1432 (303) 880-3109 The information transmitted in this electronic communication is intended only for the person or entity to whom it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this information in error, please contact the Compliance HelpLine at 800-856-1983 and properly dispose of this information. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create coef.table as in summary.glm (with Signif. codes)
I have found the function printCoefmat(). Thank you, Benjamin On 3/16/07, Benjamin Dickgiesser [EMAIL PROTECTED] wrote: Hi, how do I create a coef.table as for example returned by summary.glm? I don't see where the significance codes are assigned during summary.glm. Thank you, Benjamin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cumsum over varying column lengths
Folks, I have a matrix of historicalReturns, where entry (i, j) is the daily return corresponding to date i and equity j. I also have a matrix startOffset, where entry (1, k) is the row offset in historicalReturns where I entered into equity k. So we have that NCOL(startOffset) = NCOL(historicalReturns). Now I would like compute for each column in historicalReturns, the cumulative return 'returnsFromInception' for the equity starting from the startOffset date. Is there a better way than as follows: n - NROW(historicalReturns) returnsFromInception - matrix(nrow = 1, ncol = NCOL(historicalInceptionDates)) for (i in 1 : NCOL(historicalReturns)) { cumReturn - cumsum(historicalReturns[startOffset[1, i] : n, i]) returnsFromInception[1, i] - cumReturn[length(cumReturn)] } This works for me, but seems rather inelegant, and I don't like having to use a matrix for returnsFromInception and startOffset, where vectors would work. Thanks for your help. Murali _ Its tax season, make sure to follow these few simple tips __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ideas to speed up code: converting a matrix of integers to a matrix of normally distributed values
Hi, Many thanks to Jim and Martin for their suggestions. Using your ideas, I came up with a solution that indexes rather than uses sapply (and therefore calling up mvrnorm separately for each cell in the matrix). The trick is to create a key matrix once, and then to use the match() function each time I need to take the results from the key matrix and place them in the appropriate spots in an 'effects' matrix. If anyone is interested, here is a solution which speeds up the process by a factor of 200 (!) : unique.allele.seq - 1:1 make.effects - function(allele.seq, seed, mn = 0, var=1, cov=.5) { set.seed(seed) return(mvrnorm(length(allele.seq), mu=c(mn,mn),Sigma=matrix(c(var,cov,cov,var),nrow=2), empirical=FALSE))} effects.key - make.effects(unique.allele.seq, 123) (alleles - matrix(c(15,3309,2080,1080,2080,,389,9362,6372,3787,2798,1009),ncol=4)) (indx - match(alleles,key)) (a1 - matrix(effects.key[indx,1],ncol=ncol(alleles))) (a2 - matrix(effects.key[indx,2],ncol=ncol(alleles))) #to check timing system.time({ alleles - matrix(rep(c(5400,3309,2080,1080,2080,,389,9362,6372,3787,2798,1009),1),ncol=4) indx - match(alleles,key) a1 - matrix(effects.key[indx,1],ncol=ncol(alleles)) a2 - matrix(effects.key[indx,2],ncol=ncol(alleles))}) On 3/16/07, jim holtman [EMAIL PROTECTED] wrote: Considering that the vast majority of your time is spent in the function mvrnorm (on my system 5.7 out of 6.1 seconds). In your example that is 12000 calls to the function. To improve your speed you have to cut down the number of calls to the function. For example, how many unique integers do you have and can to do the calls for those and then substitute matching values. Here is what Rprof showed: total.time total.pct self.time self.pct system.time 6.12 99.7 0.00 0.0 as.vector 6.06 98.7 0.18 2.9 FUN 6.06 98.7 0.12 2.0 array 6.06 98.7 0.10 1.6 lapply 6.06 98.7 0.00 0.0 sapply 6.06 98.7 0.00 0.0 eval6.04 98.4 0.06 1.0 mvrnorm 5.72 93.2 0.34 5.5 eigen 2.58 42.0 0.52 8.5 or another way of looking at it: 0 6.1 root 1.6.1 system.time 2. .6.0 eval 3. . .6.0 eval 4. . . .6.0 array 5. . . . .6.0 as.vector 6. . . . . .6.0 sapply 7. . . . . . .6.0 lapply 8. . . . . . . .6.0 FUN 9. . . . . . . . .5.7 mvrnorm 10. . . . . . . . . .2.6 eigen 11. . . . . . . . . . .1.2 sort.list 12. . . . . . . . . . . .1.0 match.arg 13. . . . . . . . . . . . .0.7 eval On 3/16/07, Matthew Keller [EMAIL PROTECTED] wrote: Hi all, [this is a bit hard to describe, so if my initial description is confusing, please try running my code below] #WHAT I'M TRYING TO DO I'd appreciate any help in trying to speed up some code. I've written a script that converts a matrix of integers (usually between 1-10,000 - these represent allele names) into two new matrices of normally distributed values (representing genetic effects), such that a given number in the integer (allele name) matrix always corresponds to the *same* randomly distributed (genetic) effects. For example, every time my function sees the allele name 3782, it converts that integer into the same two effects (e.g., -.372 1.727), which follow normal distributions (these effects can be correlated; below I've set their correlation to .5). I have an entire matrix of integers, and am converting those into two entire matrices of effects. #HOW I'VE DONE IT SO FAR To get the correlations between the effects, I've used the mvrnorm function from MASS To convert the allele names to genetic effects, I've created a function (make.effect) that resets the set.seed() to the allele name each time its called. To get the matrix of genetic effects, I use sapply. #THE PROBLEM The problem is that I often need to convert matrices that have 500K cells, and do this over several hundred iterations, so it is quite slow. If anyone has ideas to speed this up (e.g., some clever way to convert a matrix of integers to a matrix of effects without using sapply), I would be forever indebted. ##MY CODE library(MASS) ##run this example to see what I'm talking about above make.effects - function(x,mn=0,var=1,cov=.5){ set.seed(x) return(mvrnorm(1,mu=c(mn,mn),Sigma=matrix(c(var,cov,cov,var),nrow=2),empirical=FALSE))} (alleles - matrix(c(5400,3309,2080,1080,2080,,389,9362,6372,3787,2798,1009),ncol=4)) a12 - array(sapply(alleles,make.effects),dim=c(2,nrow(alleles),ncol(alleles))) (a1 - a12[1,,]) (a2 - a12[2,,]) #I've set the population correlation = .5; empirical corr=.4635 cor(as.vector
[R] online course - Using R for Basic Statistics
Dr. John Verzani will present his online course, Using R for Introductory Statistics April 6 - May 4 at statistics.com. Participants can ask questions and exchange comments with Dr. Verzani via a private discussion board throughout the period. This course covers the use of R to summarize and graph data, calculate confidence intervals, test hypotheses, assess goodness-of-fit, and perform linear regression. John Verzani is a member of the faculty at the College of Staten Island of the City University of New York, and the author of Using R for Introductory Statistics (CRC Press), on which this course is based. His research interests and publications are in the area of superprocesses. There are no set hours when you must be online, and we estimate you will need about 10-15 hours per week. Register: http://www.statistics.com/content/courses/advanceddoe/ Peter Bruce [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast lookup in ragged array
Peter McMahan [EMAIL PROTECTED] writes: Well, I hadn't ever seen RBGL before, so that's great. I've been using igraph and sna mainly, but there are a few points lacking between these two. RBGL solves a lot of problems for me! But I'm not sure it will solve this specific problem. Are you suggesting I use RBGL to do a depth-first search of all the subgraphs? For this particular depth-first search I'm not searching every subgraph, but just those that are constructed from a minimal cutset of the parent subgraph. At each level of the search, I have to compute graph cohesion (vertex connectivity), which can take considerable time. A lot of computation time is saved by only searching subgraphs obtained through cutsets. So a complete search of all the subgraphs won't work, but the redundancy I come across is I think unavoidable. Perhaps you will need a combination of graph/RBGL and some custom memoization code to keep track of which subgraphs have already been searched. Some suggestions on that front: Don't use a list, use an environment. searchedBranched = new.env(hash=TRUE, parent=emptyenv(), size=X) where X is an estimate of the number of branches you will search. Using an environment implies you will need unique character names for each subgraph. Do you have that? If not, you could concatenate node names. For a 200 node graph, that should be ok. Hope that helps some. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast lookup in ragged array
Thanks, I'll give it a try. does R have a limit on variable name length? Also, is it better to over-estimate or under-estimate the size parameter? This won't be too hard to implement, either, as I'm already keeping the list in a specific environment so all the subprocesses can find the same one. On Mar 16, 2007, at 1:37 PM, Seth Falcon wrote: Peter McMahan [EMAIL PROTECTED] writes: Well, I hadn't ever seen RBGL before, so that's great. I've been using igraph and sna mainly, but there are a few points lacking between these two. RBGL solves a lot of problems for me! But I'm not sure it will solve this specific problem. Are you suggesting I use RBGL to do a depth-first search of all the subgraphs? For this particular depth-first search I'm not searching every subgraph, but just those that are constructed from a minimal cutset of the parent subgraph. At each level of the search, I have to compute graph cohesion (vertex connectivity), which can take considerable time. A lot of computation time is saved by only searching subgraphs obtained through cutsets. So a complete search of all the subgraphs won't work, but the redundancy I come across is I think unavoidable. Perhaps you will need a combination of graph/RBGL and some custom memoization code to keep track of which subgraphs have already been searched. Some suggestions on that front: Don't use a list, use an environment. searchedBranched = new.env(hash=TRUE, parent=emptyenv(), size=X) where X is an estimate of the number of branches you will search. Using an environment implies you will need unique character names for each subgraph. Do you have that? If not, you could concatenate node names. For a 200 node graph, that should be ok. Hope that helps some. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast lookup in ragged array
I just tried it and there seems to be a limit of about 256 characters in a variable name: # worked assign(paste(sample(letters,256,T), collapse=''),123,env=x) # failed at 257 characters assign(paste(sample(letters,257,T), collapse=''),123,env=x) Error in assign(paste(sample(letters, 257, T), collapse = ), 123, env = x) : symbol print-name too long On 3/16/07, Peter McMahan [EMAIL PROTECTED] wrote: Thanks, I'll give it a try. does R have a limit on variable name length? Also, is it better to over-estimate or under-estimate the size parameter? This won't be too hard to implement, either, as I'm already keeping the list in a specific environment so all the subprocesses can find the same one. On Mar 16, 2007, at 1:37 PM, Seth Falcon wrote: Peter McMahan [EMAIL PROTECTED] writes: Well, I hadn't ever seen RBGL before, so that's great. I've been using igraph and sna mainly, but there are a few points lacking between these two. RBGL solves a lot of problems for me! But I'm not sure it will solve this specific problem. Are you suggesting I use RBGL to do a depth-first search of all the subgraphs? For this particular depth-first search I'm not searching every subgraph, but just those that are constructed from a minimal cutset of the parent subgraph. At each level of the search, I have to compute graph cohesion (vertex connectivity), which can take considerable time. A lot of computation time is saved by only searching subgraphs obtained through cutsets. So a complete search of all the subgraphs won't work, but the redundancy I come across is I think unavoidable. Perhaps you will need a combination of graph/RBGL and some custom memoization code to keep track of which subgraphs have already been searched. Some suggestions on that front: Don't use a list, use an environment. searchedBranched = new.env(hash=TRUE, parent=emptyenv(), size=X) where X is an estimate of the number of branches you will search. Using an environment implies you will need unique character names for each subgraph. Do you have that? If not, you could concatenate node names. For a 200 node graph, that should be ok. Hope that helps some. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast lookup in ragged array
Peter McMahan [EMAIL PROTECTED] writes: Thanks, I'll give it a try. does R have a limit on variable name length? If you are going to have very long names, you might be better off computing a digest of some kind. You could use the digest package to compute an md5sum or the Ruuid package to generate a GUID. Also, is it better to over-estimate or under-estimate the size parameter? The environment will grow as needed. If you overestimate, you will use more memory than you need to. Whether this is a problem depends if you have extra memory available. Underestimating means that the underlying hashtable will need to be resized and this has a performance impact. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum over varying column lengths
This might work for you: returnsFromInception - sapply(1:NROW(historicalReturns), function(x){ cumsum(historicalReturns[startOffset[1, x] : n, x]) }) On 3/16/07, Murali Menon [EMAIL PROTECTED] wrote: Folks, I have a matrix of historicalReturns, where entry (i, j) is the daily return corresponding to date i and equity j. I also have a matrix startOffset, where entry (1, k) is the row offset in historicalReturns where I entered into equity k. So we have that NCOL(startOffset) = NCOL(historicalReturns). Now I would like compute for each column in historicalReturns, the cumulative return 'returnsFromInception' for the equity starting from the startOffset date. Is there a better way than as follows: n - NROW(historicalReturns) returnsFromInception - matrix(nrow = 1, ncol = NCOL(historicalInceptionDates)) for (i in 1 : NCOL(historicalReturns)) { cumReturn - cumsum(historicalReturns[startOffset[1, i] : n, i]) returnsFromInception[1, i] - cumReturn[length(cumReturn)] } This works for me, but seems rather inelegant, and I don't like having to use a matrix for returnsFromInception and startOffset, where vectors would work. Thanks for your help. Murali _ It's tax season, make sure to follow these few simple tips __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] online course - Using R for Basic Statistics
oops! I gave the wrong link - the correct one is http://www.statistics.com/courses/Rstatistics/ pb Dr. John Verzani will present his online course, Using R for Introductory Statistics April 6 - May 4 at statistics.com. Participants can ask questions and exchange comments with Dr. Verzani via a private discussion board throughout the period. This course covers the use of R to summarize and graph data, calculate confidence intervals, test hypotheses, assess goodness-of-fit, and perform linear regression. John Verzani is a member of the faculty at the College of Staten Island of the City University of New York, and the author of Using R for Introductory Statistics (CRC Press), on which this course is based. His research interests and publications are in the area of superprocesses. There are no set hours when you must be online, and we estimate you will need about 10-15 hours per week. Peter Bruce [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicated non contiguous element in a list
Many thanks for the fast reply of Leeds Here's all the anwer I got: Here's the winner: Patrick Burns with: * any(duplicated(rle(x)$values)) followed by: Peter McMahan: non.contig.dupes - function(x){ is.dup - x %in% x[duplicated(x)] is.con - c(x[-length(x)]==x[-1],F) | c(F,x[-length(x)]==x[-1]) is.dup !is.con} then mine (size was 6 just for test): function (z) {zz-t(z);risp-matrix(FALSE,1,6); for (i in c(1,2,3,4,5,6)){ if ( (zz[i]%in%zz[-c(max(0,i-1),i,min(i+1,6))])) ( ( (i1) (i6) ((zz[i]!=zz[i-1])|| (zz[i]!=zz[i+1]) )) || ( (i==1) (zz[i]!=zz[i+1])) || ( (i==6) (zz[i]!=zz[i-1])) ) ) { risp[1,i]=TRUE; } } risp; } aex-equo with Mark Leeds whose script does not work for all the situations: temp-diff(inmat) temp-temp[temp != 0] duplicated(temp) Many thanks to All of You -- Passa a Infostrada. ADSL e Telefono senza limiti e senza canone Telecom http://infostrada.it [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Probably simple function problem
# I have a simple function problem. I thought that I could write a function to modify a couple of vectors but I am doing something wrong #I have a standard cost vector called fuel and some adjustments to the #costs called adjusts. The changes are completely dependend on the length #of the dataframe newdata I then need to take the modifed vectors and use # them later. I need to do this several times and the only change in the variables # is the length of the data.frame. # Can anyone suggest what I am doing wrong or am I just misunderstanding what # a function is supposed to do? #Example: adjusts - c(.50, .70, .29, .27 , .40 , .26 , 125) coal - 1:6 newdata - 1:10 fuel.costing - function(fuel, utr, mydata) { cppf - cppm - fuel ; cppf[2] - fuel[2]*(1-utr[2])*length(mydata) + utr[7]* utr[2]*utr[5] ; cppf[4] - fuel[2]*(1-utr[4])*length(mydata) + utr[7]* utr[4]*utr[6] ; cppm[2] - fuel[2]*(1-utr[1])*length(mydata) ; cppm[4] - fuel[2]*(1-utr[3])*length(mydata) } fuel.costing(coal, adjusts, newdata) ## original code for one place cppf - cppm - coal ; cppf[2] - coal[2]*(1-adjusts[2])*length(newdata) + adjusts[7]* adjusts[2]*adjusts[5] ; cppf[4] - coal[2]*(1-adjusts[4])*length(newdata) + adjusts[7]* adjusts[4]*adjusts[6] ; cppm[2] - coal[2]*(1-adjusts[1])*length(newdata) ; cppm[4] - coal[2]*(1-adjusts[3])*length(newdata) label(cppm) - cppm - SW coal costs adjusted label (cppf) - cppf - WW coal costs adjusted # Any help or suggests would be greatly appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Discriminating between experiments with xyplot (lattice)
Hi, suppose I have data from 3 experiments which show conversion as a function of time for different boundary conditions, e.g. pressure, temperature. I want to plot all experiments as conversion over time grouped according to the temperature. However, since I have more than one experiment performed at the same temperature (but different pressures) I end up figuring out which curve belongs to which experiment. (An example with artificial data of the structure I use is given below. It shows three experiments where two experiments at temp = 250 but press = 1 and press = 0.5 are plotted within one group.) My question is: Is there a way to identify which curve whithin the same group belongs to which experiment, e.g by plotting a label like the experiment number to the end point, or by choosing different symbols for the different experiments - while keeping the color encoding of the groups? Your help is greatly appreciated! Best regards Frank require(lattice) ## generating the data, I need time - seq(0,50,1) conversion - 2 * log(time + 1) press - rep(1,51) temp - rep(250,51) experiment - rep(1, 51) v1 = as.data.frame(cbind(experiment,time,conversion,press,temp)) conversion - 2.5 * log(time + 1) press - rep(1, 51) temp - rep(270,51) experiment - rep(2, 51) v2 = as.data.frame(cbind(experiment,time,conversion,press,temp)) conversion - 1.25 * log(time + 1) press - rep(0.5, 51) temp - rep(250,51) experiment - rep(3, 51) v3 = as.data.frame(cbind(experiment,time,conversion,press,temp)) d - rbind(v1,v2,v3) ## plotting xyplot(conversion ~ time, data = d, groups = factor(temp), auto.key = T) ## result: the group for temp = 250 includes two experiments, which cannot be associated to the individual experiments -- C. u. F. Sarfert Hauffstr. 8 70825 Korntal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scatterplot brushing
Is there a package (other than xgobi which requires an X server) that will do scatterplot brushing? I see a mention in the mail archive of R-orca by Anthony Rossini but it is not in the current list of packages. My OS is Windows XP version 5.1, service pack 2 R version 2.4.1 (2006-12-18) Thanks [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discriminating between experiments with xyplot (lattice)
On 3/16/07, Christel u. Frank Sarfert [EMAIL PROTECTED] wrote: Hi, suppose I have data from 3 experiments which show conversion as a function of time for different boundary conditions, e.g. pressure, temperature. I want to plot all experiments as conversion over time grouped according to the temperature. However, since I have more than one experiment performed at the same temperature (but different pressures) I end up figuring out which curve belongs to which experiment. (An example with artificial data of the structure I use is given below. It shows three experiments where two experiments at temp = 250 but press = 1 and press = 0.5 are plotted within one group.) My question is: Is there a way to identify which curve whithin the same group belongs to which experiment, e.g by plotting a label like the experiment number to the end point, or by choosing different symbols for the different experiments - while keeping the color encoding of the groups? Your help is greatly appreciated! Best regards Frank require(lattice) ## generating the data, I need time - seq(0,50,1) conversion - 2 * log(time + 1) press - rep(1,51) temp - rep(250,51) experiment - rep(1, 51) v1 = as.data.frame(cbind(experiment,time,conversion,press,temp)) conversion - 2.5 * log(time + 1) press - rep(1, 51) temp - rep(270,51) experiment - rep(2, 51) v2 = as.data.frame(cbind(experiment,time,conversion,press,temp)) conversion - 1.25 * log(time + 1) press - rep(0.5, 51) temp - rep(250,51) experiment - rep(3, 51) v3 = as.data.frame(cbind(experiment,time,conversion,press,temp)) d - rbind(v1,v2,v3) You want to use make.groups rather than rbind here, so that you retain information on which experiment each row is coming from: dd - make.groups(v1,v2,v3) str(dd) 'data.frame': 153 obs. of 6 variables: $ experiment: num 1 1 1 1 1 1 1 1 1 1 ... $ time : num 0 1 2 3 4 5 6 7 8 9 ... $ conversion: num 0.00 1.39 2.20 2.77 3.22 ... $ press : num 1 1 1 1 1 1 1 1 1 1 ... $ temp : num 250 250 250 250 250 250 250 250 250 250 ... $ which : Factor w/ 3 levels v1,v2,v3: 1 1 1 1 1 1 1 1 1 1 ... Now how you choose to plot this is up to you. One simple possibility is to create a new factor encoding both experiment and temperature, e.g., xyplot(conversion ~ time, data = dd, groups = (which:factor(temp))[,drop=TRUE], auto.key = T) Another is to condition on one, e.g., xyplot(conversion ~ time | factor(temp), data = dd, groups = which, auto.key = T) It is possible to use one one variable for color and another for plotting character, but the code involved would be somewhat less elegant (mostly because the support functions are not already built in). Let me know if you really want that; I can come up with some sample code. -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probably simple function problem
You didn't specify the exact nature of the problem so I guess that you want it return the vector cppm. Add return(cppm) as the final line in function. Here's what I think you want; it replicates your original code. adjusts - c(.50, .70, .29, .27 , .40 , .26 , 125) coal - 1:6 newdata - 1:10 fuel.costing - function(fuel, utr, mydata) { + cppf - cppm - fuel ; + cppf[2] - fuel[2]*(1-utr[2])*length(mydata) + utr[7]* + utr[2]*utr[5] ; + cppf[4] - fuel[2]*(1-utr[4])*length(mydata) + utr[7]* + utr[4]*utr[6] ; + cppm[2] - fuel[2]*(1-utr[1])*length(mydata) ; + cppm[4] - fuel[2]*(1-utr[3])*length(mydata) + return(cppm) + } my.cppm - fuel.costing(coal, adjusts, newdata) my.cppm [1] 1.0 10.0 3.0 14.2 5.0 6.0 HTH -jason - Original Message - From: John Kane [EMAIL PROTECTED] To: R R-help r-help@stat.math.ethz.ch Sent: Friday, March 16, 2007 2:00 PM Subject: [R] Probably simple function problem # I have a simple function problem. I thought that I could write a function to modify a couple of vectors but I am doing something wrong #I have a standard cost vector called fuel and some adjustments to the #costs called adjusts. The changes are completely dependend on the length #of the dataframe newdata I then need to take the modifed vectors and use # them later. I need to do this several times and the only change in the variables # is the length of the data.frame. # Can anyone suggest what I am doing wrong or am I just misunderstanding what # a function is supposed to do? #Example: adjusts - c(.50, .70, .29, .27 , .40 , .26 , 125) coal - 1:6 newdata - 1:10 fuel.costing - function(fuel, utr, mydata) { cppf - cppm - fuel ; cppf[2] - fuel[2]*(1-utr[2])*length(mydata) + utr[7]* utr[2]*utr[5] ; cppf[4] - fuel[2]*(1-utr[4])*length(mydata) + utr[7]* utr[4]*utr[6] ; cppm[2] - fuel[2]*(1-utr[1])*length(mydata) ; cppm[4] - fuel[2]*(1-utr[3])*length(mydata) } fuel.costing(coal, adjusts, newdata) ## original code for one place cppf - cppm - coal ; cppf[2] - coal[2]*(1-adjusts[2])*length(newdata) + adjusts[7]* adjusts[2]*adjusts[5] ; cppf[4] - coal[2]*(1-adjusts[4])*length(newdata) + adjusts[7]* adjusts[4]*adjusts[6] ; cppm[2] - coal[2]*(1-adjusts[1])*length(newdata) ; cppm[4] - coal[2]*(1-adjusts[3])*length(newdata) label(cppm) - cppm - SW coal costs adjusted label (cppf) - cppf - WW coal costs adjusted # Any help or suggests would be greatly appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rownames are always character?
Gurus, Can I rely on the rownames() function, when applied to a matrix, always returning either NULL or an object of type character? It seems that row names can be entered as integers, but as of now (R 2.4.1 on Windows) the function returns character vectors, not numbers (which is my desired result). I am using elements of the returned vector to index the matrix. e.g., nams - rownames(mymat) for (thisnam in nams) { myvec - mymat[thisnam, ] # ... more code ... } -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Implementing trees in R
Gabor, Thanks. That helps a lot. Gabor Grothendieck wrote: On 3/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: 1. Here is your example redone using proto: library(proto) parent - proto() child - proto(a = 1) parent$child1 - child child$parent.env - parent This last line should have been: parent.env(child) - parent # also just for illustration lets change a parent$child1$a # 1 child$a - 2 parent$child1$a # 2 2. To redefine $- use S3 or S4 but it can be done in conjunction with proto like this: # constructor node - function(. = parent.frame(), ...) structure(proto(...), class = c(node, proto)) $-.node - function(this, s, value) { if (s == .super) parent.env(this) - value if (is.function(value)) environment(value) - this if (inherits(value, node)) parent.env(value) - this this[[as.character(substitute(s))]] - value this } p - node(a = 1) p$child - node(b = 2) p$child$parent.env() p # same On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi Gabor, Thanks for the suggestions. I tried to look for the proto vignette document but could not find it, could you tell me how to reach it? Besides, is it possible to define my own node object type with a default behavior for the - operator of its member variables being referencing rather than copying? Any good reference material/ similar code examples? Thanks. Gabor Grothendieck wrote: Lists are not good for this. There is an example in section 3.3 of the proto vignette of using proto objects for this. That section also references an S4 example although its pretty messy with S4. You might want to look at the graph, RBGL and graphviz packages in Bioconductor and the dynamicgraph, mathgraph and sna packages on CRAN. On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- God bless. Kevin -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rownames are always character?
Mike Prager [EMAIL PROTECTED] wrote: Gurus, Can I rely on the rownames() function, when applied to a matrix, always returning either NULL or an object of type character? It seems that row names can be entered as integers, but as of now (R 2.4.1 on Windows) the function returns character vectors, not numbers (which is my desired result). (To clarify my point on this Friday afternoon: the observed behavior is my desired result. I'm just asking, can I count on it?) I am using elements of the returned vector to index the matrix. e.g., nams - rownames(mymat) for (thisnam in nams) { myvec - mymat[thisnam, ] # ... more code ... } -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Revisiting multiple plots
Suppose I create a multiple plot with zoo, using: index - ISOdatetime(2004, rep(1:2, 5), sample(28, 10), 0, 0, 0) foo - zoo(rnorm(10), index) for (i in 1:9) { data - rnorm(10) z1 - zoo(data, index) foo - cbind(foo, z1) } plot(foo) This creates 10 plots on one device, one for each column in foo. Now I want to go back and use abline to draw a line at the mean on each of my 10 plots. How do I select the appropriate set of axes to draw my line on? Thanks, Jonathan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
I agree that most problems arise in the data management / file derivation phase. From my reading of 21 CFR 11, it appears that this document focuses primarily on data management (as well as on software directly involved in a medical device) rather than on validation of statistical functions. I believe this point has been made previously on the R-help list. With regards to validating functions, I have often wondered how one can validate a function when one cannot see what it is doing. You could certainly compare calculations from one package to the same calculations from another package, but then you must purchase (ouch!) and know how to properly use two software packages instead of one. And I suppose they could both be wrong! Is not peer-review the best form of validation? . . . I suspect I may be preaching to the choir here. I would love nothing more than to migrate our stat group over to R from SAS. Based on my experience with R/Splus, the language seems more extendable, flexible, and has much better graphics (as has been pointed out many times on this list). It also has available the many contributions of generous R users. However, it has been hard to win pure SAS users onto R (even if it saves the company money!). One can't send the biostat group off to R training like one would to SAS classes. Learning R requires initiative from the user (which is not necessarily a bad thing). I considered encouraging the purchase of Splus as an intermediate step (hoping that its proprietary nature would soothe fears regarding open source software), but that option was not as cheap as I thought. Regards, -Cody Delphine Fontaine wrote: Thanks for your answer which was very helpfull. I have another question: I have read in this document (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the programs written in R are ephemeral and that new releases are not always compatible with previous releases. What I would like to know is if R functions are already validated and if not, what should we do to validate a R function ? In the sense in which most persons use the term 'validate', it means to show with one or more datasets that the function is capable of producing the right answer. It doesn't mean that it produces the right answer for every dataset although we hope it does. [As an aside, most errors are in the data manipulation phase, not in the analysis phase.] So I think that instead of validating functions we should spend more effort on validating analyses [and validating analysis file derivation]. Pivotal analyses can be re-done a variety of ways, in R or in separate programmable packages such as Stata. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CPU usage on Windows
I'm using R with emacs ESS on Windows. When I create a plot, sometimes R will seem to get stuck in a busy loop, i.e. it will use 100% of my CPU. However the system is still somewhat responsive, and the problem usually goes away if I create a new device with windows(). If I then close this device, making the first device active again, sometimes R will get stuck in the busy loop again. Has anybody heard of this behavior, or, better yet, have a solution? Thanks, Jonathan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Revisiting multiple plots
Try this (or use xyplot.zoo and write a panel function for that): library(zoo) set.seed(1) tt - as.Date(paste(2004, rep(1:2, 5), sample(28, 10), sep = -)) foo - zoo(matrix(rnorm(100), 10), tt) pnl - function(x, y, ...) { lines(x, y, ...) abline(h = mean(y)) } plot(foo, panel = pnl) On 3/16/07, Jonathan Wang [EMAIL PROTECTED] wrote: Suppose I create a multiple plot with zoo, using: index - ISOdatetime(2004, rep(1:2, 5), sample(28, 10), 0, 0, 0) foo - zoo(rnorm(10), index) for (i in 1:9) { data - rnorm(10) z1 - zoo(data, index) foo - cbind(foo, z1) } plot(foo) This creates 10 plots on one device, one for each column in foo. Now I want to go back and use abline to draw a line at the mean on each of my 10 plots. How do I select the appropriate set of axes to draw my line on? Thanks, Jonathan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatterplot brushing
Have a look at iplots. (and rggobi is the updated version of xggobi) Hadley On 3/16/07, Richard Valliant [EMAIL PROTECTED] wrote: Is there a package (other than xgobi which requires an X server) that will do scatterplot brushing? I see a mention in the mail archive of R-orca by Anthony Rossini but it is not in the current list of packages. My OS is Windows XP version 5.1, service pack 2 R version 2.4.1 (2006-12-18) Thanks [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatterplot brushing
I should mention that it's very easy to install rggobi and ggobi on windows these days. GGobi has a nice installer, http://www.ggobi.org/downloads, and rggobi is on cran. Hadley On 3/16/07, hadley wickham [EMAIL PROTECTED] wrote: Have a look at iplots. (and rggobi is the updated version of xggobi) Hadley On 3/16/07, Richard Valliant [EMAIL PROTECTED] wrote: Is there a package (other than xgobi which requires an X server) that will do scatterplot brushing? I see a mention in the mail archive of R-orca by Anthony Rossini but it is not in the current list of packages. My OS is Windows XP version 5.1, service pack 2 R version 2.4.1 (2006-12-18) Thanks [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CPU usage on Windows
On 3/16/2007 6:56 PM, Jonathan Wang wrote: I'm using R with emacs ESS on Windows. When I create a plot, sometimes R will seem to get stuck in a busy loop, i.e. it will use 100% of my CPU. However the system is still somewhat responsive, and the problem usually goes away if I create a new device with windows(). If I then close this device, making the first device active again, sometimes R will get stuck in the busy loop again. Has anybody heard of this behavior, or, better yet, have a solution? I've heard of a number of problems with Emacs on Windows. I wouldn't recommend using it. As far as I can see, it makes a number of assumptions about the OS that just aren't true about Windows. If you can reproduce the behaviour outside of Emacs, I'll investigate. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast lookup in ragged array
That's a good point. What's the overhead on digests like that? Also, does that open up the possibility, exceedingly small though it may be, of misidentifying a branch as already searched and missing a qualifying subgraph? On Mar 16, 2007, at 2:02 PM, Seth Falcon wrote: Peter McMahan [EMAIL PROTECTED] writes: Thanks, I'll give it a try. does R have a limit on variable name length? If you are going to have very long names, you might be better off computing a digest of some kind. You could use the digest package to compute an md5sum or the Ruuid package to generate a GUID. Also, is it better to over-estimate or under-estimate the size parameter? The environment will grow as needed. If you overestimate, you will use more memory than you need to. Whether this is a problem depends if you have extra memory available. Underestimating means that the underlying hashtable will need to be resized and this has a performance impact. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Font in JavaGD() pdf()
Dear r-helpers, When I do an xYplot and display the result in a JavaGD() window, the font is sans-serif (presumably Helvetica). When I send the figure to a pdf, I get a serif font (presumably times). How do I insure that the font in the pdf is indeed the default sans serif? sessionInfo() R version 2.4.1 (2006-12-18) i386-apple-darwin8.8.1 locale: C attached base packages: [1] grid datasets stats graphics grDevices utils methods base other attached packages: coda gmodels lme4 Matrix HH multcomp mvtnorm vcd 0.10-7 2.13.1 0.9975-13 0.9975-11 1.18-1 0.991-8 0.7-5 1.0-2 colorspaceHmisc xtable latticeExtra lattice gridBase MASS JGR 0.95 3.2-1 1.4-3 0.1-4 0.14-16 0.4-3 7.2-32 1.4-15 iplots JavaGDrJava 1.0-5 0.3-6 0.4-14 _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discriminating between experiments with xyplot (lattice)
Here's one possibility: d - make.groups(v1,v2,v3) ## create interaction dropping unused levels d$intg - with(d, which:factor(temp))[, drop=TRUE] ## extract experiment and temp information from levels intg.expt - sapply(strsplit(levels(d$intg), :, fixed = TRUE), [[, 1) intg.temp - sapply(strsplit(levels(d$intg), :, fixed = TRUE), [[, 2) ## find a suitable color vector (where colors are repeated when the ## temperature is) temp.unique - unique(intg.temp) col.temp - rep(trellis.par.get(superpose.symbol)$col, length = length(temp.unique)) col.intg - col.temp[match(intg.temp, temp.unique)] xyplot(conversion ~ time, data = d, groups = intg, intg.expt = intg.expt, panel = panel.superpose, panel.groups = function(x, y, ..., group.number, intg.expt) { panel.xyplot(x, y, ...) panel.text(tail(x, 1), tail(y, 1), intg.expt[group.number], pos = 4) }, col = col.intg, key = list(text = list(temp.unique), points = list(col = col.temp, pch = 1))) Hope that helps, Deepayan On 3/16/07, Frank Sarfert [EMAIL PROTECTED] wrote: Hi Deepayan, many thanks for your quick reply. I actually cannot condition whith the experiment number, because in my real life data, I would like to condition with pressure and I have many more experiments (not just 3). So ideally, I would have a plot with a label near the end point of the curve which says which experiment it comes from. So, if you could really write some code - that'd be great! Many thanks in advance! Best regards Frank -Ursprüngliche Nachricht- Von: Deepayan Sarkar [EMAIL PROTECTED] Gesendet: 16.03.07 22:47:47 An: Christel u. Frank Sarfert [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Betreff: Re: [R] Discriminating between experiments with xyplot (lattice) On 3/16/07, Christel u. Frank Sarfert [EMAIL PROTECTED] wrote: Hi, suppose I have data from 3 experiments which show conversion as a function of time for different boundary conditions, e.g. pressure, temperature. I want to plot all experiments as conversion over time grouped according to the temperature. However, since I have more than one experiment performed at the same temperature (but different pressures) I end up figuring out which curve belongs to which experiment. (An example with artificial data of the structure I use is given below. It shows three experiments where two experiments at temp = 250 but press = 1 and press = 0.5 are plotted within one group.) My question is: Is there a way to identify which curve whithin the same group belongs to which experiment, e.g by plotting a label like the experiment number to the end point, or by choosing different symbols for the different experiments - while keeping the color encoding of the groups? Your help is greatly appreciated! Best regards Frank require(lattice) ## generating the data, I need time - seq(0,50,1) conversion - 2 * log(time + 1) press - rep(1,51) temp - rep(250,51) experiment - rep(1, 51) v1 = as.data.frame(cbind(experiment,time,conversion,press,temp)) conversion - 2.5 * log(time + 1) press - rep(1, 51) temp - rep(270,51) experiment - rep(2, 51) v2 = as.data.frame(cbind(experiment,time,conversion,press,temp)) conversion - 1.25 * log(time + 1) press - rep(0.5, 51) temp - rep(250,51) experiment - rep(3, 51) v3 = as.data.frame(cbind(experiment,time,conversion,press,temp)) d - rbind(v1,v2,v3) You want to use make.groups rather than rbind here, so that you retain information on which experiment each row is coming from: dd - make.groups(v1,v2,v3) str(dd) 'data.frame': 153 obs. of 6 variables: $ experiment: num 1 1 1 1 1 1 1 1 1 1 ... $ time : num 0 1 2 3 4 5 6 7 8 9 ... $ conversion: num 0.00 1.39 2.20 2.77 3.22 ... $ press : num 1 1 1 1 1 1 1 1 1 1 ... $ temp : num 250 250 250 250 250 250 250 250 250 250 ... $ which : Factor w/ 3 levels v1,v2,v3: 1 1 1 1 1 1 1 1 1 1 ... Now how you choose to plot this is up to you. One simple possibility is to create a new factor encoding both experiment and temperature, e.g., xyplot(conversion ~ time, data = dd, groups = (which:factor(temp))[,drop=TRUE], auto.key = T) Another is to condition on one, e.g., xyplot(conversion ~ time | factor(temp), data = dd, groups = which, auto.key = T) It is possible to use one one variable for color and another for plotting character, but the code involved would be somewhat less elegant (mostly because the support functions are not already built in). Let me know if you really want that; I can come up with some sample code. -Deepayan _ Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! http://smartsurfer.web.de/?mc=100071distributionid=0066
Re: [R] Rownames are always character?
On Mar 16, 2007, at 6:26 PM, Mike Prager wrote: Mike Prager [EMAIL PROTECTED] wrote: Gurus, Can I rely on the rownames() function, when applied to a matrix, always returning either NULL or an object of type character? It seems that row names can be entered as integers, but as of now (R 2.4.1 on Windows) the function returns character vectors, not numbers (which is my desired result). (To clarify my point on this Friday afternoon: the observed behavior is my desired result. I'm just asking, can I count on it?) I would venture to guess that rownames() would always be returning something that you would then be able to use for indexing, to retrieve particular entries. The help page also implies that the return value will always be a character vector, or NULL: If do.NULL is FALSE, a character vector (of length NROW(x) or NCOL (x)) is returned in any case, prepending prefix to simple numbers, if there are no dimnames or the corresponding component of the dimnames is NULL. I would think you can count on this about as much as you can count the sum function to always add up its arguments, or something of that sort. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Re: CPU usage on Windows
I can't imagine using Windows without Emacs. In particular, the Windows ports of Emacs are very aware of the operating system and usually make the right assumptions. The type of behavior you are noticing can probably be cured by typing C-g in the *R* buffer in emacs. The most likely cause is that the R process in Emacs is waiting for the plot to finish and is querying the plotting device. Most of that excess CPU usage is from the query loop. The C-g tells Emacs and R to stop waiting. If C-g doesn't stop the 100% CPU utilization, then it is most likely something about the specific plot you are drawing. We will need to see a reproducible example to say more. Rich Original message Date: Fri, 16 Mar 2007 19:37:14 -0400 From: Duncan Murdoch [EMAIL PROTECTED] Subject: Re: [R] CPU usage on Windows To: Jonathan Wang [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch On 3/16/2007 6:56 PM, Jonathan Wang wrote: I'm using R with emacs ESS on Windows. When I create a plot, sometimes R will seem to get stuck in a busy loop, i.e. it will use 100% of my CPU. Has anybody heard of this behavior, or, better yet, have a solution? I've heard of a number of problems with Emacs on Windows. I wouldn't recommend using it. As far as I can see, it makes a number of assumptions about the OS that just aren't true about Windows. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.