Re: [R] Matrix of dummy variables from a factor
Dear Charles, Try model.matrix(~f)[,-1]. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Charles H. Franklin Sent: Tuesday, December 06, 2005 3:03 PM To: r-help@stat.math.ethz.ch Subject: [R] Matrix of dummy variables from a factor What is a simple way to convert a factor into a matrix of dummy variables? fm-lm(y~f) where f is a factor takes care of this in the estimation. I'd like to save the result of expanding f into a matrix for later use. Thanks. Charles -- Charles H. Franklin Professor, Political Science University of Wisconsin, Madison [EMAIL PROTECTED] [EMAIL PROTECTED] 608-263-2022 (voice) 608-265-2663 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] is it possible to use R to edit an EXCEL spreadsheet so I can create a searchable EXCEL database of R packages?
Dear Bob, In my copy of the R FAQ (from the Rgui for Windows), these is no space between entries which, when copied to a text file, look like: AMORE A MORE flexible neural network package, providing the TAO robust neural network algorithm. AlgDesign Algorithmic experimental designs. Calculates exact and approximate theory experimental designs for D, A, and I criteria. etc. For this, the following works: packages - readLines(c:/temp/Packages.txt) packages - gsub(^\ *, , gsub(\ *$, , packages)) # get rid of leading and trailing blanks packages - matrix(packages, ncol=2, byrow=TRUE) write.table(packages, c:/temp/Packages2.txt, sep=,, row.names=FALSE, col.names=FALSE) Producing a comma-separated text file that can be imported into Excel, and that looks like: AMORE,A MORE flexible neural network package, providing the TAO robust neural network algorithm. AlgDesign,Algorithmic experimental designs. Calculates exact and approximate theory experimental designs for D, A, and I criteria. etc. If you started with blank lines between packages, then (making sure that there is a blank line after the last package in the input file) you could adapt this as: packages - readLines(c:/temp/Packages.txt) packages - gsub(^\ *, , gsub(\ *$, , packages)) packages - matrix(packages, ncol=3, byrow=TRUE) write.table(packages[,1:2], c:/temp/Packages2.txt, sep=,, row.names=FALSE, col.names=FALSE) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Green Sent: Monday, December 05, 2005 6:58 PM To: r-help@stat.math.ethz.ch Subject: [R] is it possible to use R to edit an EXCEL spreadsheet so I can create a searchable EXCEL database of R packages? I have copied the R FAQ pertaining to available packages into an EXCEL file. They take the following form - Design Regression modeling, testing, estimation, validation, graphics, prediction, and typesetting by storing enhanced model design attributes in the fit. Design is a etc. Devore5 Data sets and sample analyses from Probability and Statistics for Engineering and the Sciences (5th ed) by Jay L. Devore, 2000, Duxbury. Presently the above format can be represented as: package name package description space I want to change the format so that file name and description are in adjacent columns and the space is removed package namepackage description package namepackage description Is this possible in R so I can create a searchable EXCEL database of R packages. I imagine this would be easily carried out using PERL though was interested to know R's capacity to work on textual data? Any assistance regarding this is appreciated, Bob Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] FW: Error in structural equation model - The model hasnegativedegrees of freedom
Dear R-help list members, I forgot to copy my reply to the r-help list. Here's most of it. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: John Fox [mailto:[EMAIL PROTECTED] Sent: Sunday, December 04, 2005 5:01 PM To: 'Sunil W' Subject: RE: [R] Error in structural equation model - The model hasnegativedegrees of freedom Dear Sunil, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sunil W Sent: Sunday, December 04, 2005 11:03 AM To: R-help@stat.math.ethz.ch Subject: [R] Error in structural equation model - The model hasnegativedegrees of freedom Hi John Thanks a lot for the reply. Could you suggest how I can correct this problem? I tried using a correlation matrix instead of raw moments, but still got the same error. I also fixed parameters v1,v2,v3,a1 at 1; then it gave me the error that the system is exatly singular. The model is underidentified regardless of the input matrix. My point was that if you're using raw moments (as opposed to covariances or correlations) you should have a constant term in each equation. To answer the points that you raised: 1. x1-x6 are not causes; they are just indicatiors. Does that change my model? Yes. The arrows should go the other way. 2. Is the size of the dataset going to be a real limitation? How can I take care of that? . . . --- Dear Sunil, There are 7*8/2 = 28 raw moments among the 7 observed variables. Of these, 6*7/2 = 21 are used for the moments among the 6 fixed-exogenous variables, leaving 28 - 21 = 7 df. You model has 11 free parameters. So df for the model = 11 - 7 = -4. Some additional comments: If you're using raw moments, why isn't there a constant variable in the model? Do you really intend x1 -- x6 to be causes, rather than indicators, of m1 and m2? Why are there no normalizing constraints on the latent variables? Do you really want to fit a model like this to so small a data set? I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sunil W Sent: Wednesday, November 30, 2005 10:08 PM To: r-help@stat.math.ethz.ch Subject: [R] Error in structural equation model - The model has negativedegrees of freedom Hi I am running a structural equation model with R using the sem command; am getting the following error: Error in sem.default : The model has negative degrees of freedom = -4 My model is as follows: s_model = specify.model() x1-m1, b1,NA x2-m1, b2,NA x3-m2, b3,NA x4-m2, b4,NA x5-m2, b5,NA x6-m2, b6,NA m1-y, a1,NA m2-y, a2,NA m1-m1, v1,NA m2-m2, v2,NA y-y, v3,NA x1-x6 are observed independent variables, m1 and m2 are the latent variables and y is the observed dependent variable. I use the raw.moments command for calculating the covariance matrix, based on a data with 147 observations. The command that I use is as follows: s = sem(s_model,S=R,obs.variables=colnames(R), fixed.x=c('x1','x2','x3','x4','x5','x6'), raw=TRUE) I would appreciate any help on this; I am new to structural equation models and realize that I may be making a silly error. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Error in structural equation model - The model has negativedegrees of freedom
Dear Sunil, There are 7*8/2 = 28 raw moments among the 7 observed variables. Of these, 6*7/2 = 21 are used for the moments among the 6 fixed-exogenous variables, leaving 28 - 21 = 7 df. You model has 11 free parameters. So df for the model = 11 - 7 = -4. Some additional comments: If you're using raw moments, why isn't there a constant variable in the model? Do you really intend x1 -- x6 to be causes, rather than indicators, of m1 and m2? Why are there no normalizing constraints on the latent variables? Do you really want to fit a model like this to so small a data set? I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sunil W Sent: Wednesday, November 30, 2005 10:08 PM To: r-help@stat.math.ethz.ch Subject: [R] Error in structural equation model - The model has negativedegrees of freedom Hi I am running a structural equation model with R using the sem command; am getting the following error: Error in sem.default : The model has negative degrees of freedom = -4 My model is as follows: s_model = specify.model() x1-m1, b1,NA x2-m1, b2,NA x3-m2, b3,NA x4-m2, b4,NA x5-m2, b5,NA x6-m2, b6,NA m1-y, a1,NA m2-y, a2,NA m1-m1, v1,NA m2-m2, v2,NA y-y, v3,NA x1-x6 are observed independent variables, m1 and m2 are the latent variables and y is the observed dependent variable. I use the raw.moments command for calculating the covariance matrix, based on a data with 147 observations. The command that I use is as follows: s = sem(s_model,S=R,obs.variables=colnames(R), fixed.x=c('x1','x2','x3','x4','x5','x6'), raw=TRUE) I would appreciate any help on this; I am new to structural equation models and realize that I may be making a silly error. Thanks Sunil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Tiger Mac stalls running Rcmdr program
Dear Maha, I'm not a Mac user, so can't offer much help, since you appear to have followed the Mac installation notes. (Rick: I'm copying this reply to you in case you didn't see Maha's message on r-help.) I'm writing primarily to make sure that you know that the Rcmdr doesn't support either structural-equation modeling or meta-analysis. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Maha Golestaneh Sent: Friday, November 25, 2005 4:25 AM To: r-help@stat.math.ethz.ch Subject: [R] Tiger Mac stalls running Rcmdr program I am a Macintosh computer (MAC OS X Version 10.4.3) user. I would like to run R for structural equation modeling and meta-analysis but am having difficulty using the Rcmdr interface. According to the R commander installation notes for Tiger Macs - I need to 1) Install X11.app from Apple Install disks - which I have done 2) Install R.app - which I have done 3) Install binary package rgl from CRAN - which I have done 4) Install binary Rcmdr from CRAN - which I have done I then need to start R and X11 and type library (Rcmdr) in the R console and return... My computer now is stalling (a colored CD spins) meaning possibly it can9t run the Rcmdr (R commander) program. Please help me install this successfully. Thank you. Maha Maha Golestaneh PhD Candidate, Graduate School of Business The Woolsack, Pavilion 2, Room 2.20 University of Cape Town, Woolsack Drive Rondebosch, Cape Town South Africa [EMAIL PROTECTED] [EMAIL PROTECTED] H 011-27-21-685-4050 x222 M 011-27-72-713-0649 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] type III sums of squares in R
Dear Stefanie and Mike, To elaborate slightly on Mike's points, the Anova() function in car calculates Type-III (and Type-II) tests differently from SAS. (The difference originates in the fact that SAS uses a deficient-rank parametrization of the model while R uses a full-rank parametrization; it would be possible to mimic SAS's behaviour more closely, but I think that there are problems with it.) As a consequence, you have to use a contrast-generating function, such as contr.helmert or contr.sum (but not contr.SAS), that provides contrasts that are orthogonal in the row-basis of the model matrix. I should probably elaborate the warning about Type-III tests in the help page for Anova(), but perhaps it would help to know that the issue is discussed at greater length in the book with which the car package is associated. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mike Sears Sent: Thursday, November 24, 2005 1:54 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] type III sums of squares in R If you use the Anova function in the car package and specify contr.sum or contr.SAS for the contrasts for your categorical factors, you will get the same results as outputted by SAS. I've tried this with a variety of data sets and it works. On Thu November 24 2005 08:27, Stefanie von Felten, IPWIfU wrote: Hi everyone, Can someone explain me how to calculate SAS type III sums of squares in R? Not that I would like to use them, I know they are problematic. I would like to know how to calculate them in order to demonstrate that strange things happen when you use them (for a course for example). I know you can use drop1(lm(), test=F) but for an lm(y~A+B+A:B), type III SSQs are only calculated for the A:B interaction. If I specify lm(y~A+B), it calculates type II SSQ for the main effects (as type III only differs from type II if interactions are included). Thus, this approach doesn't help. Another approach is the Anova(, type=III) function within the library(car). But somehow it produces strange results. Somebody told me I have to change the contrast settings using options(contrasts=c(contr.helmert, contr.poly)) But I had the impression that my results are still impossible. Are the calculations dependent from the version of R I use? I am currently using R2.1.1 The only thing that seems to work is a trick: Specify a separate column AB that codes a new variable for the interaction of A:B. Now you can fit A,B, and AB (as 3 main effects) in 3 different sequential models with each one of them in the end once. For the term in the end you then get type III SSQ which seem to be correct. Cheers Steffi -- Michael W. Sears, Ph.D. Postdoctoral Fellow Department of Biology/MS 314 University of Nevada, Reno Reno, NV 89557 Assistant Professor Department of Zoology Southern Illinois University Carbondale, IL 62901 phone: 775.784.8008 cell: 775.232.3520 web:http://www.unr.edu/homepage/msears http://www.science.siu.edu/zoology/sears Natural selection is a mechanism for generating an exceedingly high degree of improbability. -Sir Ronald Fisher (1890-1962) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] residuals in logistic regression model
Dear Urania, The residuals method for glm objects can compute several kinds of residuals; the default is deviance residuals. See ?residuals.glm for details and references. I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Urania Sun Sent: Thursday, November 24, 2005 1:36 PM To: r-help@stat.math.ethz.ch Subject: [R] residuals in logistic regression model In the logistic regression model, there is no residual log (pi/(1-pi)) = beta_0 + beta_1*X_1 + . But glm model will return residuals What is that? How to understand this? Can we put some residual in the logistic regression model by replacing pi with pi' (the estimated pi)? log (pi'/(1-pi')) = beta_0 + beta_1*X_1 + .+ ei Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] residuals in logistic regression model
Dear Urania, -Original Message- From: Urania Sun [mailto:[EMAIL PROTECTED] Sent: Thursday, November 24, 2005 8:52 PM To: John Fox Cc: r-help@stat.math.ethz.ch Subject: Re: [R] residuals in logistic regression model Thanks, Professor. But is it ok to write residuals in the right hand side of the logistic regression formula? Some people said I cannot since the generalized linear model is to use a function to link the expectation to a linear model. So there should not be residuals in the right hand side. My question is that If residuals do exist (as in the glm model output), why not put them in the formula (for example, if I write the left-hand side as the estimated odds-ratio)? There are several kinds of residuals for generalized linear models, as I mentioned (see ?residuals.glm). The residuals in the glm output are deviance residuals, which are the casewise (signed) components of the residual deviance; differences between y and fitted-y are called response residuals (and aren't generally as useful). The left-hand side of a logit model transformed with the logit-link is the log-odds, not the odds or odds-ratio. The form of the model to which the response residuals applies has the proportion, not the logit, on the left-hand side. These matters are discussed in the references given in ?residuals.glm, and in many other places, such as Sec. 6.6 of my R and S-PLUS Companion to Applied Regression. Many thanks! Happy Thanksgiving! Unfortunately we celebrate Thanksgiving in Canada in October, probably because the weather here in late November leaves little to be thankful for. Regards, John On 11/24/05, John Fox [EMAIL PROTECTED] wrote: Dear Urania, The residuals method for glm objects can compute several kinds of residuals; the default is deviance residuals. See ?residuals.glm for details and references. I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On Behalf Of Urania Sun Sent: Thursday, November 24, 2005 1:36 PM To: r-help@stat.math.ethz.ch Subject: [R] residuals in logistic regression model In the logistic regression model, there is no residual log (pi/(1-pi)) = beta_0 + beta_1*X_1 + . But glm model will return residuals What is that? How to understand this? Can we put some residual in the logistic regression model by replacing pi with pi' (the estimated pi)? log (pi'/(1-pi')) = beta_0 + beta_1*X_1 + .+ ei Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] _simple_ introduction to bayesian methods?
Dear Joshua, You might take a look at Lancaster, An Introduction to Modern Bayesian Econometrics. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joshua N Pritikin Sent: Tuesday, November 22, 2005 12:13 PM To: r-help@stat.math.ethz.ch Subject: [R] _simple_ introduction to bayesian methods? I made it through the first chapter (Background) of Bayesian Data Analysis by Gelman, et al. Conceptually, the Bayesian approach appeals to me and my curiosity is piqued. However, the next chapter was much too terse. The math is daunting. Where can I find a gentle introduction? Or which math book(s) do I need to read first? On page 27, there is mention of introductory books including Bayesian Methods by Jeff Gill (2002). Just for fun, I took a look at this book to see whether I could get through it. Chapter 1 was inviting, but again, the math got too sophisticated starting from chapter 2. -- Make April 15 just another day, visit http://fairtax.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] graphing help
Dear Anthony, You can use xlim in plot() to reverse the x-axis. For example, x - 1:10 y - x plot(x, y, xlim=rev(range(x))) (I'm pretty sure, by the way, that this question has been asked before.) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Anthony Pitts Sent: Tuesday, November 22, 2005 3:04 PM To: r-help@stat.math.ethz.ch Subject: [R] graphing help Hello all, Does anyone know how to make R graph the x-axis in descending order? I am all set in creating a vertical bar chart using either the plot or the xychart (in lattice), but I need the x-axis in descending order. I have tried resorting the data but apparently R automatically graphs the x-axis in ascending order. I have multiplied the data used for the x-axis by -1 and this works but now the tick-mark labels are negative. So does anyone know how to 1) make R graph the x-axis in descending order or 2) change the tick labels so I can get rid of the negative signs? Thanks in advance for your help Tony __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Autoloading R Commander
Dear Stephen, I believe that this question has been asked before, though possibly privately rather than on the r-help list. A solution (kindly provided, as I recall, by Brian Ripley) is to put the following in an appropriate start-up file. For example, if you *always* want to start the Rcmdr when R starts, this could go in Rprofile.site in R's etc subdirectory. For more detail, see ?Startup, as others have suggested. local({ old - getOption(defaultPackages) options(defaultPackages = c(old, Rcmdr)) }) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stephen P. Molnar, Ph.D. Sent: Saturday, November 19, 2005 10:35 AM To: R Subject: [R] Autoloading R Commander How do I go about autoloading R Commander when I start R? Thanks in advance. -- Stephen P. Molnar, Ph.D. Life is a fuzzy set Foundation for Chemistry Stochastic and multivariant http://www.geocities.com/FoundationForChemistry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Autoloading R Commander
Dear Stephen, As a brief addendum, this information (and other information) is in the Rcmdr installation notes, at http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html. Sorry I forgot that when I posted my original answer. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Fox Sent: Saturday, November 19, 2005 11:27 AM To: [EMAIL PROTECTED] Cc: 'R' Subject: Re: [R] Autoloading R Commander Dear Stephen, I believe that this question has been asked before, though possibly privately rather than on the r-help list. A solution (kindly provided, as I recall, by Brian Ripley) is to put the following in an appropriate start-up file. For example, if you *always* want to start the Rcmdr when R starts, this could go in Rprofile.site in R's etc subdirectory. For more detail, see ?Startup, as others have suggested. local({ old - getOption(defaultPackages) options(defaultPackages = c(old, Rcmdr)) }) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stephen P. Molnar, Ph.D. Sent: Saturday, November 19, 2005 10:35 AM To: R Subject: [R] Autoloading R Commander How do I go about autoloading R Commander when I start R? Thanks in advance. -- Stephen P. Molnar, Ph.D. Life is a fuzzy set Foundation for Chemistry Stochastic and multivariant http://www.geocities.com/FoundationForChemistry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] loess: choose span to minimize AIC?
Dear Mike, You could try bestLoess - function(model, criterion=c(aicc, aicc1, gcv), spans=c(.05, .95)){ criterion - match.arg(criterion) f - function(span) { mod - update(model, span=span) loess.aic(mod)[[criterion]] } result - optimize(f, spans) list(span=result$minimum, criterion=result$objective) } A little experimentation suggests that aicc1 doesn't seem to behave reasonably. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Friendly Sent: Thursday, November 17, 2005 9:58 AM To: R-help@stat.math.ethz.ch Subject: [R] loess: choose span to minimize AIC? Is there an R implementation of a scheme for automatic smoothing parameter selection with loess, e.g., by minimizing one of the AIC/GCV statistics discussed by Hurvich, Simonoff Tsai (1998)? Below is a function that calculates the relevant values of AICC, AICC1 and GCV--- I think, because I to guess from the names of the components returned in a loess object. I guess I could use optimize(), or do a simple line search on span=, but I'm not sure how to use loess.aic to write a function that would act as a wrapper for loess() and return the mimimizing loess fit for a specified criterion. loess.aic - function (x) { # extract values from loess object if (!(inherits(x,loess))) stop(Error: argument must be a loess object) span - x$pars$span n - x$n traceL - x$trace.hat sigma2 - sum( x$residuals^2 ) / (n-1) delta1 - x$one.delta delta2 - x$two.delta enp - x$enp aicc - log(sigma2) + 1 + 2* (2*(traceL+1)) / (n-traceL-2) aicc1- n*log(sigma2) + n* ( (delta1/(delta2*(n+enp)))/(delta1^2/delta2)-2 ) gcv - n*sigma2 / (n-traceL)^2 result - list(span=span, aicc=aicc, aicc1=aicc1, gcv=gcv) return(result) } cars.lo - loess(dist ~ speed, cars) (values - loess.aic(cars.lo)) $span [1] 0.75 $aicc [1] 6.93678 $aicc1 [1] 167.7267 $gcv [1] 5.275487 -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Interpretation of output from glm
Dear Pedro, The basic point, which relates to the principle of marginality in formulating linear models, applies whether the predictors are factors, covariates, or both. I think that this is a common topic in books on linear models; I certainly discuss it in my Applied Regression, Linear Models, and Related Methods. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de Barros Sent: Wednesday, November 09, 2005 10:45 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] Interpretation of output from glm Importance: High Dear John, Thanks for the quick reply. I did indeed have these ideas, but somehow floating, and all I could find about this mentioned categorical predictors. Can you suggest a good book where I could try to learn more about this? Thanks again, Pedro At 01:49 09/11/2005, you wrote: Dear Pedro, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de Barros Sent: Tuesday, November 08, 2005 9:47 AM To: r-help@stat.math.ethz.ch Subject: [R] Interpretation of output from glm Importance: High I am fitting a logistic model to binary data. The response variable is a factor (0 or 1) and all predictors are continuous variables. The main predictor is LT (I expect a logistic relation between LT and the probability of being mature) and the other are variables I expect to modify this relation. I want to test if all predictors contribute significantly for the fit or not I fit the full model, and get these results summary(HMMaturation.glmfit.Full) Call: glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom, family = binomial(link = logit), data = HMIndSamples) Deviance Residuals: Min 1Q Median 3Q Max -3.0983 -0.7620 0.2540 0.7202 2.0292 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -8.789e-01 3.694e-01 -2.379 0.01735 * LT 5.372e-02 1.798e-02 2.987 0.00281 ** CondF -6.763e-02 9.296e-03 -7.275 3.46e-13 *** Biom-1.375e-02 2.005e-03 -6.856 7.07e-12 *** LT:CondF 2.434e-03 3.813e-04 6.383 1.74e-10 *** LT:Biom 7.833e-04 9.614e-05 8.148 3.71e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 10272.4 on 8224 degrees of freedom Residual deviance: 7185.8 on 8219 degrees of freedom AIC: 7197.8 Number of Fisher Scoring iterations: 8 However, when I run anova on the fit, I get anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of Deviance Table Model: binomial, link: logit Response: Mature Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev P(|Chi|) NULL822410272.4 LT 1 2873.8 8223 7398.7 0.0 CondF 1 0.1 8222 7398.5 0.7 Biom1 0.2 8221 7398.3 0.7 LT:CondF1142.1 8220 7256.3 9.413e-33 LT:Biom 1 70.4 8219 7185.8 4.763e-17 Warning message: fitted probabilities numerically 0 or 1 occurred in: method(x = x[, varseq = i, drop = FALSE], y = object$y, weights = object$prior.weights, I am having a little difficulty interpreting these results. The result from the fit tells me that all predictors are significant, while the anova indicates that besides LT (the main variable), only the interaction of the other terms is significant, but the main effects are not. I believe that in the first output (on the glm object), the significance of all terms is calculated considering each of them alone in the model (i.e. removing all other terms), while the anova output is (as it says) considering the sequential addition of the terms. So, there are 2 questions: a) Can I tell that the interactions are significant, but not the main effects? In a model with this structure, the main effects represent slopes over the origin (i.e., where the other variables in the product terms are 0), and aren't meaningfully interpreted as main effects. (Is there even any data near the origin?) b) Is it legitimate to consider a model where the interactions are considered, but not the main effects CondF and Biom? Generally, no: That is, such a model is interpretable, but it places strange constraints on the regression surface -- that the CondF and Biom slopes are 0 over the origin. None of this is specific to logistic
Re: [R] basic mac question
Dear George, I'm afraid that people won't understand the context of your question -- that is, that you're using the Rcmdr GUI, but that the crux of your problem as do with changing the names of variables in the data editor using R on a Mac. (As you know, I'm not a Mac user and unable to answer that question. I did answer the first part of your question in our private email correspondence, which is that the New Data Set window goes away only after the data editor is closed.) I hope that, with this clarification, a Mac R user will come to your assistance. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of George Skountrianos Sent: Wednesday, November 09, 2005 6:51 PM To: r-help@stat.math.ethz.ch Subject: [R] basic mac question ok i got r 2.2.0 to work on my mac just fine but im having problems with when i try to create a new data set. when i go to new data...new data set it opens up the window. so i name my set but when i click ok the window does not go away. is this normal. also the data editor will not allow me to name my variables. for example i click on the create new variable columb and row button and i get var1, var2 and then n/a n/a. ok well im able to put numbers into the n/a slots but i cannot change the var1, var2, ect names. any suggestions. ooo i have tiger 10.4. thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need advice about models with ordinal input variables
Dear Paul, I'll try to answer these question briefly. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Johnson Sent: Tuesday, November 08, 2005 2:42 PM To: r-help@stat.math.ethz.ch Subject: [R] Need advice about models with ordinal input variables Dear colleagues: I've been storing up this question for a long time and apologize for the length and verbosity of it. I am having trouble in consulting with graduate students on their research projects. They are using surveys to investigate the sources of voter behavior or attitudes. They have predictors that are factors, some ordered, but I am never confident in telling them what they ought to do. Usually, they come in with a regression model fitted as though these variables are numerical, and if one looks about in the social science literature, one finds that many people have published doing the same. I want to ask your advice about some cases. 1. An ordered factor that masquerades as a numerical interval level score. In the research journals, these are the ones most often treated as numerical variables in regressions. For example: Thermometer scores for Presidential candidates range from 0 to 100 in integer units. In my opinion, such scales aren't even really measurments, but they have a prima-facie reasonableness, and I wouldn't hesitate to use them in the absence of something better. What's a better idea? In an OLS model with just one input variable, a plot will reveal if there is a significant nonlinearity. One can recode the assigned values to linearize the final model or take the given values and make a nonlinear model. In the R package acepack I found avas, which works like a rubber ruler and recodes variables in order to make relationships as linear and homoskedastic as possible. I've never seen this used in the social science literature. It works like magic. Take an ugly scatterplot and shazam, out come transformed variables that have a beautiful plot. But what do you make of these things? There is so much going on in these transformations that interpretation of the results is very difficult. You can't say a one unit increase in x causes a b increase in y. Furthermore, if the model is a survival model, a logistic regression, or other non-OLS model, I don't see how the avas approach will help. As a general matter, the central issue seems to me the functional form of the relationship between the variables, whether or not X really has a unit of measurement. If you use a nonparametric fit (such as avas) or even if you use a complex parametric fit, such as a regression (rather than smoothing) spline, then why not think of the graph of the fit as the best description? I've tried fiddling about with smoothers, treating the input scores as if they were numerical. I got this idea from Prof. Harrell's Regression Modeling Strategies. In his Design package for R, one can include a cubic spline for a variable in a model by replacing x with rcs(x). Very convenient. If the results say the relationship is mostly linear, then we might as well treat the input variable as a numerical score and save some degrees of freedom. But if the higher order terms are statistically significant, it is difficult to know what to do. The best strategy I have found so far is to calculate fitted values for particular inputs and then try to tell a story about them. This seems reasonable, though I'd usually prefer a graph to a table. 2. Ordinal variables with less than 10 values. Consider variables like self-reported ideology, where respondents are asked to place themselves on a 7 point scale ranging from very conservative to very liberal. Or Party Identification on a 7 point scale, ranging (in the US) from Strong Democrat to Strong Republican. It has been quite common to see these thrown into regression models as if they were numerical. I've sometimes found it useful to run a regression treating them as unordered factors, and then I attempt to glean a pattern in the coefficients. If the parameter estimates step up by a fixed proportion, then one might think there's no damage from treating them as numerical variables. Since linear (and other similar) models don't make distributional assumptions about the X's (other than independence from the errors), nothing in principle changes. Yesterday, it occurred to us that there should be a signifance test to determine if one looses predictive power by replacing the factor-treatment of x with x itself. Is there a non-nested model test that is most appropriate? Actually, the models are nested, so, e.g., for a linear model, you could do an incremental F-test. That is, a linear relationship is a special case of any relationship at all, which is what you get by treating X as a factor. 3. Truly numericals variable that are
Re: [R] Interpretation of output from glm
Dear Pedro, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de Barros Sent: Tuesday, November 08, 2005 9:47 AM To: r-help@stat.math.ethz.ch Subject: [R] Interpretation of output from glm Importance: High I am fitting a logistic model to binary data. The response variable is a factor (0 or 1) and all predictors are continuous variables. The main predictor is LT (I expect a logistic relation between LT and the probability of being mature) and the other are variables I expect to modify this relation. I want to test if all predictors contribute significantly for the fit or not I fit the full model, and get these results summary(HMMaturation.glmfit.Full) Call: glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom, family = binomial(link = logit), data = HMIndSamples) Deviance Residuals: Min 1Q Median 3Q Max -3.0983 -0.7620 0.2540 0.7202 2.0292 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -8.789e-01 3.694e-01 -2.379 0.01735 * LT 5.372e-02 1.798e-02 2.987 0.00281 ** CondF -6.763e-02 9.296e-03 -7.275 3.46e-13 *** Biom-1.375e-02 2.005e-03 -6.856 7.07e-12 *** LT:CondF 2.434e-03 3.813e-04 6.383 1.74e-10 *** LT:Biom 7.833e-04 9.614e-05 8.148 3.71e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 10272.4 on 8224 degrees of freedom Residual deviance: 7185.8 on 8219 degrees of freedom AIC: 7197.8 Number of Fisher Scoring iterations: 8 However, when I run anova on the fit, I get anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of Deviance Table Model: binomial, link: logit Response: Mature Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev P(|Chi|) NULL822410272.4 LT 1 2873.8 8223 7398.7 0.0 CondF 1 0.1 8222 7398.5 0.7 Biom1 0.2 8221 7398.3 0.7 LT:CondF1142.1 8220 7256.3 9.413e-33 LT:Biom 1 70.4 8219 7185.8 4.763e-17 Warning message: fitted probabilities numerically 0 or 1 occurred in: method(x = x[, varseq = i, drop = FALSE], y = object$y, weights = object$prior.weights, I am having a little difficulty interpreting these results. The result from the fit tells me that all predictors are significant, while the anova indicates that besides LT (the main variable), only the interaction of the other terms is significant, but the main effects are not. I believe that in the first output (on the glm object), the significance of all terms is calculated considering each of them alone in the model (i.e. removing all other terms), while the anova output is (as it says) considering the sequential addition of the terms. So, there are 2 questions: a) Can I tell that the interactions are significant, but not the main effects? In a model with this structure, the main effects represent slopes over the origin (i.e., where the other variables in the product terms are 0), and aren't meaningfully interpreted as main effects. (Is there even any data near the origin?) b) Is it legitimate to consider a model where the interactions are considered, but not the main effects CondF and Biom? Generally, no: That is, such a model is interpretable, but it places strange constraints on the regression surface -- that the CondF and Biom slopes are 0 over the origin. None of this is specific to logistic regression -- it applies generally to generalized linear models, including linear models. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A Quick and (Very) Dirty Intro to Stats in R
Dear Jarrett, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jarrett Byrnes Sent: Tuesday, November 08, 2005 6:41 PM To: R-help@stat.math.ethz.ch Subject: [R] A Quick and (Very) Dirty Intro to Stats in R . . . Most importantly, there are still a few holes that need to be filled - if they can be 1) A SIMPLE explanation for how to do mixed models using lme. I am quite unsatisfied with most of what I've seen on the net, and if it even comes close to going over my head, it really won't fly with most folk I know. I've done the best I can, but I know if falls short. Possibly take a look at the appendix on mixed models to my R and S-PLUS Companion to Applied Regression, available at http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix-mixed-model s.pdf. This was intended to be a simple explanation. Regards, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] OLS variables
Dear Brian, I don't have a strong opinion, but R's interpretation seems more consistent to me, and as Kjetil points out, one can use polym() to specify a full-polynomial model. It occurs to me that ^ and ** could be differentiated in model formulae to provide both. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Monday, November 07, 2005 4:05 AM To: Kjetil Brinchmann halvorsen Cc: John Fox; r-help@stat.math.ethz.ch Subject: Re: [R] OLS variables On Sun, 6 Nov 2005, Kjetil Brinchmann halvorsen wrote: John Fox wrote: I assume that you're using lm() to fit the model, and that you don't really want *all* of the interactions among 20 predictors: You'd need quite a lot of data to fit a model with 2^20 terms in it, and might have trouble interpreting the results. If you know which interactions you're looking for, then why not specify them directly, as in lm(y ~ x1*x2 + x3*x4*x5 + etc.)? On the other hand, it you want to include all interactions, say, up to three-way, and you've put the variables in a data frame, then lm(y ~ .^3, data=DataFrame) will do it. This is nice with factors, but with continuous variables, and need of a response-surface type, of model, will not do. For instance, with variables x, y, z in data frame dat lm( y ~ (x+z)^2, data=dat ) gives a model mwith the terms x, z and x*z, not the square terms. There is a need for a semi-automatic way to get these, for instance, use poly() or polym() as in: lm(y ~ polym(x,z,degree=2), data=dat) This is an R-S difference (FAQ 3.3.2). R's formula parser always takes x^2 = x whereas the S one does so only for factors. This makes sense it you interpret `interaction' strictly as in John's description - S chose to see an interaction of any two continuous variables as multiplication (something which puzzled me when I first encountered it, as it was not well documented back in 1991). I have often wondered if this difference was thought to be an improvement, or if it just a different implementation of the Rogers-Wilkinson syntax. Should we consider changing it? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] OLS variables
Dear Brian, I like the idea of providing support for raw polynomials in poly() and polym(), if only for pedagogical reasons. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Monday, November 07, 2005 11:14 AM To: John Fox Cc: r-help@stat.math.ethz.ch; 'Kjetil Brinchmann halvorsen' Subject: RE: [R] OLS variables On Mon, 7 Nov 2005, John Fox wrote: Dear Brian, I don't have a strong opinion, but R's interpretation seems more consistent to me, and as Kjetil points out, one can use polym() to specify a full-polynomial model. It occurs to me that ^ and ** could be differentiated in model formulae to provide both. However, poly[m] only provide orthogonal polynomials, and I have from time to time considered extending them to provide raw polynomials too. Is that a better-supported idea? Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Monday, November 07, 2005 4:05 AM To: Kjetil Brinchmann halvorsen Cc: John Fox; r-help@stat.math.ethz.ch Subject: Re: [R] OLS variables On Sun, 6 Nov 2005, Kjetil Brinchmann halvorsen wrote: John Fox wrote: I assume that you're using lm() to fit the model, and that you don't really want *all* of the interactions among 20 predictors: You'd need quite a lot of data to fit a model with 2^20 terms in it, and might have trouble interpreting the results. If you know which interactions you're looking for, then why not specify them directly, as in lm(y ~ x1*x2 + x3*x4*x5 + etc.)? On the other hand, it you want to include all interactions, say, up to three-way, and you've put the variables in a data frame, then lm(y ~ .^3, data=DataFrame) will do it. This is nice with factors, but with continuous variables, and need of a response-surface type, of model, will not do. For instance, with variables x, y, z in data frame dat lm( y ~ (x+z)^2, data=dat ) gives a model mwith the terms x, z and x*z, not the square terms. There is a need for a semi-automatic way to get these, for instance, use poly() or polym() as in: lm(y ~ polym(x,z,degree=2), data=dat) This is an R-S difference (FAQ 3.3.2). R's formula parser always takes x^2 = x whereas the S one does so only for factors. This makes sense it you interpret `interaction' strictly as in John's description - S chose to see an interaction of any two continuous variables as multiplication (something which puzzled me when I first encountered it, as it was not well documented back in 1991). I have often wondered if this difference was thought to be an improvement, or if it just a different implementation of the Rogers-Wilkinson syntax. Should we consider changing it? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] OLS variables
Dear Leaf, I assume that you're using lm() to fit the model, and that you don't really want *all* of the interactions among 20 predictors: You'd need quite a lot of data to fit a model with 2^20 terms in it, and might have trouble interpreting the results. If you know which interactions you're looking for, then why not specify them directly, as in lm(y ~ x1*x2 + x3*x4*x5 + etc.)? On the other hand, it you want to include all interactions, say, up to three-way, and you've put the variables in a data frame, then lm(y ~ .^3, data=DataFrame) will do it. There are many terms in this model, however, if not quite 2^20. The introductory manual that comes with R has information on model formulas in Section 11. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun Sent: Sunday, November 06, 2005 3:11 AM To: r-help@stat.math.ethz.ch Subject: [R] OLS variables Dear all, Is there any simple way in R that can I put the all the interactions of the variables in the OLS model? e.g. I have a bunch of variables, x1,x2, x20... I expect then to have interaction (e.g. x1*x2, x3*x4*x5... ) with some combinations(2 way or higher dimensions). Is there any way that I can write the model simpler? Thanks! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] cox models
Dear Bertrand, There are only 4 degrees of freedom for a 5-level factor. By default, R creates contrasts for an unordered factor in a model formula using the function contr.treatment(), which, also by default, treats the first level of the factor (class, in your case) as the baseline or reference level. To see the contrasts, enter the command contrasts(igr1$class). A bit more information on contrasts is in Section 11.1.1 of the introductory manual that comes with R. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of bertrand Sent: Sunday, November 06, 2005 4:21 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: [R] cox models Hello, i'm a french student of medical oncology and i'm working on breast cancer. I have a variable with the histologic type of tumor wich is between 1 and 5. I use as.factor function to make some variable with level between 1 and 5. When i put it in the cox model i have only the level between 2 and 5. The level 1 doesn't appear. I think i have to change the number of level but i don't really know. Please can you help me? Class Levels: 1 2 3 4 5 coxph(formula = Surv(delai.etat, etat) ~ class, data = igr1) coef exp(coef) se(coef) z p class2 -0.093 0.9110.245 -0.38 7.0e-01 class3 -0.749 0.4730.286 -2.62 8.9e-03 class4 -0.600 0.5490.343 -1.75 8.0e-02 class5 -0.874 0.4170.197 -4.44 8.9e-06 Likelihood ratio test=24.9 on 4 df, p=5.28e-05 n=740 (1 observations deleted due to missing) Thanks for your help Bertrand __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (no subject)
Dear Erez, How about C - A + B ? Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Erez Sent: Tuesday, November 01, 2005 5:39 AM To: r-help@stat.math.ethz.ch Subject: [R] (no subject) Hi I need an advise if any one can help me. i have mass of data in 2 array A and B: A = 0 1 0 0 1 1 0 0 B = 0 0 0 1 0 1 1 1 and i have 3 rules to merge them into 3rd array C: if A[i] + B[i] == 0 then C[i]=0 if A[i] + B[i] == 1 then C[i]=1 if A[i] + B[i] == 2 then C[i]=2 it looks easy but with the regular way (loop) with large data it takes days (i test it). If any one can advise me what to do i'll be happy. Thanks Erez [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] function effect and standard error
Dear Emilie, This is, I think, the effect() function in the effects() package. By the way, the model that you've fit with glm() is just a linear model, and could also have been fit with lm(). For a model with this simple structure, effect() computes the adjusted means for the factor sp, holding other predictors to their average values. These effects are just fitted values under the model, and the standard errors reported are for the fitted values. For details, see the paper at http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revised.pdf. I hope this helps. John On Tue, 1 Nov 2005 12:55:17 -0500 Emilie Berthiaume [EMAIL PROTECTED] wrote: Hi list! I did the following regression: reg1 - glm(alti~sp + ovent + vivent + nuage, family=gaussian, data=meteo1) I was interested in knowing the effect of the species (sp) in reg1 and so I used the function «effect»: effect.sp - effect (sp, reg1, se=TRUE) with this output: sp AK BW NH OS RT SS 2.730101 2.885363 2.753774 2.750311 3.084606 2.834390 If I enter the following command: effect.sp$se I get this output: 1253 3100488514 6100 0.07924610 0.06713200 0.11493178 0.13106639 0.05252749 0.04208334 My question is: Do the numbers on the second line of this output represent the standard error? What do the numbers on the top line represent? Thank you, Emilie Berthiaume graduate student Biology Departement Université de Sherbrooke 2500 boul. de l'Université Sherbrooke, Québec J1K 2R1 CANADA Tél: 1-819-821-8000 poste 2059 Fax: 1-819-821-8049 [EMAIL PROTECTED] [[alternative HTML version deleted]] John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] GLM
Dear Jeff, One way to do this is with by(): summaries - by(example, Individual, function(data){ mod - glm(Type~Dwater+Habitat, data=data, family=binomial) list(AIC=AIC(mod), coef=coef(mod)) }) sapply(summaries, function(x) x$coef) rowMeans(sapply(summaries, function(x) x$coef)) sapply(summaries, function(x) x$AIC) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Row Sent: Monday, October 31, 2005 12:01 PM To: r-help@stat.math.ethz.ch Subject: [R] GLM Hello R-users, I am a relatively new user of R and I have a question regarding glm. I want to run the function GLM on multiple individuals separately and then get means for summary statistics such as AIC and coef. Is there a way to do this in R without separately going through each individual and subsetting the dataset, getting the summary statistics and averaging them? For example, in the dataset below I would like to run the function: glm(Type~Dwater+Habitat, data=example, family=binomial()) on each individual separately and then get the average for the coef's Dwater and Habitat for the 3 individuals. example Individual Type Dwater Habitat 1 10 40 0 2 11 80 1 3 10 30 0 4 11 90 0 5 20 15 1 6 21 95 1 7 20 20 0 8 21 75 1 9 30 15 0 10 31 60 1 11 30 10 1 12 31 95 0 Thanks, Jeff. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem with llines in lattice
Dear Deepayan, The application in which I encountered the problem is much more complicated, so I'm not sure whether using groups will work there, but I'll give it a try. Thanks for this, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Deepayan Sarkar Sent: Saturday, October 29, 2005 6:01 PM To: John Fox Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Problem with llines in lattice On 10/29/05, John Fox [EMAIL PROTECTED] wrote: Dear r-help list members, I'm experiencing problems getting type=b (or o or c) to work in llines(). Try, e.g., the following scaled-down example: x - factor(c(a, b, a, b)) y - c(1,2,1,2) z - factor(c(A, A, B, B)) symbols - 1:2 lines - 1:2 colors - 1:2 zvals - levels(z) xyplot(y~x|z, panel = function(x, y, subscripts, ...) { for (i in 1:length(zvals)) { sub - z[subscripts] == zvals[i] llines(x[sub], y[sub], lwd = 2, type = b, col = colors[i], pch = symbols[i], lty = lines[i]) } }) Only the lines (properly coloured with correct line types) appear, but no symbols are plotted. It's bug in lplot.xy. Fortunately, panel.xyplot will do the same thing, so you can use it instead of llines. One comment: I'm not sure if this would work in your real application, but for your toy example a more direct approach would be xyplot(y ~ x | z, type = 'b', groups = z, col = colors, pch = symbols, lty = lines, lwd = 2) -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem with llines in lattice
Dear r-help list members, I'm experiencing problems getting type=b (or o or c) to work in llines(). Try, e.g., the following scaled-down example: x - factor(c(a, b, a, b)) y - c(1,2,1,2) z - factor(c(A, A, B, B)) symbols - 1:2 lines - 1:2 colors - 1:2 zvals - levels(z) xyplot(y~x|z, panel = function(x, y, subscripts, ...) { for (i in 1:length(zvals)) { sub - z[subscripts] == zvals[i] llines(x[sub], y[sub], lwd = 2, type = b, col = colors[i], pch = symbols[i], lty = lines[i]) } }) Only the lines (properly coloured with correct line types) appear, but no symbols are plotted. On the other hand, if I call llines() with type=l and type=p separately in the panel function, both lines and points appear: xyplot(y~x|z, panel = function(x, y, subscripts, ...) { for (i in 1:length(zvals)) { sub - z[subscripts] == zvals[i] llines(x[sub], y[sub], lwd = 2, type = p, col = colors[i], pch = symbols[i], lty = lines[i]) llines(x[sub], y[sub], lwd = 2, type = l, col = colors[i], pch = symbols[i], lty = lines[i]) } }) I've looked at both the documentation and code for llines() and lplot.xy(), which llines() calls, and don't see the source of the problem. I'm using R 2.2.0 under Windows XP. I'm pretty sure that this used to work, but I got the same problem in R 2.1.1 when I tried it there. Any help would be appreciated. Thanks, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lm type of sums of squares
Dear Jarrett, anova() gives sequential sums of squares (as ?anova.lm says). See Anova() in the car package for something similar to Type II and III sums of squares. I hope this helps, John On Fri, 28 Oct 2005 10:05:39 -0700 Jarrett Byrnes [EMAIL PROTECTED] wrote: I'm curious, I realize there are methods for Type II and III sums of squares, and yet, when I've been constructing models with lm, I've noticed that position of the term of the model has not mattered in terms of its p-value. Does lm use sequential Type I sums of squares, or something else? Thanks! -Jarrett __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help: partial.cor significance test
Dear Simon, The population partial correlation rho[12|3...p] is 0 when the regression coefficient beta[2] for x[2] from the regression of x[1] on x[2] ... X[p] is 0. Thus, the usual t-test for a regression coefficient also tests that the partial correlation is 0. Now, the sample partial correlation r[12|3...p] = t/sqrt(t^2 + dfe)) where dfe = n - p is the degrees of freedom for error and t is the t-statistic for testing that beta[2] is 0, and thus t = sqrt(dfe*r^2[12|3...p]/(1 - r^2[12|3...p])), so it is easy to compute the (unsigned) t-statistics from the partial correlations. Why one would want to do this, however, is another matter. What would one do with a matrix of 2-sided p-values? Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of sp219 Sent: Wednesday, October 26, 2005 5:09 AM To: r-help@stat.math.ethz.ch Subject: [R] Help: partial.cor significance test Hi, I have been using the partial.cor function in Rcmdr but I was wondering if there is any easy way to get statistical significance tests (two tailed) along with the partial correlation coefficients? Simon Pickett Simon Pickett Centre for Ecology and Conservation Biology University of Exeter in Cornwall Tremough Campus Penryn Cornwall TR10 9EZ UK Tel: 01326371852 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] making an inicator variable
Dear Jen, There are lots of different ways to do what you want -- you've already had two suggestions -- but you might consider whether you really need to do it. In particular, R will generate its own indicator variables (and other kinds of contrasts) in linear and other statistical models (see Section 11 of the Introduction to R manual that comes with R). It's hard to give really good advice without knowing more about the context. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jennifer Dillon Sent: Tuesday, October 25, 2005 2:52 PM To: r-help@stat.math.ethz.ch Subject: [R] making an inicator variable Hello, I am almost a total novice, and I am sure there must be an easy (and basic) way to turn a variable of 1's and 2's into a variable of zeros and ones. This is in a data frame, but if I could do it with vectors, that's all I need. Can someone tell me how? Thanks so much, Jen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] String manipulation
Dear Luis, How about gsub([0-9], , x) ? This assumes that x contains the character data and not a factor, as would usually be the case in a data frame. If the variable is really a factor, then use as.character(x) in the call to gsub(). I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Luis Ridao Cruz Sent: Thursday, October 20, 2005 8:24 AM To: r-help@stat.math.ethz.ch Subject: [R] String manipulation R-help, I have a data frame which contains a character string column that is something like; II11 II18 II23 III1 III13 III16 III19 III2 III7 IV10 IV11 IV12 IX16 IX4 V12 V18 V2 V20 V23 V4 VII14 VII18 VII21 VII26 VII28 VII33 VII4 VII48 VII5 I want to apply a function (e.g mean) by grouping according to the roman part of the string, i.e, by I by V by VII ... ... and so on. I have looked at string manipulation functions (grep, pmatch,,,) but I can't really get it the way I want. Can anyone help? Thanks in advance. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] spliting an integer
Dear Sundar and Dimitri, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sundar Dorai-Raj Sent: Thursday, October 20, 2005 3:50 PM To: Dimitri Szerman Cc: R-Help Subject: Re: [R] spliting an integer Dimitri Szerman wrote: Hi there, From the vector X of integers, X = c(11999, 122000, 81997) I would like to make these two vectors: Z= c(1999, 2000, 1997) Y =c(1 , 12 , 8) That is, each entry of vector Z receives the four last digits of each entry of X, and Y receives the rest. Any suggestions? Thanks in advance, Dimitri [[alternative HTML version deleted]] Try: X - c(11999, 122000, 81997) Y - X %/% 1 Z - X - Y * 1 Or even X %% 1 [1] 1999 2000 1997 Regards, John See ?Arithmetic for more details. HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to import such data to R?
Dear ronggui, I didn't find any attachments, but using the data lines in your message, and assuming that . represents missing data, the following appears to do what you want: as.data.frame(scan(c:/temp/ronggui.txt, list(year=1, apps=1, top25=1, ver500=1, mth500=1, stufac=1, bowl=1, btitle=1, finfour=1, lapps=1, d93=1, avg500=1, cfinfour=1, clapps=1, cstufac=1, cbowl=1, cavg500=1, cbtitle=1, lapps_1=1, school=, ctop25=1, bball=1, cbball=1,), na.strings=.)) See ?scan for details. I hope this helps, John On Sat, 15 Oct 2005 15:57:42 +0800 ronggui [EMAIL PROTECTED] wrote: the data file has such structure: 1992 6245 49 . . 20 1 0 0 8.739536 0 . . . . . . . . alabama . 0 . 1993 7677 58 . . 15 1 0 0 8.945984 1 . 0 .2064476 -5 0 . 0 8.739536 alabama 9 0 0 1992 13327 57 36 58 16 0 0 0 9.497547 0 47 . . . . . 0 . arizona . 0 . 1993 19860 57 36 58 16 1 1 0 9.896463 1 47 0 .3989162 0 1 0 1 9.497547 arizona 0 1 1 1992 10422 37 28 58 20 0 0 0 9.251675 0 43 . . . . . -1 . arizona state . 0 . --snip- the data descriptions is: variable names: year apps top25 ver500mth500stufacbowl btitle finfour lapps d93 avg500cfinfour clappscstufac cbowl cavg500 cbtitle lapps_1 schoolctop25bball cbball Obs: 118 1. year 1992 or 1993 2. apps # applics for admission 3. top25perc frosh class in 25th high sch percen 4. ver500 perc frosh = 500 on verbal SAT 5. mth500 perc frosh = 500 on math SAT 6. stufac student-faculty ratio 7. bowl = 1 if bowl game in prev year 8. btitle = 1 if men's cnf chmps prev year 9. finfour = 1 if men's final 4 prev year 10. lappslog(apps) 11. d93 =1 if year = 1993 12. avg500 (ver500+mth500)/2 13. cfinfour change in finfour 14. clapps change in lapps 15. cstufac change in stufac 16. cbowlchange in bowl 17. cavg500 change in avg500 18. cbtitle change in btitle 19. lapps_1 lapps lagged 20. school university name 21. ctop25 change in top25 22. bball=1 if btitle or finfour 23. cbball change in bball so the each four lines represent one case,can some variables are numeric and some are character. I though the scan can read it in ,but it seems somewhat tricky as the mixed type of variables.any suggestions? the attachmen is the raw data and the description of the data. 2005-10-15 -- Deparment of Sociology Fudan University My new mail addres is [EMAIL PROTECTED] Blog:http://sociology.yculblog.com John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] asking the user for data
Dear Iain, There's an ask() function in the sm package that does what you want, but you'll have to compose your message properly: ask(paste(Please enter the z value for, x)) Alternatively, eval(parse(text=readline(paste(Please enter the z value for , x, : , sep= will do what you want, as would as.numeric(readline(paste(Please enter the z value for , x, : , sep=))). I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Iain Gallagher Sent: Sunday, October 16, 2005 7:34 AM To: r-help@stat.math.ethz.ch Subject: [R] asking the user for data Hello everyone. How do I get R to ask users for data to be entered? Specifically I want to ask for a z score to be entered (the user would look this up in a table) and then use the entered data to compute a Dunn's post-hoc test (post kruskal.test). I've tried the ask function but it's not recognised - maybe I don't have to appropriate libary installed. A pointer to the right one would be appreciated. e.g z -ask(message=Please enter the z value for x) Any help would be gratefully received. Thanks Iain Gallagher Institiute for Infection and Immunology Research Edinburgh University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] SEM with dichotomous indicators
Dear Renata, You can do this by combining the sem() and boot.sem() functions in the sem package with hetcor() in the polycor package; boot.sem() computes standard errors for parameter estimates by bootstrapping. See ?boot.sem for an example of a confirmatory factor-analysis model with ordinal indicators. See ?sem for many examples of structural-equation models with latent variables. Just combine the two. Note that a dichotomous indicator is in effect a two-category ordinal variable and may be handled as such. For example, the polychoric correlation between two dichtomous variables is just their tetrachoric correlation. I hope this helps, John On Mon, 10 Oct 2005 11:23:25 -0300 Renata [EMAIL PROTECTED] wrote: Hello, I'd like to know if there is a way to fit a Structural equation model with dichotomous indicators (ex: problem with a phone solved/ or not) having effects on a ordinal variable. How I do that using R? Do you have an example with the code in R that you can send to me? Thanks a lot! Renata Estrella UFRJ, Brasil, Rio de Janeiro Renata Leite Estrella Assistente de Pesquisa (21) 3978- Ramal 8841 www.lipe.com.br [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] zip package
Dear Sara, It looks to me as if there are three problems here: (1) Is the zip file for the package really at c:/ProgramFiles/R/rw2011/library/lpSolve_1.1.9? That is, isn't there a space in Program Files? (2) You have to specify repos=NULL to install from a local zip file, as ?install.packages tells you. (3) You don't seem to be using the destdir argument correctly; you can omit it. Why not avoid all this and just use the R for Windows menus: Packages - Install package(s) from local zip files? I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sara Mouro Sent: Friday, October 14, 2005 4:45 AM To: r-help@stat.math.ethz.ch Subject: [R] zip package Dear all I can not understand how to install the package lpsolve_1.1.9.zip I have read the FAQ and the help pages carefully, but it still not clear for me. I have tried the following (and obtained the respective error messages): install.packages(c:/ProgramFiles/R/rw2011/library/lpSolve_1. 1.9,destdir= c:/ProgramFiles/R/rw2011/library/lpSolve) Mensagem de aviso: no package 'c:/Program Files/R/rw2011/library/lpSolve_1.1.9' at the repositories in: download.packages(pkgs, destdir = tmpd, available = available, install.packages(lpSolve_1.1.9,destdir=c:/ProgramFiles/R/r w2011/libr ary/ lpSolve) Erro em download.packages(pkgs, destdir = tmpd, available = available, : 'destdir' is not a directory install.packages(lpSolve_1.1.9,destdir=NULL) Mensagem de aviso: no package 'lpSolve_1.1.9' at the repositories in: download.packages(pkgs, destdir = tmpd, available = available, Could you please tell me how to do that or when can I find a simple example of that kind of installation? Thank you so much and sorry for such basic question. Sara Mouro [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R/S-Plus equivalent to Genstat predict: predictions over averages of covariates
Dear Peter, See the effects package, described in http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revi sed.pdf. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dunn Sent: Wednesday, October 05, 2005 9:06 PM To: R-help mailing list Subject: [R] R/S-Plus equivalent to Genstat predict: predictions over averages of covariates Hi all I'm doing some things with a colleague comparing different sorts of models. My colleague has fitted a number of glms in Genstat (which I have never used), while the glm I have been using is only available for R. He has a spreadsheet of fitted means from each of his models obtained from using the Genstat predict function. For example, suppose we fit the model of the type glm.out - glm( y ~ factor(F1) + factor(F2) + X1 + poly(X2,2) + poly(X3,2), family=...) Then he produces a table like this (made up, but similar): F1(level1)12.2 F1(level2)14.2 F1(level3)15.3 F2(level1)10.3 F2(level2)9.1 X1=0 10.2 X1=0.510.4 X1=1 10.4 X1=1.510.5 X1=2 10.9 X1=2.511.9 X1=3 11.8 X2=0 12.0 X2=0.512.2 X2=1 12.5 X2=1.512.9 X2=2 13.0 X2=2.513.1 X2=3 13.5 Each of the numbers are a predicted mean. So when X1=0, on average we predict an outcome of 10.2. To obtain these figures in Genstat, he uses the Genstat predict function. When I asked for an explanation of how it was done (ie to make the predictions, what values of the other covariates were used) I was told: So, for a one-dimensional table of fitted means for any factor (or variate), all other variates are set to their average values; and the factor constants (including the first, at zero) are given a weighted average depending on their respective numbers of observations. So for quantitative variables (such as pH), one uses the mean pH in the data set when making the predictions. Reasonable anmd easy. But for categorical variables (like Month), he implies we use a weighted average of the fitted coefficients for all the months, depending on the proportion of times those factor levels appear in the data. (I hope I explained that OK...) Is there an equivalent way in R or S-Plus of doing this? I have to do it for a number of sites and species, so an automated way would be useful. I have tried searching to no avail (but may not be searching on the correct terms), and tried hard-coding something myself as yet unsuccessfully: The poly terms and the use of the weighted averaging over the factor levels are proving a bit too much for my limited skills. Any assistance appreciated. (Any clarification of what I mean can be provided if I have not been clear.) Thanks, as always. P. version _ platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major2 minor1.0 year 2005 month04 day 18 language R -- Dr Peter Dunn | Senior Lecturer in Statistics Faculty of Sciences, University of Southern Queensland Web:http://www.sci.usq.edu.au/staff/dunn Email: dunn at usq.edu.au CRICOS: QLD 00244B | NSW 02225M | VIC 02387D | WA 02521C __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data.frame error using sem package
Dear Suzanne, Take a look at your model specification: p.ram [,1][,2][,3] [1,] LM1 - LSMA LM1 - LSMA NA [2,] LSMA - RSMA LSMA - RSMA NA [3,] RSMA - RM1 RSMA - RM1 NA [4,] LSMA - LSMA LSMA - LSMA NA [5,] RSMA - RSMA RSMA - RSMA NA [6,] RM1 - RM1 RM1 - RM1 NA This matrix should have three columns, the first giving the path, the second the name of the corresponding parameter, and the third the start value for the parameter (or NA if you want sem() to compute a start value). You've apparently left out the parameter names. Please see the sem examples for details and the paper at http://socserv.socsci.mcmaster.ca/jfox/sem-package.pdf. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Suzanne Witt Sent: Thursday, October 06, 2005 2:55 PM To: r-help@stat.math.ethz.ch Subject: [R] data.frame error using sem package I am trying to use sem to measure effective connectivity among four brain regions. I have pasted the code that I am trying to run since that seems easier than trying to come up with another example. The input data is time series data taken from SPM; they are each 1x121 columns of numbers. I get the error either when I source the whole code, or if I enter it line by line when I go to get the summary. Thanks, Suzanne library(sem) # Load the region timecourses. lsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_LSMA.dat) rsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_RSMA.dat) lmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_LM1.dat) rmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_RM1.dat) # Combine all the timecourses from each session into a single data frame and name the columns appropriately. lcomf - cbind(lsma1, rsma1, lmc1, rmc1) names(lcomf) - c(LSMA, RSMA, LM1, RM1) # Type this at the command line to see a summary of your data str(lcomf) # Set up the structural equation model. p.ram - matrix(c( 'LM1 - LSMA', 'LM1 - LSMA', NA, 'LSMA - RSMA', 'LSMA - RSMA', NA, 'RSMA - RM1', 'RSMA - RM1', NA, 'LSMA - LSMA', 'LSMA - LSMA', NA, 'RSMA - RSMA', 'RSMA - RSMA', NA, 'RM1 - RM1', 'RM1 - RM1', NA), ncol = 3, byrow = TRUE) # Tell which variables are exogenous ('fixed'). p.fixed - c('LM1') # Do the fitting for session 1. C - cor(lcomf) nt - dim(lcomf)[1] #attach(lcomf) lcomf.results - sem(p.ram, C, nt, obs.variables = rownames(C), fixed.x = p.fixed) # Check out the results using the summary function summary(lcomf.results) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcmdr and scatter3d
Dear Ted, I assumed that since Naiara was using scatter3d(), he wants a 3D dynamic scatterplot. He could add points (actually, spheres) to the rgl graph produced by scatter3d() -- the analog of plot() followed by points() for a 2D graph -- but doing so would be much more work than plotting by groups. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: ecatchpole [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 04, 2005 8:55 PM To: John Fox Cc: 'Naiara S. Pinto'; r-help@stat.math.ethz.ch Subject: Re: [R] Rcmdr and scatter3d Niara, Alternatively, instead of scatter3d, the analogy to hold on in Matlab is to use plot() for the first set of data, then points() for the remainder. See ?plot ?points Ted. On 05/10/05 11:18, John Fox wrote,: Dear Naiara, Combine the data sets and differentiate among them with a factor. Then use the groups argument to scatter3d (see ?scatter3d). If you're using the R Commander to make the plot, the 3D scatterplot dialog box as a plot by groups button. You can also fit colour-coded regression surfaces by group. I've appended a new version of the scatter3d function, not yet in the Rcmdr package, which will also plot data ellipsoids (for the whole data set or by groups). I hope this helps, John --- snip -- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Naiara S. Pinto Sent: Tuesday, October 04, 2005 6:13 PM To: r-help@stat.math.ethz.ch Subject: [R] Rcmdr and scatter3d Hi folks, I'd like to use scatter3d (which is in R commander) to plot more than one dataset in the same graph, each dataset with a different color. The kind of stuff you would do with holdon in Matlab. I read a recent message that was posted to this list with a similar problem, but I couldn't understand the reply. Could someone give me one example? How do you plot subgroups using scatter3d? Thanks a lot! Naiara. Naiara S. Pinto Ecology, Evolution and Behavior 1 University Station A6700 Austin, TX, 78712 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Dr E.A. Catchpole Visiting Fellow Univ of New South Wales at ADFA, Canberra, Australia and University of Kent, Canterbury, England - www.ma.adfa.edu.au/~eac - fax: +61 2 6268 8786 - ph: +61 2 6268 8895 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] testing non-linear component in mgcv:gam
Dear Denis, Take a closer look at the anova table: The models provide identical fits to the data. The differences in degrees of freedom and deviance between the two models are essentially zero, 5.5554e-10 and 2.353e-11 respectively. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 8:22 AM To: r-help@stat.math.ethz.ch Subject: [R] testing non-linear component in mgcv:gam Hi, I need further help with my GAMs. Most models I test are very obviously non-linear. Yet, to be on the safe side, I report the significance of the smooth (default output of mgcv's summary.gam) and confirm it deviates significantly from linearity. I do the latter by fitting a second model where the same predictor is entered without the s(), and then use anova.gam to compare the two. I thought this was the equivalent of the default output of anova.gam using package gam instead of mgcv. I wonder if this procedure is correct because one of my models appears to be linear. In fact mgcv estimates df to be exactly 1.0 so I could have stopped there. However I inadvertently repeated the procedure outlined above. I would have thought in this case the anova.gam comparing the smooth and the linear fit would for sure have been not significant. To my surprise, P was 6.18e-09! Am I doing something wrong when I attempt to confirm the non- parametric part a smoother is significant? Here is my example case where the relationship does appear to be linear: library(mgcv) This is mgcv 1.3-7 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 0.38, 0.62, 0.88, 1.12, 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 3.62, 3.88, 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 6.38, 6.62, 6.88, 7.12, 8.38, 13.62) N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 29, 31, 22, 26, 24, 23, 15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 0.062046512, 0.396459596, 0.189082949, 0.054757925, 0.142810440, 0.168005168, 0.180804428, 0.111439628, 0.128799505, 0.193707937, 0.105921610, 0.103497845, 0.028591837, 0.217894389, 0.020535469, 0.080389068, 0.105234450, 0.070213450, 0.050771363, 0.042074434, 0.102348837, 0.049748344, 0.019100478, 0.005203125, 0.101711864, 0.0, 0.0, 0.014808824, 0.0, 0.22200, 0.16700, 0.0, 0.0, 0.0) sed.gam - gam(wm.sed~s(Temp),weight=N.sets) summary.gam(sed.gam) Family: gaussian Link function: identity Formula: wm.sed ~ s(Temp) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.084030.01347 6.241 3.73e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Approximate significance of smooth terms: edf Est.rank F p-value s(Temp) 11 13.95 0.000666 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.554 Deviance explained = 28.5% GCV score = 0.09904 Scale est. = 0.093686 n = 37 # testing non-linear contribution sed.lin - gam(wm.sed~Temp,weight=N.sets) summary.gam(sed.lin) Family: gaussian Link function: identity Formula: wm.sed ~ Temp Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** Temp-0.023792 0.006369 -3.736 0.000666 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.554 Deviance explained = 28.5% GCV score = 0.09904 Scale est. = 0.093686 n = 37 anova.gam(sed.lin, sed.gam, test=F) Analysis of Deviance Table Model 1: wm.sed ~ Temp Model 2: wm.sed ~ s(Temp) Resid. Df Resid. Dev Df Deviance F Pr(F) 1 3.5000e+01 3.279 2 3.5000e+01 3.279 5.5554e-10 2.353e-11 0.4521 6.18e-09 *** Thanks in advance, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] transparent surface in rgl
Dear Paul, I don't have experience with rgl.postscript(), which is relatively new, but find that the png graphs produced by rgl.snapshot() are of reasonably good quality and preserve transparency. Perhaps the developers of the rgl package can shed more light on the matter. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof. Paul R. Fisher Sent: Wednesday, October 05, 2005 8:32 AM To: r-help@stat.math.ethz.ch Subject: [R] transparent surface in rgl Hi all I am a complete newbie to this list (just subscribed) and a newcomer to R (an S user from olden times). I have been using scatter3d to create a 3d scatter plot with surface. The graphic is created within the rgl package and I have used rgl.postscript to export it so I can generate a publication quality image. My problem is that the plotted surface is no longer transparent in the postscript output ie. the rgl.spheres that are behind the surface disappear in the postscript image. Can't seem to find any info on this anywhere. Am I doing something wrong? Is there an easy fix? Anyway, thanks. Hope I've not broken some netiquette rule sending this. Cheers, Paul Fisher. -- Prof. Paul R. Fisher, Chair in Microbiology, La Trobe University, VIC 3086, AUSTRALIA. Tel. + 61 3 9479 2229 Fax. + 61 3 9479 1222 Email. [EMAIL PROTECTED] Web. http://www.latrobe.edu.au/mcbg/my.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] testing non-linear component in mgcv:gam
Dear Denis, The chi-square test is formed in analogy to what's done for a GLM: The difference in residual deviance for the nested models is divided by the estimated scale parameter -- i.e., the estimated error variance for a model with normal errors. Otherwise, as you point out, the test would be dependent upon the scale of the response. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 9:04 AM To: John Fox Cc: R list Subject: Re: [R] testing non-linear component in mgcv:gam Hi John, Le 05-10-05 à 09:45, John Fox a écrit : Dear Denis, Take a closer look at the anova table: The models provide identical fits to the data. The differences in degrees of freedom and deviance between the two models are essentially zero, 5.5554e-10 and 2.353e-11 respectively. I hope this helps, John This is one of my difficulties. In some examples I found on the web, the difference in deviance is compared directly against the chi- squared distribution. But my y variable has a very small range (between 0 and 0.5, most of the time) so the difference in deviance is always very small and if I compared it against the chi-squared distribution as I have seen done in examples, the non-linear component would always be not significant. Yet it is (with one exception), tested with both mgcv:gam and gam:gam. I think the examples I have read were wrong in this regard, the scale factor seen in mgcv output seems to intervene. But exactly how is still mysterious to me and I hesitate to judge the size of the deviance difference myself. I agree it is near zero in my example. I guess I need to have more experience with these models to better interpret the output... Denis -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 8:22 AM To: r-help@stat.math.ethz.ch Subject: [R] testing non-linear component in mgcv:gam Hi, I need further help with my GAMs. Most models I test are very obviously non-linear. Yet, to be on the safe side, I report the significance of the smooth (default output of mgcv's summary.gam) and confirm it deviates significantly from linearity. I do the latter by fitting a second model where the same predictor is entered without the s(), and then use anova.gam to compare the two. I thought this was the equivalent of the default output of anova.gam using package gam instead of mgcv. I wonder if this procedure is correct because one of my models appears to be linear. In fact mgcv estimates df to be exactly 1.0 so I could have stopped there. However I inadvertently repeated the procedure outlined above. I would have thought in this case the anova.gam comparing the smooth and the linear fit would for sure have been not significant. To my surprise, P was 6.18e-09! Am I doing something wrong when I attempt to confirm the non- parametric part a smoother is significant? Here is my example case where the relationship does appear to be linear: library(mgcv) This is mgcv 1.3-7 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 0.38, 0.62, 0.88, 1.12, 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 3.62, 3.88, 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 6.38, 6.62, 6.88, 7.12, 8.38, 13.62) N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 29, 31, 22, 26, 24, 23, 15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 0.062046512, 0.396459596, 0.189082949, 0.054757925, 0.142810440, 0.168005168, 0.180804428, 0.111439628, 0.128799505, 0.193707937, 0.105921610, 0.103497845, 0.028591837, 0.217894389, 0.020535469, 0.080389068, 0.105234450, 0.070213450, 0.050771363, 0.042074434, 0.102348837, 0.049748344, 0.019100478, 0.005203125, 0.101711864, 0.0, 0.0, 0.014808824, 0.0, 0.22200, 0.16700, 0.0, 0.0, 0.0) sed.gam - gam(wm.sed~s(Temp),weight=N.sets) summary.gam(sed.gam) Family: gaussian Link function: identity Formula: wm.sed ~ s(Temp) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.084030.01347 6.241 3.73e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Approximate significance of smooth terms: edf Est.rank F p-value s(Temp) 11 13.95 0.000666
Re: [R] testing non-linear component in mgcv:gam
Dear Denis, You got me: I would have thought from ?summary.gam that this would be the same as the adjusted R^2 for a linear model. Note, however, that the percentage of deviance explained checks with the R^2 from the linear model, as expected. Maybe you should address this question to the package author. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Denis Chabot [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 05, 2005 3:33 PM To: John Fox Cc: R list Subject: Re: [R] testing non-linear component in mgcv:gam Thank you everyone for your help, but my introduction to GAM is turning my brain to mush. I thought the one part of the output I understood the best was r-sq (adj), but now even this is becoming foggy. In my original message I mentioned a gam fit that turns out to be a linear fit. By curiosity I analysed it with a linear predictor only with mgcv package, and then as a linear model. The output was identical in both, but the r-sq (adj) was 0.55 in mgcv and 0.26 in lm. In lm I hope that my interpretation that 26% of the variance in y is explained by the linear relationship with x is valid. Then what does r2 mean in mgcv? Denis summary.gam(lin) Family: gaussian Link function: identity Formula: wm.sed ~ Temp Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** Temp-0.023792 0.006369 -3.736 0.000666 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 R-sq.(adj) = 0.554 Deviance explained = 28.5% GCV score = 0.09904 Scale est. = 0.093686 n = 37 summary(sed.true.lin) Call: lm(formula = wm.sed ~ Temp, weights = N.sets) Residuals: Min 1Q Median 3Q Max -0.6138 -0.1312 -0.0325 0.1089 1.1449 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** Temp-0.023792 0.006369 -3.736 0.000666 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.3061 on 35 degrees of freedom Multiple R-Squared: 0.285,Adjusted R-squared: 0.2646 F-statistic: 13.95 on 1 and 35 DF, p-value: 0.000666 Le 05-10-05 à 09:45, John Fox a écrit : Dear Denis, Take a closer look at the anova table: The models provide identical fits to the data. The differences in degrees of freedom and deviance between the two models are essentially zero, 5.5554e-10 and 2.353e-11 respectively. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 8:22 AM To: r-help@stat.math.ethz.ch Subject: [R] testing non-linear component in mgcv:gam Hi, I need further help with my GAMs. Most models I test are very obviously non-linear. Yet, to be on the safe side, I report the significance of the smooth (default output of mgcv's summary.gam) and confirm it deviates significantly from linearity. I do the latter by fitting a second model where the same predictor is entered without the s(), and then use anova.gam to compare the two. I thought this was the equivalent of the default output of anova.gam using package gam instead of mgcv. I wonder if this procedure is correct because one of my models appears to be linear. In fact mgcv estimates df to be exactly 1.0 so I could have stopped there. However I inadvertently repeated the procedure outlined above. I would have thought in this case the anova.gam comparing the smooth and the linear fit would for sure have been not significant. To my surprise, P was 6.18e-09! Am I doing something wrong when I attempt to confirm the non- parametric part a smoother is significant? Here is my example case where the relationship does appear to be linear: library(mgcv) This is mgcv 1.3-7 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 0.38, 0.62, 0.88, 1.12, 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 3.62, 3.88, 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 6.38, 6.62, 6.88, 7.12, 8.38, 13.62) N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 29, 31, 22, 26, 24, 23, 15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 0.062046512, 0.396459596, 0.189082949
Re: [R] Rcmdr and scatter3d
(predict(mod, newdata=dat), grid.lines, grid.lines) if (fill) rgl.surface(vals, vals, yhat, color=surface.col[j], alpha=0.5, lit=FALSE) if (grid) rgl.surface(vals, vals, yhat, color=if (fill) grid.col else surface.col[j], alpha=0.5, lit=FALSE, front=lines, back=lines) rgl.texts(0, predict(mod, newdata=data.frame(x=0, z=0, groups=group)), 0, paste(group, ), adj=1, color=surface.col[j]) if (residuals){ yy - y[select.obs] xx - x[select.obs] zz - z[select.obs] fitted - fitted(mod) rgl.lines(as.vector(rbind(xx,xx)), as.vector(rbind(yy,fitted)), as.vector(rbind(zz,zz)), col=surface.col[j]) } } } } } } if (revolutions 0) { for (i in 1:revolutions){ for (angle in seq(1, 360, length=360/speed)) rgl.viewpoint(-angle, fov=fov) } } if (model.summary) return(summaries) else return(invisible(NULL)) } John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Naiara S. Pinto Sent: Tuesday, October 04, 2005 6:13 PM To: r-help@stat.math.ethz.ch Subject: [R] Rcmdr and scatter3d Hi folks, I'd like to use scatter3d (which is in R commander) to plot more than one dataset in the same graph, each dataset with a different color. The kind of stuff you would do with holdon in Matlab. I read a recent message that was posted to this list with a similar problem, but I couldn't understand the reply. Could someone give me one example? How do you plot subgroups using scatter3d? Thanks a lot! Naiara. Naiara S. Pinto Ecology, Evolution and Behavior 1 University Station A6700 Austin, TX, 78712 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot question when type = b and pch is a vector
Dear tokas, How about: x - seq(0.01, 10, length = 20) xx - x[7] x[7] - NA plot(c(0, 10), c(-20, 20), type = n, xlab = x, ylab = expression(2 * alpha * log(x))) for(i in 1:4){ lines(x, 2 * i * log(x), lty = 1) text(xx, 2 * i * log(xx), i) } I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of toka tokas Sent: Sunday, October 02, 2005 5:37 AM To: r-help@stat.math.ethz.ch Subject: [R] plot question when type = b and pch is a vector Dear R users, I've been struggling some days with the following problem: I'm interesting in producing the following plot x - seq(0.01, 10, length = 20) plot(c(0, 10), c(-20, 20), type = n, xlab = x, ylab = expression(2 * alpha * log(x))) pch. - rep(NA, length(x)) for(i in 1:4){ pch.[7] - as.character(i) lines(x, 2 * i * log(x), type = b, pch = pch., lty = 1) } where all the line segments are connected, except from the 7th one where I've put the value of alpha -- in other words I'd like to produce a line plot where the label appears at each line with some white space around it. thanks in advance, tokas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] generalized linear model for multinomial data?
Dear Hongyu, See multinom() in the nnet package, associated with Venables and Ripley's Modern Applied Statistics with S. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Hongyu Sun Sent: Sunday, October 02, 2005 3:07 PM To: r-help@stat.math.ethz.ch Subject: [R] generalized linear model for multinomial data? Dear All: Does R have the package as in SAS's generalized logits model for nominal response data? I have searched but cannot find the windows package. Many thanks, HS __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Are least-squares means useful or appropriate?
Dear Peter, Doug, and Felipe, My effects package (on CRAN, also see the article at http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revis ed.pdf) will compute and graph adjusted effects of various kinds for linear and generalized linear models -- generalizing so-called least-squares means (or population marginal means or adjusted means). A couple of comments: By default, the all.effects() function in the effects package computes effects for high-order terms in the model, absorbing terms marginal to them. You can ask the effect() function to compute an effect for a term that's marginal to a higher-order term, and it will do so with a warning, but this is rarely sensible. Peter's mention of floating variances (or quasi-variances) in this context is interesting, but what would most like to see, I think, are the quasi-variances for the adjusted effects, that is for terms merged with their lower-order relatives. These, for example, are unaffected by contrast coding. How to define reasonable quasi-variances in this context has been puzzling me for a while. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Friday, September 23, 2005 10:23 AM To: Douglas Bates Cc: Felipe; R-help@stat.math.ethz.ch Subject: Re: [R] Are least-squares means useful or appropriate? Douglas Bates [EMAIL PROTECTED] writes: On 9/20/05, Felipe [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi. My question was just theoric. I was wondering if someone who were using SAS and R could give me their opinion on the topic. I was trying to use least-squares means for comparison in R, but then I found some indications against them, and I wanted to know if they had good basis (as I told earlier, they were not much detailed). Greetings. Felipe As Deepayan said in his reply, the concept of least squares means is associated with SAS and is not generally part of the theory of linear models in statistics. My vague understanding of these (I too am not a SAS user) is that they are an attempt to estimate the mean response for a particular level of a factor in a model in which that factor has a non-ignorable interaction with another factor. There is no clearly acceptable definition of such a thing. (PD goes and fetches the SAS manual) Well, yes. it'll do that too, although only if you ask for the lsmeans of A when an interaction like A*B is present in the model. This is related to the tests of main effects when an interaction is present using type III sums of squares, which has been beaten to death repeatedly on the list. In both cases, there seems to be an implicit assumption that categorical variables by nature comes from an underlying fully balanced design. If the interaction is absent from the model, the lsmeans are somewhat more sensible in that they at least reproduce the parameter estimates as contrasts between different groups. All continuous variables in the design will be set to their mean, but values for categorical design variables are weighted inversely as the number of groups. So if you're doing an lsmeans of lung function by smoking adjusted for age and sex you get estimates for the mean of a population of which everyone has the same age and half are male and half are female. This makes some sense, but if you do it for sex adjusting for smoking and age, you are not only forcing the sexes to smoke equally much, but actually adjusting to smoking rates of 50%, which could be quite far from reality. The whole operation really seems to revolve around 2 things: (1) pairwise comparisons between factor levels. This can alternatively be done fairly easily using parameter estimates for the relevant variable and associated covariances. You don't really need all the mumbo-jumbo of adjusting to particular values of other variables. (2) plotting effects of a factor with error bars as if they were simple group means. This has some merit since the standard parametrizations are misleading at times (e.g. if you choose the group with the least data as the reference level, std. err. for the other groups will seem high). However, it seems to me that concepts like floating variances (see float() in the Epi package) are more to the point. R is an interactive language where it is a simple matter to fit a series of models and base your analysis on a model that is appropriate. An approach of give me the answer to any possible question about this model, whether or not it make sense is unnecessary. In many ways statistical
Re: [R] text editor TinR?
Dear Mihai, It's Tinn-R (with two n's), at http://www.sciviews.org/Tinn-R/. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mihai Nica Sent: Monday, September 19, 2005 4:07 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] text editor TinR? Greetings, Please help me remember the name of the tiny text editor that works with R TinR maybe? I cannot find it at all, and cannot remember it, it is really frustrating... Thanks, Mihai Nica, ABD Jackson State University 170 East Griffith Street G5 Jackson, MS 39201 601 914 0361 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcommander and simple chisquare
Dear Christian, From the Rcmdr menus, select Statistics - Summaries - Frequency distributions, and check the Chisquare goodness of fit test box in the resulting dialog. This will bring up a dialog box where you can enter hypothesized probabilities from which expected frequencies will be calculated. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost Sent: Thursday, September 15, 2005 5:40 AM To: r-help@stat.math.ethz.ch Subject: [R] Rcommander and simple chisquare In this years biostat teaching I will include Rcommander (it indeed simplifies syntax problems that makes students frequently miss the core statistical problems). But I could not find how to make a simple chisquare comparison between observed frequencies and expected frequencies (eg in genetics where you expect phenotypic frequencies corresponding to 3:1 in standard dominant/recessif alleles). Any idea where this feature might be hidden? Or could it be added to Rcommander? Thanks, Christian. ps: in case I am not making myself clear, can Rcommander be made to perform chisq.test(c(61,39),p=c(0.75,0.25)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcommander and simple chisquare
Dear Christian, From the Rcmdr menus, select Statistics - Summaries - Frequency distributions, and check the Chisquare goodness of fit test box in the resulting dialog. This will bring up a dialog box where you can enter hypothesized probabilities from which expected frequencies will be calculated. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost Sent: Thursday, September 15, 2005 5:40 AM To: r-help@stat.math.ethz.ch Subject: [R] Rcommander and simple chisquare In this years biostat teaching I will include Rcommander (it indeed simplifies syntax problems that makes students frequently miss the core statistical problems). But I could not find how to make a simple chisquare comparison between observed frequencies and expected frequencies (eg in genetics where you expect phenotypic frequencies corresponding to 3:1 in standard dominant/recessif alleles). Any idea where this feature might be hidden? Or could it be added to Rcommander? Thanks, Christian. ps: in case I am not making myself clear, can Rcommander be made to perform chisq.test(c(61,39),p=c(0.75,0.25)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcommander and simple chisquare
Dear Christian, The Rcmdr assumes that you have a data frame with the original data, in which the variable in question is a factor. The frequency distribution is constructed for the factor. Thus, in your example, you'd have 100 observations classified on a two-category factor. What you enter directly are the hypothesized probabilities. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Christian Jost [mailto:[EMAIL PROTECTED] Sent: Thursday, September 15, 2005 11:38 AM To: John Fox; 'Philippe Grosjean' Cc: r-help@stat.math.ethz.ch Subject: RE: [R] Rcommander and simple chisquare Dear John and Philippe, thanks for your replys, I finally found this menu, but I am somewhat at a loss how I should enter the observed frequencies. To take my example below, If I enter a one-column data.frame with the numbers 61 and 39, John's indicated menu is not highlighted. If I add a second column containing some factor, the menu is highlighted by I cannot select the first column. However, if I edit the data and declare the first column to be of type 'character' I can select it in the menu dialog and declare the expected frequencies, but the chisquare output doesn't make any sense. For the moment I cannot make any sense of that :-( Any help most appreciated, or a link to the tutorial/faq that explains such kind of problems. Thanks, Christian. At 11:31 -0400 15/09/05, John Fox wrote: Dear Philippe, This does a chi-square test of independence in a contingency table, not a chi-square goodness-of-fit test (which is done in the Rcmdr via Statistics - Summaries - Frequency distribution). Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Grosjean Sent: Thursday, September 15, 2005 7:32 AM To: Christian Jost Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Rcommander and simple chisquare Hello, Just look at Statistics - Contingency tables. There is an option for making the chi square test there. Best, Philippe Grosjean, ..°})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean .. Christian Jost wrote: In this years biostat teaching I will include Rcommander (it indeed simplifies syntax problems that makes students frequently miss the core statistical problems). But I could not find how to make a simple chisquare comparison between observed frequencies and expected frequencies (eg in genetics where you expect phenotypic frequencies corresponding to 3:1 in standard dominant/recessif alleles). Any idea where this feature might be hidden? Or could it be added to Rcommander? Thanks, Christian. ps: in case I am not making myself clear, can Rcommander be made to perform chisq.test(c(61,39),p=c(0.75,0.25)) __ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Collineariy Diagnostics
Dear Sundar and Antoine, In addition, the vif function in the car package will calculate generalized variance inflation factors. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sundar Dorai-Raj Sent: Tuesday, September 13, 2005 5:16 AM To: Antoine de Bary Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Collineariy Diagnostics Antoine de Bary wrote: Hi, and thanks for your help in order to do collinearity analysis I downloaded the perturb package. I run a lm (regression) and on that the ³calldiag² commad to get condition numbers but i get the following message: the variable XY with modus ³numeric² was not found (it does the same with all predictors despite all variables are numeric and exists). Can anyone tell me how can I go arround this problem? Is there another way to have ³condition numbers²? What about VIF? Please return message to: [EMAIL PROTECTED] I cannot comment on the perturb package. However for condition numbers see ?kappa.lm, and for variance inflation factors see ?vif. The latter is in the Design package. set.seed(1) x1 - rnorm(100) x2 - x1 + 0.1 * rnorm(100) y - rnorm(100) f - lm(y ~ x1 + x2) vif(f) kappa(f) HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] is library loaded
Dear Omar, The following function tests whether a package is in the search path (with the package name given in quotes): packageLoaded - function(name) 0 != length(grep(paste(^package:, name, $, sep=), search())) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis Sent: Tuesday, September 13, 2005 2:09 PM To: r-help@stat.math.ethz.ch Subject: [R] is library loaded Is there a way to test if a library has been loaded? is.loaded does not give me what I want, I am looking to say: if loaded(Rdbi) dbSendQuery(conn, q) if loaded(RODBC) sqlQuery(conn, q) I need this to support both unix and windows platforms as I could not find a windows distribution for RdbiPgSQL. I am using R 2.1.0 and postgresql. I will be connecting to the database using Rdbi and RdbiPgSQL and have other developers using windows connect with RODBC, unless someone can suggest a better solution. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] is library loaded
Dear Robert, packageLoaded() may well be a bad name but loadedNamespaces() won't detect a package without a namespace. It therefore seemed safe to me to check the path, which would include both packages with and without namespaces. With respect to loading and attaching, I thought that library() both loaded a package (with or without a namespace) and attached it to the search path, but I must admit that I'm easily confused about these distinctions. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Robert Gentleman [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 13, 2005 3:53 PM To: John Fox Cc: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Subject: Re: [R] is library loaded Hi, Almost surely this is a bad name. With the advent of name spaces it is important to distinguish between loading and attaching. This function tests for attached packages. To test for loaded packages we already have loadedNamespaces. Best wishes, Robert John Fox wrote: Dear Omar, The following function tests whether a package is in the search path (with the package name given in quotes): packageLoaded - function(name) 0 != length(grep(paste(^package:, name, $, sep=), search())) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis Sent: Tuesday, September 13, 2005 2:09 PM To: r-help@stat.math.ethz.ch Subject: [R] is library loaded Is there a way to test if a library has been loaded? is.loaded does not give me what I want, I am looking to say: if loaded(Rdbi) dbSendQuery(conn, q) if loaded(RODBC) sqlQuery(conn, q) I need this to support both unix and windows platforms as I could not find a windows distribution for RdbiPgSQL. I am using R 2.1.0 and postgresql. I will be connecting to the database using Rdbi and RdbiPgSQL and have other developers using windows connect with RODBC, unless someone can suggest a better solution. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] Internationalized version of the Rcmdr package
Dear R-packages list members, I've recently uploaded to CRAN version 1.1-1 of the Rcmdr (R Commander) package. Based on the tcltk package, the Rcmdr provides a basic-statistics graphical interface to R. The general functionality of the R Commander has not changed much since version 1.0-0. There are some small improvements to the package, but the reason that I'm re-announcing it is that the latest version supports translation into other languages, using the new internationalization and localization features of R 2.1.x. A Catalan translation (courtesy of Manel Salamero), a French translation (courtesy of Philippe Grosjean), a Japanese translations (courtesy of Takaharu Araki), and a Slovenian translation (courtesy of Jaro Lajovic), are provided with the Rcmdr package. I understand that Portuguese, Russian, and Spanish translations are in the works, and these will be incorporated when they are completed. If you're interested in translating the Rcmdr into another language, please contact me. As before, bug reports, suggestions, and other feedback are appreciated. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plotting an ellipse in 3D
=data.frame(x=0, z=0, groups=group)), 0, paste(group, ), adj=1, color=surface.col[j]) if (residuals){ yy - y[select.obs] xx - x[select.obs] zz - z[select.obs] fitted - fitted(mod) rgl.lines(as.vector(rbind(xx,xx)), as.vector(rbind(yy,fitted)), as.vector(rbind(zz,zz)), col=surface.col[j]) } } } } } } if (revolutions 0) { for (i in 1:revolutions){ for (angle in seq(1, 360, length=360/speed)) rgl.viewpoint(-angle, fov=fov) } } if (model.summary) return(summaries) else return(invisible(NULL)) } John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Friday, September 09, 2005 8:03 AM To: Mike White Cc: R-help@stat.math.ethz.ch Subject: Re: [R] Plotting an ellipse in 3D Mike White wrote: I have been using the ellipse function from the car package and the covariance matrix to draw an ellipse around a group of points to show the confidence limits. However, the points are actually represented by 3 variables so rather than plot each pairwise combination of variables in 2D I would like to plot the 'ellipse' in 3D using the djmrgl package. Can anyone offer advice on how I can plot the surface of a 3D 'ellipse' using the covariance matrix to define the shape, so that the points inside can also be seen. You should use rgl, rather than djmrgl. It now has most of the same functions plus a lot more. Then you can plot the ellipse as a wireframe or transparent object. See the demo(regression) example for that kind of drawing; demo(shapes3d) for ellipses. (The demo names are from memory, I don't have access to it right now.) Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] tcltk, X11 protocol error: Bug?
Dear Nicholas, This problem has been reported before (enter X11 protocol error on the R site search at http://finzi.psych.upenn.edu/search.html to see the previous threads), but as far as I know, there's no definitive explanation or solution. As well, things appear to work fine, despite the warnings. The way I handle the problem in the Rcmdr package is simply to intercept the warnings. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nicholas Lewin-Koh Sent: Monday, September 05, 2005 5:16 PM To: [EMAIL PROTECTED] Subject: [R] tcltk, X11 protocol error: Bug? Hi, I am having trouble debugging this one. The code is attached below, but it seems to be a problem at the C-tk interface. If I run this 1 time there are no problems if I run it more than once I start to get warnings that increase in multiples of 11 everytime I run it. Here is a sample session source(clrramp2.r) Loading required package: tcltk Loading Tcl/Tk interface ... done clrRamp() tt-clrRamp() tt function (n) { x - ramp(seq(0, 1, length = n)) rgb(x[, 1], x[, 2], x[, 3], max = 255) } environment: 0x8b8674c image(matrix(1:10),col=tt(10)) tt-clrRamp() There were 22 warnings (use warnings() to see them) image(matrix(1:10),col=tt(10)) There were 11 warnings (use warnings() to see them) warnings() Warning messages: 1: X11 protocol error: BadWindow (invalid Window parameter) 2: X11 protocol error: BadWindow (invalid Window parameter) 3: X11 protocol error: BadWindow (invalid Window parameter) 4: X11 protocol error: BadWindow (invalid Window parameter) 5: X11 protocol error: BadWindow (invalid Window parameter) 6: X11 protocol error: BadWindow (invalid Window parameter) 7: X11 protocol error: BadWindow (invalid Window parameter) 8: X11 protocol error: BadWindow (invalid Window parameter) 9: X11 protocol error: BadWindow (invalid Window parameter) 10: X11 protocol error: BadWindow (invalid Window parameter) 11: X11 protocol error: BadWindow (invalid Window parameter) I am running R-2.1.1 on ubuntu linux 5.04, compiled from source (not the deb package). My version of tcl/tk is 8.4. The code is below. If anyone sees something I am doing foolish let me know, otherwise I will file a bug report. Nicholas # File clrramp2.r ## require(tcltk) clrRamp - function(n.col, b.color=NULL,e.color=NULL){ B.ChangeColor - function() { b.color - tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color, title=Choose a color)) if (nchar(b.color)0){ tkconfigure(canvas.b,bg=b.color) Rmp.Draw() } } E.ChangeColor - function() { e.color - tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color, title=Choose a color)) ##cat(e.color) if (nchar(e.color)0){ tkconfigure(canvas.e,bg=e.color) Rmp.Draw() } } Rmp.Draw -function(){ cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola te=spline) rmpcol - cr(n.col) #rmpcol-rgb( rmpcol[,1],rmpcol[,2],rmpcol[,3]) inc - 300/n.col xl - 0 for(i in 1:n.col){ ##item - tkitemconfigure(canvas.r,barlst[[i]],fill=rmpcol[i],outline=rmpcol[i]) #xl - xl+inc } } save.ramp - function(){ cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola te=spline) tkdestroy(tt) ##invisible(cr) } tt - tktoplevel() tkwm.title(tt,Color Ramp Tool) frame - tkframe(tt) bframe - tkframe(frame,relief=groove,borderwidth=1) if(is.null(b.color)) b.color - blue if(is.null(e.color)) e.color - yellow if(missing(n.col)) n.col - 100 canvas.b - tkcanvas(bframe,width=50,height=25,bg=b.color) canvas.e - tkcanvas(bframe,width=50,height=25,bg=e.color) canvas.r - tkcanvas(tt,width=300,height=50,bg=white) BColor.button - tkbutton(bframe,text=Begin Color,command=B.ChangeColor) ##tkgrid(canvas.b,BColor.button) EColor.button - tkbutton(bframe,text=End Color,command=E.ChangeColor) killbutton - tkbutton(bframe,text=Save,command=save.ramp) tkgrid(canvas.b,BColor.button,canvas.e,EColor.button) tkgrid(bframe) tkgrid(frame) tkgrid(canvas.r) tkgrid(killbutton) cr-colorRampPalette(c(b.color,e.color),space=Lab,interpolat e=spline) ##rmpcol - hex(mixcolor(alpha,bc,ec,where=LUV)) rmpcol - cr(n.col) inc - 300/n.col xl - 0 #barlst - vector(length=n.col,mode=list) barlst - tclArray() for(i in 1:n.col){ item-tkcreate(canvas.r,rect,xl,0,xl+inc,50, fill=rmpcol[i],outline=rmpcol[i]) ##tkaddtag(canvas.r, point, withtag, item
Re: [R] Rcmdr extensions
Dear Simone, You'll find instructions for extending the R Commander in Section 4 of the paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/R-commander.pdf. In most instances, you should cause commands to be executed and printed in the Script window and output to appear in the Output window by using the function doItAndPrint(). I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Simone Gabbriellini Sent: Monday, August 29, 2005 9:48 AM To: r-help@stat.math.ethz.ch Subject: [R] Rcmdr extensions Dear List, I am trying to extend Rcmdr with some functions usefult to my study... I have addedd succesfully a menu and some submenu to the GUI, and I have placed a file .R in the /etc folder... I am able to call functions on that file, but I cannot see the results: how can I tell Rcmdr to write the output in the output box? Second question: how can I refer to the actually selected Dataset in my .R functions' file? Thank you, Simone Gabbriellini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix oriented computing
Dear Mark, For that matter, the loop isn't a whole a slower (on my 3GHz Win XP system): x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995) df - 1:1000 system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.08 0.00 0.08 NA NA mat - matrix(0, 1000, 11) system.time(for (i in 1:length(df)) mat[i,] - qchisq(x, df[i])) [1] 0.09 0.00 0.10 NA NA Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Marc Schwartz (via MN) Sent: Friday, August 26, 2005 10:26 AM To: Peter Dalgaard Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Matrix oriented computing On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote: Marc Schwartz [EMAIL PROTECTED] writes: x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995) df - c(1:100) mat - sapply(x, qchisq, df) dim(mat) [1] 100 11 str(mat) num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ... outer() is perhaps a more natural first try... It does give the transpose of the sapply approach, though. round(t(outer(x,df,qchisq)),2) should be close. You should likely add dimnames. What I find interesting, is that I would have intuitively expected outer() to be faster than sapply(). However: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.01 0.00 0.02 0.00 0.00 # Bear in mind the round() on mat1 above all.equal(mat, mat1) [1] Mean relative difference: 4.905485e-05 all.equal(mat, t(mat2)) [1] TRUE Even when increasing the size of 'df' to 1:1000: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.16 0.01 0.16 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.16 0.00 0.18 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.16 0.01 0.17 0.00 0.00 It also seems that, at least in this case, t() and round() do not add much overhead. Best regards, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Manually Calculating Odds from POLR Model
Dear Tate, Your question pertains generally to the treatment of factors in model formulas and is not particular to polr(). For a brief explanation, see Section 11.1, Defining statistical models; formulae, and in particular Section 11.1.1 on Contrasts in the manual An Introduction to R, which is distributed with R. More detailed explanations are in texts such as Venables and Ripley, Modern Applied Statistics With S, and my own, An R and S-PLUS Companion to Applied Regression. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Tate Avery [mailto:[EMAIL PROTECTED] Sent: Monday, August 15, 2005 3:58 PM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: RE: [R] Manually Calculating Odds from POLR Model John, Thank you, the document was very helpful. I can now calculate the same values generated by predict() when I am using purely numeric input data. Another small question arises when I look at the example using 'housing' in the polr() documentation page: Running the example produces the following coefficients... Coefficients: InflMedium InflHigh TypeApartmentTypeAtrium TypeTerrace ContHigh 0.5663924 1.2888218-0.5723552-0.3661912 -1.0910195 0.3602834 Now, if I am trying to perform a prediction and the value for INFL comes in as 'Medium' what is done? And, what is done for 'low'? That seems to be the last missing piece in my understanding of how to convert the model values into predictions. Thank you, Tate From: John Fox [EMAIL PROTECTED] To: 'Tate Avery' [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: RE: [R] Manually Calculating Odds from POLR Model Date: Fri, 12 Aug 2005 19:22:23 -0400 Dear Tate, If I understand correctly what you're asking, the formulas are on p. 21 of the paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-d isplays.pdf. But why do you want to do this when you can get the fitted probabilities from predict()? I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery Sent: Friday, August 12, 2005 2:50 PM To: r-help@stat.math.ethz.ch Subject: [R] Manually Calculating Odds from POLR Model Hello, I am using polr(...) to generate a model. The summary shows the coefficients and the intercepts. For example: coefficient for x1 = c1 coefficient for x2 = c2 intercept A|B = i1 intercept B|C = i2 I can then run predict(..., type=p) with the model and see the odds for each factor. For example: ABC 10.3 0.5 0.2 20.4 0.1 0.5 What I really want to be able to do is take the 2 coefficients, the 2 intercepts, the x1 x2 values and manually calculate the probabilities generated by predict(). I have been searching quite extensively for the underlying calculations that transform the polr output and the input variables into the final output odds. I have tried a number of dead-end roads so far. So, if anyone has any information on how to do this or where I can find out, I would be extremely grateful. Thank you for your time, Tate Avery __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] path analysis
Dear Manuel, Polychoric correlations imply only that the *latent* variables are continuous -- the observed variables are ordered categories. Tetrachoric and point-biserial correlations are special cases respectively of polychoric and polyserial correlations. As long as you're willing to think of the dichotomous variable as the dissection into two categories of a latent continuous variable (and assuming multinormality of the latent variables), you can use the approach that I suggested. This isn't logistic regression, but it's similar to a probit model. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Manel Salamero Sent: Sunday, August 14, 2005 12:34 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] path analysis This solves part of my problem with the independent ordinal variables, but my dependent variable is truly categorial (illness/no illness). Polychoric correlation implies that data are continuous, which in not the case. Is possible to implement logistic regression in the path model? Thanks, Manel Salamero -- Original Message -- De: John Fox [EMAIL PROTECTED] Data: Sat, 13 Aug 2005 19:35:24 -0400 Dear Manel, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of SALAMERO BARO, MANUEL Sent: Saturday, August 13, 2005 2:02 PM To: r-help@stat.math.ethz.ch Subject: [R] path analysis Someone knows if it is possible to perform a path analysis with sem package (or any other) to explain a dependent *dichotomus* variable? Yes -- you can use the hetcor() function in the polycor package to generate a correlation matrix and boot.sem() in the sem package to get standard errors or confidence intervals. Make sure that the dichotomous variables are represented as factors. See ?boot.sem for an example. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] path analysis
Dear Manuel and list, I see that I wrote point-biserial when I meant biserial. Sorry, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Fox Sent: Sunday, August 14, 2005 1:34 PM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] path analysis Dear Manuel, Polychoric correlations imply only that the *latent* variables are continuous -- the observed variables are ordered categories. Tetrachoric and point-biserial correlations are special cases respectively of polychoric and polyserial correlations. As long as you're willing to think of the dichotomous variable as the dissection into two categories of a latent continuous variable (and assuming multinormality of the latent variables), you can use the approach that I suggested. This isn't logistic regression, but it's similar to a probit model. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Manel Salamero Sent: Sunday, August 14, 2005 12:34 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] path analysis This solves part of my problem with the independent ordinal variables, but my dependent variable is truly categorial (illness/no illness). Polychoric correlation implies that data are continuous, which in not the case. Is possible to implement logistic regression in the path model? Thanks, Manel Salamero -- Original Message -- De: John Fox [EMAIL PROTECTED] Data: Sat, 13 Aug 2005 19:35:24 -0400 Dear Manel, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of SALAMERO BARO, MANUEL Sent: Saturday, August 13, 2005 2:02 PM To: r-help@stat.math.ethz.ch Subject: [R] path analysis Someone knows if it is possible to perform a path analysis with sem package (or any other) to explain a dependent *dichotomus* variable? Yes -- you can use the hetcor() function in the polycor package to generate a correlation matrix and boot.sem() in the sem package to get standard errors or confidence intervals. Make sure that the dichotomous variables are represented as factors. See ?boot.sem for an example. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] path analysis
Dear Manel, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of SALAMERO BARO, MANUEL Sent: Saturday, August 13, 2005 2:02 PM To: r-help@stat.math.ethz.ch Subject: [R] path analysis Someone knows if it is possible to perform a path analysis with sem package (or any other) to explain a dependent *dichotomus* variable? Yes -- you can use the hetcor() function in the polycor package to generate a correlation matrix and boot.sem() in the sem package to get standard errors or confidence intervals. Make sure that the dichotomous variables are represented as factors. See ?boot.sem for an example. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Issues with tcltk for Tiger Mac OSX
Dear Bernard, The Rcmdr web site has some instructions prepared by Rob Goedman for Mac users at http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html that might prove useful. (I'm not a Mac user myself so can't offer specific advice.) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bernard Leong Sent: Friday, August 12, 2005 8:51 AM To: r-help@stat.math.ethz.ch Subject: [R] Issues with tcltk for Tiger Mac OSX Dear R-helpers, I have installed the latest version of R 2.1.1 in the Tiger OSX. However, when I load up R and use the following command: library(tcltk) I encounter the following error: Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/ tcltk.so': dlopen(/Library/Frameworks/R.framework/Resources/library/tcltk /libs/tcltk.so, 10): corrupt binary, library ordinal too big Error: .onLoad failed in 'loadNamespace' for 'tcltk' Error: package/namespace load failed for 'tcltk' I have done a couple of remedies: 1) Tried installing tcl and tk 8.4 from source 2) Tried to install the version of tcltk from the fink sourceforge net. 3) Tried to change the path and it does not work. Does anyone have a solution to this problem as I need tcltk to run limmaGUI? Best regards, Bernard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Manually Calculating Odds from POLR Model
Dear Tate, If I understand correctly what you're asking, the formulas are on p. 21 of the paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-displays.pdf. But why do you want to do this when you can get the fitted probabilities from predict()? I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery Sent: Friday, August 12, 2005 2:50 PM To: r-help@stat.math.ethz.ch Subject: [R] Manually Calculating Odds from POLR Model Hello, I am using polr(...) to generate a model. The summary shows the coefficients and the intercepts. For example: coefficient for x1 = c1 coefficient for x2 = c2 intercept A|B = i1 intercept B|C = i2 I can then run predict(..., type=p) with the model and see the odds for each factor. For example: ABC 10.3 0.5 0.2 20.4 0.1 0.5 What I really want to be able to do is take the 2 coefficients, the 2 intercepts, the x1 x2 values and manually calculate the probabilities generated by predict(). I have been searching quite extensively for the underlying calculations that transform the polr output and the input variables into the final output odds. I have tried a number of dead-end roads so far. So, if anyone has any information on how to do this or where I can find out, I would be extremely grateful. Thank you for your time, Tate Avery __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] tcltk programming guide
Dear Simone, tcltk functions correspond closely to Tcl and Tk commands, so documentation for the latter, available at http://wiki.tcl.tk/3109, is helpful. I also found Welsch's Practical Programming in Tcl and Tk useful. I expect that you've already seen Peter Dalgaard's two R News articles on the tcltk package. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Simone Gabbriellini Sent: Thursday, August 11, 2005 9:43 AM To: John Zhang Cc: R-help@stat.math.ethz.ch Subject: Re: [R] tcltk programming guide thank you, I knew that link, but I need something more document- oriented, more specific, i.e. if I want to know how to use tkadd, where should I look? or tkinsert and so on... thanx, simone Il giorno 11/ago/05, alle ore 14:18, John Zhang ha scritto: Go to http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/. There are good examples. X-Original-To: [EMAIL PROTECTED] Delivered-To: [EMAIL PROTECTED] X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch Mime-Version: 1.0 (Apple Message framework v733) To: R-help@stat.math.ethz.ch From: Simone Gabbriellini [EMAIL PROTECTED] Date: Thu, 11 Aug 2005 14:04:58 +0200 Subject: [R] tcltk programming guide X-BeenThere: r-help@stat.math.ethz.ch X-Mailman-Version: 2.1.6 List-Id: Main R Mailing List: Primary help r- help.stat.math.ethz.ch List-Unsubscribe: https://stat.ethz.ch/mailman/listinfo/r-help, mailto:[EMAIL PROTECTED] List-Archive: https://stat.ethz.ch/pipermail/r-help List-Post: mailto:r-help@stat.math.ethz.ch List-Help: mailto:[EMAIL PROTECTED] List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-help, mailto:[EMAIL PROTECTED] Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on pascal.dfci.harvard.edu X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.1 Dear List, I'm looking for some documentation about the R tcltk package The one I found in the help doesn't look exaustive, I need information on the use of the single tk widget, maybe with some examples thank you, simone gabbriellini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] partial SS for anova
Dear Natalia, See the Anova() function in the car package. Also see the warning in ?Anova about Type III sums of squares. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of NATALIA F TCHETCHERINA Sent: Saturday, July 30, 2005 6:25 PM To: r-help@stat.math.ethz.ch Subject: [R] partial SS for anova Hello, I use lme4 package. library(lme4) fit=lmer(y ~ time+dye+trt+trt:time + (1|rep), data=dataset, na.action='na.omit') anova(fit) The anova gives sequential F-tests and sequential SS. My question is: how I can get partial F-tests and partial SS? For lm (not lmer) anova(lm(y~x+z)) we can use anova(fit, ssType=3) but it is not work for lmer. Natalia. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Unexpected behavior in recode{car}
Dear Tom and David, The source of the problem isn't hard to see if you trace the execution of recode() via debug(): The test for whether the result can be coerced to numeric is faulty. I've fixed the bug and will upload a new version of car to CRAN shortly. In the meantime, you can use ss - recode(nn, 2='Num2'; 4='Num4', as.factor=TRUE) Thanks for bringing this bug to my attention. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mulholland, Tom Sent: Thursday, July 28, 2005 2:01 AM To: D. Dailey; r-help@stat.math.ethz.ch Subject: Re: [R] Unexpected behavior in recode{car} require( car ) set.seed(12345) nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - recode( nn, 2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss - recode( nn, 2='Num2';4='Num4' ) # Doesn't work as expected table( ss, exclude=NULL ) ss - recode( nn, 2='Num2';4='Num4', TRUE ) #? table( ss, exclude=NULL ) tt - recode( nn, 2='TWO'; 4='Num4' ) table( tt, exclude=NULL ) uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, exclude=NULL ) I looked at the code and found it too difficult to immediately decipher. So does making the result a factor cause any real problems? I noticed that the same response happens with any letterset followed by a number recode( nn, 2='Num2'; 4='abc5' ) Tom -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of D. Dailey Sent: Thursday, 28 July 2005 11:45 AM To: r-help@stat.math.ethz.ch Subject: [R] Unexpected behavior in recode{car} Thanks to the R creators for such a great statistical system. Thanks to the R help list, I have (finally) gotten far enough in R to have a question I hope to be worth posting. I'm using the recode function from John Fox's car package and have encountered some unexpected behavior. Consider the following example: ## Begin cut-and-paste example require( car ) set.seed(12345) nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - recode( nn, 2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss - recode( nn, 2='Num2';4='Num4' ) # Doesn't work as expected table( ss, exclude=NULL ) tt - recode( nn, 2='TWO'; 4='Num4' ) table( tt, exclude=NULL ) uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, exclude=NULL ) ## End cut-and-paste example All but the recoding to ss work as expected: I get a character vector with 23 instances of either FOUR or Num4 and 27 instances of TWO or Num2. But for the ss line, wherein all the strings to be assigned contain a digit, the resulting vector contains all NAs. Using str(), I note that ss is a numeric vector. Is there a tidy way (using recode) to recode numeric values into character strings, all of which contain a digit? I have a workaround for my current project, but it would be nice to be able to use mixed alphanumeric strings in this context. Thanks in advance for any insight you can give into this question. Using R 2.1.1 (downloaded binary) on Windows XP Pro, car version 1.0-17 (installed from CRAN via Windows GUI). Complete version information below: version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor1.1 year 2005 month06 day 20 language R t(t( installed.packages()['car',] )) [,1] Package car LibPath C:/Programs/R/rw2011/library Version 1.0-17 Priority NA Bundle NA Contains NA Depends R (= 1.9.0) Suggests MASS, nnet, leaps Imports NA Built2.1.0 I subscribe to the help list in digest form, so would appreciate being copied directly in addition to seeing responses sent to the list. David Dailey Shoreline, Washington, USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] logistic regression: categorical value, and multinomial
Dear Ed, See ?glm for fitting binomial logit models, and ?multinom (in the nnet package) for multinomial logit models. Neither function will handle a character (text) variable as the response, but you could easily convert the variable to a factor. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Haibo Huang Sent: Wednesday, July 27, 2005 11:23 AM To: r-help@stat.math.ethz.ch Subject: [R] logistic regression: categorical value, and multinomial I have two questions: 1. If I want to do a binomial logit, how to handle the categorical response variable? Data for the response variables are not numerical, but text. 2. What if I want to do a multinomial logit, still with categorical response variable? The variable has 5 non-numerical response levels, I have to do it with a multinomial logit. Any input is highly appreciated! Thanks! Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] setting elements to NA across an array
Dear Dr Carbon, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon Sent: Wednesday, July 27, 2005 12:19 PM To: [EMAIL PROTECTED] Subject: [R] setting elements to NA across an array Please excuse what is obviously a trivial matter... I have a large 3-d array. I wish to set the third dimension (z) to NA if there are any NA values in the first two dimnesions (xy). So, given array foo: foo - array(data = NA, dim = c(5,5,3)) foo[,,1] - matrix(rnorm(25), 5, 5) foo[,,2] - matrix(rnorm(25), 5, 5) foo[,,3] - matrix(rnorm(25), 5, 5) I'll set two elements to NA foo[2,2,1]- NA foo[3,5,2]- NA foo Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How can I build a logical statement to do this? That should work just as you've specified it: That is, elements in all layers in the second row, second column and in the the third row, fifth column of the array should be NA. Since you posed the question, I suppose that I'm missing something. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] setting elements to NA across an array
Dear Dr Carbon, Actually, it appears that I'm the one who was being obtuse. To make sure that I'm now interpreting what you want correctly, you'd like to set all entries in a layer to NA if any entry in a layer is NA. If that's correct, then how about the following? foo[array(apply(foo, c(1,2), function(x) any(is.na(x))), dim(foo))] - NA Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Dr Carbon [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 27, 2005 12:45 PM To: John Fox Cc: [EMAIL PROTECTED] Subject: Re: [R] setting elements to NA across an array Sorry for being obtuse. How can I build a logical statement that will set foo[2,2,] - NA and foo[3,5,] - NA? Something like, if any row and column are NA, then set NA to that row and column to NA for the entire array On 7/27/05, John Fox [EMAIL PROTECTED] wrote: Dear Dr Carbon, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon Sent: Wednesday, July 27, 2005 12:19 PM To: [EMAIL PROTECTED] Subject: [R] setting elements to NA across an array Please excuse what is obviously a trivial matter... I have a large 3-d array. I wish to set the third dimension (z) to NA if there are any NA values in the first two dimnesions (xy). So, given array foo: foo - array(data = NA, dim = c(5,5,3)) foo[,,1] - matrix(rnorm(25), 5, 5) foo[,,2] - matrix(rnorm(25), 5, 5) foo[,,3] - matrix(rnorm(25), 5, 5) I'll set two elements to NA foo[2,2,1]- NA foo[3,5,2]- NA foo Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How can I build a logical statement to do this? That should work just as you've specified it: That is, elements in all layers in the second row, second column and in the the third row, fifth column of the array should be NA. Since you posed the question, I suppose that I'm missing something. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep help needed
Dear Denis, I don't believe that anyone fielded your question -- my apologies if I missed a response. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Monday, July 25, 2005 9:46 PM To: R list Subject: [R] grep help needed Hi, In another thread (PBSmapping and shapefiles) I asked for an easy way to read shapefiles and transform them in data that PBSmapping could use. One person is exploring some ways of doing this, but it is possible I'll have to do this manually. With package maptools I am able to extract the information I need from a shapefile but it is formatted like this: [[1]] [,1] [,2] [1,] -55.99805 51.68817 [2,] -56.00222 51.68911 [3,] -56.01694 51.68911 [4,] -56.03781 51.68606 [5,] -56.04639 51.68759 [6,] -56.04637 51.69445 [7,] -56.03777 51.70207 [8,] -56.02301 51.70892 [9,] -56.01317 51.71578 [10,] -56.00330 51.73481 [11,] -55.99805 51.73840 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 11 attr(,nParts) [1] 1 attr(,shpID) [1] NA [[2]] [,1] [,2] [1,] -57.76294 50.88770 [2,] -57.76292 50.88693 [3,] -57.76033 50.88163 [4,] -57.75668 50.88091 [5,] -57.75551 50.88169 [6,] -57.75562 50.88550 [7,] -57.75932 50.88775 [8,] -57.76294 50.88770 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 8 attr(,nParts) [1] 1 attr(,shpID) [1] NA I do not quite understand the structure of this data object (list of lists I think) Actually, it looks like a list of matrices, each with some attributes (which, I gather, aren't of interest to you). but at this point I resorted to printing it on the console and imported that text into Excel for further cleaning, which is easy enough. I'd like to complete the process within R to save time and to circumvent Excel's limit of around 64000 lines. But I have a hard time figuring out how to clean up this text in R. If I understand correctly what you want, this seems a very awkward way to proceed. Why not just extract the matrices from the list, stick on the additional columns that you want, stick the matrices together, name the columns, and then output the data to a file? M1 - Data[[1]] # assuming that the original list is named Data M2 - Data[[2]] M1 - cbind(1, 1:nrow(M1), M1) M2 - cbind(2, 1:nrow(M2), M2) M - rbind(M1, M2) colnames(M) - c(PID, POS, X, Y) write.table(M, Data.txt, row.names=FALSE, quote=FALSE) It wouldn't be hard to generalize this to any number of matrices and to automate the process. I hope that this helps, John What I need to produce for PBSmapping is a file where each block of coordinates shares one ID number, called PID, and a variable POS indicates the position of each coordinate within a shape. All other lines must disappear. So the above would become: PID POS X Y 1 1 -55.99805 51.68817 1 2 -56.00222 51.68911 1 3 -56.01694 51.68911 1 4 -56.03781 51.68606 1 5 -56.04639 51.68759 1 6 -56.04637 51.69445 1 7 -56.03777 51.70207 1 8 -56.02301 51.70892 1 9 -56.01317 51.71578 1 10 -56.00330 51.73481 1 11 -55.99805 51.73840 2 1 -57.76294 50.88770 2 2 -57.76292 50.88693 2 3 -57.76033 50.88163 2 4 -57.75668 50.88091 2 5 -57.75551 50.88169 2 6 -57.75562 50.88550 2 7 -57.75932 50.88775 2 8 -57.76294 50.88770 First I imported this text file into R: test - read.csv2(test file.txt,header=F, sep=;, colClasses = character) I used sep=; to insure there would be only one variable in this file, as it contains no ; To remove lines that do not contain coordinates, I used the fact that longitudes are expressed as negative numbers, so with my very limited knowledge of grep searches, I thought of this, which is probably not the best way to go: a - rep(-, length(test$V1)) b - grep(a, test$V1) this gives me a warning (Warning message: the condition has length 1 and only the first element will be used in: if (is.na(pattern)) { but seems to do what I need anyway c - seq(1, length(test$V1)) d - c %in% b e - test$V1[d] Partial victory, now I only have lines that look like [1,] -57.76294 50.88770 But I don't know how to go further: the number in square brackets can be used for variable POS, after removing the square brackets and the comma, but this requires a better knowledge of grep than I have. Furthermore, I don't know how to add a PID (polygon ID) variable, i.e. all lines of a polygon must have the same ID, as in the example above (i.e. each time POS == 1, a new polygon starts and PID needs to be incremented by 1, and PID is kept constant for lines where POS ! 1). Any help will be much appreciated. Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sciviews installation
Dear Philippe, The development version 1.1-0 of the Rcmdr package (i.e., the one with localization facilities) isn't yet on CRAN. Georges doesn't say which version of the Rcmdr package he's using, but I assume 1.0-2 from CRAN, which I believe should work with SciViews. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Grosjean Sent: Wednesday, July 20, 2005 5:24 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] sciviews installation Hello, Several changes in the latest Rcmdr broken the SciViews compatibility. I am working on it... but it takes longer than expected. Sorry for the inconvenience. Best, Philippe ..°})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Pentagone (3D08) ( ( ( ( (Academie Universitaire Wallonie-Bruxelles ) ) ) ) ) 8, av du Champ de Mars, 7000 Mons, Belgium ( ( ( ( ( ) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.30.54 ( ( ( ( (email: [EMAIL PROTECTED] ) ) ) ) ) ( ( ( ( (web: http://www.umh.ac.be/~econum ) ) ) ) ) http://www.sciviews.org ( ( ( ( ( .. ghjmora g mail wrote: Hello 1. a few months ago, I had sciviews working fine with R (rw2001) under windows XP 2. now, upgrading to rw2011, the stuff seems fine (every package installed),but I find a conflict when launching sciviews: - it runs, apparently - but when I try to work (import/export In: text for instance), it asks for Rcmdr (Would you like to install it now?) 3. Rcmdr is already installed (with all dependencies) and works well when called directly in R gui 4. and it's impossible to make it reconized or to install it under sciviews I have all the latest packages, and I am going to get mad. what do you suggest to solve my problem ? Thanks Georges Moracchini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 3D ellipsoid confidence region
Dear Melanie, I think that you're referring to the data.ellipse function in the car package. What you're suggesting is a nice idea but I haven't done it. I think that it might be easier to draw the ellipsoid using the rgl package rather than scatterplot3d. (See, e.g., the scatter3d function in the Rcmdr package.) Regards, John On Tue, 28 Jun 2005 12:35:35 -0700 Melanie Edwards [EMAIL PROTECTED] wrote: I am curious if there is code developed to plot confidence regions in 3D. The scatterplot3d function generates the plot I want, but would like an 3D equivalent to the data.ellipse function. Any help in this direction would be appreciated, be it theoretical, graphical, or otherwise. Melanie Edwards Senior Statistician Exponent 15375 SE 30th PL, Suite 250 Bellevue, WA 98007 Tel: (425) 519-8714 Fax: (425) 643-9827 Cell: (206) 852-5739 www.exponent.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to save changed options in Rcmdr
Dear Shige, You can set appropriate options when R starts up. In R for Windows, the easiest thing to do is probably to put an options() command in the Rprofile file in R's etc directory. For more information on setting Rcmdr options, see ?Commander (also accessible from the Rcmdr Help menu). For more information on R initialization, see ?Startup. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shige Song Sent: Thursday, June 23, 2005 12:19 AM To: r-help@stat.math.ethz.ch Subject: [R] How to save changed options in Rcmdr Dear All, I want to change the default options of Rcmdr; it seemed to work when I made changes and click the Exit and Restart R Commander. However, next time I open Rcmdr, it automatically restored to the default options. Is there a way to change Rcmdr's options permanently? Thanks! Shige __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Another question about Rcmdr
Dear Shige, I'm afraid that my Linux system isn't up-to-date (and isn't a Suse system). Since I'm about to leave town, I won't be able to check this out further now. I expect that if the problem were general, I'd have heard of it before, but I can't be sure that's the case. Perhaps someone else has used the Rcmdr under R 2.1.1 on a Linux system and can report success or failure. I'm sorry that I can't be of more help, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shige Song Sent: Thursday, June 23, 2005 2:53 AM To: r-help@stat.math.ethz.ch Subject: [R] Another question about Rcmdr Hi, I downloaded the new R 2.1.1 source, compiled and installed on my suse 9.3 box. Then I installed the package Rcmdr. Surprisingly, I cannot type anything onto the script window. I uninstalled R and grabbed an rpm version from CRAN, installed it, and installed Rcmdr, same problem! Has anyone else had this problem? Shige __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] test for equality of two data sets withmultidimensional variables
Dear Petr and Shengzhe, Also see ?manova. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Petr Pikal Sent: Tuesday, June 21, 2005 7:30 AM To: wu sz; r-help@stat.math.ethz.ch Subject: Re: [R] test for equality of two data sets withmultidimensional variables Hi searching in CRAN homepage for hotelling gave me this response Thanks everyone for your help on this question. I solved the problem by writing a procedure to calculate Hotelling's T^2 for a one-sample, multivariate t-test. Here's how it looks, perhaps it will be useful to others. data - cbind(rnorm(50, 0.1, .01), rnorm(50,.1,.01), rnorm(50,.1,.01)) k - ncol(data) n - nrow(data) xbar - apply(data, 2, mean) mubar - rep(0,k) #hypothesized means are zero dbar - xbar - mubar v - var(data) t2 - n*dbar%*%solve(v)%*%dbar F - (n-k)*t2/((n-1)*k) P - 1-pf(F,k,n-k) A previous post by Peter B. Mandeville was very helpful, as well as the Johnson/Wichern book on multivariate stats. -S. Schultz and this cran.r-project.org/doc/packages/agce.pdf - Podobné stránky CRAN - Package SharedHT2 SharedHT2: Shared Hotelling T2 test for small sample microarray experiments ... Derives a Hotelling T2 statistic having an F-distribution using an empirical ... Maybe this is what you want. HTH Petr On 21 Jun 2005 at 13:00, wu sz wrote: Hello there, I have two data sets with 14 variables each, and wish to do the test for equality of their covariance matrices and mean vectors. Normally these tests should be done by chi square test (box provided) and Hotelling's T square test respectively. Which R functions could do this kind of test? I just find some functions could do for one dimension, but no for multidimension. Some one suggests bartlett.test, but it seems just works for one dimension. Do you know which ones could do that, or I have to do R programming by myself? Thank you, Shengzhe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Factanal loadings as large as 1.2 with promax -- how unusual?
Dear Ben, To get big factor loadings like this, I'd guess that you have large communalities (not a bad thing, of course, and probably less rather than more likely with crudely measured variables) and strong correlations between factors. I suppose that if the latter get too large that might suggest that you could get by with fewer factors. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ben Fairbank Sent: Monday, June 20, 2005 4:10 PM To: r-help@stat.math.ethz.ch Subject: [R] Factanal loadings as large as 1.2 with promax -- how unusual? I am performing a large (105 variable) factor analysis with factanal, specifying promax rotation. I kow that some loadings over 1.0 are not unsual with that rotation, but I have some as large as 1.2, which seems extreme. I am skirting the assumptions of the model by using responses on a 7-point rating scale as data; I may have to go back and compute polychoric correlations instead of product moment, but before doing that I would like to know if others have had equally large factor loadings using data that are truly interval level data on continuous scales. Thanks for suggestions or information, Ben Fairbank [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem with multinom ?
Dear Marc, I get the same results -- same coefficients, standard errors, and fitted probabilities -- from multinom() and glm(). It's true that the deviances differ, but they, I believe, are defined only up to an additive constant: dt output factor n 1 m1.2 10 2 f1.2 12 3 m1.3 14 4 f1.3 14 5 m1.4 15 6 f1.4 12 dt.m - multinom(output ~ factor, data=dt, weights=n) # weights: 3 (2 variable) initial value 53.372333 iter 10 value 53.115208 iter 10 value 53.115208 iter 10 value 53.115208 final value 53.115208 converged dt2 m f factor 1 10 121.2 2 14 141.3 3 15 121.4 dt.b - glm(cbind(m,f) ~ factor, data=dt2, family=binomial) summary(dt.m) Call: multinom(formula = output ~ factor, data = dt, weights = n) Coefficients: Values Std. Err. (Intercept) -2.632443 3.771265 factor 2.034873 2.881479 Residual Deviance: 106.2304 AIC: 110.2304 Correlation of Coefficients: [1] -0.9981598 summary(dt.b) Call: glm(formula = cbind(m, f) ~ factor, family = binomial, data = dt2) Deviance Residuals: 1 2 3 0.01932 -0.03411 0.01747 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -2.632 3.771 -0.6980.485 factor 2.035 2.881 0.7060.480 (Dispersion parameter for binomial family taken to be 1) Null deviance: 0.5031047 on 2 degrees of freedom Residual deviance: 0.0018418 on 1 degrees of freedom AIC: 15.115 Number of Fisher Scoring iterations: 2 predict(dt.b, type=response) 1 2 3 0.4524946 0.5032227 0.5538845 predict(dt.m, type=probs) 1 2 3 4 5 6 0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846 These fitted probabilities *are* correct: Since the members of each pair (1,2), (3,4), and (5,6) have identical values of factor they are identical fitted probabilities. (Note, by the way, the large negative correlation between the coefficients, produced by the configuration of factor values.) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Marc Girondot Sent: Saturday, June 11, 2005 3:06 AM To: John Fox Cc: r-help@stat.math.ethz.ch Subject: [R] Problem with multinom ? Thanks for your response. OK, multinom() is a more logical in this context. But similar problem occurs: Let these data to be analyzed using classical glm with binomial error: m f factor m theo f theo -Ln L model-Ln L full interecept f 10 12 1.2 0.452494473 0.547505527 1.778835688 1.7786489632.632455675 -2.034882223 14 14 1.3 0.503222759 0.496777241 1.901401922 1.900820284 15 12 1.4 0.553884782 0.446115218 1.877062369 1.876909821 Sum -Ln L 5.557299979 5.556379068 Residual deviance Deviance 11.11459996 11.11275814 0.001841823 If I try to use multinom() function to analyze these data, the fitted parameters are correct but the residual deviance not. dt-read.table('/try.txt'. header=T) dt output factor n 1 m1.2 10 2 f1.2 12 3 m1.3 14 4 f1.3 14 5 m1.4 15 6 f1.4 12 dt.plr - multinom(output ~ factor. data=dt. weights=n. maxit=1000) # weights: 3 (2 variable) initial value 53.372333 iter 10 value 53.115208 iter 10 value 53.115208 iter 10 value 53.115208 final value 53.115208 converged dt.plr Call: multinom(formula = output ~ factor. data = dt. weights = n. maxit = 1000) Coefficients: (Intercept) factor -2.6324432.034873 Residual Deviance: 106.2304 AIC: 110.2304 dt.pr1-predict(dt.plr. . type=probs) dt.pr1 1 2 3 4 5 6 0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846 Probability for 2. 4 and 6 are not correct and its explain the non-correct residual deviance obtained in R. Probably the problem I have is due to an incorrect data format... could someone help me... Thanks (I know there is a simple way to analyze binomial data. but in fine I want to use multinom() for 5 categories of outputs. Thanks a lot Marc -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, ENGREF et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1 69 15 56 96 e-mail: [EMAIL PROTECTED] Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Re: [R] Problem with multinom ?
Dear Marc, -Original Message- From: Marc Girondot [mailto:[EMAIL PROTECTED] Sent: Saturday, June 11, 2005 2:16 PM To: John Fox Cc: [EMAIL PROTECTED] Subject: RE: [R] Problem with multinom ? Dear Marc, I get the same results -- same coefficients, standard errors, and fitted probabilities -- from multinom() and glm(). It's true that the deviances differ, but they, I believe, are defined only up to an additive constant: predict(dt.b, type=response) 1 2 3 0.4524946 0.5032227 0.5538845 predict(dt.m, type=probs) 1 2 3 4 5 6 0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846 These fitted probabilities *are* correct: Since the members of each pair (1,2), (3,4), and (5,6) have identical values of factor they are identical fitted probabilities. I expected rather to obtain (note 1- before some terms): 1 2 3 4 5 6 0.4524948 1-0.4524948 0.5032229 1-0.5032229 0.5538846 1-0.5538846 ... But the fitted probabilities are for each observation in the data set; in your data, these have identical values of factor for each pair and hence identical fitted probabilities. When the response is dichotomous, you get only one of the two fitted probabilities for each observation; for a polytomous response, you get a matrix of fitted probabilities which sum to 1 row-wise. Regards, John Marc -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, ENGREF et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1 69 15 56 96 e-mail: [EMAIL PROTECTED] Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot Fax in US: 1-425-732-6934 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with polr ?
Dear Marc, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Marc Girondot Sent: Friday, June 10, 2005 3:44 PM To: r-help@stat.math.ethz.ch Subject: [R] problem with polr ? I want to fit a multinomial model with logit link. For example let this matrix to be analyzed: male female aborted factor 10 12 1 1.2 14 14 4 1.3 15 12 3 1.4 (this is an example, not the true data which are far more complex...) I suppose the correct function to analyze these data is polr from MASS library. Actually, polr fits the proportional-odds logistic regression model (or a similar ordered probit model), which requires an ordinal response. I don't believe that it makes sense to think of a f m. For the multinomial logit model, see the multinom function in the nnet package. The data have been entered in a text file like this: output factor n m 1.2 10 f 1.2 12 a 1.2 1 m 1.3 14 f 1.3 14 a 1.3 4 m 1.4 15 f 1.4 12 a 1.4 3 However, after having performed the analysis, it appears this is not correct as the 3 outputs per experiment are not linked... library(MASS) dt.plr - polr(output ~ factor, data=dt, weights=n) dt.pr1-predict(dt.plr, , type=probs) dt.pr1 a f m 1 0.09987167 0.4578184 0.4423099 2 0.09987167 0.4578184 0.4423099 3 0.09987167 0.4578184 0.4423099 4 0.09437078 0.4477902 0.4578390 5 0.09437078 0.4477902 0.4578390 6 0.09437078 0.4477902 0.4578390 7 0.08914287 0.4374067 0.4734505 8 0.08914287 0.4374067 0.4734505 9 0.08914287 0.4374067 0.4734505 These are the fitted probabilities of response in each of the three categories. Note that each row sums to 1, as it should. Of course, the model likely doesn't make sense. ---Another linked problem Also, I don't understand what the meaning of the residual deviance that is displayed. Let modify the file so that the model can also be analyzed using binomial model: output factor n m 1.2 10 f 1.2 12 a 1.2 0 m 1.3 14 f 1.3 14 a 1.3 0 m 1.4 15 f 1.4 12 a 1.4 0 dt.plr Call: polr(formula = output ~ factor, data = dt, weights = n) Coefficients: factor 2.034848 Intercepts: a|ff|m -16.306511 2.632410 Residual Deviance: 106.2304 AIC: 112.2304 whereas the corresponding scaled deviance for the binomial model (removing a column) is 1.842e-3... I'm surprised that you were able to get estimates here at all, since there are no observations at category a; nevetheless, polr is estimating two thresholds, one between the never-observed a and f. I expect that this is a numerical artifact. Note that if you simply remove the rows for a rather than set the counts to 0, polr will complain that there are only two categories. I hope this helps, John -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, ENGREF et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1 69 15 56 96 e-mail: [EMAIL PROTECTED] Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot Fax in US: 1-425-732-6934 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Forecasting with macroeconomic structural equations models?
Dear Ronaldo, As you discovered, there are no predict methods in the sem package, either for tsls or sem objects. You might have a look at the systemfit package, which does support predict(). More generally, you should be able to substitute out identities and base predictions on the reduced form of the model. To do so, you'll need forecasts for the exogenous variables in the system. I hope that this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ronaldo Carpio Sent: Wednesday, June 08, 2005 9:10 PM To: r-help@stat.math.ethz.ch Subject: [R] Forecasting with macroeconomic structural equations models? Hello, Is there a package or sample code that shows how to do ex ante forecasts with a macroeconomic structural equations model? I looked at the sem package, which lets you estimate e.g. Klein's model, but I'm not sure how to make simulations using the full set of equations, including the identities. Thank you, Ronaldo Carpio [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] chisq.test and anova problems
Dear Richard, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Mendes Sent: Monday, June 06, 2005 8:44 AM To: r-help@stat.math.ethz.ch Subject: [R] chisq.test and anova problems we just started to use R and having some problems that no one in our school could solve. I hope someone here can help me out. the first problem is with the chisquare test. we want to exclude the missing values from the data. we used na.omit and made two new variables. now we want to use the chi square method but get the error x and y must have the same length. how do i use the chisquare method were i exclude the missing values ? and can i use this method if there is a difference in length. Take a look at the help file for chisq.test (e.g., ?chisq.test): Missing data are excluded when the contingency table for x and y is formed, so you need do nothing special to get what you want -- just chisq.test(x, y). the second problem is with anova in the data set we are working on we have to use this method on multiple variables with a difference in length, can this be done. this is the syntax we used and the error is stated behind. anova(lm(test~test1)) and the error states variable length differ. I'm guessing that test and test1 are two samples on the response variable that you want to compare, though that's not entirely clear from your question. To do a one-way anova via lm(), you should have all of the observations on the response in one variable and a factor giving group membership of each observation; then anova(lm(response~factor)) will give you the one-way ANOVA table. Also take a look at ?aov. Finally, if I've correctly guessed what you're trying to do, then t.test(test, test1) is an alternative. More generally, have you looked at the introductory manual that comes with R? I hope this helps, John i think there has to be a way to use this method on differences in variable lengths does anybode know how to do this thanks in advance for your help richard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] R GUI for Linux?
Dear Charles, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of White, Charles E WRAIR-Wash DC Sent: Monday, May 30, 2005 10:52 AM To: r-help@stat.math.ethz.ch Cc: [EMAIL PROTECTED] Subject: [R] R GUI for Linux? . . . Rcmdr: There are all sorts of things in FC3 that seem to be tcl/tk related but Rcmdr doesn't seem to work with them. Since some are part of the base FC3 installation, I'm nervious about replacing them or installing competing software. Potentially conflicting software in FC3 are listed below: tcl.i386 8.4.7-2 installed tclx.i3868.3.5-4 installed db4-tcl.i386 4.2.52-6 base postgresql-tcl.i386 7.4.8-1.FC3.1 updates-released ruby-tcltk.i386 1.8.2-1.FC3.1 updates-released tcl-devel.i386 8.4.7-2base tcl-html.i3868.4.7-2base tclx-devel.i386 8.3.5-4base tclx-doc.i3868.3.5-4base . . . I haven't tried the Rcmdr package with FC3, but I ran it under an older version of Red Hat Linux some time ago, and the current version (Rcmdr 1.0-2) under Quantian. Can you be more specific about the problems that you encountered? Are these general to the tcltk package or specific to the Rcmdr? I'm sorry that you're experiencing problems. John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Errors in Variables
Dear Spencer, The reason that I didn't respond to the original posting (I'm the author of the sem package), that that without additional information (such as the error variance of x), a model with error in both x and y will be underidentified (unless there are multiple indicators of x, which didn't seem to be the case here). I figured that what Jacob had in mind was something like minimizing the least (orthogonal) distance of the points to the regression line (implying by the way that x and y are on the same scale or somehow standardized), which isn't doable with sem as far as I'm aware. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves Sent: Saturday, May 28, 2005 4:47 PM To: Eric-Olivier Le Bigot Cc: r-help@stat.math.ethz.ch; Jacob van Wyk Subject: Re: [R] Errors in Variables I'm sorry, I have not followed this thread, but I wonder if you have considered library(sem), structural equations modeling? Errors in variables problems are the canonical special case. Also, have you done a search of www.r-project.org - search - R site search for terms like errors in variables regression? This just led me to ODRpack, which is NOT a CRAN package but is apparently available after a Google search. If it were my problem, I'd first try to figure out sem; if that seemed too difficult, I might then look at ODRpack. Also, have you read the posting guide! http://www.R-project.org/posting-guide.html? This suggests, among other things, that you provide a toy example that a potential respondant could easily copy from your email, test a few modifications, and prase a reply in a minute or so. This also helps clarify your question so any respondants are more likely to suggest something that is actually useful to you. Moreover, many people have reported that they were able to answer their own question in the course of preparing a question for this list using the posting guide. hope this helps. spencer graves Eric-Olivier Le Bigot wrote: I'm interested in this 2D line fitting too! I've been looking, without success, in the list of R packages. It might be possible to implement quite easily some of the formalism that you can find in Numerical Recipes (Fortran 77, 2nd ed.), paragraph 15.3. As a matter of fact, I did this in R but only for a model of the form y ~ x (with a given covariance matrix between x and y). I can send you the R code (preliminary version: I wrote it yesterday), if you want. Another interesting reference might be Am. J. Phys. 60, p. 66 (1992). But, again, you would have to implement things by yourself. All the best, EOL -- Dr. Eric-Olivier LE BIGOT (EOL)CNRS Associate Researcher ~~~o~o o~o~~~ Kastler Brossel Laboratory (LKB) http://www.lkb.ens.fr Université P. M. Curie and Ecole Normale Supérieure, Case 74 4 place Jussieu 75252 Paris CEDEX 05 France ~~~o~o o~o~~~ office : 01 44 27 73 67 fax: 01 44 27 38 45 ECR room: 01 44 27 47 12 x-ray room: 01 44 27 63 00 home: 01 73 74 61 87 For int'l calls: 33 + number without leading 0 On Wed, 25 May 2005, Jacob van Wyk wrote: I hope somebody can help. A student of mine is doing a study on Measurement Error models (errors-in-variables, total least squares, etc.). I have an old reference to a multi archive that contains leiv3: Programs for best line fitting with errors in both coordinates. (The date is October 1989, by B.D. Ripley et al.) I have done a search for something similar in R withour success. Has this been implemented in a R-package, possibly under some sort of assumptions about variances. I would lke my student to apply some regression techniques to data that fit this profile. Any help is much appreciated. (If I have not done my search more carefully - my apologies.) Thanks Jacob Jacob L van Wyk Department of Mathematics and Statistics University of Johannesburg APK P O Box 524 Auckland Park 2006 South Africa Tel: +27-11-489-3080 Fax: +27-11-489-2832 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] text-function: adding text in an x,y-plot
Dear Helmut, The problem is that you're implicitly coercing the entire data frame lab to character, producing as.character(lab) [1] c(1, 3, 4, 5, 6, 7, 8, 9, 10, 2) The numbers themselves come from the numeric coding of the factor V1 in this data frame; as.numeric(lab$V1) shows you what's going on. Instead you could specify labels=lab$V1 in the text() command. But why put the row names in a data frame in the first place; why not simply use labels=rownames(out.pca)? I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Helmut Kudrnovsky Sent: Sunday, May 29, 2005 11:08 AM To: r-help@stat.math.ethz.ch Subject: [R] text-function: adding text in an x,y-plot Hello R-friends, i have a question to the text-function. a little test-dataset for better understanding: -the dataset was imported with read.table(,header=TRUE) s1-s10 are the samplenames var1 var2 var3 s1 112 s2 231 s3 223 s4 543 s5 423 s6 632 s7 854 s8 721 s9 932 s10864 -then i´ve performed a pincipal component analysis with prcomp -for displaying the result in a x,y-graph i transformed out.pca$x into the dataframe points points - out.pca$x points PC1 PC2PC3 s1 -4.7055777 -0.14781544 -0.3683602 s2 -3.1854599 0.19661476 1.5212455 s3 -3.2687980 0.78193513 -0.6352458 s4 0.2278948 1.00061498 0.2164108 s5 -1.4382847 0.02500633 -0.9114340 s6 0.6216283 -0.68606440 0.2071083 s7 3.4951878 1.17343675 -0.3266629 s8 1.0153619 -2.37274378 0.1978058 s9 3.3673983 -1.82145761 -0.2071739 s10 3.8706492 1.85047328 0.3063065 - with plot(points$PC1, points$PC2, type=n) i start a graph without displaying anything at the x,y-coordinates - i want now to display the samplenames at the right x,y-coordinate; i generated a dataframe called lab with the samplenames lab V1 1 s1 2 s2 3 s3 4 s4 5 s5 6 s6 7 s7 8 s8 9 s9 10 s10 - i´ve studied the text-helppage and so i tried the following: text(pca$PC1, pca$PC2, labels=lab) - in the plot there is now c(1,3,4,5,6,7,8,9,10,2) displayed instead of s1-s10 is out there any idea what i'am doing wrong? is there maybe another way of displaying the samplenames at the right x,y-coordinate? greetings from the sunny tirol with thanks in advance helli platform i386-pc-mingw32 arch i386 os mingw32 - win xp system i386, mingw32 status major2 minor1.0 year 2005 month04 day 18 language R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Errors in Variables
Dear Spencer, -Original Message- From: Spencer Graves [mailto:[EMAIL PROTECTED] Sent: Sunday, May 29, 2005 4:13 PM To: John Fox Cc: r-help@stat.math.ethz.ch; 'Jacob van Wyk'; 'Eric-Olivier Le Bigot' Subject: Re: [R] Errors in Variables Hi, John: Thanks for the clarification. I know that the errors in X problem requires additional information, most commonly one of the variances or the correlation. The question I saw (below) indicated he had tried model of the form y ~ x (with a given covariance matrix ...), which made me think of sem. If he wants the least (orthogonal) distance, could he could get it indirectly from sem by calling sem repeatedly giving, say, a variance for x, then averaging the variances of x and y and trying that in sem? I'm not sure how that would work, but seems similar to averaging the regressions of y on x and x on y. Also, what do you know about ODRpack? It looks like that might solve the least (orthogonal) distance. I'm not familiar with ODRpack, but it seems to me that one could fairly simply minimize the sum of squared least distances using, e.g., optim. Regards, John Thanks again for your note, John. Best Wishes, Spencer Graves John Fox wrote: Dear Spencer, The reason that I didn't respond to the original posting (I'm the author of the sem package), that that without additional information (such as the error variance of x), a model with error in both x and y will be underidentified (unless there are multiple indicators of x, which didn't seem to be the case here). I figured that what Jacob had in mind was something like minimizing the least (orthogonal) distance of the points to the regression line (implying by the way that x and y are on the same scale or somehow standardized), which isn't doable with sem as far as I'm aware. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves Sent: Saturday, May 28, 2005 4:47 PM To: Eric-Olivier Le Bigot Cc: r-help@stat.math.ethz.ch; Jacob van Wyk Subject: Re: [R] Errors in Variables I'm sorry, I have not followed this thread, but I wonder if you have considered library(sem), structural equations modeling? Errors in variables problems are the canonical special case. Also, have you done a search of www.r-project.org - search - R site search for terms like errors in variables regression? This just led me to ODRpack, which is NOT a CRAN package but is apparently available after a Google search. If it were my problem, I'd first try to figure out sem; if that seemed too difficult, I might then look at ODRpack. Also, have you read the posting guide! http://www.R-project.org/posting-guide.html? This suggests, among other things, that you provide a toy example that a potential respondant could easily copy from your email, test a few modifications, and prase a reply in a minute or so. This also helps clarify your question so any respondants are more likely to suggest something that is actually useful to you. Moreover, many people have reported that they were able to answer their own question in the course of preparing a question for this list using the posting guide. hope this helps. spencer graves Eric-Olivier Le Bigot wrote: I'm interested in this 2D line fitting too! I've been looking, without success, in the list of R packages. It might be possible to implement quite easily some of the formalism that you can find in Numerical Recipes (Fortran 77, 2nd ed.), paragraph 15.3. As a matter of fact, I did this in R but only for a model of the form y ~ x (with a given covariance matrix between x and y). I can send you the R code (preliminary version: I wrote it yesterday), if you want. Another interesting reference might be Am. J. Phys. 60, p. 66 (1992). But, again, you would have to implement things by yourself. All the best, EOL -- Dr. Eric-Olivier LE BIGOT (EOL)CNRS Associate Researcher ~~~o~o o~o~~~ Kastler Brossel Laboratory (LKB) http://www.lkb.ens.fr Université P. M. Curie and Ecole Normale Supérieure, Case 74 4 place Jussieu 75252 Paris CEDEX 05 France ~~~o~o o~o~~~ office : 01 44 27 73 67 fax: 01 44 27 38 45 ECR room: 01 44 27 47 12 x-ray room: 01 44 27 63 00 home
RE: [R] precision problem
Dear Omar, Perhaps I'm missing something, but why not just subtract one matrix from the other and test the difference in relation to the precision that you require for the comparison? E.g., to test near equality, something like, abs(A - B) 1e-13. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis Sent: Wednesday, May 25, 2005 9:09 AM To: r-help@stat.math.ethz.ch Subject: [R] precision problem I have prices that I am finding difficult to compare with ==, and , due to precision. For example: the numbers should match, with '==', but they differ in the magnitude of 1e-14 due to bunch of calculations that I run on them. Programming with java, I am used to implementing a function that compares the difference between the numbers to a pre determined precision factor. This could be very slow when I have two matrices of numbers that I could otherwise compare with a simple '==', '' or '' in R. What is teh best solution for this problem? Can I control the precision of ==, and without having to reimplement the operations in a slow way? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] precision problem
Dear Omar, It wasn't clear to me from your original question that you wanted to test that *all* the corresponding entries were equal, as opposed to each individual entry. In any event, I don't think that you'll find a similar function for testing inequality, so you can do as you suggest, but of course without abs(). Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis Sent: Wednesday, May 25, 2005 10:08 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] precision problem all.equal is helpful when I am comparing equality of two matrices. However, when I am comparing two individual number with or is my best bet doing if( abs(x - y) tolerence) or is there a function like all.equal that has the same default tolerence? On 5/25/05, Omar Lakkis [EMAIL PROTECTED] wrote: Thank you all. all.equal is very helpful since I am also interested in finding the mismatched prices. On 5/25/05, John Fox [EMAIL PROTECTED] wrote: Dear Omar, Perhaps I'm missing something, but why not just subtract one matrix from the other and test the difference in relation to the precision that you require for the comparison? E.g., to test near equality, something like, abs(A - B) 1e-13. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis Sent: Wednesday, May 25, 2005 9:09 AM To: r-help@stat.math.ethz.ch Subject: [R] precision problem I have prices that I am finding difficult to compare with ==, and , due to precision. For example: the numbers should match, with '==', but they differ in the magnitude of 1e-14 due to bunch of calculations that I run on them. Programming with java, I am used to implementing a function that compares the difference between the numbers to a pre determined precision factor. This could be very slow when I have two matrices of numbers that I could otherwise compare with a simple '==', '' or '' in R. What is teh best solution for this problem? Can I control the precision of ==, and without having to reimplement the operations in a slow way? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] colors and palettes and things...
Dear Jeff, Some time ago, Don McQueen posted to r-help a nice function for displaying the named colours. You'll find it at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/31607.html. As well, if you're using R for Windows, the various named colours are defined in the file rgb.txt in R's etc directory. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff D. Hamann Sent: Monday, May 23, 2005 10:35 AM To: r-help@stat.math.ethz.ch Subject: [R] colors and palettes and things... After trying to find if there was a color picker in the FAQs and the help, I thought I would send a post here. I was overwhelmed with all the wonderful color choices R has predefined (discovered after typing in colors()) but can't figure out what they all (by name) look like. Is there a color picker or some other method to display all those colors next to the name? I think I can put together palettes, but another question I have then regards the building of palettes (a list of variable length I can select or create myself other than the ones defined by Palette) so I can pass these colors into functions instead of having to predefine a bunch of colors myself or use the predefined colors like terrain.colors(n)? Are there groups of colors in the colors() that I can group together to make some nice palettes for drawing barplots, etc? Thanks, Jeff. -- Jeff D. Hamann Forest Informatics, Inc. PO Box 1421 Corvallis, Oregon 97339-1421 phone 541-754-1428 fax 541-752-0288 [EMAIL PROTECTED] www.forestinformatics.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] installing R and Rcmdr onMacOS 10.3.9
Dear Karla, It's likely that you don't have Tcl/Tk installed on your Mac. I know that it's possible to get the Rcmdr working on a Mac, but that doing so requires some additional steps. I'm not a Mac user, so I'm afraid that I can't be of much direct help. If someone who's using the Rcmdr on a Mac is willing to prepare notes about how to get it installed and working, I'd gladly post them on my web site. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Karla Van Meter Sent: Monday, May 23, 2005 11:43 AM To: R-help@stat.math.ethz.ch Subject: [R] installing R and Rcmdr onMacOS 10.3.9 I have been trying to install the new, patched R 2.1.0a along with the Rcmdr on my mac. I got the basic R package installed, but in Project Manager, the tcltk package will not install. When I downloaded the Rcmdr binary package from the Berkeley site, it did not show up on the package installer list until a second attempt, but it still does not show up on the package manager list. On the package manager list, the tcltk package is listed as ''not installed'. When I click on it to install, it won't. I tried the library(tcltk) command, which failed. The following is the output from the last command: library(Rcmdr) Loading required package: tcltk Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/ tcltk.so': dlcompat: dyld: /Applications/R.app/Contents/MacOS/R can't open library: /usr/X11R6/lib/libX11.6.dylib (No such file or directory, errno = 2) Error: .onLoad failed in 'loadNamespace' for 'tcltk' Error: package 'tcltk' could not be loaded library(tcltk) Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/ tcltk.so': dlcompat: dyld: /Applications/R.app/Contents/MacOS/R can't open library: /usr/X11R6/lib/libX11.6.dylib (No such file or directory, errno = 2) Error: .onLoad failed in 'loadNamespace' for 'tcltk' Error: package/namespace load failed for 'tcltk' I am NOT a developer, but an application-oriented grad student. I'd appreciate any help that will make Rcmdr work. Thanks, Karla Van Meter tel: 707.765.0420 net: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] R annoyances
Dear Uwe, I've often wondered why T and F aren't reserved words in R as TRUE and FALSE are. Perhaps there's some use of T and F as variables, but that seems ill-advised. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Thursday, May 19, 2005 10:08 AM To: Chalasani, Prasad Cc: r-help@stat.math.ethz.ch Subject: Re: [R] R annoyances Chalasani, Prasad wrote: Thanks all for pointing out that I can use mtx[,1,drop=F] Which, for example, won't work for F - 10.25 --- drop=FALSE ! ^ Uwe Ligges -Original Message- From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: Thursday, May 19, 2005 10:49 AM To: Chalasani, Prasad Cc: r-help@stat.math.ethz.ch Subject: Re: [R] R annoyances Chalasani, Prasad wrote: Dear R Folks, I'm a big fan of R, but there are a couple of things that repeatedly annoy me, and I wondered if anyone has neat ways to deal with them. (a) When using apply row-wise to a matrix, it returns the results column-wise, and to preserve the original orientation, I've to do a transpose. E.g. I've to keep doing a transpose, which I consider to be quite annoying. transformed.mtx - t(apply( mtx, 1, exp)) I'd rather type exp(mtx) (b) When extracting 2 or more columns of a matrix, R returns the result as a matrix, BUT when extracting just one column, it returns a vector/array, rather than a matrix, so I've to keep doing as.matrix, which is annoying. sub.mtx - as.matrix(mtx[,1]) Of course I could write a suitable function cols - function(mtx,range) as.matrix(mtx[, range]) but then I lose the syntactic sugar of being able to say [,1]. The docs suggest: mtx[ , 1, drop = FALSE] Uwe Ligges __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] R annoyances
Dear Jan, Since you can use variables named c, q, or t in any event, I don't see why the existence of functions with these names is much of an impediment. The problem that I see with T and F is that allowing them to be redefined sets a trap for people. If R wants to discourage use of T and F for TRUE and FALSE, then why provide standard global variables by these names? On the other hand, if providing T and F is considered desirable (e.g., for S-PLUS compatibility), then why not make them reserved names? Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Jan T. Kim [mailto:[EMAIL PROTECTED] Sent: Thursday, May 19, 2005 12:22 PM To: John Fox Subject: Re: [R] R annoyances On Thu, May 19, 2005 at 11:55:22AM -0400, John Fox wrote: Dear Uwe, I've often wondered why T and F aren't reserved words in R as TRUE and FALSE are. Perhaps there's some use of T and F as variables, but that seems ill-advised. Personally, I'd rather argue the other way around: Reserved words should be words that should be more unique and expressive than just a single letter. In fact, I've found it slightly irritating at times that c, q and t are functions in the base package, as I'm somewhat prone to use all of these as local variable names... Best regards, Jan -- +- Jan T. Kim ---+ |*NEW*email: [EMAIL PROTECTED] | |*NEW*WWW: http://www.cmp.uea.ac.uk/people/jtk | *-= hierarchical systems are for files, not for humans =-* __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Difference between difference of means
Dear Peter, This looks like a two-way ANOVA, with product as one factor and condition as the other. The test that you describe is for the interaction between product and condition. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Zorg Sent: Wednesday, May 04, 2005 9:24 AM To: r-help@stat.math.ethz.ch Subject: [R] Difference between difference of means Hi I would like to compare two differences of means. I have the means of the log of one variable for two conditions for one product, and for another product. The differences look different between both products for the two conditions, but a statistical test is required. Thank you for any hint Peter __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Anova - adjusted or sequential sums of squares?
Dear Mick, The Anova() function in the car package will compute what are often called type-II and -III sums of squares. Without wishing to reinvigorate the sums-of-squares controversy, I'll just add that the various types of sums of squares correspond to different hypotheses. The hypotheses tested by type-I sums of squares are rarely sensible; that the results vary by the order of the factors is a symptom of this, but sums of squares that are invariant with respect to ordering of the factors don't necessarily correspond to sensible hypotheses. If you do decide to use type-III sums of squares, be careful to use a contrast type (such as contr.sum) that produces an orthogonal row basis for the terms in the model. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of michael watson (IAH-C) Sent: Wednesday, April 20, 2005 9:38 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: RE: [R] Anova - adjusted or sequential sums of squares? I guess the real problem is this: As I have a different number of observations in each of the groups, the results *change* depending on which order I specify the factors in the model. This unnerves me. With a completely balanced design, this doesn't happen - the results are the same no matter which order I specify the factors. It's this reason that I have been given for using the so-called type III adjusted sums of squares... Mick -Original Message- From: Douglas Bates [mailto:[EMAIL PROTECTED] Sent: 20 April 2005 15:07 To: michael watson (IAH-C) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Anova - adjusted or sequential sums of squares? michael watson (IAH-C) wrote: Hi I am performing an analysis of variance with two factors, each with two levels. I have differing numbers of observations in each of the four combinations, but all four combinations *are* present (2 of the factor combinations have 3 observations, 1 has 4 and 1 has 5) I have used both anova(aov(...)) and anova(lm(...)) in R and it gave the same result - as expected. I then plugged this into minitab, performed what minitab called a General Linear Model (I have to use this in minitab as I have an unbalanced data set) and got a different result. After a little mining this is because minitab, by default, uses the type III adjusted SS. Sure enough, if I changed minitab to use the type I sequential SS, I get exactly the same results as aov() and lm() in R. So which should I use? Type I adjusted SS or Type III sequential SS? Minitab help tells me that I would usually want to use type III adjusted SS, as type I sequential sums of squares can differ when your design is unbalanced - which mine is. The R functions I am using are clearly using the type I sequential SS. Install the fortunes package and try fortune(Venables) I'm really curious to know why the two types of sum of squares are called Type I and Type III! This is a very common misconception, particularly among SAS users who have been fed this nonsense quite often for all their professional lives. Fortunately the reality is much simpler. There is, by any sensible reckoning, only ONE type of sum of squares, and it always represents an improvement sum of squares of the outer (or alternative) model over the inner (or null hypothesis) model. What the SAS highly dubious classification of sums of squares does is to encourage users to concentrate on the null hypothesis model and to forget about the alternative. This is always a very bad idea and not surprisingly it can lead to nonsensical tests, as in the test it provides for main effects even in the presence of interactions, something which beggars definition, let alone belief. -- Bill Venables R-help (November 2000) In the words of the master, there is ... only one type of sum of squares, which is the one that R reports. The others are awkward fictions created for times when one could only afford to fit one or two linear models per week and therefore wanted the output to give results for all possible tests one could conceive, even if the models being tested didn't make sense. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Pearson corelation and p-value for matrix
Dear Dren, Since cor(), on which Andy's solution is based, can compute pairwise-present correlations, you could adapt his function -- you'll have to adjust the df for each pair. Alternatively, you could probably save some time (programming time + computer time) by just using my solution: R - diag(100) R[upper.tri(R)] - R[lower.tri(R)] - .5 library(mvtnorm) X - rmvnorm(6000, sigma=R) system.time(for (i in 1:50) cor.pvalues(X), gc=TRUE) [1] 518.19 1.11 520.23 NA NA I know that time is money, but nine minutes (on my machine) probably won't bankrupt anyone. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dren Scott Sent: Monday, April 18, 2005 12:41 PM To: 'R-Help' Subject: RE: [R] Pearson corelation and p-value for matrix Hi all, Thanks Andy, Mark and John for all the help. I really appreciate it. I'm new to both R and statistics, so please excuse any gaffes on my part. Essentially what I'm trying to do, is to evaluate for each row, how many other rows would have a p-value 0.05. So, after I get my N x N p-value matrix, I'll just filter out values that are 0.05. Each of my datasets (6000 rows x 100 columns) would consist of some NA's. The iterative procedure (cor.pvalues) proposed by John would yield the values, but it would take an inordinately long time (I have 50 of these datasets to process). The solution proposed by Andy, although fast, would not be able to incorporate the NA's. Is there any workaround for the NA's? Or possibly do you think I could try something else? Thanks very much. Dren Liaw, Andy [EMAIL PROTECTED] wrote: From: John Fox Dear Andy, That's clearly much better -- and illustrates an effective strategy for vectorizing (or matricizing) a computation. I think I'll add this to my list of programming examples. It might be a little dangerous to pass ... through to cor(), since someone could specify type=spearman, for example. Ah, yes, the ... isn't likely to help here! Also, it will only work correctly if there are no NA's, for example (or else the degree of freedom would be wrong). Best, Andy Thanks, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Friday, April 15, 2005 9:51 PM To: 'John Fox'; [EMAIL PROTECTED] Cc: 'R-Help'; 'Dren Scott' Subject: RE: [R] Pearson corelation and p-value for matrix We can be a bit sneaky and `borrow' code from cor.test.default: cor.pval - function(x, alternative=two-sided, ...) { corMat - cor(x, ...) n - nrow(x) df - n - 2 STATISTIC - sqrt(df) * corMat / sqrt(1 - corMat^2) p - pt(STATISTIC, df) p - if (alternative == less) { p } else if (alternative == greater) { 1 - p } else 2 * pmin(p, 1 - p) p } The test: system.time(c1 - cor.pvals(X), gcFirst=TRUE) [1] 13.19 0.01 13.58 NA NA system.time(c2 - cor.pvalues(X), gcFirst=TRUE) [1] 6.22 0.00 6.42 NA NA system.time(c3 - cor.pval(X), gcFirst=TRUE) [1] 0.07 0.00 0.07 NA NA Cheers, Andy From: John Fox Dear Mark, I think that the reflex of trying to avoid loops in R is often mistaken, and so I decided to try to time the two approaches (on a 3GHz Windows XP system). I discovered, first, that there is a bug in your function -- you appear to have indexed rows instead of columns; fixing that: cor.pvals - function(mat) { cols - expand.grid(1:ncol(mat), 1:ncol(mat)) matrix(apply(cols, 1, function(x) cor.test(mat[, x[1]], mat[, x[2]])$p.value), ncol = ncol(mat)) } My function is cor.pvalues and yours cor.pvals. This is for a data matrix with 1000 observations on 100 variables: R - diag(100) R[upper.tri(R)] - R[lower.tri(R)] - .5 library(mvtnorm) X - rmvnorm(1000, sigma=R) dim(X) [1] 1000 100 system.time(cor.pvalues(X)) [1] 5.53 0.00 5.53 NA NA system.time(cor.pvals(X)) [1] 12.66 0.00 12.66 NA NA I frankly didn't expect the advantage of my approach to be this large, but there it is. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Marc
[R] Discrepancy between gam from gam package and gam in S-PLUS
Dear Trevor, I've noticed a discrepancy in the degrees of freedom reported by gam() from the gam package in R vs. gam() in S-PLUS. The nonparametric df differ by 1; otherwise (except for things that depend upon the df), the output is the same: - snip *** From R (gam version 0.93): mod.gam - gam(prestige ~ lo(income, span=.6), data=Prestige) summary(mod.gam) Call: gam(formula = prestige ~ lo(income, span = 0.6), data = Prestige) Deviance Residuals: Min 1Q Median 3Q Max -17.163 -8.205 -1.998 8.070 32.326 (Dispersion Parameter for gaussian family taken to be 122.5047) Null Deviance: 29895.43 on 101 degrees of freedom Residual Deviance: 12004.72 on 97.9939 degrees of freedom AIC: 785.82 Number of Local Scoring Iterations: 2 DF for Terms and F-values for Nonparametric Effects Df Npar Df Npar F Pr(F) (Intercept) 1 lo(income, span = 0.6) 1 2 10.626 6.537e-05 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 *** From S-PLUS 6.2.1 for Windows: library(car) mod.gam - gam(prestige ~ lo(income, span=.6), data=Prestige) summary(mod.gam) Call: gam(formula = prestige ~ lo(income, span = 0.6), data = Prestige) Deviance Residuals: Min1QMedian 3Q Max -17.16264 -8.205246 -1.997525 8.070375 32.32608 (Dispersion Parameter for Gaussian family taken to be 123.7677 ) Null Deviance: 29895.43 on 101 degrees of freedom Residual Deviance: 12004.72 on 96.99392 degrees of freedom Number of Local Scoring Iterations: 1 DF for Terms and F-values for Nonparametric Effects Df Npar Df Npar FPr(F) (Intercept) 1 lo(income, span = 0.6) 1 3 7.018995 0.0002517358 - snip I suppose that one of these must be in error. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Pearson corelation and p-value for matrix
Dear Andy, Very nice! (My point was that if this is a one-time thing, for Dren to puzzle over it is probably more time-consuming than simply doing it inefficiently.) Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Monday, April 18, 2005 2:40 PM To: 'John Fox'; 'Dren Scott' Cc: 'R-Help' Subject: RE: [R] Pearson corelation and p-value for matrix I believe this will do: cor.pval2 - function(x, alternative=two-sided) { corMat - cor(x, use=if (any(is.na(x))) pairwise.complete.obs else all) df - crossprod(!is.na(x)) - 2 STATISTIC - sqrt(df) * corMat / sqrt(1 - corMat^2) p - pt(STATISTIC, df) p - if (alternative == less) { p } else if (alternative == greater) { 1 - p } else 2 * pmin(p, 1 - p) p } Some test: set.seed(1) x - matrix(runif(2e3 * 1e2), 2e3) system.time(res1 - cor.pval(t(x)), gcFirst=TRUE) [1] 17.28 0.77 18.16NANA system.time(res2 - cor.pval2(t(x)), gcFirst=TRUE) [1] 19.51 1.05 20.70NANA max(abs(res1 - res2)) [1] 0 x[c(1, 3, 6), c(2, 4)] - NA x[30, 61] - NA system.time(res2 - cor.pval2(t(x)), gcFirst=TRUE) [1] 24.48 0.71 25.28NANA This is a bit slower because of the extra computation for df. One can try to save some computation by only computing with the lower (or upper) triangular part. Cheers, Andy From: John Fox Dear Dren, Since cor(), on which Andy's solution is based, can compute pairwise-present correlations, you could adapt his function -- you'll have to adjust the df for each pair. Alternatively, you could probably save some time (programming time + computer time) by just using my solution: R - diag(100) R[upper.tri(R)] - R[lower.tri(R)] - .5 library(mvtnorm) X - rmvnorm(6000, sigma=R) system.time(for (i in 1:50) cor.pvalues(X), gc=TRUE) [1] 518.19 1.11 520.23 NA NA I know that time is money, but nine minutes (on my machine) probably won't bankrupt anyone. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dren Scott Sent: Monday, April 18, 2005 12:41 PM To: 'R-Help' Subject: RE: [R] Pearson corelation and p-value for matrix Hi all, Thanks Andy, Mark and John for all the help. I really appreciate it. I'm new to both R and statistics, so please excuse any gaffes on my part. Essentially what I'm trying to do, is to evaluate for each row, how many other rows would have a p-value 0.05. So, after I get my N x N p-value matrix, I'll just filter out values that are 0.05. Each of my datasets (6000 rows x 100 columns) would consist of some NA's. The iterative procedure (cor.pvalues) proposed by John would yield the values, but it would take an inordinately long time (I have 50 of these datasets to process). The solution proposed by Andy, although fast, would not be able to incorporate the NA's. Is there any workaround for the NA's? Or possibly do you think I could try something else? Thanks very much. Dren Liaw, Andy [EMAIL PROTECTED] wrote: From: John Fox Dear Andy, That's clearly much better -- and illustrates an effective strategy for vectorizing (or matricizing) a computation. I think I'll add this to my list of programming examples. It might be a little dangerous to pass ... through to cor(), since someone could specify type=spearman, for example. Ah, yes, the ... isn't likely to help here! Also, it will only work correctly if there are no NA's, for example (or else the degree of freedom would be wrong). Best, Andy Thanks, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Friday, April 15, 2005 9:51 PM To: 'John Fox'; [EMAIL PROTECTED] Cc: 'R-Help'; 'Dren Scott' Subject: RE: [R] Pearson corelation and p-value for matrix We can be a bit sneaky and `borrow' code from cor.test.default: cor.pval - function(x, alternative=two-sided, ...) { corMat - cor(x, ...) n - nrow(x) df - n - 2 STATISTIC - sqrt(df
RE: [R] Pearson corelation and p-value for matrix
Dear Andy, That's clearly much better -- and illustrates an effective strategy for vectorizing (or matricizing) a computation. I think I'll add this to my list of programming examples. It might be a little dangerous to pass ... through to cor(), since someone could specify type=spearman, for example. Thanks, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Friday, April 15, 2005 9:51 PM To: 'John Fox'; [EMAIL PROTECTED] Cc: 'R-Help'; 'Dren Scott' Subject: RE: [R] Pearson corelation and p-value for matrix We can be a bit sneaky and `borrow' code from cor.test.default: cor.pval - function(x, alternative=two-sided, ...) { corMat - cor(x, ...) n - nrow(x) df - n - 2 STATISTIC - sqrt(df) * corMat / sqrt(1 - corMat^2) p - pt(STATISTIC, df) p - if (alternative == less) { p } else if (alternative == greater) { 1 - p } else 2 * pmin(p, 1 - p) p } The test: system.time(c1 - cor.pvals(X), gcFirst=TRUE) [1] 13.19 0.01 13.58NANA system.time(c2 - cor.pvalues(X), gcFirst=TRUE) [1] 6.22 0.00 6.42 NA NA system.time(c3 - cor.pval(X), gcFirst=TRUE) [1] 0.07 0.00 0.07 NA NA Cheers, Andy From: John Fox Dear Mark, I think that the reflex of trying to avoid loops in R is often mistaken, and so I decided to try to time the two approaches (on a 3GHz Windows XP system). I discovered, first, that there is a bug in your function -- you appear to have indexed rows instead of columns; fixing that: cor.pvals - function(mat) { cols - expand.grid(1:ncol(mat), 1:ncol(mat)) matrix(apply(cols, 1, function(x) cor.test(mat[, x[1]], mat[, x[2]])$p.value), ncol = ncol(mat)) } My function is cor.pvalues and yours cor.pvals. This is for a data matrix with 1000 observations on 100 variables: R - diag(100) R[upper.tri(R)] - R[lower.tri(R)] - .5 library(mvtnorm) X - rmvnorm(1000, sigma=R) dim(X) [1] 1000 100 system.time(cor.pvalues(X)) [1] 5.53 0.00 5.53 NA NA system.time(cor.pvals(X)) [1] 12.66 0.00 12.66NANA I frankly didn't expect the advantage of my approach to be this large, but there it is. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: Friday, April 15, 2005 7:08 PM To: John Fox Cc: 'Dren Scott'; R-Help Subject: RE: [R] Pearson corelation and p-value for matrix Here is what might be a slightly more efficient way to get to John's question: cor.pvals - function(mat) { rows - expand.grid(1:nrow(mat), 1:nrow(mat)) matrix(apply(rows, 1, function(x) cor.test(mat[x[1], ], mat[x[2], ])$p.value), ncol = nrow(mat)) } HTH, Marc Schwartz On Fri, 2005-04-15 at 18:26 -0400, John Fox wrote: Dear Dren, How about the following? cor.pvalues - function(X){ nc - ncol(X) res - matrix(0, nc, nc) for (i in 2:nc){ for (j in 1:(i - 1)){ res[i, j] - res[j, i] - cor.test(X[,i], X[,j])$p.value } } res } What one then does with all of those non-independent test is another question, I guess. I hope this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dren Scott Sent: Friday, April 15, 2005 4:33 PM To: r-help@stat.math.ethz.ch Subject: [R] Pearson corelation and p-value for matrix Hi, I was trying to evaluate the pearson correlation and the p-values for an nxm matrix, where each row represents a vector. One way to do it would be to iterate through each row, and find its correlation value( and the p-value) with respect to the other rows. Is there some function by which I can use the matrix as input? Ideally, the output would be an nxn matrix, containing the p-values between the respective vectors. I have tried cor.test for the iterations, but couldn't find a function that would take the matrix as input. Thanks for the help. Dren __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting
RE: [R] Pearson corelation and p-value for matrix
Dear Mark, -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: Friday, April 15, 2005 9:41 PM To: John Fox Cc: 'R-Help'; 'Dren Scott' Subject: RE: [R] Pearson corelation and p-value for matrix John, Interesting test. Thanks for pointing that out. You are right, there is a knee-jerk reaction to avoid loops, especially nested loops. On the indexing of rows, I did that because Dren had indicated in his initial post: I was trying to evaluate the pearson correlation and the p-values for an nxm matrix, where each row represents a vector. One way to do it would be to iterate through each row, and find its correlation value( and the p-value) with respect to the other rows. So I ran the correlations by row, rather than by column. That's the second time yesterday that I responded to a posting without reading it carefully enough -- a good lesson for me. I guess that Dren could just apply my solution to the transpose of his matrix -- i.e., cor.pvalues(t(X)). Sorry, John Thanks again. Good lesson. Marc On Fri, 2005-04-15 at 21:36 -0400, John Fox wrote: Dear Mark, I think that the reflex of trying to avoid loops in R is often mistaken, and so I decided to try to time the two approaches (on a 3GHz Windows XP system). I discovered, first, that there is a bug in your function -- you appear to have indexed rows instead of columns; fixing that: cor.pvals - function(mat) { cols - expand.grid(1:ncol(mat), 1:ncol(mat)) matrix(apply(cols, 1, function(x) cor.test(mat[, x[1]], mat[, x[2]])$p.value), ncol = ncol(mat)) } My function is cor.pvalues and yours cor.pvals. This is for a data matrix with 1000 observations on 100 variables: R - diag(100) R[upper.tri(R)] - R[lower.tri(R)] - .5 library(mvtnorm) X - rmvnorm(1000, sigma=R) dim(X) [1] 1000 100 system.time(cor.pvalues(X)) [1] 5.53 0.00 5.53 NA NA system.time(cor.pvals(X)) [1] 12.66 0.00 12.66NANA I frankly didn't expect the advantage of my approach to be this large, but there it is. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Marc Schwartz [mailto:[EMAIL PROTECTED] Sent: Friday, April 15, 2005 7:08 PM To: John Fox Cc: 'Dren Scott'; R-Help Subject: RE: [R] Pearson corelation and p-value for matrix Here is what might be a slightly more efficient way to get to John's question: cor.pvals - function(mat) { rows - expand.grid(1:nrow(mat), 1:nrow(mat)) matrix(apply(rows, 1, function(x) cor.test(mat[x[1], ], mat[x[2], ])$p.value), ncol = nrow(mat)) } HTH, Marc Schwartz On Fri, 2005-04-15 at 18:26 -0400, John Fox wrote: Dear Dren, How about the following? cor.pvalues - function(X){ nc - ncol(X) res - matrix(0, nc, nc) for (i in 2:nc){ for (j in 1:(i - 1)){ res[i, j] - res[j, i] - cor.test(X[,i], X[,j])$p.value } } res } What one then does with all of those non-independent test is another question, I guess. I hope this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dren Scott Sent: Friday, April 15, 2005 4:33 PM To: r-help@stat.math.ethz.ch Subject: [R] Pearson corelation and p-value for matrix Hi, I was trying to evaluate the pearson correlation and the p-values for an nxm matrix, where each row represents a vector. One way to do it would be to iterate through each row, and find its correlation value( and the p-value) with respect to the other rows. Is there some function by which I can use the matrix as input? Ideally, the output would be an nxn matrix, containing the p-values between the respective vectors. I have tried cor.test for the iterations, but couldn't find a function that would take the matrix as input. Thanks for the help. Dren __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html