Re: [R] sciviews installation
Dear Philippe, The development version 1.1-0 of the Rcmdr package (i.e., the one with localization facilities) isn't yet on CRAN. Georges doesn't say which version of the Rcmdr package he's using, but I assume 1.0-2 from CRAN, which I believe should work with SciViews. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Grosjean Sent: Wednesday, July 20, 2005 5:24 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] sciviews installation Hello, Several changes in the latest Rcmdr broken the SciViews compatibility. I am working on it... but it takes longer than expected. Sorry for the inconvenience. Best, Philippe ..°})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Pentagone (3D08) ( ( ( ( (Academie Universitaire Wallonie-Bruxelles ) ) ) ) ) 8, av du Champ de Mars, 7000 Mons, Belgium ( ( ( ( ( ) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.30.54 ( ( ( ( (email: [EMAIL PROTECTED] ) ) ) ) ) ( ( ( ( (web: http://www.umh.ac.be/~econum ) ) ) ) ) http://www.sciviews.org ( ( ( ( ( .. ghjmora g mail wrote: Hello 1. a few months ago, I had sciviews working fine with R (rw2001) under windows XP 2. now, upgrading to rw2011, the stuff seems fine (every package installed),but I find a conflict when launching sciviews: - it runs, apparently - but when I try to work (import/export In: text for instance), it asks for Rcmdr (Would you like to install it now?) 3. Rcmdr is already installed (with all dependencies) and works well when called directly in R gui 4. and it's impossible to make it reconized or to install it under sciviews I have all the latest packages, and I am going to get mad. what do you suggest to solve my problem ? Thanks Georges Moracchini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep help needed
Dear Denis, I don't believe that anyone fielded your question -- my apologies if I missed a response. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Monday, July 25, 2005 9:46 PM To: R list Subject: [R] grep help needed Hi, In another thread (PBSmapping and shapefiles) I asked for an easy way to read shapefiles and transform them in data that PBSmapping could use. One person is exploring some ways of doing this, but it is possible I'll have to do this manually. With package maptools I am able to extract the information I need from a shapefile but it is formatted like this: [[1]] [,1] [,2] [1,] -55.99805 51.68817 [2,] -56.00222 51.68911 [3,] -56.01694 51.68911 [4,] -56.03781 51.68606 [5,] -56.04639 51.68759 [6,] -56.04637 51.69445 [7,] -56.03777 51.70207 [8,] -56.02301 51.70892 [9,] -56.01317 51.71578 [10,] -56.00330 51.73481 [11,] -55.99805 51.73840 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 11 attr(,nParts) [1] 1 attr(,shpID) [1] NA [[2]] [,1] [,2] [1,] -57.76294 50.88770 [2,] -57.76292 50.88693 [3,] -57.76033 50.88163 [4,] -57.75668 50.88091 [5,] -57.75551 50.88169 [6,] -57.75562 50.88550 [7,] -57.75932 50.88775 [8,] -57.76294 50.88770 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 8 attr(,nParts) [1] 1 attr(,shpID) [1] NA I do not quite understand the structure of this data object (list of lists I think) Actually, it looks like a list of matrices, each with some attributes (which, I gather, aren't of interest to you). but at this point I resorted to printing it on the console and imported that text into Excel for further cleaning, which is easy enough. I'd like to complete the process within R to save time and to circumvent Excel's limit of around 64000 lines. But I have a hard time figuring out how to clean up this text in R. If I understand correctly what you want, this seems a very awkward way to proceed. Why not just extract the matrices from the list, stick on the additional columns that you want, stick the matrices together, name the columns, and then output the data to a file? M1 - Data[[1]] # assuming that the original list is named Data M2 - Data[[2]] M1 - cbind(1, 1:nrow(M1), M1) M2 - cbind(2, 1:nrow(M2), M2) M - rbind(M1, M2) colnames(M) - c(PID, POS, X, Y) write.table(M, Data.txt, row.names=FALSE, quote=FALSE) It wouldn't be hard to generalize this to any number of matrices and to automate the process. I hope that this helps, John What I need to produce for PBSmapping is a file where each block of coordinates shares one ID number, called PID, and a variable POS indicates the position of each coordinate within a shape. All other lines must disappear. So the above would become: PID POS X Y 1 1 -55.99805 51.68817 1 2 -56.00222 51.68911 1 3 -56.01694 51.68911 1 4 -56.03781 51.68606 1 5 -56.04639 51.68759 1 6 -56.04637 51.69445 1 7 -56.03777 51.70207 1 8 -56.02301 51.70892 1 9 -56.01317 51.71578 1 10 -56.00330 51.73481 1 11 -55.99805 51.73840 2 1 -57.76294 50.88770 2 2 -57.76292 50.88693 2 3 -57.76033 50.88163 2 4 -57.75668 50.88091 2 5 -57.75551 50.88169 2 6 -57.75562 50.88550 2 7 -57.75932 50.88775 2 8 -57.76294 50.88770 First I imported this text file into R: test - read.csv2(test file.txt,header=F, sep=;, colClasses = character) I used sep=; to insure there would be only one variable in this file, as it contains no ; To remove lines that do not contain coordinates, I used the fact that longitudes are expressed as negative numbers, so with my very limited knowledge of grep searches, I thought of this, which is probably not the best way to go: a - rep(-, length(test$V1)) b - grep(a, test$V1) this gives me a warning (Warning message: the condition has length 1 and only the first element will be used in: if (is.na(pattern)) { but seems to do what I need anyway c - seq(1, length(test$V1)) d - c %in% b e - test$V1[d] Partial victory, now I only have lines that look like [1,] -57.76294 50.88770 But I don't know how to go further: the number in square brackets can be used for variable POS, after removing the square brackets and the comma, but this requires a better knowledge of grep than I have. Furthermore, I don't know how to add a PID (polygon ID) variable, i.e. all lines of a polygon must have the same ID, as in the example above (i.e. each time POS == 1, a new polygon starts and PID needs to be incremented by 1, and PID is kept constant for lines where POS ! 1). Any help will be much appreciated. Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] logistic regression: categorical value, and multinomial
Dear Ed, See ?glm for fitting binomial logit models, and ?multinom (in the nnet package) for multinomial logit models. Neither function will handle a character (text) variable as the response, but you could easily convert the variable to a factor. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Haibo Huang Sent: Wednesday, July 27, 2005 11:23 AM To: r-help@stat.math.ethz.ch Subject: [R] logistic regression: categorical value, and multinomial I have two questions: 1. If I want to do a binomial logit, how to handle the categorical response variable? Data for the response variables are not numerical, but text. 2. What if I want to do a multinomial logit, still with categorical response variable? The variable has 5 non-numerical response levels, I have to do it with a multinomial logit. Any input is highly appreciated! Thanks! Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] setting elements to NA across an array
Dear Dr Carbon, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon Sent: Wednesday, July 27, 2005 12:19 PM To: [EMAIL PROTECTED] Subject: [R] setting elements to NA across an array Please excuse what is obviously a trivial matter... I have a large 3-d array. I wish to set the third dimension (z) to NA if there are any NA values in the first two dimnesions (xy). So, given array foo: foo - array(data = NA, dim = c(5,5,3)) foo[,,1] - matrix(rnorm(25), 5, 5) foo[,,2] - matrix(rnorm(25), 5, 5) foo[,,3] - matrix(rnorm(25), 5, 5) I'll set two elements to NA foo[2,2,1]- NA foo[3,5,2]- NA foo Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How can I build a logical statement to do this? That should work just as you've specified it: That is, elements in all layers in the second row, second column and in the the third row, fifth column of the array should be NA. Since you posed the question, I suppose that I'm missing something. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] setting elements to NA across an array
Dear Dr Carbon, Actually, it appears that I'm the one who was being obtuse. To make sure that I'm now interpreting what you want correctly, you'd like to set all entries in a layer to NA if any entry in a layer is NA. If that's correct, then how about the following? foo[array(apply(foo, c(1,2), function(x) any(is.na(x))), dim(foo))] - NA Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Dr Carbon [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 27, 2005 12:45 PM To: John Fox Cc: [EMAIL PROTECTED] Subject: Re: [R] setting elements to NA across an array Sorry for being obtuse. How can I build a logical statement that will set foo[2,2,] - NA and foo[3,5,] - NA? Something like, if any row and column are NA, then set NA to that row and column to NA for the entire array On 7/27/05, John Fox [EMAIL PROTECTED] wrote: Dear Dr Carbon, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon Sent: Wednesday, July 27, 2005 12:19 PM To: [EMAIL PROTECTED] Subject: [R] setting elements to NA across an array Please excuse what is obviously a trivial matter... I have a large 3-d array. I wish to set the third dimension (z) to NA if there are any NA values in the first two dimnesions (xy). So, given array foo: foo - array(data = NA, dim = c(5,5,3)) foo[,,1] - matrix(rnorm(25), 5, 5) foo[,,2] - matrix(rnorm(25), 5, 5) foo[,,3] - matrix(rnorm(25), 5, 5) I'll set two elements to NA foo[2,2,1]- NA foo[3,5,2]- NA foo Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How can I build a logical statement to do this? That should work just as you've specified it: That is, elements in all layers in the second row, second column and in the the third row, fifth column of the array should be NA. Since you posed the question, I suppose that I'm missing something. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Unexpected behavior in recode{car}
Dear Tom and David, The source of the problem isn't hard to see if you trace the execution of recode() via debug(): The test for whether the result can be coerced to numeric is faulty. I've fixed the bug and will upload a new version of car to CRAN shortly. In the meantime, you can use ss - recode(nn, 2='Num2'; 4='Num4', as.factor=TRUE) Thanks for bringing this bug to my attention. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mulholland, Tom Sent: Thursday, July 28, 2005 2:01 AM To: D. Dailey; r-help@stat.math.ethz.ch Subject: Re: [R] Unexpected behavior in recode{car} require( car ) set.seed(12345) nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - recode( nn, 2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss - recode( nn, 2='Num2';4='Num4' ) # Doesn't work as expected table( ss, exclude=NULL ) ss - recode( nn, 2='Num2';4='Num4', TRUE ) #? table( ss, exclude=NULL ) tt - recode( nn, 2='TWO'; 4='Num4' ) table( tt, exclude=NULL ) uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, exclude=NULL ) I looked at the code and found it too difficult to immediately decipher. So does making the result a factor cause any real problems? I noticed that the same response happens with any letterset followed by a number recode( nn, 2='Num2'; 4='abc5' ) Tom -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of D. Dailey Sent: Thursday, 28 July 2005 11:45 AM To: r-help@stat.math.ethz.ch Subject: [R] Unexpected behavior in recode{car} Thanks to the R creators for such a great statistical system. Thanks to the R help list, I have (finally) gotten far enough in R to have a question I hope to be worth posting. I'm using the recode function from John Fox's car package and have encountered some unexpected behavior. Consider the following example: ## Begin cut-and-paste example require( car ) set.seed(12345) nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - recode( nn, 2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss - recode( nn, 2='Num2';4='Num4' ) # Doesn't work as expected table( ss, exclude=NULL ) tt - recode( nn, 2='TWO'; 4='Num4' ) table( tt, exclude=NULL ) uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, exclude=NULL ) ## End cut-and-paste example All but the recoding to ss work as expected: I get a character vector with 23 instances of either FOUR or Num4 and 27 instances of TWO or Num2. But for the ss line, wherein all the strings to be assigned contain a digit, the resulting vector contains all NAs. Using str(), I note that ss is a numeric vector. Is there a tidy way (using recode) to recode numeric values into character strings, all of which contain a digit? I have a workaround for my current project, but it would be nice to be able to use mixed alphanumeric strings in this context. Thanks in advance for any insight you can give into this question. Using R 2.1.1 (downloaded binary) on Windows XP Pro, car version 1.0-17 (installed from CRAN via Windows GUI). Complete version information below: version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor1.1 year 2005 month06 day 20 language R t(t( installed.packages()['car',] )) [,1] Package car LibPath C:/Programs/R/rw2011/library Version 1.0-17 Priority NA Bundle NA Contains NA Depends R (= 1.9.0) Suggests MASS, nnet, leaps Imports NA Built2.1.0 I subscribe to the help list in digest form, so would appreciate being copied directly in addition to seeing responses sent to the list. David Dailey Shoreline, Washington, USA [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] partial SS for anova
Dear Natalia, See the Anova() function in the car package. Also see the warning in ?Anova about Type III sums of squares. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of NATALIA F TCHETCHERINA Sent: Saturday, July 30, 2005 6:25 PM To: r-help@stat.math.ethz.ch Subject: [R] partial SS for anova Hello, I use lme4 package. library(lme4) fit=lmer(y ~ time+dye+trt+trt:time + (1|rep), data=dataset, na.action='na.omit') anova(fit) The anova gives sequential F-tests and sequential SS. My question is: how I can get partial F-tests and partial SS? For lm (not lmer) anova(lm(y~x+z)) we can use anova(fit, ssType=3) but it is not work for lmer. Natalia. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] tcltk programming guide
Dear Simone, tcltk functions correspond closely to Tcl and Tk commands, so documentation for the latter, available at http://wiki.tcl.tk/3109, is helpful. I also found Welsch's Practical Programming in Tcl and Tk useful. I expect that you've already seen Peter Dalgaard's two R News articles on the tcltk package. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Simone Gabbriellini Sent: Thursday, August 11, 2005 9:43 AM To: John Zhang Cc: R-help@stat.math.ethz.ch Subject: Re: [R] tcltk programming guide thank you, I knew that link, but I need something more document- oriented, more specific, i.e. if I want to know how to use tkadd, where should I look? or tkinsert and so on... thanx, simone Il giorno 11/ago/05, alle ore 14:18, John Zhang ha scritto: Go to http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/. There are good examples. X-Original-To: [EMAIL PROTECTED] Delivered-To: [EMAIL PROTECTED] X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch Mime-Version: 1.0 (Apple Message framework v733) To: R-help@stat.math.ethz.ch From: Simone Gabbriellini [EMAIL PROTECTED] Date: Thu, 11 Aug 2005 14:04:58 +0200 Subject: [R] tcltk programming guide X-BeenThere: r-help@stat.math.ethz.ch X-Mailman-Version: 2.1.6 List-Id: Main R Mailing List: Primary help r- help.stat.math.ethz.ch List-Unsubscribe: https://stat.ethz.ch/mailman/listinfo/r-help, mailto:[EMAIL PROTECTED] List-Archive: https://stat.ethz.ch/pipermail/r-help List-Post: mailto:r-help@stat.math.ethz.ch List-Help: mailto:[EMAIL PROTECTED] List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-help, mailto:[EMAIL PROTECTED] Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on pascal.dfci.harvard.edu X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.1 Dear List, I'm looking for some documentation about the R tcltk package The one I found in the help doesn't look exaustive, I need information on the use of the single tk widget, maybe with some examples thank you, simone gabbriellini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html Jianhua Zhang Department of Medical Oncology Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Issues with tcltk for Tiger Mac OSX
Dear Bernard, The Rcmdr web site has some instructions prepared by Rob Goedman for Mac users at http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html that might prove useful. (I'm not a Mac user myself so can't offer specific advice.) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bernard Leong Sent: Friday, August 12, 2005 8:51 AM To: r-help@stat.math.ethz.ch Subject: [R] Issues with tcltk for Tiger Mac OSX Dear R-helpers, I have installed the latest version of R 2.1.1 in the Tiger OSX. However, when I load up R and use the following command: library(tcltk) I encounter the following error: Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/ tcltk.so': dlopen(/Library/Frameworks/R.framework/Resources/library/tcltk /libs/tcltk.so, 10): corrupt binary, library ordinal too big Error: .onLoad failed in 'loadNamespace' for 'tcltk' Error: package/namespace load failed for 'tcltk' I have done a couple of remedies: 1) Tried installing tcl and tk 8.4 from source 2) Tried to install the version of tcltk from the fink sourceforge net. 3) Tried to change the path and it does not work. Does anyone have a solution to this problem as I need tcltk to run limmaGUI? Best regards, Bernard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Manually Calculating Odds from POLR Model
Dear Tate, If I understand correctly what you're asking, the formulas are on p. 21 of the paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-displays.pdf. But why do you want to do this when you can get the fitted probabilities from predict()? I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery Sent: Friday, August 12, 2005 2:50 PM To: r-help@stat.math.ethz.ch Subject: [R] Manually Calculating Odds from POLR Model Hello, I am using polr(...) to generate a model. The summary shows the coefficients and the intercepts. For example: coefficient for x1 = c1 coefficient for x2 = c2 intercept A|B = i1 intercept B|C = i2 I can then run predict(..., type=p) with the model and see the odds for each factor. For example: ABC 10.3 0.5 0.2 20.4 0.1 0.5 What I really want to be able to do is take the 2 coefficients, the 2 intercepts, the x1 x2 values and manually calculate the probabilities generated by predict(). I have been searching quite extensively for the underlying calculations that transform the polr output and the input variables into the final output odds. I have tried a number of dead-end roads so far. So, if anyone has any information on how to do this or where I can find out, I would be extremely grateful. Thank you for your time, Tate Avery __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] path analysis
Dear Manel, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of SALAMERO BARO, MANUEL Sent: Saturday, August 13, 2005 2:02 PM To: r-help@stat.math.ethz.ch Subject: [R] path analysis Someone knows if it is possible to perform a path analysis with sem package (or any other) to explain a dependent *dichotomus* variable? Yes -- you can use the hetcor() function in the polycor package to generate a correlation matrix and boot.sem() in the sem package to get standard errors or confidence intervals. Make sure that the dichotomous variables are represented as factors. See ?boot.sem for an example. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] path analysis
Dear Manuel, Polychoric correlations imply only that the *latent* variables are continuous -- the observed variables are ordered categories. Tetrachoric and point-biserial correlations are special cases respectively of polychoric and polyserial correlations. As long as you're willing to think of the dichotomous variable as the dissection into two categories of a latent continuous variable (and assuming multinormality of the latent variables), you can use the approach that I suggested. This isn't logistic regression, but it's similar to a probit model. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Manel Salamero Sent: Sunday, August 14, 2005 12:34 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] path analysis This solves part of my problem with the independent ordinal variables, but my dependent variable is truly categorial (illness/no illness). Polychoric correlation implies that data are continuous, which in not the case. Is possible to implement logistic regression in the path model? Thanks, Manel Salamero -- Original Message -- De: John Fox [EMAIL PROTECTED] Data: Sat, 13 Aug 2005 19:35:24 -0400 Dear Manel, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of SALAMERO BARO, MANUEL Sent: Saturday, August 13, 2005 2:02 PM To: r-help@stat.math.ethz.ch Subject: [R] path analysis Someone knows if it is possible to perform a path analysis with sem package (or any other) to explain a dependent *dichotomus* variable? Yes -- you can use the hetcor() function in the polycor package to generate a correlation matrix and boot.sem() in the sem package to get standard errors or confidence intervals. Make sure that the dichotomous variables are represented as factors. See ?boot.sem for an example. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] path analysis
Dear Manuel and list, I see that I wrote point-biserial when I meant biserial. Sorry, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Fox Sent: Sunday, August 14, 2005 1:34 PM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] path analysis Dear Manuel, Polychoric correlations imply only that the *latent* variables are continuous -- the observed variables are ordered categories. Tetrachoric and point-biserial correlations are special cases respectively of polychoric and polyserial correlations. As long as you're willing to think of the dichotomous variable as the dissection into two categories of a latent continuous variable (and assuming multinormality of the latent variables), you can use the approach that I suggested. This isn't logistic regression, but it's similar to a probit model. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Manel Salamero Sent: Sunday, August 14, 2005 12:34 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] path analysis This solves part of my problem with the independent ordinal variables, but my dependent variable is truly categorial (illness/no illness). Polychoric correlation implies that data are continuous, which in not the case. Is possible to implement logistic regression in the path model? Thanks, Manel Salamero -- Original Message -- De: John Fox [EMAIL PROTECTED] Data: Sat, 13 Aug 2005 19:35:24 -0400 Dear Manel, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of SALAMERO BARO, MANUEL Sent: Saturday, August 13, 2005 2:02 PM To: r-help@stat.math.ethz.ch Subject: [R] path analysis Someone knows if it is possible to perform a path analysis with sem package (or any other) to explain a dependent *dichotomus* variable? Yes -- you can use the hetcor() function in the polycor package to generate a correlation matrix and boot.sem() in the sem package to get standard errors or confidence intervals. Make sure that the dichotomous variables are represented as factors. See ?boot.sem for an example. I hope this helps, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Manually Calculating Odds from POLR Model
Dear Tate, Your question pertains generally to the treatment of factors in model formulas and is not particular to polr(). For a brief explanation, see Section 11.1, Defining statistical models; formulae, and in particular Section 11.1.1 on Contrasts in the manual An Introduction to R, which is distributed with R. More detailed explanations are in texts such as Venables and Ripley, Modern Applied Statistics With S, and my own, An R and S-PLUS Companion to Applied Regression. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Tate Avery [mailto:[EMAIL PROTECTED] Sent: Monday, August 15, 2005 3:58 PM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: RE: [R] Manually Calculating Odds from POLR Model John, Thank you, the document was very helpful. I can now calculate the same values generated by predict() when I am using purely numeric input data. Another small question arises when I look at the example using 'housing' in the polr() documentation page: Running the example produces the following coefficients... Coefficients: InflMedium InflHigh TypeApartmentTypeAtrium TypeTerrace ContHigh 0.5663924 1.2888218-0.5723552-0.3661912 -1.0910195 0.3602834 Now, if I am trying to perform a prediction and the value for INFL comes in as 'Medium' what is done? And, what is done for 'low'? That seems to be the last missing piece in my understanding of how to convert the model values into predictions. Thank you, Tate From: John Fox [EMAIL PROTECTED] To: 'Tate Avery' [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: RE: [R] Manually Calculating Odds from POLR Model Date: Fri, 12 Aug 2005 19:22:23 -0400 Dear Tate, If I understand correctly what you're asking, the formulas are on p. 21 of the paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-d isplays.pdf. But why do you want to do this when you can get the fitted probabilities from predict()? I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery Sent: Friday, August 12, 2005 2:50 PM To: r-help@stat.math.ethz.ch Subject: [R] Manually Calculating Odds from POLR Model Hello, I am using polr(...) to generate a model. The summary shows the coefficients and the intercepts. For example: coefficient for x1 = c1 coefficient for x2 = c2 intercept A|B = i1 intercept B|C = i2 I can then run predict(..., type=p) with the model and see the odds for each factor. For example: ABC 10.3 0.5 0.2 20.4 0.1 0.5 What I really want to be able to do is take the 2 coefficients, the 2 intercepts, the x1 x2 values and manually calculate the probabilities generated by predict(). I have been searching quite extensively for the underlying calculations that transform the polr output and the input variables into the final output odds. I have tried a number of dead-end roads so far. So, if anyone has any information on how to do this or where I can find out, I would be extremely grateful. Thank you for your time, Tate Avery __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix oriented computing
Dear Mark, For that matter, the loop isn't a whole a slower (on my 3GHz Win XP system): x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995) df - 1:1000 system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.08 0.00 0.08 NA NA mat - matrix(0, 1000, 11) system.time(for (i in 1:length(df)) mat[i,] - qchisq(x, df[i])) [1] 0.09 0.00 0.10 NA NA Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Marc Schwartz (via MN) Sent: Friday, August 26, 2005 10:26 AM To: Peter Dalgaard Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Matrix oriented computing On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote: Marc Schwartz [EMAIL PROTECTED] writes: x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995) df - c(1:100) mat - sapply(x, qchisq, df) dim(mat) [1] 100 11 str(mat) num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ... outer() is perhaps a more natural first try... It does give the transpose of the sapply approach, though. round(t(outer(x,df,qchisq)),2) should be close. You should likely add dimnames. What I find interesting, is that I would have intuitively expected outer() to be faster than sapply(). However: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.01 0.00 0.02 0.00 0.00 # Bear in mind the round() on mat1 above all.equal(mat, mat1) [1] Mean relative difference: 4.905485e-05 all.equal(mat, t(mat2)) [1] TRUE Even when increasing the size of 'df' to 1:1000: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.16 0.01 0.16 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.16 0.00 0.18 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.16 0.01 0.17 0.00 0.00 It also seems that, at least in this case, t() and round() do not add much overhead. Best regards, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcmdr extensions
Dear Simone, You'll find instructions for extending the R Commander in Section 4 of the paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/R-commander.pdf. In most instances, you should cause commands to be executed and printed in the Script window and output to appear in the Output window by using the function doItAndPrint(). I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Simone Gabbriellini Sent: Monday, August 29, 2005 9:48 AM To: r-help@stat.math.ethz.ch Subject: [R] Rcmdr extensions Dear List, I am trying to extend Rcmdr with some functions usefult to my study... I have addedd succesfully a menu and some submenu to the GUI, and I have placed a file .R in the /etc folder... I am able to call functions on that file, but I cannot see the results: how can I tell Rcmdr to write the output in the output box? Second question: how can I refer to the actually selected Dataset in my .R functions' file? Thank you, Simone Gabbriellini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] tcltk, X11 protocol error: Bug?
Dear Nicholas, This problem has been reported before (enter X11 protocol error on the R site search at http://finzi.psych.upenn.edu/search.html to see the previous threads), but as far as I know, there's no definitive explanation or solution. As well, things appear to work fine, despite the warnings. The way I handle the problem in the Rcmdr package is simply to intercept the warnings. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nicholas Lewin-Koh Sent: Monday, September 05, 2005 5:16 PM To: [EMAIL PROTECTED] Subject: [R] tcltk, X11 protocol error: Bug? Hi, I am having trouble debugging this one. The code is attached below, but it seems to be a problem at the C-tk interface. If I run this 1 time there are no problems if I run it more than once I start to get warnings that increase in multiples of 11 everytime I run it. Here is a sample session source(clrramp2.r) Loading required package: tcltk Loading Tcl/Tk interface ... done clrRamp() tt-clrRamp() tt function (n) { x - ramp(seq(0, 1, length = n)) rgb(x[, 1], x[, 2], x[, 3], max = 255) } environment: 0x8b8674c image(matrix(1:10),col=tt(10)) tt-clrRamp() There were 22 warnings (use warnings() to see them) image(matrix(1:10),col=tt(10)) There were 11 warnings (use warnings() to see them) warnings() Warning messages: 1: X11 protocol error: BadWindow (invalid Window parameter) 2: X11 protocol error: BadWindow (invalid Window parameter) 3: X11 protocol error: BadWindow (invalid Window parameter) 4: X11 protocol error: BadWindow (invalid Window parameter) 5: X11 protocol error: BadWindow (invalid Window parameter) 6: X11 protocol error: BadWindow (invalid Window parameter) 7: X11 protocol error: BadWindow (invalid Window parameter) 8: X11 protocol error: BadWindow (invalid Window parameter) 9: X11 protocol error: BadWindow (invalid Window parameter) 10: X11 protocol error: BadWindow (invalid Window parameter) 11: X11 protocol error: BadWindow (invalid Window parameter) I am running R-2.1.1 on ubuntu linux 5.04, compiled from source (not the deb package). My version of tcl/tk is 8.4. The code is below. If anyone sees something I am doing foolish let me know, otherwise I will file a bug report. Nicholas # File clrramp2.r ## require(tcltk) clrRamp - function(n.col, b.color=NULL,e.color=NULL){ B.ChangeColor - function() { b.color - tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color, title=Choose a color)) if (nchar(b.color)0){ tkconfigure(canvas.b,bg=b.color) Rmp.Draw() } } E.ChangeColor - function() { e.color - tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color, title=Choose a color)) ##cat(e.color) if (nchar(e.color)0){ tkconfigure(canvas.e,bg=e.color) Rmp.Draw() } } Rmp.Draw -function(){ cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola te=spline) rmpcol - cr(n.col) #rmpcol-rgb( rmpcol[,1],rmpcol[,2],rmpcol[,3]) inc - 300/n.col xl - 0 for(i in 1:n.col){ ##item - tkitemconfigure(canvas.r,barlst[[i]],fill=rmpcol[i],outline=rmpcol[i]) #xl - xl+inc } } save.ramp - function(){ cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola te=spline) tkdestroy(tt) ##invisible(cr) } tt - tktoplevel() tkwm.title(tt,Color Ramp Tool) frame - tkframe(tt) bframe - tkframe(frame,relief=groove,borderwidth=1) if(is.null(b.color)) b.color - blue if(is.null(e.color)) e.color - yellow if(missing(n.col)) n.col - 100 canvas.b - tkcanvas(bframe,width=50,height=25,bg=b.color) canvas.e - tkcanvas(bframe,width=50,height=25,bg=e.color) canvas.r - tkcanvas(tt,width=300,height=50,bg=white) BColor.button - tkbutton(bframe,text=Begin Color,command=B.ChangeColor) ##tkgrid(canvas.b,BColor.button) EColor.button - tkbutton(bframe,text=End Color,command=E.ChangeColor) killbutton - tkbutton(bframe,text=Save,command=save.ramp) tkgrid(canvas.b,BColor.button,canvas.e,EColor.button) tkgrid(bframe) tkgrid(frame) tkgrid(canvas.r) tkgrid(killbutton) cr-colorRampPalette(c(b.color,e.color),space=Lab,interpolat e=spline) ##rmpcol - hex(mixcolor(alpha,bc,ec,where=LUV)) rmpcol - cr(n.col) inc - 300/n.col xl - 0 #barlst - vector(length=n.col,mode=list) barlst - tclArray() for(i in 1:n.col){ item-tkcreate(canvas.r,rect,xl,0,xl+inc,50, fill=rmpcol[i],outline=rmpcol[i]) ##tkaddtag(canvas.r, point, withtag, item
[R] [R-pkgs] Internationalized version of the Rcmdr package
Dear R-packages list members, I've recently uploaded to CRAN version 1.1-1 of the Rcmdr (R Commander) package. Based on the tcltk package, the Rcmdr provides a basic-statistics graphical interface to R. The general functionality of the R Commander has not changed much since version 1.0-0. There are some small improvements to the package, but the reason that I'm re-announcing it is that the latest version supports translation into other languages, using the new internationalization and localization features of R 2.1.x. A Catalan translation (courtesy of Manel Salamero), a French translation (courtesy of Philippe Grosjean), a Japanese translations (courtesy of Takaharu Araki), and a Slovenian translation (courtesy of Jaro Lajovic), are provided with the Rcmdr package. I understand that Portuguese, Russian, and Spanish translations are in the works, and these will be incorporated when they are completed. If you're interested in translating the Rcmdr into another language, please contact me. As before, bug reports, suggestions, and other feedback are appreciated. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plotting an ellipse in 3D
=data.frame(x=0, z=0, groups=group)), 0, paste(group, ), adj=1, color=surface.col[j]) if (residuals){ yy - y[select.obs] xx - x[select.obs] zz - z[select.obs] fitted - fitted(mod) rgl.lines(as.vector(rbind(xx,xx)), as.vector(rbind(yy,fitted)), as.vector(rbind(zz,zz)), col=surface.col[j]) } } } } } } if (revolutions 0) { for (i in 1:revolutions){ for (angle in seq(1, 360, length=360/speed)) rgl.viewpoint(-angle, fov=fov) } } if (model.summary) return(summaries) else return(invisible(NULL)) } John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Friday, September 09, 2005 8:03 AM To: Mike White Cc: R-help@stat.math.ethz.ch Subject: Re: [R] Plotting an ellipse in 3D Mike White wrote: I have been using the ellipse function from the car package and the covariance matrix to draw an ellipse around a group of points to show the confidence limits. However, the points are actually represented by 3 variables so rather than plot each pairwise combination of variables in 2D I would like to plot the 'ellipse' in 3D using the djmrgl package. Can anyone offer advice on how I can plot the surface of a 3D 'ellipse' using the covariance matrix to define the shape, so that the points inside can also be seen. You should use rgl, rather than djmrgl. It now has most of the same functions plus a lot more. Then you can plot the ellipse as a wireframe or transparent object. See the demo(regression) example for that kind of drawing; demo(shapes3d) for ellipses. (The demo names are from memory, I don't have access to it right now.) Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Collineariy Diagnostics
Dear Sundar and Antoine, In addition, the vif function in the car package will calculate generalized variance inflation factors. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sundar Dorai-Raj Sent: Tuesday, September 13, 2005 5:16 AM To: Antoine de Bary Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Collineariy Diagnostics Antoine de Bary wrote: Hi, and thanks for your help in order to do collinearity analysis I downloaded the perturb package. I run a lm (regression) and on that the ³calldiag² commad to get condition numbers but i get the following message: the variable XY with modus ³numeric² was not found (it does the same with all predictors despite all variables are numeric and exists). Can anyone tell me how can I go arround this problem? Is there another way to have ³condition numbers²? What about VIF? Please return message to: [EMAIL PROTECTED] I cannot comment on the perturb package. However for condition numbers see ?kappa.lm, and for variance inflation factors see ?vif. The latter is in the Design package. set.seed(1) x1 - rnorm(100) x2 - x1 + 0.1 * rnorm(100) y - rnorm(100) f - lm(y ~ x1 + x2) vif(f) kappa(f) HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] is library loaded
Dear Omar, The following function tests whether a package is in the search path (with the package name given in quotes): packageLoaded - function(name) 0 != length(grep(paste(^package:, name, $, sep=), search())) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis Sent: Tuesday, September 13, 2005 2:09 PM To: r-help@stat.math.ethz.ch Subject: [R] is library loaded Is there a way to test if a library has been loaded? is.loaded does not give me what I want, I am looking to say: if loaded(Rdbi) dbSendQuery(conn, q) if loaded(RODBC) sqlQuery(conn, q) I need this to support both unix and windows platforms as I could not find a windows distribution for RdbiPgSQL. I am using R 2.1.0 and postgresql. I will be connecting to the database using Rdbi and RdbiPgSQL and have other developers using windows connect with RODBC, unless someone can suggest a better solution. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] is library loaded
Dear Robert, packageLoaded() may well be a bad name but loadedNamespaces() won't detect a package without a namespace. It therefore seemed safe to me to check the path, which would include both packages with and without namespaces. With respect to loading and attaching, I thought that library() both loaded a package (with or without a namespace) and attached it to the search path, but I must admit that I'm easily confused about these distinctions. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Robert Gentleman [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 13, 2005 3:53 PM To: John Fox Cc: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Subject: Re: [R] is library loaded Hi, Almost surely this is a bad name. With the advent of name spaces it is important to distinguish between loading and attaching. This function tests for attached packages. To test for loaded packages we already have loadedNamespaces. Best wishes, Robert John Fox wrote: Dear Omar, The following function tests whether a package is in the search path (with the package name given in quotes): packageLoaded - function(name) 0 != length(grep(paste(^package:, name, $, sep=), search())) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis Sent: Tuesday, September 13, 2005 2:09 PM To: r-help@stat.math.ethz.ch Subject: [R] is library loaded Is there a way to test if a library has been loaded? is.loaded does not give me what I want, I am looking to say: if loaded(Rdbi) dbSendQuery(conn, q) if loaded(RODBC) sqlQuery(conn, q) I need this to support both unix and windows platforms as I could not find a windows distribution for RdbiPgSQL. I am using R 2.1.0 and postgresql. I will be connecting to the database using Rdbi and RdbiPgSQL and have other developers using windows connect with RODBC, unless someone can suggest a better solution. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcommander and simple chisquare
Dear Christian, From the Rcmdr menus, select Statistics - Summaries - Frequency distributions, and check the Chisquare goodness of fit test box in the resulting dialog. This will bring up a dialog box where you can enter hypothesized probabilities from which expected frequencies will be calculated. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost Sent: Thursday, September 15, 2005 5:40 AM To: r-help@stat.math.ethz.ch Subject: [R] Rcommander and simple chisquare In this years biostat teaching I will include Rcommander (it indeed simplifies syntax problems that makes students frequently miss the core statistical problems). But I could not find how to make a simple chisquare comparison between observed frequencies and expected frequencies (eg in genetics where you expect phenotypic frequencies corresponding to 3:1 in standard dominant/recessif alleles). Any idea where this feature might be hidden? Or could it be added to Rcommander? Thanks, Christian. ps: in case I am not making myself clear, can Rcommander be made to perform chisq.test(c(61,39),p=c(0.75,0.25)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcommander and simple chisquare
Dear Christian, From the Rcmdr menus, select Statistics - Summaries - Frequency distributions, and check the Chisquare goodness of fit test box in the resulting dialog. This will bring up a dialog box where you can enter hypothesized probabilities from which expected frequencies will be calculated. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost Sent: Thursday, September 15, 2005 5:40 AM To: r-help@stat.math.ethz.ch Subject: [R] Rcommander and simple chisquare In this years biostat teaching I will include Rcommander (it indeed simplifies syntax problems that makes students frequently miss the core statistical problems). But I could not find how to make a simple chisquare comparison between observed frequencies and expected frequencies (eg in genetics where you expect phenotypic frequencies corresponding to 3:1 in standard dominant/recessif alleles). Any idea where this feature might be hidden? Or could it be added to Rcommander? Thanks, Christian. ps: in case I am not making myself clear, can Rcommander be made to perform chisq.test(c(61,39),p=c(0.75,0.25)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcommander and simple chisquare
Dear Christian, The Rcmdr assumes that you have a data frame with the original data, in which the variable in question is a factor. The frequency distribution is constructed for the factor. Thus, in your example, you'd have 100 observations classified on a two-category factor. What you enter directly are the hypothesized probabilities. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Christian Jost [mailto:[EMAIL PROTECTED] Sent: Thursday, September 15, 2005 11:38 AM To: John Fox; 'Philippe Grosjean' Cc: r-help@stat.math.ethz.ch Subject: RE: [R] Rcommander and simple chisquare Dear John and Philippe, thanks for your replys, I finally found this menu, but I am somewhat at a loss how I should enter the observed frequencies. To take my example below, If I enter a one-column data.frame with the numbers 61 and 39, John's indicated menu is not highlighted. If I add a second column containing some factor, the menu is highlighted by I cannot select the first column. However, if I edit the data and declare the first column to be of type 'character' I can select it in the menu dialog and declare the expected frequencies, but the chisquare output doesn't make any sense. For the moment I cannot make any sense of that :-( Any help most appreciated, or a link to the tutorial/faq that explains such kind of problems. Thanks, Christian. At 11:31 -0400 15/09/05, John Fox wrote: Dear Philippe, This does a chi-square test of independence in a contingency table, not a chi-square goodness-of-fit test (which is done in the Rcmdr via Statistics - Summaries - Frequency distribution). Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Grosjean Sent: Thursday, September 15, 2005 7:32 AM To: Christian Jost Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Rcommander and simple chisquare Hello, Just look at Statistics - Contingency tables. There is an option for making the chi square test there. Best, Philippe Grosjean, ..°})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean .. Christian Jost wrote: In this years biostat teaching I will include Rcommander (it indeed simplifies syntax problems that makes students frequently miss the core statistical problems). But I could not find how to make a simple chisquare comparison between observed frequencies and expected frequencies (eg in genetics where you expect phenotypic frequencies corresponding to 3:1 in standard dominant/recessif alleles). Any idea where this feature might be hidden? Or could it be added to Rcommander? Thanks, Christian. ps: in case I am not making myself clear, can Rcommander be made to perform chisq.test(c(61,39),p=c(0.75,0.25)) __ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] text editor TinR?
Dear Mihai, It's Tinn-R (with two n's), at http://www.sciviews.org/Tinn-R/. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mihai Nica Sent: Monday, September 19, 2005 4:07 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] text editor TinR? Greetings, Please help me remember the name of the tiny text editor that works with R TinR maybe? I cannot find it at all, and cannot remember it, it is really frustrating... Thanks, Mihai Nica, ABD Jackson State University 170 East Griffith Street G5 Jackson, MS 39201 601 914 0361 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Are least-squares means useful or appropriate?
Dear Peter, Doug, and Felipe, My effects package (on CRAN, also see the article at http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revis ed.pdf) will compute and graph adjusted effects of various kinds for linear and generalized linear models -- generalizing so-called least-squares means (or population marginal means or adjusted means). A couple of comments: By default, the all.effects() function in the effects package computes effects for high-order terms in the model, absorbing terms marginal to them. You can ask the effect() function to compute an effect for a term that's marginal to a higher-order term, and it will do so with a warning, but this is rarely sensible. Peter's mention of floating variances (or quasi-variances) in this context is interesting, but what would most like to see, I think, are the quasi-variances for the adjusted effects, that is for terms merged with their lower-order relatives. These, for example, are unaffected by contrast coding. How to define reasonable quasi-variances in this context has been puzzling me for a while. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Friday, September 23, 2005 10:23 AM To: Douglas Bates Cc: Felipe; R-help@stat.math.ethz.ch Subject: Re: [R] Are least-squares means useful or appropriate? Douglas Bates [EMAIL PROTECTED] writes: On 9/20/05, Felipe [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi. My question was just theoric. I was wondering if someone who were using SAS and R could give me their opinion on the topic. I was trying to use least-squares means for comparison in R, but then I found some indications against them, and I wanted to know if they had good basis (as I told earlier, they were not much detailed). Greetings. Felipe As Deepayan said in his reply, the concept of least squares means is associated with SAS and is not generally part of the theory of linear models in statistics. My vague understanding of these (I too am not a SAS user) is that they are an attempt to estimate the mean response for a particular level of a factor in a model in which that factor has a non-ignorable interaction with another factor. There is no clearly acceptable definition of such a thing. (PD goes and fetches the SAS manual) Well, yes. it'll do that too, although only if you ask for the lsmeans of A when an interaction like A*B is present in the model. This is related to the tests of main effects when an interaction is present using type III sums of squares, which has been beaten to death repeatedly on the list. In both cases, there seems to be an implicit assumption that categorical variables by nature comes from an underlying fully balanced design. If the interaction is absent from the model, the lsmeans are somewhat more sensible in that they at least reproduce the parameter estimates as contrasts between different groups. All continuous variables in the design will be set to their mean, but values for categorical design variables are weighted inversely as the number of groups. So if you're doing an lsmeans of lung function by smoking adjusted for age and sex you get estimates for the mean of a population of which everyone has the same age and half are male and half are female. This makes some sense, but if you do it for sex adjusting for smoking and age, you are not only forcing the sexes to smoke equally much, but actually adjusting to smoking rates of 50%, which could be quite far from reality. The whole operation really seems to revolve around 2 things: (1) pairwise comparisons between factor levels. This can alternatively be done fairly easily using parameter estimates for the relevant variable and associated covariances. You don't really need all the mumbo-jumbo of adjusting to particular values of other variables. (2) plotting effects of a factor with error bars as if they were simple group means. This has some merit since the standard parametrizations are misleading at times (e.g. if you choose the group with the least data as the reference level, std. err. for the other groups will seem high). However, it seems to me that concepts like floating variances (see float() in the Epi package) are more to the point. R is an interactive language where it is a simple matter to fit a series of models and base your analysis on a model that is appropriate. An approach of give me the answer to any possible question about this model, whether or not it make sense is unnecessary. In many ways statistical
Re: [R] plot question when type = b and pch is a vector
Dear tokas, How about: x - seq(0.01, 10, length = 20) xx - x[7] x[7] - NA plot(c(0, 10), c(-20, 20), type = n, xlab = x, ylab = expression(2 * alpha * log(x))) for(i in 1:4){ lines(x, 2 * i * log(x), lty = 1) text(xx, 2 * i * log(xx), i) } I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of toka tokas Sent: Sunday, October 02, 2005 5:37 AM To: r-help@stat.math.ethz.ch Subject: [R] plot question when type = b and pch is a vector Dear R users, I've been struggling some days with the following problem: I'm interesting in producing the following plot x - seq(0.01, 10, length = 20) plot(c(0, 10), c(-20, 20), type = n, xlab = x, ylab = expression(2 * alpha * log(x))) pch. - rep(NA, length(x)) for(i in 1:4){ pch.[7] - as.character(i) lines(x, 2 * i * log(x), type = b, pch = pch., lty = 1) } where all the line segments are connected, except from the 7th one where I've put the value of alpha -- in other words I'd like to produce a line plot where the label appears at each line with some white space around it. thanks in advance, tokas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] generalized linear model for multinomial data?
Dear Hongyu, See multinom() in the nnet package, associated with Venables and Ripley's Modern Applied Statistics with S. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Hongyu Sun Sent: Sunday, October 02, 2005 3:07 PM To: r-help@stat.math.ethz.ch Subject: [R] generalized linear model for multinomial data? Dear All: Does R have the package as in SAS's generalized logits model for nominal response data? I have searched but cannot find the windows package. Many thanks, HS __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcmdr and scatter3d
(predict(mod, newdata=dat), grid.lines, grid.lines) if (fill) rgl.surface(vals, vals, yhat, color=surface.col[j], alpha=0.5, lit=FALSE) if (grid) rgl.surface(vals, vals, yhat, color=if (fill) grid.col else surface.col[j], alpha=0.5, lit=FALSE, front=lines, back=lines) rgl.texts(0, predict(mod, newdata=data.frame(x=0, z=0, groups=group)), 0, paste(group, ), adj=1, color=surface.col[j]) if (residuals){ yy - y[select.obs] xx - x[select.obs] zz - z[select.obs] fitted - fitted(mod) rgl.lines(as.vector(rbind(xx,xx)), as.vector(rbind(yy,fitted)), as.vector(rbind(zz,zz)), col=surface.col[j]) } } } } } } if (revolutions 0) { for (i in 1:revolutions){ for (angle in seq(1, 360, length=360/speed)) rgl.viewpoint(-angle, fov=fov) } } if (model.summary) return(summaries) else return(invisible(NULL)) } John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Naiara S. Pinto Sent: Tuesday, October 04, 2005 6:13 PM To: r-help@stat.math.ethz.ch Subject: [R] Rcmdr and scatter3d Hi folks, I'd like to use scatter3d (which is in R commander) to plot more than one dataset in the same graph, each dataset with a different color. The kind of stuff you would do with holdon in Matlab. I read a recent message that was posted to this list with a similar problem, but I couldn't understand the reply. Could someone give me one example? How do you plot subgroups using scatter3d? Thanks a lot! Naiara. Naiara S. Pinto Ecology, Evolution and Behavior 1 University Station A6700 Austin, TX, 78712 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rcmdr and scatter3d
Dear Ted, I assumed that since Naiara was using scatter3d(), he wants a 3D dynamic scatterplot. He could add points (actually, spheres) to the rgl graph produced by scatter3d() -- the analog of plot() followed by points() for a 2D graph -- but doing so would be much more work than plotting by groups. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: ecatchpole [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 04, 2005 8:55 PM To: John Fox Cc: 'Naiara S. Pinto'; r-help@stat.math.ethz.ch Subject: Re: [R] Rcmdr and scatter3d Niara, Alternatively, instead of scatter3d, the analogy to hold on in Matlab is to use plot() for the first set of data, then points() for the remainder. See ?plot ?points Ted. On 05/10/05 11:18, John Fox wrote,: Dear Naiara, Combine the data sets and differentiate among them with a factor. Then use the groups argument to scatter3d (see ?scatter3d). If you're using the R Commander to make the plot, the 3D scatterplot dialog box as a plot by groups button. You can also fit colour-coded regression surfaces by group. I've appended a new version of the scatter3d function, not yet in the Rcmdr package, which will also plot data ellipsoids (for the whole data set or by groups). I hope this helps, John --- snip -- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Naiara S. Pinto Sent: Tuesday, October 04, 2005 6:13 PM To: r-help@stat.math.ethz.ch Subject: [R] Rcmdr and scatter3d Hi folks, I'd like to use scatter3d (which is in R commander) to plot more than one dataset in the same graph, each dataset with a different color. The kind of stuff you would do with holdon in Matlab. I read a recent message that was posted to this list with a similar problem, but I couldn't understand the reply. Could someone give me one example? How do you plot subgroups using scatter3d? Thanks a lot! Naiara. Naiara S. Pinto Ecology, Evolution and Behavior 1 University Station A6700 Austin, TX, 78712 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Dr E.A. Catchpole Visiting Fellow Univ of New South Wales at ADFA, Canberra, Australia and University of Kent, Canterbury, England - www.ma.adfa.edu.au/~eac - fax: +61 2 6268 8786 - ph: +61 2 6268 8895 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] testing non-linear component in mgcv:gam
Dear Denis, Take a closer look at the anova table: The models provide identical fits to the data. The differences in degrees of freedom and deviance between the two models are essentially zero, 5.5554e-10 and 2.353e-11 respectively. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 8:22 AM To: r-help@stat.math.ethz.ch Subject: [R] testing non-linear component in mgcv:gam Hi, I need further help with my GAMs. Most models I test are very obviously non-linear. Yet, to be on the safe side, I report the significance of the smooth (default output of mgcv's summary.gam) and confirm it deviates significantly from linearity. I do the latter by fitting a second model where the same predictor is entered without the s(), and then use anova.gam to compare the two. I thought this was the equivalent of the default output of anova.gam using package gam instead of mgcv. I wonder if this procedure is correct because one of my models appears to be linear. In fact mgcv estimates df to be exactly 1.0 so I could have stopped there. However I inadvertently repeated the procedure outlined above. I would have thought in this case the anova.gam comparing the smooth and the linear fit would for sure have been not significant. To my surprise, P was 6.18e-09! Am I doing something wrong when I attempt to confirm the non- parametric part a smoother is significant? Here is my example case where the relationship does appear to be linear: library(mgcv) This is mgcv 1.3-7 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 0.38, 0.62, 0.88, 1.12, 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 3.62, 3.88, 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 6.38, 6.62, 6.88, 7.12, 8.38, 13.62) N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 29, 31, 22, 26, 24, 23, 15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 0.062046512, 0.396459596, 0.189082949, 0.054757925, 0.142810440, 0.168005168, 0.180804428, 0.111439628, 0.128799505, 0.193707937, 0.105921610, 0.103497845, 0.028591837, 0.217894389, 0.020535469, 0.080389068, 0.105234450, 0.070213450, 0.050771363, 0.042074434, 0.102348837, 0.049748344, 0.019100478, 0.005203125, 0.101711864, 0.0, 0.0, 0.014808824, 0.0, 0.22200, 0.16700, 0.0, 0.0, 0.0) sed.gam - gam(wm.sed~s(Temp),weight=N.sets) summary.gam(sed.gam) Family: gaussian Link function: identity Formula: wm.sed ~ s(Temp) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.084030.01347 6.241 3.73e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Approximate significance of smooth terms: edf Est.rank F p-value s(Temp) 11 13.95 0.000666 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.554 Deviance explained = 28.5% GCV score = 0.09904 Scale est. = 0.093686 n = 37 # testing non-linear contribution sed.lin - gam(wm.sed~Temp,weight=N.sets) summary.gam(sed.lin) Family: gaussian Link function: identity Formula: wm.sed ~ Temp Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** Temp-0.023792 0.006369 -3.736 0.000666 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.554 Deviance explained = 28.5% GCV score = 0.09904 Scale est. = 0.093686 n = 37 anova.gam(sed.lin, sed.gam, test=F) Analysis of Deviance Table Model 1: wm.sed ~ Temp Model 2: wm.sed ~ s(Temp) Resid. Df Resid. Dev Df Deviance F Pr(F) 1 3.5000e+01 3.279 2 3.5000e+01 3.279 5.5554e-10 2.353e-11 0.4521 6.18e-09 *** Thanks in advance, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] transparent surface in rgl
Dear Paul, I don't have experience with rgl.postscript(), which is relatively new, but find that the png graphs produced by rgl.snapshot() are of reasonably good quality and preserve transparency. Perhaps the developers of the rgl package can shed more light on the matter. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof. Paul R. Fisher Sent: Wednesday, October 05, 2005 8:32 AM To: r-help@stat.math.ethz.ch Subject: [R] transparent surface in rgl Hi all I am a complete newbie to this list (just subscribed) and a newcomer to R (an S user from olden times). I have been using scatter3d to create a 3d scatter plot with surface. The graphic is created within the rgl package and I have used rgl.postscript to export it so I can generate a publication quality image. My problem is that the plotted surface is no longer transparent in the postscript output ie. the rgl.spheres that are behind the surface disappear in the postscript image. Can't seem to find any info on this anywhere. Am I doing something wrong? Is there an easy fix? Anyway, thanks. Hope I've not broken some netiquette rule sending this. Cheers, Paul Fisher. -- Prof. Paul R. Fisher, Chair in Microbiology, La Trobe University, VIC 3086, AUSTRALIA. Tel. + 61 3 9479 2229 Fax. + 61 3 9479 1222 Email. [EMAIL PROTECTED] Web. http://www.latrobe.edu.au/mcbg/my.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] testing non-linear component in mgcv:gam
Dear Denis, The chi-square test is formed in analogy to what's done for a GLM: The difference in residual deviance for the nested models is divided by the estimated scale parameter -- i.e., the estimated error variance for a model with normal errors. Otherwise, as you point out, the test would be dependent upon the scale of the response. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 9:04 AM To: John Fox Cc: R list Subject: Re: [R] testing non-linear component in mgcv:gam Hi John, Le 05-10-05 à 09:45, John Fox a écrit : Dear Denis, Take a closer look at the anova table: The models provide identical fits to the data. The differences in degrees of freedom and deviance between the two models are essentially zero, 5.5554e-10 and 2.353e-11 respectively. I hope this helps, John This is one of my difficulties. In some examples I found on the web, the difference in deviance is compared directly against the chi- squared distribution. But my y variable has a very small range (between 0 and 0.5, most of the time) so the difference in deviance is always very small and if I compared it against the chi-squared distribution as I have seen done in examples, the non-linear component would always be not significant. Yet it is (with one exception), tested with both mgcv:gam and gam:gam. I think the examples I have read were wrong in this regard, the scale factor seen in mgcv output seems to intervene. But exactly how is still mysterious to me and I hesitate to judge the size of the deviance difference myself. I agree it is near zero in my example. I guess I need to have more experience with these models to better interpret the output... Denis -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 8:22 AM To: r-help@stat.math.ethz.ch Subject: [R] testing non-linear component in mgcv:gam Hi, I need further help with my GAMs. Most models I test are very obviously non-linear. Yet, to be on the safe side, I report the significance of the smooth (default output of mgcv's summary.gam) and confirm it deviates significantly from linearity. I do the latter by fitting a second model where the same predictor is entered without the s(), and then use anova.gam to compare the two. I thought this was the equivalent of the default output of anova.gam using package gam instead of mgcv. I wonder if this procedure is correct because one of my models appears to be linear. In fact mgcv estimates df to be exactly 1.0 so I could have stopped there. However I inadvertently repeated the procedure outlined above. I would have thought in this case the anova.gam comparing the smooth and the linear fit would for sure have been not significant. To my surprise, P was 6.18e-09! Am I doing something wrong when I attempt to confirm the non- parametric part a smoother is significant? Here is my example case where the relationship does appear to be linear: library(mgcv) This is mgcv 1.3-7 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 0.38, 0.62, 0.88, 1.12, 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 3.62, 3.88, 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 6.38, 6.62, 6.88, 7.12, 8.38, 13.62) N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 29, 31, 22, 26, 24, 23, 15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 0.062046512, 0.396459596, 0.189082949, 0.054757925, 0.142810440, 0.168005168, 0.180804428, 0.111439628, 0.128799505, 0.193707937, 0.105921610, 0.103497845, 0.028591837, 0.217894389, 0.020535469, 0.080389068, 0.105234450, 0.070213450, 0.050771363, 0.042074434, 0.102348837, 0.049748344, 0.019100478, 0.005203125, 0.101711864, 0.0, 0.0, 0.014808824, 0.0, 0.22200, 0.16700, 0.0, 0.0, 0.0) sed.gam - gam(wm.sed~s(Temp),weight=N.sets) summary.gam(sed.gam) Family: gaussian Link function: identity Formula: wm.sed ~ s(Temp) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.084030.01347 6.241 3.73e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Approximate significance of smooth terms: edf Est.rank F p-value s(Temp) 11 13.95 0.000666
Re: [R] testing non-linear component in mgcv:gam
Dear Denis, You got me: I would have thought from ?summary.gam that this would be the same as the adjusted R^2 for a linear model. Note, however, that the percentage of deviance explained checks with the R^2 from the linear model, as expected. Maybe you should address this question to the package author. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Denis Chabot [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 05, 2005 3:33 PM To: John Fox Cc: R list Subject: Re: [R] testing non-linear component in mgcv:gam Thank you everyone for your help, but my introduction to GAM is turning my brain to mush. I thought the one part of the output I understood the best was r-sq (adj), but now even this is becoming foggy. In my original message I mentioned a gam fit that turns out to be a linear fit. By curiosity I analysed it with a linear predictor only with mgcv package, and then as a linear model. The output was identical in both, but the r-sq (adj) was 0.55 in mgcv and 0.26 in lm. In lm I hope that my interpretation that 26% of the variance in y is explained by the linear relationship with x is valid. Then what does r2 mean in mgcv? Denis summary.gam(lin) Family: gaussian Link function: identity Formula: wm.sed ~ Temp Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** Temp-0.023792 0.006369 -3.736 0.000666 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 R-sq.(adj) = 0.554 Deviance explained = 28.5% GCV score = 0.09904 Scale est. = 0.093686 n = 37 summary(sed.true.lin) Call: lm(formula = wm.sed ~ Temp, weights = N.sets) Residuals: Min 1Q Median 3Q Max -0.6138 -0.1312 -0.0325 0.1089 1.1449 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** Temp-0.023792 0.006369 -3.736 0.000666 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.3061 on 35 degrees of freedom Multiple R-Squared: 0.285,Adjusted R-squared: 0.2646 F-statistic: 13.95 on 1 and 35 DF, p-value: 0.000666 Le 05-10-05 à 09:45, John Fox a écrit : Dear Denis, Take a closer look at the anova table: The models provide identical fits to the data. The differences in degrees of freedom and deviance between the two models are essentially zero, 5.5554e-10 and 2.353e-11 respectively. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Wednesday, October 05, 2005 8:22 AM To: r-help@stat.math.ethz.ch Subject: [R] testing non-linear component in mgcv:gam Hi, I need further help with my GAMs. Most models I test are very obviously non-linear. Yet, to be on the safe side, I report the significance of the smooth (default output of mgcv's summary.gam) and confirm it deviates significantly from linearity. I do the latter by fitting a second model where the same predictor is entered without the s(), and then use anova.gam to compare the two. I thought this was the equivalent of the default output of anova.gam using package gam instead of mgcv. I wonder if this procedure is correct because one of my models appears to be linear. In fact mgcv estimates df to be exactly 1.0 so I could have stopped there. However I inadvertently repeated the procedure outlined above. I would have thought in this case the anova.gam comparing the smooth and the linear fit would for sure have been not significant. To my surprise, P was 6.18e-09! Am I doing something wrong when I attempt to confirm the non- parametric part a smoother is significant? Here is my example case where the relationship does appear to be linear: library(mgcv) This is mgcv 1.3-7 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 0.38, 0.62, 0.88, 1.12, 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 3.62, 3.88, 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 6.38, 6.62, 6.88, 7.12, 8.38, 13.62) N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 29, 31, 22, 26, 24, 23, 15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 0.062046512, 0.396459596, 0.189082949
Re: [R] R/S-Plus equivalent to Genstat predict: predictions over averages of covariates
Dear Peter, See the effects package, described in http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revi sed.pdf. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dunn Sent: Wednesday, October 05, 2005 9:06 PM To: R-help mailing list Subject: [R] R/S-Plus equivalent to Genstat predict: predictions over averages of covariates Hi all I'm doing some things with a colleague comparing different sorts of models. My colleague has fitted a number of glms in Genstat (which I have never used), while the glm I have been using is only available for R. He has a spreadsheet of fitted means from each of his models obtained from using the Genstat predict function. For example, suppose we fit the model of the type glm.out - glm( y ~ factor(F1) + factor(F2) + X1 + poly(X2,2) + poly(X3,2), family=...) Then he produces a table like this (made up, but similar): F1(level1)12.2 F1(level2)14.2 F1(level3)15.3 F2(level1)10.3 F2(level2)9.1 X1=0 10.2 X1=0.510.4 X1=1 10.4 X1=1.510.5 X1=2 10.9 X1=2.511.9 X1=3 11.8 X2=0 12.0 X2=0.512.2 X2=1 12.5 X2=1.512.9 X2=2 13.0 X2=2.513.1 X2=3 13.5 Each of the numbers are a predicted mean. So when X1=0, on average we predict an outcome of 10.2. To obtain these figures in Genstat, he uses the Genstat predict function. When I asked for an explanation of how it was done (ie to make the predictions, what values of the other covariates were used) I was told: So, for a one-dimensional table of fitted means for any factor (or variate), all other variates are set to their average values; and the factor constants (including the first, at zero) are given a weighted average depending on their respective numbers of observations. So for quantitative variables (such as pH), one uses the mean pH in the data set when making the predictions. Reasonable anmd easy. But for categorical variables (like Month), he implies we use a weighted average of the fitted coefficients for all the months, depending on the proportion of times those factor levels appear in the data. (I hope I explained that OK...) Is there an equivalent way in R or S-Plus of doing this? I have to do it for a number of sites and species, so an automated way would be useful. I have tried searching to no avail (but may not be searching on the correct terms), and tried hard-coding something myself as yet unsuccessfully: The poly terms and the use of the weighted averaging over the factor levels are proving a bit too much for my limited skills. Any assistance appreciated. (Any clarification of what I mean can be provided if I have not been clear.) Thanks, as always. P. version _ platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major2 minor1.0 year 2005 month04 day 18 language R -- Dr Peter Dunn | Senior Lecturer in Statistics Faculty of Sciences, University of Southern Queensland Web:http://www.sci.usq.edu.au/staff/dunn Email: dunn at usq.edu.au CRICOS: QLD 00244B | NSW 02225M | VIC 02387D | WA 02521C __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data.frame error using sem package
Dear Suzanne, Take a look at your model specification: p.ram [,1][,2][,3] [1,] LM1 - LSMA LM1 - LSMA NA [2,] LSMA - RSMA LSMA - RSMA NA [3,] RSMA - RM1 RSMA - RM1 NA [4,] LSMA - LSMA LSMA - LSMA NA [5,] RSMA - RSMA RSMA - RSMA NA [6,] RM1 - RM1 RM1 - RM1 NA This matrix should have three columns, the first giving the path, the second the name of the corresponding parameter, and the third the start value for the parameter (or NA if you want sem() to compute a start value). You've apparently left out the parameter names. Please see the sem examples for details and the paper at http://socserv.socsci.mcmaster.ca/jfox/sem-package.pdf. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Suzanne Witt Sent: Thursday, October 06, 2005 2:55 PM To: r-help@stat.math.ethz.ch Subject: [R] data.frame error using sem package I am trying to use sem to measure effective connectivity among four brain regions. I have pasted the code that I am trying to run since that seems easier than trying to come up with another example. The input data is time series data taken from SPM; they are each 1x121 columns of numbers. I get the error either when I source the whole code, or if I enter it line by line when I go to get the summary. Thanks, Suzanne library(sem) # Load the region timecourses. lsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_LSMA.dat) rsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_RSMA.dat) lmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_LM1.dat) rmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/ 010905_lcomf_RM1.dat) # Combine all the timecourses from each session into a single data frame and name the columns appropriately. lcomf - cbind(lsma1, rsma1, lmc1, rmc1) names(lcomf) - c(LSMA, RSMA, LM1, RM1) # Type this at the command line to see a summary of your data str(lcomf) # Set up the structural equation model. p.ram - matrix(c( 'LM1 - LSMA', 'LM1 - LSMA', NA, 'LSMA - RSMA', 'LSMA - RSMA', NA, 'RSMA - RM1', 'RSMA - RM1', NA, 'LSMA - LSMA', 'LSMA - LSMA', NA, 'RSMA - RSMA', 'RSMA - RSMA', NA, 'RM1 - RM1', 'RM1 - RM1', NA), ncol = 3, byrow = TRUE) # Tell which variables are exogenous ('fixed'). p.fixed - c('LM1') # Do the fitting for session 1. C - cor(lcomf) nt - dim(lcomf)[1] #attach(lcomf) lcomf.results - sem(p.ram, C, nt, obs.variables = rownames(C), fixed.x = p.fixed) # Check out the results using the summary function summary(lcomf.results) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] zip package
Dear Sara, It looks to me as if there are three problems here: (1) Is the zip file for the package really at c:/ProgramFiles/R/rw2011/library/lpSolve_1.1.9? That is, isn't there a space in Program Files? (2) You have to specify repos=NULL to install from a local zip file, as ?install.packages tells you. (3) You don't seem to be using the destdir argument correctly; you can omit it. Why not avoid all this and just use the R for Windows menus: Packages - Install package(s) from local zip files? I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sara Mouro Sent: Friday, October 14, 2005 4:45 AM To: r-help@stat.math.ethz.ch Subject: [R] zip package Dear all I can not understand how to install the package lpsolve_1.1.9.zip I have read the FAQ and the help pages carefully, but it still not clear for me. I have tried the following (and obtained the respective error messages): install.packages(c:/ProgramFiles/R/rw2011/library/lpSolve_1. 1.9,destdir= c:/ProgramFiles/R/rw2011/library/lpSolve) Mensagem de aviso: no package 'c:/Program Files/R/rw2011/library/lpSolve_1.1.9' at the repositories in: download.packages(pkgs, destdir = tmpd, available = available, install.packages(lpSolve_1.1.9,destdir=c:/ProgramFiles/R/r w2011/libr ary/ lpSolve) Erro em download.packages(pkgs, destdir = tmpd, available = available, : 'destdir' is not a directory install.packages(lpSolve_1.1.9,destdir=NULL) Mensagem de aviso: no package 'lpSolve_1.1.9' at the repositories in: download.packages(pkgs, destdir = tmpd, available = available, Could you please tell me how to do that or when can I find a simple example of that kind of installation? Thank you so much and sorry for such basic question. Sara Mouro [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Test for equality of coefficients in multivariate multipleregression
Dear Berwin, Simply stacking the problems and treating the resulting observations as independent will give you the correct coefficients, but incorrect coefficient variances and artificially zero covariances. The approach that I suggested originally -- testing a linear hypothesis using the coefficient estimates and covariances from the multivariate linear model -- seems simple enough. For example, to test that all three coefficients are the same across the two equations, b - as.vector(coef(x.mlm)) V - vcov(x.mlm) L - c(1, 0, 0,-1, 0, 0, 0, 1, 0, 0,-1, 0, 0, 0, 1, 0, 0,-1) L - matrix(L, nrow=3, byrow=TRUE) t(L %*% b) %*% (L %*% V %*% t(L)) %*% (L %*% b) The test statistic is chi-square with 3 df under the null hypothesis. (Note: not checked carefully.) (BTW, it's a bit unclear to me how much of this exchange was on r-help, but I'm copying to r-help since at least one of Ulrich's messages referring to alternative approaches appeared there. I hope that's OK.) Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Berwin A Turlach [mailto:[EMAIL PROTECTED] On Behalf Of Berwin A Turlach Sent: Tuesday, July 18, 2006 9:28 PM To: Andrew Robinson Cc: Ulrich Keller; John Fox Subject: Re: [R] Test for equality of coefficients in multivariate multipleregression G'day all, AR == Andrew Robinson [EMAIL PROTECTED] writes: AR I suggest that you try to rewrite the model system into a AR single mixed-effects model, [...] Why a mixed-effect model, wouldn't a fixed effect be o.k. too? Something like: DF-data.frame(x1=rep(c(0,1),each=50),x2=rep(c(0,1),50)) tmp-rnorm(100) DF$y1-tmp+DF$x1*.5+DF$x2*.3+rnorm(100,0,.5) DF$y2-tmp+DF$x1*.5+DF$x2*.7+rnorm(100,0,.5) x.mlm-lm(cbind(y1,y2)~x1+x2,data=DF) coef(x.mlm) y1 y2 (Intercept) -0.08885266 -0.05749196 x1 0.33749086 0.60395258 x2 0.72017894 1.11932077 DF2 - with(DF, data.frame(y=c(y1,y2))) DF2$x11 - with(DF, c(x1, rep(0,100))) DF2$x21 - with(DF, c(x2, rep(0,100))) DF2$x12 - with(DF, c(rep(0,100), x1)) DF2$x22 - with(DF, c(rep(0,100), x2)) DF2$x1 - with(DF, c(x1, x1)) DF2$wh - rep(c(0,1), each=100) fm1 - lm(y~wh + x11 + x21 + x12 + x22, DF2) fm1 Call: lm(formula = y ~ wh + x11 + x21 + x12 + x22, data = DF2) Coefficients: (Intercept) wh x11 x21 x12 x22 -0.08885 0.03136 0.33749 0.72018 0.60395 1.11932 fm2 - lm(y~wh + x1 + x21 + x22, DF2) anova(fm2,fm1) Analysis of Variance Table Model 1: y ~ wh + x1 + x21 + x22 Model 2: y ~ wh + x11 + x21 + x12 + x22 Res.Df RSS Df Sum of Sq F Pr(F) 1195 246.919 2194 246.031 1 0.888 0.6998 0.4039 Cheers, Berwin == Full address Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Mathematics and Statistics+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: [EMAIL PROTECTED] Australiahttp://www.maths.uwa.edu.au/~berwin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] equality constraint
Dear Ana Maria, How about replacing the last of the four parameters with the negative of the sum of the others? I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Friday, July 21, 2006 4:31 PM To: r-help@stat.math.ethz.ch Subject: [R] equality constraint I would like to optimize a (log-)likelihood function subject to linear equality constraints in parameters. My function has eight parameters. The first one, third and fourth are greater than zero, the second one must be less than zero. The problem is that the last four must sum zero. Function constrOptim only fits inequality constraints. Is there an alternative to lead with this problem? Ana Maria __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] Version 0.9-4 of sem to CRAN
Dear Kurt, I've uploaded a new version (0.9-4) of the sem package to CRAN. I hope that everything is well with you. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax of Levene's test
Dear Paul, The argument y is the response variable and group is a factor defining groups (as ?levene.test says). If you have more than one factor, then you can use interaction() to create from them a factor with levels given by the product set of the levels of the individual factors. Here's an example library(car) data(Moore) attach(Moore) levene.test(conformity, interaction(fcategory, partner.status)) Levene's Test for Homogeneity of Variance Df F value Pr(F) group 5 1.4694 0.2219 39 levels(interaction(fcategory, partner.status)) [1] high.high low.highmedium.high high.lowlow.low [6] medium.low levels(fcategory) [1] high lowmedium levels(partner.status) [1] high low I'll add a couple of examples to the help page. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith Sent: Wednesday, August 02, 2006 5:33 AM To: r-help@stat.math.ethz.ch Subject: [R] Syntax of Levene's test Dear All I am trying to use Levene's test (of package car), but I do not understand quite well how to use it. '?levene.test' does not unfortunately provide any example. My data are in a data frame and correspond to 4 factors plus response. Could someone please give me an example about how to use the command levene.test(y, group) ? Thanks in advance, Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax of Levene's test
Dear Paul, Levene's test tests the null hypothesis that the variance are equal, so a small p-value suggests that they are not. Looking at your output, it seems odd that you have as many as 96 groups. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith Sent: Wednesday, August 02, 2006 5:02 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Syntax of Levene's test On 8/2/06, John Fox [EMAIL PROTECTED] wrote: The argument y is the response variable and group is a factor defining groups (as ?levene.test says). If you have more than one factor, then you can use interaction() to create from them a factor with levels given by the product set of the levels of the individual factors. Here's an example library(car) data(Moore) attach(Moore) levene.test(conformity, interaction(fcategory, partner.status)) Levene's Test for Homogeneity of Variance Df F value Pr(F) group 5 1.4694 0.2219 39 levels(interaction(fcategory, partner.status)) [1] high.high low.highmedium.high high.low low.low [6] medium.low levels(fcategory) [1] high lowmedium levels(partner.status) [1] high low I'll add a couple of examples to the help page. Thanks, John. Now, I understand how to use levene.test. There is only a question remaining: is the null hypothesis corresponding to homogeneity of variances, i.e., should one conclude that Levene's Test for Homogeneity of Variance Df F valuePr(F) group 95 3.5919 2.2e-16 *** 864 tell us that the hypothesis that the variances are equal is (highly) significant? Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt about Student t distribution simulation
Dear Jose, The problem is that you're using the population standard deviation (sigma) rather than the sample SD of each sample [i.e., t[i] = (mean(amo.i) - mu) / (sd(amo.i) / sqrt(n)) ], so your values should be normally distributed, as they appear to be. A couple of smaller points: (1) Even after this correction, you're sampling from a discrete population (albeit with replacement) and so the values won't be exactly t-distributed. You could draw the samples directly from N(mu, sigma) instead. (2) It would be preferable to make a quantile-comparison plot against the t-distribution, since you'd get a better picture of what's going on in the tails. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jose Claudio Faria Sent: Friday, August 04, 2006 3:09 PM To: R-help@stat.math.ethz.ch Subject: [R] Doubt about Student t distribution simulation Dear R list, I would like to illustrate the origin of the Student t distribution using R. So, if (sample.mean - pop.mean) / standard.error(sample.mean) has t distribution with (sample.size - 1) degree free, what is wrong with the simulation below? I think that the theoretical curve should agree with the relative frequencies of the t values calculated: #== begin options= # parameters mu= 10 sigma = 5 # size of sample n = 3 # repetitions nsim = 1 # histogram parameter nchist = 150 #== end options=== t = numeric() pop = rnorm(1, mean = mu, sd = sigma) for (i in 1:nsim) { amo.i = sample(pop, n, replace = TRUE) t[i] = (mean(amo.i) - mu) / (sigma / sqrt(n)) } win.graph(w = 5, h = 7) split.screen(c(2,1)) screen(1) hist(t, main = histogram, breaks = nchist, col = lightgray, xlab = '', ylab = Fi, font.lab = 2, font = 2) screen(2) hist(t, probability = T, main= 'f.d.p and histogram', breaks = nchist, col = 'lightgray', xlab= 't', ylab = 'f(t)', font.lab= 2, font = 2) x = t curve(dt(x, df = n-1), add = T, col = red, lwd = 2) Many thanks for any help, ___ Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt about Student t distribution simulation
Dear Sundar, Try qq.plot(t, dist=t, df=n-1) from the car package, which include a 95-percent point-wise confidence envelope that helps you judge how extreme the outliers are relative to expectations. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Sundar Dorai-Raj [mailto:[EMAIL PROTECTED] Sent: Friday, August 04, 2006 3:27 PM To: John Fox Cc: [EMAIL PROTECTED]; R-help@stat.math.ethz.ch Subject: Re: [R] Doubt about Student t distribution simulation Hi, Jose/John, Here's an example to help Jose and highlights John's advice. Also includes set.seed which should be included in all simulations posted to R-help. set.seed(42) mu - 10 sigma - 5 n - 3 nsim - 1 m - matrix(rnorm(n * nsim, mu, sigma), nsim, n) t - apply(m, 1, function(x) (mean(x) - mu)/(sd(x)/sqrt(n))) library(lattice) qqmath(t, distribution = function(x) qt(x, n - 1), panel = function(x, ...) { panel.qqmath(x, col = darkblue, ...) panel.qqmathline(x, col = darkred, ...) }) With n = 3, expect a few outliers. --sundar John Fox wrote: Dear Jose, The problem is that you're using the population standard deviation (sigma) rather than the sample SD of each sample [i.e., t[i] = (mean(amo.i) - mu) / (sd(amo.i) / sqrt(n)) ], so your values should be normally distributed, as they appear to be. A couple of smaller points: (1) Even after this correction, you're sampling from a discrete population (albeit with replacement) and so the values won't be exactly t-distributed. You could draw the samples directly from N(mu, sigma) instead. (2) It would be preferable to make a quantile-comparison plot against the t-distribution, since you'd get a better picture of what's going on in the tails. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jose Claudio Faria Sent: Friday, August 04, 2006 3:09 PM To: R-help@stat.math.ethz.ch Subject: [R] Doubt about Student t distribution simulation Dear R list, I would like to illustrate the origin of the Student t distribution using R. So, if (sample.mean - pop.mean) / standard.error(sample.mean) has t distribution with (sample.size - 1) degree free, what is wrong with the simulation below? I think that the theoretical curve should agree with the relative frequencies of the t values calculated: #== begin options= # parameters mu= 10 sigma = 5 # size of sample n = 3 # repetitions nsim = 1 # histogram parameter nchist = 150 #== end options=== t = numeric() pop = rnorm(1, mean = mu, sd = sigma) for (i in 1:nsim) { amo.i = sample(pop, n, replace = TRUE) t[i] = (mean(amo.i) - mu) / (sigma / sqrt(n)) } win.graph(w = 5, h = 7) split.screen(c(2,1)) screen(1) hist(t, main = histogram, breaks = nchist, col = lightgray, xlab = '', ylab = Fi, font.lab = 2, font = 2) screen(2) hist(t, probability = T, main= 'f.d.p and histogram', breaks = nchist, col = 'lightgray', xlab= 't', ylab = 'f(t)', font.lab= 2, font = 2) x = t curve(dt(x, df = n-1), add = T, col = red, lwd = 2) Many thanks for any help, ___ Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] polychoric correlation error
Dear Janet, Because you didn't set the value of the random-number generator seed, your example isn't precisely reproducible, but the problem is apparent anyway: set.seed(12345) n-100 test.x-rnorm(n, mean=0, sd=1) test.c-test.x + rnorm(n, mean=0, sd=.5) thresh.x-c(-2.5, -1, -.5, .5, 1000) thresh.c-c(-1, 1, 2, 3, 1000) discrete.x-discrete.c-vector(length=n) for (i in 1:n) { + discrete.x[i]-which.min(thresh.x test.x[i] ) + discrete.c[i]-which.min(thresh.c test.c[i] ) } table(discrete.x, discrete.c) discrete.c discrete.x 1 2 3 4 5 2 12 1 0 0 0 3 3 12 0 0 0 4 2 19 2 0 0 5 0 18 21 9 1 cor(test.x, test.c) [1] 0.9184189 pc - polychor(discrete.x, discrete.c, std.err=T, ML=T) Warning messages: 1: NaNs produced in: log(x) 2: NaNs produced in: log(x) 3: NaNs produced in: log(x) pc Polychoric Correlation, ML est. = 0.9077 (0.03314) Test of bivariate normality: Chisquare = 3.103, df = 11, p = 0.9893 Row Thresholds Threshold Std.Err. 1 -1.12200 0.1609 2 -0.56350 0.1309 3 0.03318 0.1235 Column Thresholds Threshold Std.Err. 1 -0.9389 0.1489 20.4397 0.1292 31.2790 0.1707 42.3200 0.3715 The variables that you've created are indeed bivariate normal, but they are highly correlated, and your choice of cut points makes it hard to estimate the correlation from the contingency tables, apparently producing some difficulty in the maximization of the likelihood. Nevertheless, the ML estimates of the correlation and thresholds for the set of data above are pretty good. (In your case, the optimization failed.) BTW, a more straightforward way to create the categorical variables would be discrete.x - cut(test.x, c(-Inf, -2.5, -1, -.5, .5, Inf)) discrete.c - cut(test.c, c(-Inf, -1, 1, 2, 3, Inf)) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rosenbaum, Janet Sent: Friday, August 04, 2006 5:49 PM To: r-help@stat.math.ethz.ch Subject: [R] polychoric correlation error Dear all, I get a strange error when I find polychoric correlations with the ML method, which I have been able to reproduce using randomly-generated data. What is wrong? I realize that the data that I generated randomly is a bit strange, but it is the only way that I duplicate the error message. n-100 test.x-rnorm(n, mean=0, sd=1) test.c-test.x + rnorm(n, mean=0, sd=.5) thresh.x-c(-2.5, -1, -.5, .5, 1000) thresh.c-c(-1, 1, 2, 3, 1000) discrete.x-discrete.c-vector(length=n) for (i in 1:n) { + discrete.x[i]-which.min(thresh.x test.x[i] ) + discrete.c[i]-which.min(thresh.c test.c[i] ) } pc-polychor(discrete.x, discrete.c, std.err=T, ML=T) Error in optim(c(optimise(f, interval = c(-1, 1))$minimum, rc, cc), f, : non-finite finite-difference value [1] In addition: There were 50 or more warnings (use warnings() to see the first 50) print(pc) Error in print(pc) : object pc not found warnings() Warning messages: 1: NaNs produced in: log(x) 2: NA/Inf replaced by maximum positive value 3: NaNs produced in: log(x) --- Thanks, Janet This email message is for the sole use of the intended\ ...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Specifying Path Model in SEM for CFA
Dear Rick, There are a couple of problems here: (1) You've fixed the error variance parameters for each of the observed variables to 1 rather than defining each as a free parameter to estimate. For example, use X1 - X1, theta1, NA Rather than X1 - X1, NA, 1 The general principle is that if you give a parameter a name, it's a free parameter to be estimated; if you give the name as NA, then the parameter is given a fixed value (here, 1). (There is some more information on this and on error-variance parameters in ?sem.) (2) I believe that the model you're trying to specify -- in which all variables but X6 load on F1, and all variables but X1 load on F2 -- is underidentified. In addition, you've set the metric of the factors by fixing one loading to 0.20 and another to 0.25. That should work but strikes me as unusual, and makes me wonder whether this was what you really intended. It would be more common in a CFA to fix the variance of each factor to 1, and let the factor loadings be free parameters. Then the factor covariance would be their correlation. You should not have to specify start values for free parameters (such as g11, g22, and g12 in your model), though it is not wrong to do so. I would not, however, specify start values that imply a singular covariance matrix among the factors, as you've done; I'm surprised that the program was able to get by the start values to produce a solution. BTW, the Thurstone example in ?sem is for a confirmatory factor analysis (albeit a slightly more complicated one with a second-order factor). There's also an example of a one-factor CFA in the paper at http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf, though this is for ordinal observed variables. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rick Bilonick Sent: Tuesday, August 15, 2006 11:50 PM To: R Help Subject: [R] Specifying Path Model in SEM for CFA I'm using specify.model for the sem package. I can't figure out how to represent the residual errors for the observed variables for a CFA model. (Once I get this working I need to add some further constraints.) Here is what I've tried: model.sa - specify.model() F1 - X1,l11, NA F1 - X2,l21, NA F1 - X3,l31, NA F1 - X4,l41, NA F1 - X5, NA, 0.20 F2 - X1,l12, NA F2 - X2,l22, NA F2 - X3,l32, NA F2 - X4,l42, NA F2 - X6, NA, 0.25 F1 - F2,g12, 1 F1- F1,g11, 1 F2- F2,g22, 1 X1 - X1, NA, 1 X2 - X2, NA, 1 X3 - X3, NA, 1 X4 - X4, NA, 1 X5 - X5, NA, 1 X6 - X6, NA, 1 This at least converges: summary(fit.sem) Model Chisquare = 2147 Df = 10 Pr(Chisq) = 0 Chisquare (null model) = 2934 Df = 15 Goodness-of-fit index = 0.4822 Adjusted goodness-of-fit index = -0.087387 RMSEA index = 0.66107 90 % CI: (NA, NA) Bentler-Bonnett NFI = 0.26823 Tucker-Lewis NNFI = -0.098156 Bentler CFI = 0.26790 BIC = 2085.1 Normalized Residuals Min. 1st Qu. MedianMean 3rd Qu.Max. -5.990 -0.618 0.192 0.165 1.700 3.950 Parameter Estimates Estimate Std Error z value Pr(|z|) l11 -0.245981 0.21863 -1.12510 0.26054748 X1 --- F1 l21 -0.308249 0.22573 -1.36555 0.17207875 X2 --- F1 l31 0.202590 0.079102.56118 0.01043175 X3 --- F1 l41 -0.235156 0.21980 -1.06985 0.28468885 X4 --- F1 l12 0.839985 0.219623.82476 0.00013090 X1 --- F2 l22 0.828460 0.225483.67418 0.00023862 X2 --- F2 l32 0.066722 0.083690.79725 0.42530606 X3 --- F2 l42 0.832037 0.218403.80963 0.00013917 X4 --- F2 g12 0.936719 0.643311.45609 0.14536647 F2 -- F1 g11 2.567669 1.256082.04418 0.04093528 F1 -- F1 g22 1.208497 0.550402.19567 0.02811527 F2 -- F2 Iterations = 59 And it produces the following path diagram: path.diagram(fit.sem) digraph fit.sem { rankdir=LR; size=8,8; node [fontname=Helvetica fontsize=14 shape=box]; edge [fontname=Helvetica fontsize=10]; center=1; F2 [shape=ellipse] F1 [shape=ellipse] F1 - X1 [label=l11]; F1 - X2 [label=l21]; F1 - X3 [label=l31]; F1 - X4 [label=l41]; F1 - X5 [label=]; F2 - X1 [label=l12]; F2 - X2 [label=l22]; F2 - X3 [label=l32]; F2 - X4 [label=l42]; F2 - X6 [label=]; } But I don't see the residual error terms that go into each of the observed variables X1 - X6. I've tried: model.sa - specify.model() E1 - X1, e1, 1 E2 - X2, e2, 1 E3 - X3, e3, 1 E4 - X4, e4, 1 E5 - X5, e5, 1 E6 - X6, e6, 1 E1 - E1, s1, NA E2 - E2, s2, NA E3 - E3, s3, NA E4 - E4, s4, NA E5 - E5, s5, NA E6 - E6, s6, NA F1 - X1,l11, NA F1 - X2,l21, NA F1 - X3,l31, NA
Re: [R] Specifying Path Model in SEM for CFA
Dear Rick, It's unclear to me what you mean by constraining each column of the factor matrix to sum to one. If you intend to constrain the loadings on each factor to sum to one, sem() won't do that, since it supports only equality constraints, not general linear constraints on parameters of the model, but why such a constraint would be reasonable in the first place escapes me. More common in confirmatory factor analysis would be to constrain more of the loadings to zero. Of course, one would do this only if it made substantive sense in the context of the research. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Rick Bilonick [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 16, 2006 12:07 PM To: John Fox Cc: 'R Help' Subject: Re: [R] Specifying Path Model in SEM for CFA On Wed, 2006-08-16 at 08:47 -0400, John Fox wrote: Dear Rick, There are a couple of problems here: (1) You've fixed the error variance parameters for each of the observed variables to 1 rather than defining each as a free parameter to estimate. For example, use X1 - X1, theta1, NA Rather than X1 - X1, NA, 1 The general principle is that if you give a parameter a name, it's a free parameter to be estimated; if you give the name as NA, then the parameter is given a fixed value (here, 1). (There is some more information on this and on error-variance parameters in ?sem.) (2) I believe that the model you're trying to specify -- in which all variables but X6 load on F1, and all variables but X1 load on F2 -- is underidentified. In addition, you've set the metric of the factors by fixing one loading to 0.20 and another to 0.25. That should work but strikes me as unusual, and makes me wonder whether this was what you really intended. It would be more common in a CFA to fix the variance of each factor to 1, and let the factor loadings be free parameters. Then the factor covariance would be their correlation. You should not have to specify start values for free parameters (such as g11, g22, and g12 in your model), though it is not wrong to do so. I would not, however, specify start values that imply a singular covariance matrix among the factors, as you've done; I'm surprised that the program was able to get by the start values to produce a solution. BTW, the Thurstone example in ?sem is for a confirmatory factor analysis (albeit a slightly more complicated one with a second-order factor). There's also an example of a one-factor CFA in the paper at http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf, though this is for ordinal observed variables. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox Thanks for the information. I think I understand how to handle the residual variance after reading the sem help file more carefully. Now I have to figure out how to constrain each column of the factor matrix to sum to one. Maybe this will fix the problem with being under-identified. Rick B. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tkinser
Dear Julie, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of julie7.josse Sent: Thursday, August 17, 2006 4:01 AM To: r-help Subject: [R] tkinser Importance: High Dear list, I 'd like to know if it is possible to delete my text window after running it?? I have add a Menu to my text window and so i can for example open a script; then i run it and i have my result on my R console. But i'd like that after running, the code disappears automatically. i d'like something that clean my text window. I think that what you mean is that you'd like to delete the text in the window rather than the window itself. If the text widget is called txt, then you can do tkdelete(txt, 0.0, end) I have a second problem: My text window and all my widgets are not fixed: I use combo box, message box...and it always moves, it appears on the right of my screen, on the bottom..or on the bar tasks. I d'like my text window not move at all and after if it's possible that my widgets appear on the same place on my sreen. When they appear on the bottom on my tasks bar, i have to open it each time... If the top-level Tk window is named top, then, after creating the window, e.g., tkwm.geometry(tt, -100+100) will position it 100 pixels 100 pixels from the right top corner of the display. More generally, I found it useful to read Welch, Jones, and Hobbs, Practical Programming in Tcl and Tk, to learn these kinds of things. I hope this helps, John Thanks a lot. Julie. Cet iti, pensez aux cartes postales de laposte.net ! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Specifying Path Model in SEM for CFA
Dear Rick, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rick Bilonick Sent: Thursday, August 17, 2006 7:07 AM To: John Fox Cc: 'R Help'; 'Rick Bilonick' Subject: Re: [R] Specifying Path Model in SEM for CFA . . . I'm trying to build a multivariate receptor model as described by Christensen and Sain (Technometrics, vol 44 (4) pp. 328-337). The model is x_t = Af_t + e_t where A is the matrix of nonnegative source compositions, x_t are the observed pollutant concentrations at time t, and f_t are the unobserved factors. The columns of A are supposed to sum to no more than 100%. They say they are using a latent variable model. If sem can't handle this, do you know of another R package that could? sem() handles only equality constraints among parameters, and this model requires linear inequality constraints. I'm aware of SEM software that handles inequality constraints, but I'm not aware of anything in R that will do it out of the box. One possibility is to write out the likelihood (or fitting function) for your model and perform a bounded optimization using optim(). It would probably be a fair amount of work setting up the problem. Finally, there are tricks that permit the imposition of general constraints and inequality constraints using software, like sem(), that handles only equality constraints. It's probably possible to do what you want using such a trick, but it would be awkward. See the references given in Bollen, Structural Equations with Latent Variables (Wiley, 1989), pp. 401-403. I'm sorry that I can't be of more direct help. John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Total (un)standardized effects in SEM?
0 0 0 Alienation67 Alienation71 Anomia671.000 0.00 Powerless67 1.0265597 0.00 Anomia710.000 1.00 Powerless71 0.000 0.970892 Education 0.000 0.00 SEI 0.000 0.00 Alienation670.000 0.00 Alienation710.6173153 0.00 (C - A[endog, exog, drop=FALSE]) # direct effects, exogenous - endogenous SES Anomia67 0.000 Powerless67 0.000 Anomia71 0.000 Powerless71 0.000 Education 1.000 SEI 0.5163168 Alienation67 -0.5498096 Alienation71 -0.2115088 I - diag(nrow(B)) IBinv - solve(I - B) (Ty - IBinv - I) # total effects, endogenous - endogenous Anomia67 Powerless67 Anomia71 Powerless71 Education SEI Anomia670 00 0 0 0 Powerless67 0 00 0 0 0 Anomia710 00 0 0 0 Powerless71 0 00 0 0 0 Education 0 00 0 0 0 SEI 0 00 0 0 0 Alienation670 00 0 0 0 Alienation710 00 0 0 0 Alienation67 Alienation71 Anomia671.000 0.00 Powerless67 1.0265597 0.00 Anomia710.6173153 1.00 Powerless71 0.5993465 0.970892 Education 0.000 0.00 SEI 0.000 0.00 Alienation670.000 0.00 Alienation710.6173153 0.00 (Tx - IBinv %*% C) # total effects, exogenous - endogenous SES Anomia67 -0.5498096 Powerless67 -0.5644124 Anomia71 -0.5509147 Powerless71 -0.5348786 Education 1.000 SEI 0.5163168 Alienation67 -0.5498096 Alienation71 -0.5509147 Ty - B # indirect effects, endogenous - endogenous Anomia67 Powerless67 Anomia71 Powerless71 Education SEI Anomia670 00 0 0 0 Powerless67 0 00 0 0 0 Anomia710 00 0 0 0 Powerless71 0 00 0 0 0 Education 0 00 0 0 0 SEI 0 00 0 0 0 Alienation670 00 0 0 0 Alienation710 00 0 0 0 Alienation67 Alienation71 Anomia670.0000 Powerless67 0.0000 Anomia710.61731530 Powerless71 0.59934650 Education 0.0000 SEI 0.0000 Alienation670.0000 Alienation710.0000 Tx - C # indirect effects, exogenous - endogenous SES Anomia67 -0.5498096 Powerless67 -0.5644124 Anomia71 -0.5509147 Powerless71 -0.5348786 Education 0.000 SEI 0.000 Alienation67 0.000 Alienation71 -0.3394059 These results agree with those in the LISREL manual (and for another example there as well), but I haven't checked the method carefully. It would, of course, be simple to encapsulate the steps above in a function, but here's a caveat: The idea of indirect and total effects makes sense to me for a recursive model, and for the exogenous variables in a nonrecursive model, where they are the reduced-form coefficients (supposing, of course, that the model makes sense in the first place, which is often problematic), but not for the endogenous variables in a nonrecursive model. That is why I haven't put such a function in the sem package; perhaps I should reconsider. Having said that, I'm ashamed to add that I believe that I was the person who suggested the definition of total and indirect effects currently used for these models. Finally, you can get standardized effects similarly by using standardized structural coefficients. In the sem package, these are computed and printed by standardized.eoefficients(). This function doesn't return the standardized A matrix in a usable form, but could be made to do so. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] polychor error
Dear Janet, Performing a traceback after the error gives a hint: tmp.pcc-polychor(tmp.mat, ML=T, std.err=T) traceback() 8: stop(at least one element of , sQuote(lower), is larger than , sQuote(upper)) 7: checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr, sigma = sigma) 6: pmvnorm(lower = c(row.cuts[i], col.cuts[j]), upper = c(row.cuts[i + 1], col.cuts[j + 1]), corr = R) 5: binBvn(rho, row.cuts, col.cuts) 4: fn(par, ...) 3: function (par) fn(par, ...)(c(0.422816748044139, -2.20343446037221, -2.2163627792244, -1.79075055316993, -1.11077161663679, -0.323037731981826, 0.664036943094355, -2.71305188847272, -1.58338678422633, -0.534011853182102, 0.65365619155084 )) 2: optim(c(optimise(f, interval = c(-1, 1))$minimum, rc, cc), f, control = control, hessian = std.err) 1: polychor(tmp.mat, ML = T, std.err = T) The parameters are in the order correlation, row thresholds (of which there are 6), column thresholds (of which there are 4), and at this point in the maximization of the likelihood the first row threshold (-2.20343446037221) is *above* the second threshold (-2.2163627792244), causing pmvnorm() to complain. This can happen when the problem is ill-conditioned, since optim() doesn't constrain the order of the thresholds. So, why is the problem ill-conditioned? Here is your contingency table: tmp.mat 3 4 5 6 7 1 0 2 0 0 0 2 0 0 1 1 0 3 0 0 5 1 1 4 0 5 10 12 2 5 0 5 27 29 11 6 1 3 20 57 31 7 0 1 9 34 32 There are only two observations in the first row, two in the second row, and one in the first column. You're expecting a lot out of ML to get estimates of the first couple of thresholds for rows and the first for columns. One approach would be to eliminate one or more sparse rows or columns; e.g., polychor(tmp.mat[-1,], ML = TRUE, std.err = TRUE) Polychoric Correlation, ML est. = 0.3932 (0.05719) Test of bivariate normality: Chisquare = 14.21, df = 19, p = 0.7712 Row Thresholds Threshold Std.Err. 1 -2.4410 0.24270 2 -1.8730 0.14280 3 -1.1440 0.09236 4 -0.3353 0.07408 50.6636 0.07857 Column Thresholds Threshold Std.Err. 1 -2.6500 0.30910 2 -1.6420 0.12100 3 -0.5494 0.07672 40.6508 0.07833 polychor(tmp.mat[,-1], ML = TRUE, std.err = TRUE) Polychoric Correlation, ML est. = 0.4364 (0.05504) Test of bivariate normality: Chisquare = 14.85, df = 17, p = 0.6062 Row Thresholds Threshold Std.Err. 1 -2.4940 0.26020 2 -2.2080 0.19160 3 -1.7850 0.13400 4 -1.1090 0.09113 5 -0.3154 0.07371 60.6625 0.07821 Column Thresholds Threshold Std.Err. 1 -1.6160 0.11970 2 -0.5341 0.07639 30.6507 0.07800 A more defensible alternative would be to collapse sparse rows or columns. BTW, you *can* get estimated standard-errors from polychor() for your original table for the two-step estimator: polychor(tmp.mat, ML = FALSE, std.err = TRUE) Polychoric Correlation, 2-step est. = 0.4228 (0.05298) Test of bivariate normality: Chisquare = 19.22, df = 23, p = 0.6883 That's because the two-step estimator estimates the thresholds from the marginal distributions of the variables rather than from their joint distribution. So, is this a bug or a feature? I suppose that it's a bug to allow the thresholds to get out of order, though to constrain the optimization to prevent that from happening is probably not worth the effort and could cause some strange results. On the other hand, the error tells you something about the data, so maybe it's a feature. I noticed that you posted another version of this question two days before this one. I apologize for the slow response -- I wasn't able to read my email for a few days and it's taken me most of today to catch up. Regards, John original message --- Janet Rosenbaum jrosenba at rand.org Thu Aug 24 00:41:16 CEST 2006 Hi. Does anyone know whether the following error is a result of a bug or a feature? I can eliminate the error by making ML=F, but I would like to see the values of the cut-points and their variance. Is there anything that I can do? tmp.vec-c(0, 0, 0 , 0 ,0 , 1, 0, 2, 0 , 0, 5 ,5 ,3 ,1, 0 , 1, 5, 10, 27, 20, 9, 0, 1, 1, 12, 29, 57, 34, 0, 0, 1, 2, 11, 31, 32) tmp.mat-matrix(tmp.vec, nrow=7) rownames(tmp.mat)-1:7 colnames(tmp.mat)-3:7 tmp.pcc-polychor(tmp.mat, ML=T, std.err=T) Error in checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr, : at least one element of 'lower' is larger than 'upper' Thanks, Janet John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type II and III sum of square in Anova (R, car package)
Dear Amasco, A complete explanation of the issues that you raise is awkward in an email, so I'll address your questions briefly. Section 8.2 of my text, Applied Regression Analysis, Linear Models, and Related Methods (Sage, 1997) has a detailed discussion. (1) In balanced designs, so-called Type I, II, and III sums of squares are identical. If the STATA manual says that Type II tests are only appropriate in balanced designs, then that doesn't make a whole lot of sense (unless one believes that Type-II tests are nonsense, which is not the case). (2) One should concentrate not directly on different types of sums of squares, but on the hypotheses to be tested. Sums of squares and F-tests should follow from the hypotheses. Type-II and Type-III tests (if the latter are properly formulated) test hypotheses that are reasonably construed as tests of main effects and interactions in unbalanced designs. In unbalanced designs, Type-I sums of squares usually test hypotheses of interest only by accident. (3) Type-II sums of squares are constructed obeying the principle of marginality, so the kinds of contrasts employed to represent factors are irrelevant to the sums of squares produced. You get the same answer for any full set of contrasts for each factor. In general, the hypotheses tested assume that terms to which a particular term is marginal are zero. So, for example, in a three-way ANOVA with factors A, B, and C, the Type-II test for the AB interaction assumes that the ABC interaction is absent, and the test for the A main effect assumes that the ABC, AB, and AC interaction are absent (but not necessarily the BC interaction, since the A main effect is not marginal to this term). A general justification is that we're usually not interested, e.g., in a main effect that's marginal to a nonzero interaction. (4) Type-III tests do not assume that terms higher-order to the term in question are zero. For example, in a two-way design with factors A and B, the type-III test for the A main effect tests whether the population marginal means at the levels of A (i.e., averaged across the levels of B) are the same. One can test this hypothesis whether or not A and B interact, since the marginal means can be formed whether or not the profiles of means for A within levels of B are parallel. Whether the hypothesis is of interest in the presence of interaction is another matter, however. To compute Type-III tests using incremental F-tests, one needs contrasts that are orthogonal in the row-basis of the model matrix. In R, this means, e.g., using contr.sum, contr.helmert, or contr.poly (all of which will give you the same SS), but not contr.treatment. Failing to be careful here will result in testing hypotheses that are not reasonably construed, e.g., as hypotheses concerning main effects. (5) The same considerations apply to linear models that include quantitative predictors -- e.g., ANCOVA. Most software will not automatically produce sensible Type-III tests, however. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Amasco Miralisus Sent: Saturday, August 26, 2006 5:07 PM To: r-help@stat.math.ethz.ch Subject: [R] Type II and III sum of square in Anova (R, car package) Hello everybody, I have some questions on ANOVA in general and on ANOVA in R particularly. I am not Statistician, therefore I would be very appreciated if you answer it in a simple way. 1. First of all, more general question. Standard anova() function for lm() or aov() models in R implements Type I sum of squares (sequential), which is not well suited for unbalanced ANOVA. Therefore it is better to use Anova() function from car package, which was programmed by John Fox to use Type II and Type III sum of squares. Did I get the point? 2. Now more specific question. Type II sum of squares is not well suited for unbalanced ANOVA designs too (as stated in STATISTICA help), therefore the general rule of thumb is to use Anova() function using Type II SS only for balanced ANOVA and Anova() function using Type III SS for unbalanced ANOVA? Is this correct interpretation? 3. I have found a post from John Fox in which he wrote that Type III SS could be misleading in case someone use some contrasts. What is this about? Could you please advice, when it is appropriate to use Type II and when Type III SS? I do not use contrasts for comparisons, just general ANOVA with subsequent Tukey post-hoc comparisons. Thank you in advance, Amasco [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] Total (un)standardized effects in SEM?
Dear Rense, I already wrote a function, appended below, to compute effects for SEMs, as a method for the effects() generic. I've been thinking about adding this to the sem package, possibly with asymptotic standard errors computed by the delta method, as suggested by Michael Sobel. If I do so, I can also incorporate standardized effects. As I mentioned I have serious doubts about the utility of these ideas, and worry about encouraging them, but I guess that I could say the same thing for SEMs in general. My original purpose in writing the sem package was to have something in R that I use in teaching. Regards, John -- snip --- effects.sem - function(object, ...){ A - object$A exog - apply(A, 1, function(x) all(x == 0)) endog - !exog B - A[endog, endog, drop=FALSE] C - A[endog, exog, drop=FALSE] I - diag(nrow(B)) IBinv - solve(I - B) TotEndog - IBinv - I TotExog - IBinv %*% C result - list(B=B, C=C, TotEndog=TotEndog, TotExog=TotExog) class(result) - semEffect result } print.semEffect - function(x, digits=5, ...){ B - x$B C - x$C cat(\nDirect Effects of Exogenous on Endogenous Variables:\n) print(C, digits=digits) cat(\n\nDirect Effects of Endogenous Variables on Each Other:\n) print(B, digits=digits) cat(\n\nIndirect Effects of Exogenous on Endogenous Variables:\n) print(x$TotExog - C, digits=digits) cat(\n\nIndirect Effects of Endogenous Variables on Each Other:\n) print(x$TotEndog - B, digits=digits) cat(\n\nTotal Effects of Exogenous on Endogenous Variables:\n) print(x$TotExog, digits=digits) cat(\n\nTotal Effects of Endogenous Variables on Each Other:\n) print(x$TotEndog, digits=digits) invisible(x) } John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rense Nieuwenhuis Sent: Sunday, August 27, 2006 2:13 PM To: r-help@stat.math.ethz.ch Cc: John Fox Subject: Re: [R] Total (un)standardized effects in SEM? Dear John, thank you very much for your reply. The suggestions you make for calculating the direct and indirect effects are exactly what I was looking for. Although I'm very new to SEM and not at all very experienced in R, I tried to put them together in a function (called decomp) and expanded it to be able to calculate standardized effects as well. For that, I made a few changes to your standardized.coefficients() and added the little function std.matrix () which converts the output of standardized.coefficients. It's not a very coherent set of functions, but it works (for now). I have performed a preliminary test using LISREL. The results where identical. Again, I would like to thank you for the reply and the work on the SEM package. With highest regards, Rense Nieuwenhuis (Note: The syntax below consists of three functions. The first one decomposes SEM-objects. It needs the other two functions to work. I'm sure this can be programmed a lot cleaner, but for now, it serves my needs.) decomp - function(x, std=FALSE) { if(std==FALSE) { A - x$A # unstandardized structural coefficients } if(std==TRUE) { A - std.matrix(std.coef(x)) # standardized structural coefficients } exog - apply(A, 1, function(x) all(x == 0)) endog - !exog B - A[endog, endog, drop=FALSE] # direct effects, endogenous - endogenous C - A[endog, exog, drop=FALSE] # direct effects, exogenous - endogenous I - diag(nrow(B)) IBinv - solve(I - B) total.endo.endo - IBinv - I # total effects, endogenous - endogenous total.exo.endo - IBinv %*% C # total effects, exogenous - endogenous ind.endo.endo - total.endo.endo - B# indirect effects, endogenous - endogenous ind.exo.endo - total.exo.endo - C # indirect effects, exogenous - endogenous temp - list( total.endo.endo = total.endo.endo, total.exo.endo = total.exo.endo, ind.endo.endo = ind.endo.endo, ind.exo.endo = ind.exo.endo) return (temp) } std.matrix - function(x) { i - length(x$D) ii - length(x
Re: [R] Type II and III sum of square in Anova (R, car package)
Dear Amasco, Again, I'll answer briefly (since the written source that I previously mentioned has an extensive discussion): -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Amasco Miralisus Sent: Monday, August 28, 2006 2:21 PM To: r-help@stat.math.ethz.ch Cc: John Fox; Prof Brian Ripley; Mark Lyman Subject: Re: [R] Type II and III sum of square in Anova (R, car package) Hello, First of all, I would like to thank everybody who answered my question. Every post has added something to my knowledge of the topic. I now know why Type III SS are so questionable. As I understood form R FAQ, there is disagreement among Statisticians which SS to use (http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-does-the-out put-from-anova_0028_0029-depend-on-the-order-of-factors-in-the -model_003f). However, most commercial statistical packages use Type III as the default (with orthogonal contrasts), just as STATISTICA, from which I am currently trying to migrate to R. This was probably was done for the convenience of end-users who are not very experienced in theoretical statistics. Note that the contrasts are only orthogonal in the row basis of the model matrix, not, with unbalanced data, in the model matrix itself. I am aware that the same result could be produced using the standard anova() function with Type I sequential SS, supplemented by drop1() function, but this approach will look quite complicated for persons without any substantial background in statistics, like no-math students. I would prefer easier way, possibly more universal, though also probably more for dummies :) If am not mistaken, car package by John Fox with his nice Anova() function is the reasonable alternative for any, who wish to simply perform quick statistical analysis, without afraid to mess something with model fitting. Of course orthogonal contrasts have to be specified (for example contr.sum) in case of Type III SS. Therefore, I would like to reformulate my questions, to make it easier for you to answer: 1. The first question related to answer by Professor Brian Ripley: Did I understood correctly from the advised paper (Bill Venables' 'exegeses' paper) that there is not much sense to test main effects if the interaction is significant? Many are of this opinion. I would put it a bit differently: Properly formulated, tests of main effects in the presence of interactions make sense (i.e., have a straightforward interpretation in terms of population marginal means) but probably are not of interest. 2. If I understood the post by John Fox correctly, I could safely use Anova(.,type=III) function from car for ANOVA analyses in R, both for balanced and unbalanced designs? Of course providing the model was fitted with orthogonal contrasts. Something like below: mod - aov(response ~ factor1 * factor2, data=mydata, contrasts=list(factor1=contr.sum, factor2=contr.sum)) Anova(mod, type=III) Yes (or you could reset the contrasts option), but why do you appear to prefer the type-III tests to the type-II tests? It was also said in most of your posts that the decision of which of Type of SS to use has to be done on the basis of the hypothesis we want to test. Therefore, let's assume that I would like to test the significance of both factors, and if some of them significant, I plan to use post-hoc tests to explore difference(s) between levels of this significant factor(s). Your statement is too vague to imply what kind of tests you should use. I think that people are almost always interested in main effects when interactions to which they are marginal are negligible. In this situation, both type-II and type-III tests are appropriate, and type-II tests would usually be more powerful. Regards, John Thank you in advance, Amasco On 8/27/06, John Fox [EMAIL PROTECTED] wrote: Dear Amasco, A complete explanation of the issues that you raise is awkward in an email, so I'll address your questions briefly. Section 8.2 of my text, Applied Regression Analysis, Linear Models, and Related Methods (Sage, 1997) has a detailed discussion. (1) In balanced designs, so-called Type I, II, and III sums of squares are identical. If the STATA manual says that Type II tests are only appropriate in balanced designs, then that doesn't make a whole lot of sense (unless one believes that Type-II tests are nonsense, which is not the case). (2) One should concentrate not directly on different types of sums of squares, but on the hypotheses to be tested. Sums of squares and F-tests should follow from the hypotheses. Type-II and Type-III tests (if the latter are properly formulated) test hypotheses that are reasonably construed as tests of main effects and interactions in unbalanced designs. In unbalanced designs, Type-I sums of squares usually test
[R] [R-pkgs] Version 1.2-0 of the Rcmdr package
I've submitted a new, and substantially enhanced, version (1.2-0) of the Rcmdr package to CRAN. Some highlights (from the CHANGES) file: o Added ability to import from Excel, Access or dBase files (contributed by Samir Messad, Renaud Lancelot and Matthieu Lesnoff). o Added ability to read data from the clipboard (suggested by Graham Smith). o Added Data - Active data set - Stack variables in active data set (suggested by Richard Heiberger). o Added many probability distributions, courtesy of Miroslav Ristic and G. Jay Kerns, Andy Chang, and Theophilius Boye. o Added dialogs to sample from probability distributions; divided distributions menus into Continuous and Discrete. o Added totPercents function. Added total percentages and components of chi-square to two-way tables (suggested by Richard Heiberger). o Added Statistics - Summaries - Count missing observations. o Added Statistics - Summaries - Correlation test, contributed by Stefano Calza. o The numerical summaries, table of statistics, frequency distributions, numeric to factor, and recode dialogs will now process multiple variables in parallel (suggested by Bo Sommerlund). o Added function numSummaries for neater output from Statistics - Summaries - Numerical summaries (motivated by comments from Bo Sommerlund). o XY conditioning plots dialog added, courtesy of Richard Heiberger. o Enhancements to 3D scatterplots. o Small interface improvements (e.g., pressing letter key in a list box moves to next entry starting with that letter). o New Rcmdr options (etc and etcMenus) determine where the Commander will look for its configuration and menu files (motivated by a suggestion by Richard Heiberger). Rcmdr version 1.2-0 includes revised Catalan and Japanese translations (courtesy of Manel Salamero and Takaharu Araki); as other revised translations become available, I'll submit updated versions of the package to CRAN. Version 1.2-1 will also have a revised introductory manual. As usual, bug reports, comments, and suggestions are welcome. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox ___ R-packages mailing list R-packages@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question on Chi-square of null model in sem package
Dear Wei-Wei, As I explained to you in private email yesterday (perhaps you didn't receive my reply?), the problem that you point out is due to a bug in the sem function that I fixed some time ago and then inadvertently reintroduced. Yesterday, I sent a corrected version of the sem package (0.9-5) to CRAN; the source package is there now and I'm sure that the compiled Windows package will appear in due course. Thank you once more for bringing the problem to my attention. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Guo Wei-Wei Sent: Monday, September 04, 2006 12:34 AM To: r-help@stat.math.ethz.ch Subject: [R] Question on Chi-square of null model in sem package Dear all, I met a problem while doing SEM by sem package. I got a negative chi-square of null model. Because the theoretical value of chi-square cannot be negative, I checked the source code of sem.R in sem package and I found the Chi-square of null model was computed by the following expression: result$chisqNull - (N - 1) * (sum(diag(S %*% diag(1/diag(S + log(prod(diag(S I think the reason for negative Chi-square is the too small value of prod(diag(S)) of my data. I'm working on a data.frame named emc.data from a sample of a 16-item questioinnaire. The variance of items are diag(cov(emc.data)) EMC1 EMC2 EMC3 EMC4 EMC5 EMC6 EMC7 EMC8 0.364 0.2350041 0.2488009 0.2901653 0.3195399 0.3107343 0.3436622 0.2345912 EMC9 EMC10 EMC11 EMC12 EMC13 EMC14 EMC15 EMC16 0.2621680 0.3230400 0.4039245 0.3803105 0.2773370 0.4348342 0.2757216 0.3405252 The fit indices of RMSEA and GFI are good, so I think the problem might be solve by another way for computing the Chi-square of null model. I'm not well trained in maths, so I come for help. Any advise is appreciated. Best wishes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with R and tckl/tk on Mac OS X
Dear Alex, It's possible, though not clear from the original posting, that Ingo is trying to install the Rcmdr package. If that's the case, then he might find the installation instructions for Mac users at http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html helpful. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Brown Sent: Friday, September 29, 2006 8:48 AM Cc: r-help@stat.math.ethz.ch Subject: Re: [R] problems with R and tckl/tk on Mac OS X I assume you are running MacOS X 10.4? The Mac version of R does not require X11 or Tcl/TK since it uses a separate GUI interface developed exclusively for the Mac using native widgets and a quartz based graphing mechanism. Try downloading the following binary installer which includes the gui interface: http://cran.r-project.org/bin/macosx/R-2.3.1.dmg -Alex Brown On 29 Sep 2006, at 08:04, Ingo wrote: Dear R-help team, I am trying to run R on my Intel-based Mac. I have installed R, X11 and Tcl/TK (I thought), but the GUI doesn't run. My system administrator doesn't support R and therefore, I'm a little helpless. What can I do? Regards, Ingo __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. On 29 Sep 2006, at 08:04, Ingo wrote: Dear R-help team, I am trying to run R on my Intel-based Mac. I have installed R, X11 and Tcl/TK (I thought), but the GUI doesn't run. My system administrator doesn't support R and therefore, I'm a little helpless. What can I do? Regards, Ingo __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatter3d() model.summary coefficients?
Dear Anja, As you suggest, models in scatter3d() are fit via lm() and also mgcv(). scatter3d() rescales the three variables to fit in the unit cube; I believe that the new version of rgl makes the rescaling unnecessary, so eventually I'll probably rework scatter3d() to avoid it. It would be better if ?scatter3d mentioned this; I've made that change in the development version of the package. BTW, a nice thing about R is that the source code is there, so you can look to see what a function does. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of angela baldo Sent: Friday, September 29, 2006 5:23 PM To: r-help@stat.math.ethz.ch Subject: [R] scatter3d() model.summary coefficients? Hello All, I am a R newbie and am probably misinterpreting something really obvious... In the Rcmdr package there is a scatter3d() function that can fit a curve and also provide coefficients for the model. If I'm understanding this right, I think it's calling the lower level stats package function lm(), which is the part that actually does the curve fitting. Anyway, what has me perplexed is that the model summary from scatter3d() has different coefficients than the one generated by lm(). However, the actual surface plotted by scatter3d() looks like the function generated by lm(). In the scatter3d() docs I didn't see anything about transforming the coefficients or changing them somehow - perhaps I have not been looking in the right place? I'm using a Linux box: 2.6.17-1.2187_FC5smp, R version 2.3.1, Rcmdr version 1.2-0, in case that helps. Thanks very much for any enlightenment! anja Here's an example of the output on the same data by both functions. If anyone wants the dataset, let me know: scatter3d(samples$x1, samples$y, samples$x2, fit=linear, residuals=TRUE, bg=white, axis.scales=TRUE, grid=TRUE, ellipsoid=FALSE, xlab=x1, ylab=y, zlab=x2, model.summary=TRUE) $linear Call: lm(formula = y ~ x + z) Residuals: Min1QMedian3Q Max -0.096984 -0.022303 0.004758 0.029354 0.091188 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.708945 0.007005 101.20 2e-16 *** x0.278540 0.011262 24.73 2e-16 *** z -0.688175 0.011605 -59.30 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.03936 on 105 degrees of freedom Multiple R-Squared: 0.972,Adjusted R-squared: 0.9715 F-statistic: 1822 on 2 and 105 DF, p-value: 2.2e-16 summary(lm(formula=samples$y~samples$x1+samples$x2)) Call: lm(formula = samples$y ~ samples$x1 + samples$x2) Residuals: Min 1Q Median 3Q Max -7865.0 -1808.6 385.8 2380.5 7394.9 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 92204.502 1323.217 69.68 2e-16 *** samples$x1225.882 9.133 24.73 2e-16 *** samples$x2 -558.076 9.411 -59.30 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3192 on 105 degrees of freedom Multiple R-Squared: 0.972,Adjusted R-squared: 0.9715 F-statistic: 1822 on 2 and 105 DF, p-value: 2.2e-16 -- Angela M. Baldo Computational Biologist USDA, ARS Plant Genetic Resources Unit Grape Genetics Research Unit New York State Agricultural Experiment Station 630 W. North Street Geneva, NY 14456-0462 USA voice 315 787-2413 or 607 254-9413 fax 315 787-2339 or 607 254-9339 [EMAIL PROTECTED] http://www.ars.usda.gov/NAA/Geneva __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC ERROR on Rcmdr install
Dear Sarosh, As I understand it, RODBC isn't useful on non-Windows systems, since the necessary ODBC drivers aren't available. (Someone will correct me, I'm sure, if I don't have that entirely straight.) The RODBC package is used in the Rcmdr to read Excel and some other files under Windows; in the latest version of the Rcmdr, you won't even see this menu item in non-Windows systems. You should, I believe, be able to install and use the Rcmdr package on your system without RODBC. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sarosh Jamal Sent: Friday, September 29, 2006 11:53 AM To: R-help@stat.math.ethz.ch Subject: [R] RODBC ERROR on Rcmdr install I've been trying to install the RODBC dependency for Rcmdr on Rv2.4 on a SunOS9 system. It claims not to be able to create gcc output files (executables) for the installation. This is puzzling since I've been able to install other packages with the same PATH variables and all. Feedback would be much appreciated. Thank you! Sarosh Jamal Geo Computing IT Specialist, Department of Geography University of Toronto at Mississauga e: [EMAIL PROTECTED] p: (905) 569-4497 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC ERROR on Rcmdr install
Dear Vittorio, Thanks for the clarification. This raises two questions: (1) If a *nix system has an ODBC driver, can one then read Excel, Access, and dBase data sets via RODBC (which is what the Rcmdr menu item in question provides for)? (2) If so, is there a way for me to detect whether unixODBC is present, or would I have to rely, e.g., on a user-set option? I'm moving this message to the r-devel list. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of vittorio Sent: Saturday, September 30, 2006 12:40 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] RODBC ERROR on Rcmdr install Alle 22:55, venerdì 29 settembre 2006, John Fox ha scritto: As I understand it, RODBC isn't useful on non-Windows systems, since the necessary ODBC drivers aren't available. (Someone will correct me, I'm sure, if I don't have that entirely straight.) The RODBC package is used in the Rcmdr to read Excel and some other files under Windows; in the latest version of the Rcmdr, you won't even see this menu item in non-Windows systems. As a matter of fact RODBC can be profitably used under *nix OS together with unixODBC to connect to many DBs. I've been using RODBC with unixODBC on linux, freebsd and win xp to connect smoothly to postgresql, mysql and oracle (somewhat tricky to me under *nix, you need ** to buy ** a driver) and I know that connections are possible to many other *nix DBs under *nix itself. Ciao Vittorio __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model with hidden variables
Dear Richard, This looks like a latent-variable structural-equation model of the kind that one can fit with the sem package. If, however, both of the Y's affect all of the Z's, then the model isn't identified. (You also don't say what you assume about the errors from different equations -- are they correlated?) I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard A. O'Keefe Sent: Wednesday, October 04, 2006 12:37 AM To: r-help@stat.math.ethz.ch Subject: [R] Linear model with hidden variables I have some data on a moving vehicle where, amongst other things, it looks as though it would be informative to fit a model with the following structure: Z = B.Y + errorz Y = C.X + errorz The X variables are observed predictor variables; 6 of the variables look promising (on the basis of what they mean). The Z variables are observed response variables; there are 4 of them. There is a priori reason to believe, and scatterplots to suggest, that the Z variables are really essentially two-dimensional, so the Y variables are hidden intermediate variables. There should be 2 of them. There are actually two physical candidates for what they might be, but they happen not to have been measured. There are 800 cases. (More precisely, there are 14 periods, each with about 800 samples, and I am interested in fitting a separate model in each period.) A simple least squares fit for this model would minimise the sum of error squares for the Zs. Oh, I do mean there to be constant terms. I suppose I could go back to first principles and work it all out, but has anyone ever done something like this in R? There seems to be every imaginable variation on lm and some that I find unimaginable, so presumably a means to do this is already around somewhere. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm and plot.effects
Dear Michael, Look more closely at the two plots: With rescale.axis=TRUE (the default), the vertical axis is on the scale of the linear predictor (i.e., the logit scale), but the tick marks are labelled in the scale of the response (i.e., the probability scale); this preserves the linear structure of the model. With rescale.axis=FALSE, the vertical axis is on the scale of the response (directly on the probability scale). If you want the vertical axis labelled on the scale of the linear predictor, specify type=link (the default is response). I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Kubovy Sent: Friday, October 06, 2006 7:19 AM To: R list Subject: [R] glm and plot.effects Dear R-helpers, I don't see a difference between the following two plots of effect objects, which I understand should be different. What am I missing? require(doBy) require(effects) data(budworm) m1 - glm(ndead/20 ~ sex + log(dose), data=budworm, weight=ntotal, family=binomial) m1.eff - all.effects(m1) plot(m1.eff, rescale.axis = FALSE, selection = 2, main = 'rescale = F') plot(m1.eff, rescale.axis = TRUE, selection = 2, main = 'rescale = T') * R version 2.4.0 (2006-10-03) powerpc-apple-darwin8.7.0 locale: C attached base packages: [1] grid tools splines methods utils [6] stats graphics grDevices datasets base other attached packages: effects doBy Hmisc chron weaver codetools 1.0-9 0.83.1-12.3-81.0.00.0-3 digest proptest survival lme4 MatrixJGR 0.2.20.1-1 2.29 0.9975-0 0.9975-1 1.4-11 iplots JavaGD rJava MASSlattice 1.0-40.3-5 0.4-10 7.2-29 0.14-9 _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Differences in estimates calculated in R and SPSS
Dear Katja, If the fitted values are the same and the coefficients differ, then the parametrization of the models differ. I suspect that SPSS is using sum-to-zero contrasts (contr.sum in R, coded -1, 0, 1), while R uses dummy-coded contrasts (contr.treatment, coded 0,1) by default. If this is the case, then you should get the same coefficients in R by setting options(contrasts=c(contr.sum, contr.poly)). There are other ways of doing this as well. See ?options, ?contr.treatment, etc., and section 11.1.1 of the introductory manual that comes with R (and perhaps read something more detailed about how R handles factors in linear models). I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Katja Walentowitz Sent: Saturday, October 14, 2006 11:08 AM To: Dirk Eddelbuettel Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Differences in estimates calculated in R and SPSS I have compared the models, they are exactly the same. The same for the data. The problem occurs most often when I use the aov-function. When I calculate the fitted values, they are the same, but the intercept is somewhat different. In R the intercept is more or less the mean of the regressand values, as defined in some statistical books - but still: it is not the exact mean. In SPSS it is not. Thanks again for helping me. Katja On 10/14/06, Dirk Eddelbuettel [EMAIL PROTECTED] wrote: On 14 October 2006 at 16:59, Katja Walentowitz wrote: | I am a bit confused because I get different estimates for the | regression coefficients from R and SPSS. | Does anyone have an explanation for this? As I am not so | experienced in data analysis, I don't know why this happens. Maybe you are not estimating the same model -- one may have an intercept, and the other doesn't. Or maybe something happens while you read the data -- start by comparing summary statistics, tabulations and plots of your data. But as you didn't tell what you estimated, and how, it is a tad hard for use to guess. There is a Posting Guide recommending how to phrase questions and what to suplly ... Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regression analyses using a vector of means anda variance-covariance matrix
Dear John, I'm not aware of an existing function that will do what you want conveniently, but it's not hard to write one: reg - function(V, means, y, x, n){ # V: covariance matrix of variables # means: vector of means # y: index of response in V # x: indices of regressors in V SSP - (n - 1)*V + n*outer(means, means) SSP - cbind(n*means, SSP) SSP - rbind(c(n, n*means), SSP) XtXinv - solve(SSP[c(1,x+1), c(1,x+1)]) b - as.vector(XtXinv %*% SSP[c(1,x+1), y+1]) R2 - as.vector(V[y,x] %*% solve(V[x,x]) %*% V[x,y])/V[y,y] s2 - (1 - R2)*V[y,y]*(n - 1)/(n - length(x) - 1) Vb - s2*XtXinv list(coef=b, vcov=Vb, R2=R2, s2=s2, SSP=SSP) } For example, library(car) data - Duncan[,c(prestige, income, education)] reg(var(data), colMeans(data), 1, 2:3, nrow(data)) $coef [1] -6.0646629 0.5987328 0.5458339 $vcov incomeeducation 18.2494814 -0.15184501 -0.150706025 income-0.1518450 0.01432027 -0.008518551 education -0.1507060 -0.00851855 0.009653582 $R2 [1] 0.8281734 $s2 [1] 178.7309 $SSP prestige income education 45 2146 1884 2365 prestige 2146 146028 118229147936 income1884 118229 105148122197 education 2365 147936 122197163265 summary(lm(prestige ~ income + education, data=Duncan)) # check Call: lm(formula = prestige ~ income + education, data = Duncan) Residuals: Min 1Q Median 3Q Max -29.5380 -6.4174 0.6546 6.6051 34.6412 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -6.064664.27194 -1.4200.163 income 0.598730.11967 5.003 1.05e-05 *** education0.545830.09825 5.555 1.73e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 13.37 on 42 degrees of freedom Multiple R-Squared: 0.8282, Adjusted R-squared: 0.82 F-statistic: 101.2 on 2 and 42 DF, p-value: 2.2e-16 Some caveats: This function will work OK for well conditioned data, since I'm inverting the matrix of sums of squares and products; one could take a more sophisticated approach. As well, there's a bit of redundancy in the computation of R2, but I didn't have time to figure out how to avoid it. Finally, there's an obvious advantage to computing on the original data -- e.g., one can get residuals. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Sorkin Sent: Saturday, October 14, 2006 3:27 PM To: r-help@stat.math.ethz.ch Subject: [R] regression analyses using a vector of means anda variance-covariance matrix R 2.2.0 windows XP How can I perform a regression analyses using a vector of means, a variance-covariance matrix? I looked at the help screen for lm and did not see any option for using the afore mentioned structures as input to lm. Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) [EMAIL PROTECTED] Confidentiality Statement: This email message, including any attachments, is for the\...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
RE: [R] ANCOVA
Dear Matt, The sequential sums of squares produced by anova() test for g ignoring x (and the interaction), x after g (and ignoring the interaction), and the x:g interaction after g and x. The second and third test are generally sensible, but the first doesn't adjust for x, which is probably not what you want in general (although you've constructed x to be independent of g, which is typically not the case). Note that you would not usually want to test for equal intercepts in the model that doesn't constrain the slopes to be equal (though you haven't done this): Among other things, 0 is outside the range of x. The Anova() function in the car package will produce so-called type-II tests. I hope that this helps. John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt Oliver Sent: Friday, August 27, 2004 4:10 PM To: [EMAIL PROTECTED] Subject: [R] ANCOVA Dear R-help list, I am attempting to understand the proper formulation of ANCOVA's in R. I would like to test both parallelism and intercept equality for some data sets, so I have generated an artificial data set to ease my understanding. This is what I have done #Limits of random error added to vectors min - -0.1 max - 0.1 x - c(c(1:10), c(1:10))+runif(20, min, max) x1 - c(c(1:10), c(1:10))+runif(20, min, max) y - c(c(1:10), c(10:1))+runif(20, min, max) z - c(c(1:10), c(11:20))+runif(20, min, max) g - as.factor(c(rep(1, 10), rep(2, 10))) #x and x1 have similar slopes and have the similar intercepts, #x and y have different slopes and different intercepts #x and z have similar slopes with different intercepts #These are my full effects models fitx1x - lm(x1~g+x+x:g) fityx - lm(y~g+x+x:g) fitzx - lm(z~g+x+x:g) anova(fitx1x) Analysis of Variance Table Response: x1 Df Sum Sq Mean Sq F value Pr(F) g 1 0.002 0.002 0.3348 0.5709 x 1 163.927 163.927 23456.8319 2e-16 *** g:x1 0.002 0.002 0.2671 0.6123 Residuals 16 0.112 0.007 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 These results confirm that x and x1 do not have significantly different means with respect to g; There is a significant linear relationship between x and x1 independent of g There is no evidence that the slopes of x and x1 in the two g groups is different anova(fityx) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(F) g 1 0.012 0.012 1.7344 0.2064 x 1 0.003 0.003 0.4399 0.5166 g:x 1 164.947 164.947 24274.4246 2e-16 *** Residuals 160.1090.007 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 These results confirm that x and y do not have significantly different means with respect to g; There is a not a significant linear relationship between x and y independent of g There is evidence that the slopes of x and y in the two g groups is significantly different. anova(fitzx) Analysis of Variance Table Response: z DfSum Sq Mean Sq F value Pr(F) g 1 501.07501.07 52709.9073 2e-16 *** x 1 165.39165.39 17398.4057 2e-16 *** g:x10.02 0.02 1.7472 0.2048 Residuals 160.15 0.01 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 These results confirm that x and z have significantly different means with respect to g; There is a a significant linear relationship between x and z independent of g There is no evidence that the slopes of x and z in the two g groups is significantly different. What I don't understand is how to formulate the model so that I can tell if the intercepts between the g groups are different. Also, how would I formulate an ANCOVA if I am dealing with Model II regressions? Any help would be greatly appreciated. Matt == When you reach an equilibrium in biology, you're dead. - A. Mandell == Matthew J. Oliver Institute of Marine and Coastal Sciences 71 Dudley Road, New Brunswick New Jersey, 08901 http://marine.rutgers.edu/cool/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
Re: [R] Rcmdr X11 protocol error message
Dear Peter, On 30 Aug 2004 17:41:10 +0200 Peter Dalgaard [EMAIL PROTECTED] wrote: John Fox [EMAIL PROTECTED] writes: Dear Peter and Michael, Peter: Thank you for fielding this question. Michael initially contacted me directly, and I suggested that he write to the list since I didn't know the possible source of the problem, had never seen it (either under Windows or Linux), and never had a report of it before. Michael: Does this problem occur only with certain Rcmdr dialogs or generally? I'd also be curious to learn whether anyone else has experienced this problem, and whether anyone has run the Rcmdr package under Mandrake Linux without a problem. I can't seem to reproduce it on RH8 with R-1.9.x. I see a couple of other quirks though: If you bring up a dialog and click help, you get the help window OK, but it is not scrollable until you exit the dialog with OK or Cancel (by which time the point is moot either way, presumably). My Linux system badly needs updating, so I can't check this directly now. Does this problem occur with HTML help? As well, setting the Rcmdr grab.focus option to FALSE might get things unstuck. The View data set feature gets stuck in a tight loop and requires a Ctl-C in the console to allow it to display the data and process any events. Does this problem occur occur when showData() (from the relimp package) is called directly, or only when it's called from the Rcmdr? I'm sorry to impose these questions on you -- in the longer term, I have to update my Linux system so that I can again perform these tests myself. Thank you for your help, John -p -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] initialization error with XEmacs init.el file
Dear Melanie, Are you using forward-slashes (/) or double-back-slashes (\\) to separate directories, as I believe you should, and placing quotes around the path? I hope that this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Melanie A. Link-Perez Sent: Monday, August 30, 2004 5:08 PM To: [EMAIL PROTECTED] Subject: [R] initialization error with XEmacs init.el file I have successfully used XEmacs on my home PC for several months. On 8/25/04 I downloaded R version 1.9.1 (rw1091) from one of the R mirror sites, XEmacs installer (setup.exe) from http://www.xemacs.org/Download/win32/, and Jim Fox's latest configuration files (fox-ess-config.zip) from http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/ESS/ to my Dell laptop, which is running MS Windows XP Professional version 2002. Upon launching XEmacs, I get the following error: (1) (initialization/error) An error has occurred while loading c:\.xemacs\init.el: Cannot open load file: executable The init.el file is version 0.5.6. I have triple-checked path names and placement of files in directories, but have not been able to track down this problem. I also tried replacing the init.el file with my older version (0.5.4) to no avail. I have tried to get around this by using Alt-x and shift-R to launch R, but the following appears below the buffer: Searching for program: No such file or directory, C:Program FilesR^Mw1091^HinRterm Certainly, there is no Mw1091 or HinRterm and I don't know where the program is getting that info (with the obvious typos). I am grateful for any assistance that is offered. Thanks, Melanie Link-Perez Miami University __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Rcmdr X11 protocol error message
Dear Michael, A question: Do you observe these problems with tcltk in general or just with the Rcmdr package? Is it possible for you to test a Tcl/Tk program outside of R? Regards, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Tuesday, August 31, 2004 2:32 AM To: Michael Bibo Cc: [EMAIL PROTECTED] Subject: Re: [R] Rcmdr X11 protocol error message Michael Bibo [EMAIL PROTECTED] writes: What's happening? Both Java and R are reading events from the X11 event loop ``at the same time''. As a result, R is reading events meant for Java and vice-versa and bad things happen! There are two solutions: * use the Java graphics device which uses the Java graphics facilities to do the rendering; * don't run the GUI and the X11 device at the same time. The second solution is clearly not ideal. What we can do is to compile the X11 device code to be thread-safe. However, the same effect will be seen if we load another X11 event handler (e.g. the Tcl/Tk library). I don't think that is the issue. The X11 device and Tk sit on different connections to the X device, so they shouldn't be seeing eachother's events (the event loops can starve eachother though since they are multiplexed). I'd rather suspect a race condition with the Tk code itself - sending events to a window that is in the process of being destroyed and things like that. Which Tk version are we talking about, by the way? -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Rcmdr X11 protocol error message
Dear Peter and Michael, I installed Quantian on a spare machine that I have and observed the same warning messages that Michael has been reporting. These (and the problem with help files but not with viewing data sets that Peter reported) occurred with version 0.9-11 of Rcmdr but not with an earlier version. Since the code for the Rcmdr package was substantially reworked this summer, that seems to me a good candidate for the source of these problems, though I don't see why the changes should be problematic. I'm afraid that I'm insufficiently familiar with the inner workings of X11 and Tcl/Tk to be much help in figuring out what's wrong. Everything seems to work fine under Windows, as far as I can tell. It occurs to me that if the warning messages are benign, one approach would be to suppress them. I already intercept warnings and present them in dialog boxes; I could grep for X11 protocol error and simply ignore these. That doesn't seem to me a good solution, however. It would be better to understand what's happening. I'm copying this message to Dirk since he's mentioned that he plans to put the newer Rcmdr in Quantian. Dirk: Have you tested with Rcmdr 0.9-11? Thank you. John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Tuesday, August 31, 2004 2:37 PM To: Michael Bibo Cc: [EMAIL PROTECTED] Subject: Re: [R] Rcmdr X11 protocol error message Michael Bibo [EMAIL PROTECTED] writes: John Fox jfox at mcmaster.ca writes: Dear Michael, A question: Do you observe these problems with tcltk in general or just with the Rcmdr package? Is it possible for you to test a Tcl/Tk program outside of R? And Peter asked which version of Tcl/Tk. Apparently my system has version 8.4 installed, specifically: Tcl-8.4.5-3-mdk (tclsh8.4) and Tk-8.4.5-3-mdk (libtk8.4.so). As I understand it, these are installed from RPMs on the installation DVD, and I note that they are mandrake specific. John - I wasn't sure if I had any other Tcl/Tk applications installed (it's not always obvious when installing from RPM's). I have certainly not encountered these error messages with any other application. I quickly downloaded WISH Supernotepad 1.2.3. This application requires Tcl/Tk 8.4 or greater. There are no graphics window in this application, but plenty of dialogue boxes. It gave no errors. Is this an appropriate test? If this is a mandrake-configuration-specific problem, it may not be worth spending too much time investigating, as R Commandr still works. I can always try re-installing Tcl/Tk from source when/if I have time. I don't think we have evidence that it's the Tcl installation, although it could be (Google suggests that there have been problems with at least some versions, although most references seem rather old). I can't seem to reproduce the effect with SUSE's tk-8.4.6-28 either. If it is a bug in Rcmdr, then we'd want to find it and you have the only handle on it BTW, sometimes Tk errors allow you to see a trace of the execution. Would this happen to be one of those situations? -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Rcmdr X11 protocol error message
Dear Michael, I can confirm Michael's observations: no problems until after a graphics device window is opened. Then increasing numbers of warnings each time commands are executed from within the Rcmdr. This happens whether the commands are executed from dialog boxes or from the (upper) script window. The warnings continue even if the graphics window is closed. Exiting from the Commander and restarting it (within the name R session) stops the warnings until another graphics window is opened. John -Original Message- From: Michael Bibo [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 01, 2004 6:21 AM To: John Fox Cc: 'Peter Dalgaard'; [EMAIL PROTECTED]; 'Dirk Eddelbuettel' Subject: Re: [R] Rcmdr X11 protocol error message John Fox wrote: Dear Peter and Michael, I installed Quantian on a spare machine that I have and observed the same warning messages that Michael has been reporting. These (and the problem with help files but not with viewing data sets that Peter reported) occurred with version 0.9-11 of Rcmdr but not with an earlier version. Since the code for the Rcmdr package was substantially reworked this summer, that seems to me a good candidate for the source of these problems, though I don't see why the changes should be problematic. I'm afraid that I'm insufficiently familiar with the inner workings of X11 and Tcl/Tk to be much help in figuring out what's wrong. Everything seems to work fine under Windows, as far as I can tell. It occurs to me that if the warning messages are benign, one approach would be to suppress them. I already intercept warnings and present them in dialog boxes; I could grep for X11 protocol error and simply ignore these. That doesn't seem to me a good solution, however. It would be better to understand what's happening. I'm copying this message to Dirk since he's mentioned that he plans to put the newer Rcmdr in Quantian. Dirk: Have you tested with Rcmdr 0.9-11? Thank you. John John, Peter and Dirk, I'm glad I'm not the only one with a handle on it now. I've already exceeded my level of knowledge and skill in both Linux and R. But I'm happy to help track it down, if I can be of assistance. As I have said to John, the major value of Rcmdr to me is to help 'sell' R to others in my organisation currently using SPSS, and for this purpose, the Windows version at work is working fine. It is more of an annoyance at home, as it doesn't seem to stop anything working. Peter asked: BTW, sometimes Tk errors allow you to see a trace of the execution. Would this happen to be one of those situations? The short answer is I don't know (see my first line :-)). There is no button on the dialogue box saying 'more details...' or anything that obvious. Is there a log file somewhere on the system or some other way to generate such a trace? I have been experimenting, and the following seem reliable (at least on my system): Error messages don't seem to happen for the first graph generated. The second graph drawn also generates an error dialogue box with the error message repeated about 11 times. The third and subsequent times exactly the same graph is generated leads to an error message repeated about 21 times. (I don't know if the number of repetitions is meaningful). Note that the graphs are generated and visible when the error messages appear. Running any analysis that only writes output to the output window (such as fitting a regression model) generates the error messages before the output is written to the output window. The output appears when the error dialogue box is OK'd. I guess it's more likely to be responding to exiting the dialogue box than writing the output. But this only happens when such error messages have been generated previously in that session by creating a (second) graph. If the diagnostic panel plots for a regression model are called, the error messages appear, but this time the plots themselves are not visible in the graphics device (it is blank). When the error messages are OK'd, the plots appear in the device. When exiting Rcmdr, then, and OK'ing the Exit? dialogue box, the final error messages appear, but again only if they have already been generated by a graph call. I don't know if this will be helpful, but I thought reliable observations might give some clues. Regards, Michael [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Rcmdr X11 protocol error message
Dear Michael and Peter, I've made the following changes to the current development version of the Rcmdr package (on my web site, at http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/index.html, but not posted to CRAN): (1) The default for the Rcmdr grab.focus option is now set to FALSE for non-Windows systems. (2) There's a new report.X11.warnings option, which is set to FALSE by default; this suppresses these warnings so that they don't appear in Rcmdr warning dialogs. This just papers over the problem without really solving it. I'm not happy with these solutions, but perhaps they'll help until we discover the source of the problems. (I tested on my Quantian system.) Thanks for your help. John -Original Message- From: Michael Bibo [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 01, 2004 6:21 AM To: John Fox Cc: 'Peter Dalgaard'; [EMAIL PROTECTED]; 'Dirk Eddelbuettel' Subject: Re: [R] Rcmdr X11 protocol error message John Fox wrote: Dear Peter and Michael, I installed Quantian on a spare machine that I have and observed the same warning messages that Michael has been reporting. These (and the problem with help files but not with viewing data sets that Peter reported) occurred with version 0.9-11 of Rcmdr but not with an earlier version. Since the code for the Rcmdr package was substantially reworked this summer, that seems to me a good candidate for the source of these problems, though I don't see why the changes should be problematic. I'm afraid that I'm insufficiently familiar with the inner workings of X11 and Tcl/Tk to be much help in figuring out what's wrong. Everything seems to work fine under Windows, as far as I can tell. It occurs to me that if the warning messages are benign, one approach would be to suppress them. I already intercept warnings and present them in dialog boxes; I could grep for X11 protocol error and simply ignore these. That doesn't seem to me a good solution, however. It would be better to understand what's happening. I'm copying this message to Dirk since he's mentioned that he plans to put the newer Rcmdr in Quantian. Dirk: Have you tested with Rcmdr 0.9-11? Thank you. John John, Peter and Dirk, I'm glad I'm not the only one with a handle on it now. I've already exceeded my level of knowledge and skill in both Linux and R. But I'm happy to help track it down, if I can be of assistance. As I have said to John, the major value of Rcmdr to me is to help 'sell' R to others in my organisation currently using SPSS, and for this purpose, the Windows version at work is working fine. It is more of an annoyance at home, as it doesn't seem to stop anything working. Peter asked: BTW, sometimes Tk errors allow you to see a trace of the execution. Would this happen to be one of those situations? The short answer is I don't know (see my first line :-)). There is no button on the dialogue box saying 'more details...' or anything that obvious. Is there a log file somewhere on the system or some other way to generate such a trace? I have been experimenting, and the following seem reliable (at least on my system): Error messages don't seem to happen for the first graph generated. The second graph drawn also generates an error dialogue box with the error message repeated about 11 times. The third and subsequent times exactly the same graph is generated leads to an error message repeated about 21 times. (I don't know if the number of repetitions is meaningful). Note that the graphs are generated and visible when the error messages appear. Running any analysis that only writes output to the output window (such as fitting a regression model) generates the error messages before the output is written to the output window. The output appears when the error dialogue box is OK'd. I guess it's more likely to be responding to exiting the dialogue box than writing the output. But this only happens when such error messages have been generated previously in that session by creating a (second) graph. If the diagnostic panel plots for a regression model are called, the error messages appear, but this time the plots themselves are not visible in the graphics device (it is blank). When the error messages are OK'd, the plots appear in the device. When exiting Rcmdr, then, and OK'ing the Exit? dialogue box, the final error messages appear, but again only if they have already been generated by a graph call. I don't know if this will be helpful, but I thought reliable observations might give some clues. Regards, Michael [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Rcmdr X11 protocol error message
Dear Michael, -Original Message- From: Michael Bibo [mailto:[EMAIL PROTECTED] Sent: Saturday, September 04, 2004 9:59 AM To: John Fox Cc: 'Peter Dalgaard'; [EMAIL PROTECTED]; 'Dirk Eddelbuettel' Subject: Re: [R] Rcmdr X11 protocol error message . . . I have another reliable observation with the new version of Rcmdr, which I offer only because it might provide a clue as to what's going on: If you generate a scatterplot for two numerical variables grouped by a factor, the plot is drawn, but the dialogue box remains on screen, and is unresponsive. When the graphics device is closed, you get an error message: 'Error: No graphics device active'. If you then try to generate the same scatterplot ungrouped (which previously worked fine) you receive a message; 'Warning: display list redraw incomplete'. This only occurs the first time you try to draw the ungrouped scatterplot. Hope it helps. This works fine for me, under both Windows and Quantian. Is it possible that you forgot to left-click to position the legend for the scatterplot by groups (at is indicated at the bottom of the Groups dialog box)? I hope that this helps, John Peter, Any hints as to how to generate that trace of the execution you mentioned? Michael [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] a little question about R
You can use letters[1:4]. (B (BI hope that this helps, (B John (B (B -Original Message- (B From: [EMAIL PROTECTED] (B [mailto:[EMAIL PROTECTED] On Behalf Of ?? (B Sent: Tuesday, September 07, 2004 7:42 PM (B To: [EMAIL PROTECTED] (B Subject: [R] a little question about R (B (B Hello,sir: Here's a little question about R which needs your (B help.Thanks in advance. (B If I wanna make a sequence just like a,b,c,d (In other (B words,a vector consists of 4 characters :a,b,c,d ).How can I (B do it in a shortcut manner? Yes,I can do it as following: (B c("a","b","c","d") and the result is:[1] "a" "b" "c" "d". (B But I remember there's a shortcut manner to do the same (B thing,something like c("a":"d").But the system said:"Error in (B "a":"d" : NA/NaN argument" So I wonder the correct method to (B do it(I remember the expression is very similar with (B c("a":"d")). Thanks from the bottom of my heart. My best regards! (B (B (B__ (B[EMAIL PROTECTED] mailing list (Bhttps://stat.ethz.ch/mailman/listinfo/r-help (BPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] R conversion
Dear Mark, See ?factanal for a factor-analysis function, in the standard stats package (though factanal does ML factor analysis and not principal axes). Note that help.search(factor analysis) turns this up. With respect to your last question, it's not possible to know the source of the difference without knowing what the difference is. A guess is that specifying proportion=0.9 in SAS causes the program to use communality estimates rather than 1's on the diagonal of the correlation matrix. I hope this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Strivens Sent: Thursday, September 09, 2004 7:26 PM To: [EMAIL PROTECTED] Subject: [R] R conversion I am a newcomer to R trying to convert a SAS program to R. Does anyone know if there is a functional equivalent of the SAS 'Factor' procedure? For example in SAS: proc factor DATA=cor method=principal rotate=varimax proportion=0.9 scree where 'cor' is a correlation matrix (as in the R 'cor' function) This should get you a list of eigen values as well as a factor pattern matrix. Also why when I use the 'eigen' function in R does it seem to give a subtly different answer to the eigen values generated by the above program? Many thanks for any help Mark Strivens __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Proposal for New R List: Criticism? Comments?
Dear Bert, I believe that you've identified an important issue -- and one that's occasionally been discussed on this list previously -- but I'm not sure that another email list is a good solution. Some method of indexing functions in packages that would allow people to more easily locate them (e.g., author-supplied [i.e., not simply standard] keywords for each public object in a package) seems to me a more promising approach. Regards, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Berton Gunter Sent: Thursday, September 09, 2004 11:50 AM To: [EMAIL PROTECTED] Subject: [R] Proposal for New R List: Criticism? Comments? Folks: I would like to propose a new R list, tentatively labeled r-contents. I wish to briefly explain the purpose and format here and solicit public comments, pro or con, so feel free to criticize or suggest a better name and other improvements or alternatives. R presently consists of a suite of about a dozen core recommended packages and several hundred contributed packages comprising thousands -- perhaps tens of thousands -- of functions. Hopefully, this will continue to grow rapidly. No one can possibly keep track of all of this, and it is therefore a daunting task for someone seeking specific functionality to find it, especially when they are relatively new to R. Of course, R and CRAN (and Google and ...) have various search capabilities that help, but these are essentially keyword-based and so require the searcher to guess search terms that are at least reasonably close to function names and keywords. A lot of the time this works, but it can be tedious; some of the time one guesses wrong, and it doesn't work. S-Plus and much other software addresses this by providing a semantically-based Contents Index (or something like it) in their Help functionality. I find this quite useful, but creating and maintaining such an index seems to me to be extremely labor intensive, fraught with its own issues (what heading should I look under?), and, I think, not a good fit to the spirit and dynamics of R anyway. Not surprisingly, as a result, many of the questions addressed to r-help are of the form: I want to do such and such. How do I do it? While this certainly gives answers, I think the breadth of r-help and its etiquette and posting conventions result in an abruptness to many of our replies (Read the posting guide! Read the Help files and do what they say!) that discourages many users -- especially casual ones -- from posting questions, and thus may thus discourage use of R. Clearly, if true, this is not a good thing; on the other hand, I think that given r-help's purpose and practices, many of these abrupt replies may well be appropriate (I'm a curmudgeon at heart!). Hence, there is a mismatch between user needs and r-help services. To address this mismatch, I would like to propose a new list, r-contents, to essentially serve the same purpose as the S-Plus Contents index. Hence, it would serve as a place for users to post queries ** only ** of the form: I want to do such and such. How do I do it? and receive answers that would all be **single phrases ** of the form package suchandsuch or ?suchandsuchfunction. No further explanations regarding usage would be provided, though users would be free to follow up answers with private questions to the responder, although there should be no expectation of any response. Queries could be framed with as much or as little supporting detail as desired, with the obvious consequence that a more clearly framed question would be more likely to get a (better) response. No other posting conventions (aside from the usual ones regarding civility and adherence to topic) would be expected. My hope is that such a list would both reduce unnecessary traffic on r-help and satisfy a genuine need in a less threatening way. I can certainly see downsides (I often learn a lot from How can I do this? queries), but I think, on balance, this approach might be useful. So I would like to subject the idea to public scrutiny and criticism, as well as the opportunity for improvement from suggested modifications or alternatives. If it's useful, this will be recognized; if it's not and/or no one is interested, that, too, will be made manifest. I would be especially grateful for the opinions of casual users or newbies, either publicly or privately. Cheers, -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Proposal for New R List: Criticism? Comments?
Dear Brian et al., Jonathan's search site is excellent -- I use it frequently -- and for some reason new users seem unaware of help.search(), which, despite the fact that it searches only in installed packages, I also find very useful. A couple of comments, however: First, if help pages from all packages were available at a central location -- e.g., at CRAN -- help.search() could have an option to search that location. Second, I still feel that it would be useful to provide some other way of searching the space of all available functions. One idea, which I mentioned in an earlier message on this thread, would be a keyword system (again, different from the current set of standard keywords). The keywords could be accessed by help.search() and also compiled into an index. Regards, John -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Friday, September 10, 2004 5:26 AM To: Jonathan Baron Cc: Adaikalavan Ramasamy; John Fox; R-help; 'Berton Gunter' Subject: Re: [R] Proposal for New R List: Criticism? Comments? On Fri, 10 Sep 2004, Jonathan Baron wrote: On 09/10/04 03:54, Adaikalavan Ramasamy wrote: There is another issue to be considered. Currently you need to have the relevant packages installed before help.search() bring it up. My work around this is to install all available packages just in case the function I need is nestled in some non-standard packages. I also update them rather frequently. I do this too, at my search site (where frequently=monthly) and you can search functions only, and use Boolean search expressions and phrases. But right now the entire set of packages takes about 885 meg (if I'm reading du correctly), which is less than my very modest collection of digital photos, and a tiny fraction of a 3-year-old standard hard disk. In other words, it is no big deal to install all the packages if you have your own computer. I am seeing about 520Mb for all base + CRAN packages under 1.9.1, and it will be rather less under 2.0.0 as more parts are stored compressed. BioC is a lot larger. It is however, a BIG deal to install *all* the packages and am I currently 10 short since they depend on other software that I do not have a licence for or will not compile (and there are three others I cannot reinstall using current gcc). On AMD64 and Solaris there are several others, and something like 20 do not install on Windows. (I could use --install-fake as the CRAN checks do, but I have the almost complete set installed to test R changes, not test packages.) So I do see some merit in having a full-text search for R help available at some URL, as Jonathan has kindly provided. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] boxplot() from list
Dear Laura, You don't say what kind of objects are in the list, but suppose that they are matrices; here's a scaled-down example using 3 list elements: M1 - matrix(rnorm(30*4), 48, 4) M2 - matrix(rnorm(30*4), 48, 4) M3 - matrix(rnorm(30*4), 48, 4) L - list(M1=M1, M2=M2, M3=M3) par(mfrow=c(3, 4)) lapply(L, function(x) apply(x, 2, boxplot)) I hope that this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Laura Quinn Sent: Sunday, September 12, 2004 10:10 AM To: [EMAIL PROTECTED] Subject: [R] boxplot() from list I have a list containing 48 objects (each with 30 rows and 4 columns, all numeric), and wish to produce 4 boxplot series (with 48 plots in each) , one for each column of each object. Basically I want a boxplot from boxplot(mylist[[]][,i]) for i in 1:4. It seems that I can create a boxplot of length 48 from the entire list, but I don't seem able to subscript to return 4 boxplots from the list - I have also tried to create 4 new lists (one for each column of each object) by using variations on the following, but none seems to work: newlist-oldlist[,1] newlist-oldlist[[]][,1] newlist-oldlist[[]][,$colone] can anyone please offer some insight?? Thanks in advance, Laura Quinn Institute of Atmospheric Science School of Earth and Environment University of Leeds Leeds LS2 9JT __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] mahalanobis distance
Dear Murli, Try ?mahalanobis, which, by the way, is turned up by help.search(mahalanobis). I hope this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Murli Nair Sent: Sunday, September 12, 2004 3:17 PM To: [EMAIL PROTECTED] Subject: [R] mahalanobis distance Is there a function that calculate the mahalanobis distance in R . The dist function calculates euclidean', 'maximum', 'manhattan', 'canberra', 'binary' or 'minkowski'. Thanks ../Murli __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] replacing NA's with 0 in a dataframe for specified columns
Dear David, How about the following? cols - c(1,3) x[,cols][is.na(x[,cols])] - 0 I hope that this helps, John On Wed, 15 Sep 2004 14:44:53 -0400 David Kane [EMAIL PROTECTED] wrote: I know that there must be a cool way of doing this, but I can't think of it. Let's say I have an dataframe with NA's. x - data.frame(a = c(0,1,2,NA), b = c(0,NA,1,2), c = c(NA, 0, 1, 2)) x a b c 1 0 0 NA 2 1 NA 0 3 2 1 1 4 NA 2 2 I know it is easy to replace all the NA's with zeroes. x[is.na(x)] - 0 x a b c 1 0 0 0 2 1 0 0 3 2 1 1 4 0 2 2 But how do I do this for just columns a and c, leaving the NA in column b alone? Thanks, Dave Kane R.version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major1 minor9.1 year 2004 month06 day 21 language R __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Splitting vector into individual elements
Dear Paul, How about do.call(rgb, as.list(offred.rgb)) ? I hope that this helps, John On Wed, 15 Sep 2004 15:20:24 -0500 (CDT) Paul Roebuck [EMAIL PROTECTED] wrote: Is there a means to split a vector into its individual elements without going the brute-force route for arguments to a predefined function call? offred.rgb - c(1, 0, 0) * 0.60; ## Brute force style offred.col - rgb(offred.rgb[1], offred.rgb[2], offred.rgb[3], names = offred) ## Desired style offred.col - rgb(silver.bullet(offred.rgb), names = offred) Neither of my attempts gets it right. silver.bullet.try1 - function(x) { expr - cat(x, sep = ,) return(parse(text = expr)) } silver.bullet.try2 - function(x) { expr - expression(cat(x, sep = ,)) return(eval(expr)) } -- SIGSIG -- signature too long (core dumped) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How do I insert a newline in my title in a plot?
Dear Christos, This works for me, and has many times in the past. Is it possible that you used a forward-slash (/), rather than a back-slash (\)? Alternatively, perhaps this has something to do with using Greek fonts. I hope this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christos Rodopoulos Sent: Thursday, September 16, 2004 8:00 AM To: '[EMAIL PROTECTED]' Subject: FW: [R] How do I insert a newline in my title in a plot? yes I have tries, and nothing. It just shows the strings with the slashn, just like i typed it. -Original Message- From: Rashid Nassar [mailto:[EMAIL PROTECTED] Sent: , 16 2004 15:44 To: Christos Rodopoulos Subject: Re: [R] How do I insert a newline in my title in a plot? Have you not tried what you have already suggested: title(this is a title\nIn 2 lines) ? On Thu, 16 Sep 2004, Christos Rodopoulos wrote: Hello, I want to help me with a simple I think question: How do I insert a newline into the title of a plot I have made? Is it only done with hershey fonts? vfont = ??? and so on? I do not understand Hershey fonts, and I am afraid to use them, since I use greek fonts (iso8859-7). I need propably something like: title(This is a title\nIn 2 lines); any help? __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Proposal for New R List: Criticism? Comments?
Dear Martin, Thanks for pointing this out -- I'm ashamed to say that I forgot about \concept{} entries. As you say (aside from people stupidly forgetting that they exist), the problem is to get people to use them. How about requiring one or more concept entries for each help file? Regards, John -Original Message- From: Martin Maechler [mailto:[EMAIL PROTECTED] Sent: Friday, September 17, 2004 7:57 AM To: John Fox Cc: 'R-help' Subject: RE: [R] Proposal for New R List: Criticism? Comments? Hi John et al. I'm coming late to this thread (because of vacation), JohnF == John Fox [EMAIL PROTECTED] on Fri, 10 Sep 2004 10:56:51 -0400 writes: JohnF Dear Brian et al., JohnF Jonathan's search site is excellent -- I use it JohnF frequently -- and for some reason new users seem JohnF unaware of help.search(), which, despite the fact JohnF that it searches only in installed packages, I also JohnF find very useful. yes and yes. JohnF A couple of comments, however: First, if help pages JohnF from all packages were available at a central JohnF location -- e.g., at CRAN -- help.search() could have JohnF an option to search that location. Second, I still JohnF feel that it would be useful to provide some other JohnF way of searching the space of all available JohnF functions. One idea, which I mentioned in an earlier JohnF message on this thread, would be a keyword system JohnF (again, different from the current set of standard JohnF keywords). \concept{} was introduced for this JohnF The keywords could be accessed by help.search() and this happens (by default) for \concept{} entries JohnF and also compiled into an index. this doesn't happen yet. The ``real problem'' of course is that package authors need to write all these \concept{} entries before such an index can really become useful. Martin Maechler -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Friday, September 10, 2004 5:26 AM To: Jonathan Baron Cc: Adaikalavan Ramasamy; John Fox; R-help; 'Berton Gunter' Subject: Re: [R] Proposal for New R List: Criticism? Comments? On Fri, 10 Sep 2004, Jonathan Baron wrote: On 09/10/04 03:54, Adaikalavan Ramasamy wrote: There is another issue to be considered. Currently you need to have the relevant packages installed before help.search() bring it up. My work around this is to install all available packages just in case the function I need is nestled in some non-standard packages. I also update them rather frequently. I do this too, at my search site (where frequently=monthly) and you can search functions only, and use Boolean search expressions and phrases. But right now the entire set of packages takes about 885 meg (if I'm reading du correctly), which is less than my very modest collection of digital photos, and a tiny fraction of a 3-year-old standard hard disk. In other words, it is no big deal to install all the packages if you have your own computer. I am seeing about 520Mb for all base + CRAN packages under 1.9.1, and it will be rather less under 2.0.0 as more parts are stored compressed. BioC is a lot larger. It is however, a BIG deal to install *all* the packages and am I currently 10 short since they depend on other software that I do not have a licence for or will not compile (and there are three others I cannot reinstall using current gcc). On AMD64 and Solaris there are several others, and something like 20 do not install on Windows. (I could use --install-fake as the CRAN checks do, but I have the almost complete set installed to test R changes, not test packages.) So I do see some merit in having a full-text search for R help available at some URL, as Jonathan has kindly provided. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Issue with predict() for glm models
# #Degrees of Freedom: 39 Total (i.e. Null); 37 Residual #Null Deviance: 55.45 #Residual Deviance: 41.67AIC: 47.67 ### # Predicted Probabilities for Test Data # ### New.Data - data.frame(predictors.train1=predictors.test[,1], predictors.train2=predictors.test[,2]) logreg.pred.prob.test - predict.glm(log.reg,New.Data,type=response) logreg.pred.prob.test #logreg.pred.prob.test # [1] 0.51106406 0.15597423 0.04948404 0.03863875 0.35587589 0.71331091 # [7] 0.17320087 0.14176632 0.30966718 0.61878952 0.12525988 0.21271139 #[13] 0.70068113 0.18340723 0.10295501 0.44591568 0.72285161 0.31499339 #[19] 0.65789420 0.42750139 0.14435889 0.93008117 0.70798465 0.80109005 #[25] 0.89161472 0.47480625 0.56520952 0.63981834 0.57595189 0.60075882 #[31] 0.96493393 0.77015507 0.87643986 0.62973986 0.63043351 0.45398955 #[37] 0.80855782 0.90835588 0.54809117 0.11568637 Of course, notice that the vector for the predicted probabilities has only 40 elements, while the New.Data has 50 elements (since n.test has 25 per group for 2 groups) and thus should have 50 predicted probabilities. As it turns out, the output is for the training data predictors and not for the New.Data as I would like it to be. I should also note that I have made sure that the names for the predictors in the New.Data are the same as the names for the predictors within the glm object (i.e., within log.reg) as this is what is done in the example for predict.glm() within the help files. Could some one help me understand either what I am doing incorrectly or what problems there might be within the predict() function? I should mention that I tried the same program using predict.glm() and obtained the same problematic results. Thanks and take care, Joe Joe Rausch, M.A. Psychology Liaison Lab for Social Research 917 Flanner Hall University of Notre Dame Notre Dame, IN 46556 (574) 631-3910 If we knew what it was we were doing, it would not be called research, would it? - Albert Einstein __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Issue with predict() for glm models
Dear Uwe, Unless I've somehow messed this up, as I mentioned yesterday, what you suggest doesn't seem to work when the predictor is a matrix. Here's a simplified example: X - matrix(rnorm(200), 100, 2) y - (X %*% c(1,2) + rnorm(100)) 0 dat - data.frame(y=y, X=X) mod - glm(y ~ X, family=binomial, data=dat) new - data.frame(X = matrix(rnorm(20),2)) predict(mod, new) 12345 6 1.81224443 -5.92955128 1.98718051 -10.05331521 2.6506 -2.50635812 789 10 11 12 5.63728698 -0.94845276 -3.61657377 -1.63141320 5.03417372 1.80400271 13 14 15 16 17 18 9.32876273 -5.32723406 5.29373023 -3.90822713 -10.95065186 4.90038016 . . . 97 98 99 100 -6.92509812 0.59357486 -1.17205723 0.04209578 Note that there are 100 rather than 10 predicted values. But with individuals predictors (rather than a matrix), x1 - X[,1] x2 - X[,2] dat.2 - data.frame(y=y, x1=x1, x2=x2) mod.2 - glm(y ~ x1 + x2, family=binomial, data=dat.2) new.2 - data.frame(x1=rnorm(10), x2=rnorm(10)) predict(mod.2, new.2) 1 2 3 4 5 6 7 6.5723823 0.6356392 4.0291018 -4.7914650 2.1435485 -3.1738096 -2.8261585 8 9 10 -1.5255329 -4.7087592 4.0619290 works as expected (?). Regards, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Thursday, September 23, 2004 1:33 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [R] Issue with predict() for glm models [EMAIL PROTECTED] wrote: Hello everyone, I am having a problem using the predict (or the predict.glm) function in R. Basically, I run the glm model on a training data set and try to obtain predictions for a set of new predictors from a test data set (i.e., not the predictors that were utilized to obtain the glm parameter estimates). Unfortunately, every time that I attempt this, I obtain the predictions for the predictors that were used to fit the glm model. I have looked at the R mailing list archives and don't believe I am making the same mistakes that have been made in the past and also have tried to closely follow the predict.glm example in the help file. Here is an example of what I am trying to do: set.seed(545345) # Necessary Variables # p - 2 train.n - 20 test.n - 25 mean.vec.1 - c(1,1) mean.vec.2 - c(0,0) Sigma.1 - matrix(c(1,.5,.5,1),p,p) Sigma.2 - matrix(c(1,.5,.5,1),p,p) ### # Load MASS Library # ### library(MASS) ### # Data to Parameters for Logistic Regression Model # ### train.data.1 - mvrnorm(train.n,mu=mean.vec.1,Sigma=Sigma.1) train.data.2 - mvrnorm(train.n,mu=mean.vec.2,Sigma=Sigma.2) train.class.var - as.factor(c(rep(1,train.n),rep(2,train.n))) predictors.train - rbind(train.data.1,train.data.2) ## # Test Data Where Predictions for Probabilities Using Logistic Reg. # # From Training Data are of Interest # ## test.data.1 - mvrnorm(test.n,mu=mean.vec.1,Sigma=Sigma.1) test.data.2 - mvrnorm(test.n,mu=mean.vec.2,Sigma=Sigma.2) predictors.test - rbind(test.data.1,test.data.2) ## # Run Logistic Regression on Training Data # ## log.reg - glm(train.class.var~predictors.train, family=binomial(link=logit)) Well, you haven't specified the data argument, but given the two variables directly. Exactly those variables will be used in the predict() step below! If you want the predict() step to work, use something like: train - data.frame(class = train.class.var, predictors = predictors.train) log.reg - glm(class ~ ., data = train, family=binomial(link=logit)) log.reg # log.reg #Call: glm(formula = train.class.var ~ predictors.train, family = #binomial(link = logit)) # #Coefficients: # (Intercept) predictors.train1 predictors.train2 # 0.5105-0.2945-1.0811 # #Degrees of Freedom: 39 Total (i.e. Null); 37 Residual #Null Deviance: 55.45 #Residual Deviance: 41.67AIC: 47.67 ### # Predicted Probabilities for Test Data # ### New.Data - data.frame(predictors.train1=predictors.test[,1], predictors.train2=predictors.test[,2]) logreg.pred.prob.test -
RE: [R] Issue with predict() for glm models
Dear Uwe, -Original Message- From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: Thursday, September 23, 2004 8:06 AM To: John Fox Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [R] Issue with predict() for glm models John Fox wrote: Dear Uwe, Unless I've somehow messed this up, as I mentioned yesterday, what you suggest doesn't seem to work when the predictor is a matrix. Here's a simplified example: X - matrix(rnorm(200), 100, 2) y - (X %*% c(1,2) + rnorm(100)) 0 dat - data.frame(y=y, X=X) mod - glm(y ~ X, family=binomial, data=dat) new - data.frame(X = matrix(rnorm(20),2)) predict(mod, new) Dear John, the questioner had a 2 column matrix with 40 and one with 50 observations (not a 100 column matrix with 2 observation) and for those matrices it works ... Indeed, and in my example the matrix predictor X has 2 columns and 100 rows; I did screw up the matrix for the new data to be used for predictions (in the example I sent today but not yesterday), but even when this is done right -- where the new data has 10 rows and 2 columns -- there are 100 (not 10) predicted values: X - matrix(rnorm(200), 100, 2) # original predictor matrix with 100 rows y - (X %*% c(1,2) + rnorm(100)) 0 dat - data.frame(y=y, X=X) mod - glm(y ~ X, family=binomial, data=dat) new - data.frame(X = matrix(rnorm(20),10, 2)) # corrected -- note 10 rows predict(mod, new) # note 100 predicted values 12345 6 5.75238091 0.31874587 -3.00515893 -3.77282121 -1.97511221 0.54712914 789 10 11 12 1.85091226 4.38465524 -0.41028694 -1.53942869 0.57613555 -1.82761518 . . . 91 92 93 94 95 96 0.36210780 1.71358713 -9.63612775 -4.54257576 -5.29740468 2.64363405 97 98 99 100 -4.45478627 -2.44973209 2.51587537 -4.09584837 Actually, I now see the source of the problem: The data frames dat and new don't contain a matrix named X; rather the matrix is split columnwise: names(dat) [1] y X.1 X.2 names(new) [1] X.1 X.2 Consequently, both glm and predict pick up the X in the global environment (since there is none in dat or new), which accounts for why there are 100 predicted values. Using list() rather than data.frame() produces the originally expected behaviour: new - list(X = matrix(rnorm(20),10, 2)) predict(mod, new) 1 2 3 4 5 6 7 5.9373064 0.3687360 -8.3793045 0.7645584 -2.6773842 2.4130547 0.7387318 8 9 10 -0.4347916 8.4678728 -0.8976054 Regards, John Best, Uwe 12345 6 1.81224443 -5.92955128 1.98718051 -10.05331521 2.6506 -2.50635812 789 10 11 12 5.63728698 -0.94845276 -3.61657377 -1.63141320 5.03417372 1.80400271 13 14 15 16 17 18 9.32876273 -5.32723406 5.29373023 -3.90822713 -10.95065186 4.90038016 . . . 97 98 99 100 -6.92509812 0.59357486 -1.17205723 0.04209578 Note that there are 100 rather than 10 predicted values. But with individuals predictors (rather than a matrix), x1 - X[,1] x2 - X[,2] dat.2 - data.frame(y=y, x1=x1, x2=x2) mod.2 - glm(y ~ x1 + x2, family=binomial, data=dat.2) new.2 - data.frame(x1=rnorm(10), x2=rnorm(10)) predict(mod.2, new.2) 1 2 3 4 5 6 7 6.5723823 0.6356392 4.0291018 -4.7914650 2.1435485 -3.1738096 -2.8261585 8 9 10 -1.5255329 -4.7087592 4.0619290 works as expected (?). Regards, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Thursday, September 23, 2004 1:33 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [R] Issue with predict() for glm models [EMAIL PROTECTED] wrote: Hello everyone, I am having a problem using the predict (or the predict.glm) function in R. Basically, I run the glm model on a training data set and try to obtain predictions for a set of new predictors from a test data set (i.e., not the predictors that were utilized to obtain the glm parameter estimates). Unfortunately, every time that I attempt this, I obtain the predictions for the predictors that were used to fit the glm model. I have looked at the R mailing list archives and don't believe I am making the same mistakes that have been made in the past and also have tried to closely follow the predict.glm example in the help file. Here is an example
RE: [R] Issue with predict() for glm models
Dear Uwe, -Original Message- From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: Thursday, September 23, 2004 11:37 AM To: John Fox Cc: [EMAIL PROTECTED] Subject: Re: [R] Issue with predict() for glm models . . . John, note that I used glm(y ~ .) (the dot!), because the names are automatically chosen to be X.1 and X.2, hence you cannot use X in the formula in this case ... Best, Uwe OK -- I see. I did notice that you used . in the formula but didn't make the proper connection! Thanks, John __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RE: using tcltk in R under ESS/XEmacs on Windows
Dear Andy, Perhaps it's possible, but I've never been able to get tcltk to work properly under ESS/Xemacs on Windows. Regards, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Friday, September 24, 2004 3:03 PM To: [EMAIL PROTECTED]; ESS (Help list) Subject: using tcltk in R under ESS/XEmacs on Windows Sorry for the cross-post. Not sure where the problem is... A while back I posted an R function to R-help: cd - function (dir = tclvalue(tkchooseDirectory()), saveOld = FALSE, loadNew = TRUE) { stopifnot(require(tcltk)) if (saveOld) save.image(compress = TRUE) setwd(dir) rm(list = ls(all = TRUE, envir = .GlobalEnv), envir = .GlobalEnv) if (loadNew file.exists(.RData)) { loaded - load(.RData, envir = .GlobalEnv) return(invisible(loaded)) } where the default value for the `dir' argument is to run the tcltk directory chooser and get the directory name chosen. (Thanks to Prof. John Fox for the tcltk part!!) While this function works fine under Rgui on Windows, it doesn't work when running R within ESS (5.2.3) and XEmacs (21.4.13). The directory chooser never shows up, and dir just gets the empty string. Does anyone have any idea what could be the problem? I'd very much appreciate any pointers. Best, Andy __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/ess-help __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RE: using tcltk in R under ESS/XEmacs on Windows
Dear Andy, I've taken the liberty of copying this reply to the r-help and ess-help lists since my last reply was misleading. I believe that the problem with tkchooseDirectory() (you can see it by just calling that function directly) is that it brings up a native Windows dialog, and it's that kind of tcltk command that I've had trouble with in ESS/XEmacs under Windows. Regards, John -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Friday, September 24, 2004 3:23 PM To: 'John Fox' Subject: RE: using tcltk in R under ESS/XEmacs on Windows Hi John, demo(tkfaq) did work for me. Maybe that's the exception... Thanks! Andy From: John Fox Dear Andy, Perhaps it's possible, but I've never been able to get tcltk to work properly under ESS/Xemacs on Windows. Regards, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Friday, September 24, 2004 3:03 PM To: [EMAIL PROTECTED]; ESS (Help list) Subject: using tcltk in R under ESS/XEmacs on Windows Sorry for the cross-post. Not sure where the problem is... A while back I posted an R function to R-help: cd - function (dir = tclvalue(tkchooseDirectory()), saveOld = FALSE, loadNew = TRUE) { stopifnot(require(tcltk)) if (saveOld) save.image(compress = TRUE) setwd(dir) rm(list = ls(all = TRUE, envir = .GlobalEnv), envir = .GlobalEnv) if (loadNew file.exists(.RData)) { loaded - load(.RData, envir = .GlobalEnv) return(invisible(loaded)) } where the default value for the `dir' argument is to run the tcltk directory chooser and get the directory name chosen. (Thanks to Prof. John Fox for the tcltk part!!) While this function works fine under Rgui on Windows, it doesn't work when running R within ESS (5.2.3) and XEmacs (21.4.13). The directory chooser never shows up, and dir just gets the empty string. Does anyone have any idea what could be the problem? I'd very much appreciate any pointers. Best, Andy __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/ess-help -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Diagnosing trouble with R-2.0, Fedora Core 2, and Rcmdf
Dear Paul, I'm currently out of town, limiting a bit my ability to answer these questions definitively, but I'll give it a shot (hoping that my memory isn't faulty): On Mon, 11 Oct 2004 13:08:39 -0500 Paul Johnson [EMAIL PROTECTED] wrote: Greetings, R-help! On 2 Fedora Core 2 Linux systems, i've completely erased the previous R and all packages and then installed R-2.0 and installed fresh packages. In using Rcmdr, I see some trouble and I wonder if other people see this and if it is due to the tcl/tk, or R, or Rcmdr. (If readers have not yet tried Rcmdr, I recommend it not just because of the GUI it provides, but also because Prof. Fox has created several very handy commands, like scatter3d, which make it possible to use advanced packages like rgl. Rcmdr provides several very handy wrapper commands that just work and users don't have to fuss over so many details! And, if you are new at R, you can use pull down menus and then Rcmdr displays the commands that correspond with those menu options. Very educational!) In no particular order, here are the issues I see with R-2.0 and Rcmdr 1. In the panel to select a dataset from a package (nenu: Data/Data in packages), I can type in the name of a dataset, but the GUI will not let me choose the name of a package from which to select a dataset. Nothing happens when I pick a package name. (With previous R/Rcmdr, I did not have this trouble). The problem is that the Rcmdr code doesn't quote the name of the package. I believe that I've fixed this problem, but haven't posted the fixed version either to my web site or to CRAN. 2. When using Rcmdr, some of the output from commands shows in the Rcmdr bottom window, but some is sent to the xterm in which R was started. For example, with a scatter3d() command, if I add the option model.summary=T, the model summary shows in the xterm. But some other models show results in the bottom pane of Rcmdr. And, it seems to me, there are some commands in which I've noticed some of the output going to the Rcmdr bottom buffer and some to the xterm. As I recall, I've fixed this problem as well, but perhaps not in a completely general way. Again, I haven't yet released the fixed version. 3. When I want to exit Rcmdr and choose the pull down exit from Commander and R, I get a Segmentation Fault that closes Rcmdr and R. This I haven't seen before. 4. 3d plots do not display on the screen until I click in the rgl window that displays the graph. This might be a video card/AGP driver problem, I suppose. These systems have the NVidia FX5200 cards and the NVidia proprietary X driver. If other users don't see the same on other kinds of systems, I guess that will tell me where the problem lies. I believe that I've noticed this problem on a Quantian system. 5. At random times, I get an error tcl/tk window popping up with a message about MASS and categorical variables. It says This reflects the first problem, and is produced by the way that the Rcmdr initially suppresses and then reports warnings. I'll check further when I return home. Sorry for the problems. John Error in data(package=parse(text=packageName)); 'MASS' must be character string or NULL It says that over and over, sometimes it has other package names, as in: Error in data(package=parse(text=packageName)); 'relimp' must be character string or NULL These seem to be harmless? -- Paul E. Johnson email: [EMAIL PROTECTED] Dept. of Political Sciencehttp://lark.cc.ku.edu/~pauljohn 1541 Lilac Lane, Rm 504 University of Kansas Office: (785) 864-9086 Lawrence, Kansas 66044-3177 FAX: (785) 864-5700 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Diagnosing trouble with R-2.0, Fedora Core 2, and Rcmdf
Dear Paul, I've now had a chance to check into these problems further: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Johnson Sent: Monday, October 11, 2004 1:09 PM To: r help Subject: [R] Diagnosing trouble with R-2.0, Fedora Core 2, and Rcmdf Greetings, R-help! On 2 Fedora Core 2 Linux systems, i've completely erased the previous R and all packages and then installed R-2.0 and installed fresh packages. In using Rcmdr, I see some trouble and I wonder if other people see this and if it is due to the tcl/tk, or R, or Rcmdr. (If readers have not yet tried Rcmdr, I recommend it not just because of the GUI it provides, but also because Prof. Fox has created several very handy commands, like scatter3d, which make it possible to use advanced packages like rgl. Rcmdr provides several very handy wrapper commands that just work and users don't have to fuss over so many details! And, if you are new at R, you can use pull down menus and then Rcmdr displays the commands that correspond with those menu options. Very educational!) In no particular order, here are the issues I see with R-2.0 and Rcmdr 1. In the panel to select a dataset from a package (nenu: Data/Data in packages), I can type in the name of a dataset, but the GUI will not let me choose the name of a package from which to select a dataset. Nothing happens when I pick a package name. (With previous R/Rcmdr, I did not have this trouble). I had in fact fixed this in the development version of the package (Rcmdr 0.9-12). I've now posted that to my web site, at http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/index.html. I'd like to do some more testing before sending this version to CRAN. 2. When using Rcmdr, some of the output from commands shows in the Rcmdr bottom window, but some is sent to the xterm in which R was started. For example, with a scatter3d() command, if I add the option model.summary=T, the model summary shows in the xterm. But some other models show results in the bottom pane of Rcmdr. And, it seems to me, there are some commands in which I've noticed some of the output going to the Rcmdr bottom buffer and some to the xterm. I did in fact try to deal with this problem. The solution that I employed didn't work for scatter3d() with model.summary=TRUE, since the printed summaries were produced as side effects by scatter3d(). That's a bad practice, and I've fixed it; now scatter3d() returns a list of summaries, and the Rcmdr can cope with that. (As you know, it's not my intention that model.summary=TRUE be set from the Rcmdr GUI, but it's better to have it work if you enter this from the script editor.) You imply, however, that there are other cases of this problem. It would help to know what they are -- assuming that they're not fixed in the development version now on my web site. 3. When I want to exit Rcmdr and choose the pull down exit from Commander and R, I get a Segmentation Fault that closes Rcmdr and R. I haven't been able to duplicate this problem, to this point testing only under Windows XP. Does this occur only when an rgl graphics device is open? (The Rcmdr should close the rgl device before it exits.) 4. 3d plots do not display on the screen until I click in the rgl window that displays the graph. This might be a video card/AGP driver problem, I suppose. These systems have the NVidia FX5200 cards and the NVidia proprietary X driver. If other users don't see the same on other kinds of systems, I guess that will tell me where the problem lies. As I said, I believe that I've seen this problem before under Quantian. It would help to know whether it's peculiar to the Rcmdr, scatter3d(), or rgl. 5. At random times, I get an error tcl/tk window popping up with a message about MASS and categorical variables. It says Error in data(package=parse(text=packageName)); 'MASS' must be character string or NULL It says that over and over, sometimes it has other package names, as in: Error in data(package=parse(text=packageName)); 'relimp' must be character string or NULL These seem to be harmless? This should go away now that problem 1 is fixed. Regards, John -- Paul E. Johnson email: [EMAIL PROTECTED] Dept. of Political Sciencehttp://lark.cc.ku.edu/~pauljohn 1541 Lilac Lane, Rm 504 University of Kansas Office: (785) 864-9086 Lawrence, Kansas 66044-3177 FAX: (785) 864-5700 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Seeking advice on introducing R
Dear Arin, I have a variety of materials at http://socserv.socsci.mcmaster.ca/jfox/Courses/S-course/index.html that may be helpful. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of data Analytics Sent: Wednesday, October 13, 2004 8:19 AM To: [EMAIL PROTECTED] Subject: [R] Seeking advice on introducing R Dear List: This is in search of advices/opinions for a lecture presentation on R. I have this presentation scheduled on Saturday (16-October-2004); the topic is introduction to R. The audience is a group of researchers and scholars who have either never used R/are beginning to use R, but otherwise have used a packagecalledlimdeb,andSPSS.ThereisalsoarequesttocoverRgnumeri c,andexcelbindingstoR. I have no knowledge of limdeb. Would greatly appreciate your advices, opinions, and look forward to learn from your experiences in introducing R to similar audiences. Also, pointers to web resources would be greatly appreciated. TIA, Arin Basu Kolkata, India [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Testing for normality of residuals in a regression model
Dear Federico, A problem with applying a standard test of normality to LS residuals is that the residuals are correlated and heterskedastic even if the standard assumptions of the model hold. In a large sample, this is unlikely to be problematic (unless there's an unusual data configuration), but in a small sample the effect could be nontrivial. One approach is to use BLUS residuals, which transform the LS residuals to a smaller set of uncorrelated, homoskedastic residuals (assuming the correctness of the model). A search of R resources didn't turn up anything for BLUS, but they shouldn't be hard to compute. This is a standard topic covered in many econometrics texts. You might consider the alternative of generating a bootstrapped confidence envelope for the QQ plot; the qq.plot() function in the car package will do this for a linear model. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Federico Gherardini Sent: Friday, October 15, 2004 7:44 AM To: [EMAIL PROTECTED] Subject: [R] Testing for normality of residuals in a regression model Hi all, Is it possible to have a test value for assessing the normality of residuals from a linear regression model, instead of simply relying on qqplots? I've tried to use fitdistr to try and fit the residuals with a normal distribution, but fitdsitr only returns the parameters of the distribution and the standard errors, not the p-value. Am I missing something? Cheers, Federico __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] one more Rcmdr problem
Dear Adrian, Since I moved the discussion of this problem to the r-devel list, I should have reported back to r-help (1) that the problem was general to tlctk (not just the Rcmdr package), and (2) that Duncan Murdoch made a change to the patched Windows version of R 2.0.0 that made the problem go away. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Dusa Sent: Friday, October 15, 2004 11:44 AM To: 'Erich Neuwirth'; 'John Fox'; 'Duncan Murdoch' Cc: [EMAIL PROTECTED] Subject: RE: [R] one more Rcmdr problem Brilliant. With R 2.0.0 patched and Rcmdr 0.9-12 all problems are gone. Thank you all, and a special thank you for the R Commander; it saves a lot of effort for teaching purposes. Best regards, Adrian -Original Message- From: Erich Neuwirth [mailto:[EMAIL PROTECTED] Sent: 15 octombrie 2004 18:27 To: Adrian Dusa Cc: [EMAIL PROTECTED] Subject: Re: [R] one more Rcmdr problem I did experience the same problem. After installing R 2.0.0 patched and downloading the source for Rcmdr_0.99-12 from John Fox's Web page http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/ and recompiling the package it now works. Ctrl-C, Ctrl-X, and Ctrl-V, in the text boxes of Rcmdr now work. Adrian Dusa wrote: Hello, I'm using R 2.0.0 with the latest Rcmdr package installed from CRAN, on Windows XP Professional. When trying to copy some commands or results, either from the upper or lower text window, this causes Rcmdr to crash: R for Windows GUI front-end has encountered a problem and needs to close Did anyone have the same problem? I don't think it's my system, as it happened to reinstall my Windows just a few days ago, and the same problem occurred in the former one. Regards, Adrian Adrian Dusa Romanian Social Data Archive 1 Schitu Magureanu Bd. 050025 Bucharest sector 5 Tel. +40 21 3126618\ +40 21 3153122/ int.101 -- Erich Neuwirth, Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-38624 Fax: +43-1-4277-9386 -- This message was scanned for spam and viruses by BitDefender For more information please visit http://linux.bitdefender.com/ -- This message was scanned for spam and viruses by BitDefender For more information please visit http://linux.bitdefender.com/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Testing for normality of residuals in a regression model
Dear Kjetil, I don't believe that these are BLUS residuals, but since the last n - r effects are projections onto an orthogonal basis for the residual subspace, they should do just fine (as long as the basis vectors have the same length, which I think is the case, but perhaps someone can confirm). The general idea is to transform the LS residuals into an uncorrelated, equal-variance set. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: Kjetil Brinchmann Halvorsen [mailto:[EMAIL PROTECTED] Sent: Friday, October 15, 2004 9:12 AM To: John Fox Cc: 'Federico Gherardini'; [EMAIL PROTECTED] Subject: Re: [R] Testing for normality of residuals in a regression model John Fox wrote: Dear Federico, A problem with applying a standard test of normality to LS residuals is that the residuals are correlated and heterskedastic even if the standard assumptions of the model hold. In a large sample, this is unlikely to be problematic (unless there's an unusual data configuration), but in a small sample the effect could be nontrivial. One approach is to use BLUS residuals, which transform the LS residuals to a smaller set of uncorrelated, homoskedastic residuals (assuming the correctness of the model). I'm not sure if this are BLUE residuals, but the following function transform to a smaller set of independent, homoscedastic residuals and the calls shapiro.test: I've proposed to make this a method for shapiro.test for lm objects, but it is not accepted. shapiro.test.lm function (obj) { eff - effects(obj) rank - obj$rank df.r - obj$df.residual if (df.r 3) stop(To few degrees of freedom for residual for the test.) data.name - deparse(substitute(obj)) x - eff[-(1:rank)] res - shapiro.test(x) res$data.name - data.name res$method - paste(res$method, for residuals of linear model) res } Kjetil A search of R resources didn't turn up anything for BLUS, but they shouldn't be hard to compute. This is a standard topic covered in many econometrics texts. You might consider the alternative of generating a bootstrapped confidence envelope for the QQ plot; the qq.plot() function in the car package will do this for a linear model. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Federico Gherardini Sent: Friday, October 15, 2004 7:44 AM To: [EMAIL PROTECTED] Subject: [R] Testing for normality of residuals in a regression model Hi all, Is it possible to have a test value for assessing the normality of residuals from a linear regression model, instead of simply relying on qqplots? I've tried to use fitdistr to try and fit the residuals with a normal distribution, but fitdsitr only returns the parameters of the distribution and the standard errors, not the p-value. Am I missing something? Cheers, Federico __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Testing for normality of residuals in a regression model
Dear Federico, The problem is the same with GLS residuals -- even if the GLS transformation produces homoskedastic errors, the residuals will be correlated and heteroskedastic (with this problem tending to disappear in most instances as n grows). The central point is that residuals don't behave quite the same as errors. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Federico Gherardini Sent: Friday, October 15, 2004 11:22 AM To: [EMAIL PROTECTED] Subject: Re: [R] Testing for normality of residuals in a regression model Thank you very much for your suggestions! The residuals come from a gls model, because I had to correct for heteroscedasticity using a weighted regression... can I simply apply one of these tests (like shapiro.test) to the standardized residuals from my gls model? Cheers, Federico __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Testing for normality of residuals in a regression model
Dear Andy, At the risk of muddying the waters (and certainly without wanting to advocate the use of normality tests for residuals), I believe that your point #4 is subject to misinterpretation: That is, while it is true that t- and F-tests for regression coefficients in large sample retain their validity well when the errors are non-normal, the efficiency of the LS estimates can (depending upon the nature of the non-normality) be seriously compromised, not only absolutely but in relation to alternatives (e.g., robust regression). Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Friday, October 15, 2004 11:55 AM To: 'Federico Gherardini'; Berton Gunter Cc: R-help mailing list Subject: RE: [R] Testing for normality of residuals in a regression model Let's see if I can get my stat 101 straight: We learned that linear regression has a set of assumptions: 1. Linearity of the relationship between X and y. 2. Independence of errors. 3. Homoscedasticity (equal error variance). 4. Normality of errors. Now, we should ask: Why are they needed? Can we get away with less? What if some of them are not met? It should be clear why we need #1. Without #2, I believe the least squares estimator is still unbias, but the usual estimate of SEs for the coefficients are wrong, so the t-tests are wrong. Without #3, the coefficients are, again, still unbiased, but not as efficient as can be. Interval estimates for the prediction will surely be wrong. Without #4, well, it depends. If the residual DF is sufficiently large, the t-tests are still valid because of CLT. You do need normality if you have small residual DF. The problem with normality tests, I believe, is that they usually have fairly low power at small sample sizes, so that doesn't quite help. There's no free lunch: A normality test with good power will usually have good power against a fairly narrow class of alternatives, and almost no power against others (directional test). How do you decide what to use? Has anyone seen a data set where the normality test on the residuals is crucial in coming up with appriate analysis? Cheers, Andy From: Federico Gherardini Berton Gunter wrote: Exactly! My point is that normality tests are useless for this purpose for reasons that are beyond what I can take up here. Thanks for your suggestions, I undesrtand that! Could you possibly give me some (not too complicated!) links so that I can investigate this matter further? Cheers, Federico Hints: Balanced designs are robust to non-normality; independence (especially clustering of subjects due to systematic effects), not normality is usually the biggest real statistical problem; hypothesis tests will always reject when samples are large -- so what!; trust refers to prediction validity which has to do with study design and the validity/representativeness of the current data to future. I know that all the stats 101 tests say to test for normality, but they're full of baloney! Of course, this is free advice -- so caveat emptor! Cheers, Bert __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] one more Rcmdr problem
Dear Adrian, I've just been able to verify this problem. It occurs for me when I try to copy via Ctrl-C, but not when I use the Rcmdr Edit - Copy menu, nor when I use the right-click context menu. As near as I can tell, the problem is general to all Ctrl-key combinations. I don't have time at the moment to investigate further, but I suspect a problem between tcltk and Rgui. I'm copying this response to the r-devel list, since that might be a better place to pursue the problem. Sorry for the trouble. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Dusa Sent: Wednesday, October 13, 2004 6:35 AM To: [EMAIL PROTECTED] Subject: [R] one more Rcmdr problem Hello, I'm using R 2.0.0 with the latest Rcmdr package installed from CRAN, on Windows XP Professional. When trying to copy some commands or results, either from the upper or lower text window, this causes Rcmdr to crash: R for Windows GUI front-end has encountered a problem and needs to close Did anyone have the same problem? I don't think it's my system, as it happened to reinstall my Windows just a few days ago, and the same problem occurred in the former one. Regards, Adrian Adrian Dusa Romanian Social Data Archive 1 Schitu Magureanu Bd. 050025 Bucharest sector 5 Tel. +40 21 3126618\ +40 21 3153122/ int.101 -- This message was scanned for spam and viruses by BitDefender For more information please visit http://linux.bitdefender.com/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] multinom object :way of plotting??
Dear Camille, You might be interested in a paper (available at http://socserv.socsci.mcmaster.ca/jfox/logit-effect-displays.pdf) that Bob Andersen and I wrote on this topic. The paper deals with graphing multinomial-logit and proportional-odds models. Regards, John On 27 Sep 2004 18:16:53 +0100 Camille Szmaragd [EMAIL PROTECTED] wrote: Dear all, I'm fitting a multinom function to my dataset (multinom(outcome~age+K+D)) and I need to present my results on a poster. Does someone know a nice way of doing that? I think I saw in an archive that you cannot plot a multinom.object, is it true? Thank you by advance for your help, Cheers Camille __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance
Dear Paul, I tried polr() and lrm() on a different problem and (except for the difference in signs for the cut-points/intercepts) got identical results for both coefficients and standard errors. There might be something ill-conditioned about your problem that produces the discrepancy -- I noticed, for example, that some of the upper categories of the response are very sparse. Perhaps the two functions use different forms of the information matrix. I expect that someone else will be able to supply more details. I believe that the t-statistics in the polr() output are actually Wald statistics. I hope this helps, John -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Johnson Sent: Thursday, September 30, 2004 4:41 PM To: r help Subject: [R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance Greetings: I'm running R-1.9.1 on Fedora Core 2 Linux. I tested a proportional odds logistic regression with MASS's polr and Design's lrm. Parameter estimates between the 2 are consistent, but the standard errors are quite different, and the conclusions from the t and Wald tests are dramatically different. I cranked the abstol argument up quite a bit in the polr method and it did not make the differences go away. So 1. Can you help me see why the std. errors in the polr are so much smaller, and 2. Can I hear more opinions on the question of t vs. Wald in making these signif tests. So far, I understand the t is based on the asymptotic Normality of the estimate of b, and for finite samples b/se is not exactly distributed as a t. But I also had the impression that the Wald value was an approximation as well. summary(polr(as.factor(RENUCYC) ~ DOCS + PCT65PLS*RANNEY2 + OLDCRASH + FISCAL2 + PCTMETRO + ADMLICEN, data=elaine1)) Re-fitting to get Hessian Call: polr(formula = as.factor(RENUCYC) ~ DOCS + PCT65PLS * RANNEY2 + OLDCRASH + FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1) Coefficients: Value Std. Error t value DOCS 0.004942217 0.002952001 1.674192 PCT65PLS 0.454638558 0.113504288 4.005475 RANNEY2 0.110473483 0.010829826 10.200855 OLDCRASH 0.139808663 0.042245692 3.309418 FISCAL2 0.025592117 0.011465812 2.232037 PCTMETRO 0.018184093 0.007792680 2.333484 ADMLICEN -0.028490387 0.011470999 -2.483688 PCT65PLS:RANNEY2 -0.008559228 0.001456543 -5.876400 Intercepts: Value Std. Error t value 2|36.6177 0.301921.9216 3|47.1524 0.277325.7938 4|5 10.5856 0.214949.2691 5|6 12.2132 0.185865.7424 6|8 12.2704 0.185666.1063 8|10 13.0345 0.218459.6707 10|12 13.9801 0.351739.7519 12|18 14.6806 0.558726.2782 Residual Deviance: 587.0995 AIC: 619.0995 lrm(RENUCYC ~ DOCS + PCT65PLS*RANNEY2 + OLDCRASH + FISCAL2 + PCTMETRO + ADMLICEN, data=elaine1) Logistic Regression Model lrm(formula = RENUCYC ~ DOCS + PCT65PLS * RANNEY2 + OLDCRASH + FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1) Frequencies of Responses 2 3 4 5 6 8 10 12 18 21 12 149 46 1 10 6 2 2 Frequencies of Missing Values Due to Each Variable RENUCYC DOCS PCT65PLS RANNEY2 OLDCRASH FISCAL2 PCTMETRO ADMLICEN 500605 05 Obs Max Deriv Model L.R. d.f. P C Dxy 249 7e-05 56.58 8 0 0.733 0.465 Gamma Tau-a R2 Brier 0.47 0.278 0.22 0.073 Coef S.E. Wald Z P y=3-6.617857 6.716688 -0.99 0.3245 y=4-7.152561 6.716571 -1.06 0.2869 y=5 -10.585705 6.74 -1.57 0.1164 y=6 -12.213340 6.755656 -1.81 0.0706 y=8 -12.270506 6.755571 -1.82 0.0693 y=10 -13.034584 6.756829 -1.93 0.0537 y=12 -13.980235 6.767724 -2.07 0.0389 y=18 -14.680760 6.786639 -2.16 0.0305 DOCS 0.004942 0.002932 1.69 0.0918 PCT65PLS 0.454653 0.552430 0.82 0.4105 RANNEY2 0.110475 0.076438 1.45 0.1484 OLDCRASH 0.139805 0.042104 3.32 0.0009 FISCAL2 0.025592 0.011374 2.25 0.0245 PCTMETRO 0.018184 0.007823 2.32 0.0201 ADMLICEN-0.028490 0.011576 -2.46 0.0138 PCT65PLS * RANNEY2 -0.008559 0.006417 -1.33 0.1822 -- Paul E. Johnson email: [EMAIL PROTECTED] Dept. of Political Sciencehttp://lark.cc.ku.edu/~pauljohn 1541 Lilac Lane, Rm 504 University of Kansas Office: (785) 864-9086 Lawrence, Kansas 66044-3177 FAX: (785) 864-5700 __ [EMAIL PROTECTED] mailing list