Re: [R] SAS data
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Samuel Okoye Sent: Friday, March 14, 2008 8:21 AM To: [EMAIL PROTECTED] Subject: [R] SAS data Hello, I am trying to read the SAS file MyData.sa7bdat in R! This file is saved under D:\data! I therefore wrote path -D:/SasData sashome - C/Progra, Files/SAS Institute/9_1/SAS sascmd - file.path(sashome, sas.exe) MyData - read.ssd(path, MyData, sascmd=sascmd) The results what I get: SAS failed. SAS program at C:\DOCUME~1\Temp\RtmpcTlKtb\file4eb43288.sas The log file will be file4eb43288.log in the current directory NULL Warning message: SAS return code was 2 in: read.ssd(path, MyData, sascmd = sascmd) Thank you in advance! Sam Sam, If you actually tried to run the code as listed above, it is not surprising that it didn't work. You wrote that the data was in D:/Data, but specified the libname reference as D:/SasData. You specified the path to SAS as C/Progra, Files/SAS Institute/9_1/SAS. I can't believe that that is a valid path to SAS on any system. I you write back to the list with actual code, location of data and sas.exe, and operating system (presumably MS Windows, but what version?), someone should be able to help. Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] r-metrics website
Hello, it seems www.rmetrics.org has been unavailable for some days. Would anyone know of a mirror? Thanks, Ruud __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Forward Selection with regsubsets
At 15:52 14/03/2008, [EMAIL PROTECTED] wrote: Hi, I would like to perform a forward selection procedure on a data set with 6 observations and 10 predictors. I tried to run it with regsubsets (I set nvmax=number of observations) but I keep getting these warning messages: Warning messages: 1: 5 linear dependencies found in: leaps.setup(x, y, wt = weights, nbest = nbest, nvmax = nvmax, 2: nvmax reduced to 5 in: leaps.setup(x, y, wt = weights, nbest = nbest, nvmax = nvmax, I think the problem was due to inadequate observations, is there any way to perform subset selection in R if the number of observations are less than the predictors? library(fortunes) fortune(Yoda) Evelyn Hall: I would like to know how (if) I can extract some of the information from the summary of my nlme. Simon Blomberg: This is R. There is no if. Only how. -- Evelyn Hall and Simon 'Yoda' Blomberg R-help (April 2005) You have a number of options. 1 - Doing a site search for relaimpo would be a good start. Read the author's related articles first before trying to implement the procedures. 2 - Understanding your data better would help. Are you not curious as to what these dependencies are which R tells you about? 3 - Choosing predictors at random would be fast and in the absence of any clue about why you are doing this could be a good solution. Thanks! Woonyuen Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stats/debugging question hotelling t-sq
Hi I spent hours looking over my formula. Somehow I cant find the reason why it gives me different answer. help appreciated. x = as.matrix(read.table(http://www.niehs.nih.gov/research/atniehs/core/microarrays/docs/heinloth.txt,1)) x = t(x)#now rows are subjects, cols are genes x = x[order(rownames(x)),]#order by treatment group oxygen, ultra-violet, gamma radiation y = x[26:45,1:10] x = x[2:25,1:10] p = ncol(x); p nx = nrow(x); nx ny = nrow(y); ny n = nx+ny; n # (t(x)-colMeans(x)) %*% t(t(x)-colMeans(x)) T2 = nx*ny/n * t(colMeans(x)-colMeans(y)) %*% solve( ( (nx-1)*cov(x)+(ny-1)*cov(y) )/( n-2 ) ) %*% (colMeans(x)-colMeans(y)); T2 library(ICSNP) HotellingsT2(y,x) http://en.wikipedia.org/wiki/Hotelling's_T-square_distribution http://finzi.psych.upenn.edu/R/Rhelp02a/archive/60962.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova
On Sat, 2008-03-15 at 12:48 -0500, daniel jupiter wrote: Hi all, I apologize for what might be a silly question. I am interested in doing a one way anova. This is not too hard in and of itself, either with anova, aov or oneway.test . However, I need to 1) get pvalues, if obj is the result of anova, aov, oneway.test, then str(obj) ## for anova str(summary(obj)) ## for aov str(obj) ## for oneway.test to find the names of the elements of obj that contain the p-values of the various tests/models. In the first two you are looking for the component Pr(F) and the latter is obvious (p.value) For summary(aov) objects, the result is a list so this gets the p-value you need: obj[[1]]$`Pr(F)` or obj[[1]][,5]] for anova then this: obj$`Pr(F)` or obj[,5] note the quoting of the component name using backticks. For oneway.test obj$p.value 2) do a posthoc analysis with Tukey HSD, ?TukeyHSD for the results of aov 3) and have (sometimes) an unbalanced design. See ?lme in package nlme I just can't seem to put all the pieces together. Any suggestions? I'm not sure what the problem is here - you don't say. All of what I say above is documented in the relevant help pages for the various functions and using str() is a basic tenet of using R and looking at returned objects. Ok, you might have needed help with getting the p-values for some of those tests/models, but 2) and 3) are answered on ?aov For what you describe, stick with aov for balanced designs if you want to do TukeyHSD as there is a method for aov objects (otherwise) you'll need to refit the model. For unbalanced designs, check out lme and for that you may need to get/borrow the book by Pinhiero and Bates, reference details of which are given in item [7] on: http://www.r-project.org/doc/bib/R-books.html Thanks in advance, Dan. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stats/debugging question hotelling t-sq
Hi! I think the results agree: using simulated data: set.seed(1) library(mvtnorm) x-rmvnorm(44,rep(0,10)) y = x[(26:45)-1,1:10] x = x[(2:25)-1,1:10] p = ncol(x); p nx = nrow(x); nx ny = nrow(y); ny n = nx+ny; n # (t(x)-colMeans(x)) %*% t(t(x)-colMeans(x)) T2 = nx*ny/n * t(colMeans(x)-colMeans(y)) %*% solve( ( (nx-1)*cov(x)+(ny-1)*cov(y) )/( n-2 ) ) %*% (colMeans(x)-colMeans(y)); T2 library(ICSNP) HotellingsT2(y,x) note that the default here returns the test statistic value that is F distributed. So using HotellingsT2(y,x,test=chi) gives the same. Or if you transform your T2 to be F distributed (n-p-1) / ((n-2)*p) * T2 Best wishes, Klaus Original-Nachricht Datum: Sun, 16 Mar 2008 04:09:00 -0700 Von: [EMAIL PROTECTED] An: r-help@r-project.org Betreff: [R] stats/debugging question hotelling t-sq Hi I spent hours looking over my formula. Somehow I cant find the reason why it gives me different answer. help appreciated. x = as.matrix(read.table(http://www.niehs.nih.gov/research/atniehs/core/microarrays/docs/heinloth.txt,1)) x = t(x)#now rows are subjects, cols are genes x = x[order(rownames(x)),]#order by treatment group oxygen, ultra-violet, gamma radiation y = x[26:45,1:10] x = x[2:25,1:10] p = ncol(x); p nx = nrow(x); nx ny = nrow(y); ny n = nx+ny; n # (t(x)-colMeans(x)) %*% t(t(x)-colMeans(x)) T2 = nx*ny/n * t(colMeans(x)-colMeans(y)) %*% solve( ( (nx-1)*cov(x)+(ny-1)*cov(y) )/( n-2 ) ) %*% (colMeans(x)-colMeans(y)); T2 library(ICSNP) HotellingsT2(y,x) http://en.wikipedia.org/wiki/Hotelling's_T-square_distribution http://finzi.psych.upenn.edu/R/Rhelp02a/archive/60962.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Psst! Geheimtipp: Online Games kostenlos spielen bei den GMX Free Games! http://games.entertainment.gmx.net/de/entertainment/games/free __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pretty formatting of lists
Hello, is there already a function in any R package which does source code formatting of deparsed lists? Let's create the following list: x - list(a = round(rnorm(3), 2), b = round(rnorm(3), 2)) xx -c(aa = round(rnorm(30)), f = function(a) a + b, list(x, x)) Now, I want deparse it in a way that yields something like: list( aa = c(0.25, 0.18, 0.84, -1.25, 0.09, -0.99, 1.64, 1.42, -1.29, -0.14, -1.07, -1.05, 0.98, 0.33, 1.76, -1.66, 0.96, -0.21, -1.29, 0.78, -0.4, -1.63, -0.78, -1.05, 1.27, -1.44, 0.12, -0.4, 0.02, 1.03), f = function (a) a + b, list( a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46) ), list( a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46) ) ) Thanks a lot! Thomas P. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pretty formatting of lists
Is this basically what you want? xx -list(aa = round(rnorm(30),2), f = function(a) a + b, a=x, b=x) xx $aa [1] 1.34 -0.21 -0.18 -0.10 0.71 -0.07 -0.04 -0.68 -0.32 0.06 -0.59 0.53 -1.52 0.31 -1.54 -0.30 [17] -0.53 -0.65 -0.06 -1.91 1.18 -1.66 -0.46 -1.12 -0.75 2.09 0.02 -1.29 -1.64 0.45 $f function(a) a + b $a $a$a [1] -0.06 -0.16 -1.47 $a$b [1] -0.48 0.42 1.36 $b $b$a [1] -0.06 -0.16 -1.47 $b$b [1] -0.48 0.42 1.36 On Sun, Mar 16, 2008 at 8:47 AM, Thomas Petzoldt [EMAIL PROTECTED] wrote: Hello, is there already a function in any R package which does source code formatting of deparsed lists? Let's create the following list: x - list(a = round(rnorm(3), 2), b = round(rnorm(3), 2)) xx -c(aa = round(rnorm(30)), f = function(a) a + b, list(x, x)) Now, I want deparse it in a way that yields something like: list( aa = c(0.25, 0.18, 0.84, -1.25, 0.09, -0.99, 1.64, 1.42, -1.29, -0.14, -1.07, -1.05, 0.98, 0.33, 1.76, -1.66, 0.96, -0.21, -1.29, 0.78, -0.4, -1.63, -0.78, -1.05, 1.27, -1.44, 0.12, -0.4, 0.02, 1.03), f = function (a) a + b, list( a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46) ), list( a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46) ) ) Thanks a lot! Thomas P. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pretty formatting of lists
jim holtman wrote: Is this basically what you want? xx -list(aa = round(rnorm(30),2), f = function(a) a + b, a=x, b=x) xx $aa [1] 1.34 -0.21 -0.18 -0.10 0.71 -0.07 -0.04 -0.68 -0.32 0.06 -0.59 0.53 -1.52 0.31 -1.54 -0.30 [17] -0.53 -0.65 -0.06 -1.91 1.18 -1.66 -0.46 -1.12 -0.75 2.09 0.02 -1.29 -1.64 0.45 $f function(a) a + b $a $a$a [1] -0.06 -0.16 -1.47 $a$b [1] -0.48 0.42 1.36 $b $b$a [1] -0.06 -0.16 -1.47 $b$b [1] -0.48 0.42 1.36 No, I want a neat source code representation, not simply the printed content -- technically something like the following, but more reader friendly: cat(deparse(xx), fill=60) structure(list(aa = c(0.25, 0.18, 0.84, -1.25, 0.09, -0.99, 1.64, 1.42, -1.29, -0.14, -1.07, -1.05, 0.98, 0.33, 1.76, -1.66, 0.96, -0.21, -1.29, 0.78, -0.4, -1.63, -0.78, -1.05, 1.27, -1.44, 0.12, -0.4, 0.02, 1.03), f = function (a) a + b, structure(list(a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46)), .Names = c(a, b)), structure(list(a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46)), .Names = c(a, b))), .Names = c(aa, f, , )) I wonder whether a function for some kind of pretty deparsing already exists before I start to write my own solution. Thomas P. On Sun, Mar 16, 2008 at 8:47 AM, Thomas Petzoldt [EMAIL PROTECTED] wrote: Hello, is there already a function in any R package which does source code formatting of deparsed lists? Let's create the following list: x - list(a = round(rnorm(3), 2), b = round(rnorm(3), 2)) xx -c(aa = round(rnorm(30)), f = function(a) a + b, list(x, x)) Now, I want deparse it in a way that yields something like: list( aa = c(0.25, 0.18, 0.84, -1.25, 0.09, -0.99, 1.64, 1.42, -1.29, -0.14, -1.07, -1.05, 0.98, 0.33, 1.76, -1.66, 0.96, -0.21, -1.29, 0.78, -0.4, -1.63, -0.78, -1.05, 1.27, -1.44, 0.12, -0.4, 0.02, 1.03), f = function (a) a + b, list( a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46) ), list( a = c(2.22, 0.36, 0.74), b = c(0.46, 0.41, 1.46) ) ) Thanks a lot! Thomas P. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overloading %*%
Hi, Is it possible to supply a new method for the %*% operator? I need to provide a new method for working on variables of a newly defined class, ad. I've had no problems overloading +, * etc.., using code such as: +.ad - function(a,b = NULL) { # further code here } I've tried to do the same thing with %*%: %*%.ad - function(a,b) { # further code here } However this doesn't work; the new method is never called and the standard %*% operator is used instead. I've had a look at the documentation and it appears to be because the %*% operator is not part of the Math, Ops, Summary or Complex groups. I was wondering if anybody knew of a work-around for this? (I realise that I can just do %*%.ad(a,b) when I want to use the new method, but it would be much better for me if I could find something more transparent.) Thanks, Joe Cainey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rgl build warnings and loading error on Linux
Dear useRs, I have several problems in using rgl-0.77 (and recent earlier versions) on Gentoo Linux with a custom-built v. 2.6.22 kernel. Currently I use R-2.6.1. When I build rgl, # R CMD INSTALL /home/liviu/inst/dwn/R/rgl_0.77.tar.gz or install.packages(rgl, dependencies=TRUE, method =wget), I notice the following warning messages: i686-pc-linux-gnu-g++ -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include-fpic -O2 -march=pentium-m -pipe -fomit-frame-pointer -std=gnu99 -c api.cpp -o api.o cc1plus: warning: command line option -std=gnu99 is valid for C/ObjC but not for C++ The warning message itself is repeated during the entire build process. However, the package builds fine, but fails to load: library(rgl) Error in dyn.load(file, ...) : unable to load shared library '/usr/lib/R/library/rgl/libs/rgl.so': /usr/lib/R/library/rgl/libs/rgl.so: undefined symbol: glTexCoordPointer Error : .onLoad failed in 'loadNamespace' for 'rgl' Error: package/namespace load failed for 'rgl' I had the exact same error message with rgl_0.76. Could anyone suggest how to make rgl build correctly? Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overloading %*%
Joe Cainey jcainey at gmail.com writes: Is it possible to supply a new method for the %*% operator? clipped I've tried to do the same thing with %*%: %*%.ad - function(a,b) { # further code here } However this doesn't work; the new method is never called and the standard %*% operator is used instead. I've had a look at the documentation and it appears to be because the %*% operator is not part of the Math, Ops, Summary or Complex groups. I was wondering if anybody knew of a work-around for this? According to the help page for %*%, it is S4 generic but not S3, so you might make further progress using S4 methods. Thanks, Joe Cainey best, Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl build warnings and loading error on Linux
On 16/03/2008 3:38 PM, Liviu Andronic wrote: Dear useRs, I have several problems in using rgl-0.77 (and recent earlier versions) on Gentoo Linux with a custom-built v. 2.6.22 kernel. Currently I use R-2.6.1. When I build rgl, # R CMD INSTALL /home/liviu/inst/dwn/R/rgl_0.77.tar.gz or install.packages(rgl, dependencies=TRUE, method =wget), I notice the following warning messages: i686-pc-linux-gnu-g++ -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include-fpic -O2 -march=pentium-m -pipe -fomit-frame-pointer -std=gnu99 -c api.cpp -o api.o cc1plus: warning: command line option -std=gnu99 is valid for C/ObjC but not for C++ The warning message itself is repeated during the entire build process. However, the package builds fine, but fails to load: library(rgl) Error in dyn.load(file, ...) : unable to load shared library '/usr/lib/R/library/rgl/libs/rgl.so': /usr/lib/R/library/rgl/libs/rgl.so: undefined symbol: glTexCoordPointer Error : .onLoad failed in 'loadNamespace' for 'rgl' Error: package/namespace load failed for 'rgl' I had the exact same error message with rgl_0.76. Could anyone suggest how to make rgl build correctly? It sounds as though it is not finding the OpenGL libs when it configures. That function should be in libGL.so. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Hi, I am trying to use the Fisher scoring method with a geometric distribution, with p = .07, 100 observations from the geom distrib, and 10 iterations. I cannot quite get the code to work. Can anyone see the mistake? n - 100 p - 0.07 x - rgeom(n, p) s - sum(x) f - function(x, p) p*(1-p)^x L - function(p) p^n*(1-p)^s logL - function(p) n*log(p)+s*(log(1-p)) logLprime - function(p) (n/p)-(s/(1-p)) I - n/p^2*(1-p) iter - 10 p[1] - .06 p[2] - .11 for (i in 1:10) { pnew - p[i]+logLprime(p[i])/I*(p[i]) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl build warnings and loading error on Linux
On 16 March 2008 at 17:00, Duncan Murdoch wrote: | On 16/03/2008 3:38 PM, Liviu Andronic wrote: | Dear useRs, | | I have several problems in using rgl-0.77 (and recent earlier | versions) on Gentoo Linux with a custom-built v. 2.6.22 kernel. | Currently I use R-2.6.1. | | When I build rgl, | # R CMD INSTALL /home/liviu/inst/dwn/R/rgl_0.77.tar.gz | or | install.packages(rgl, dependencies=TRUE, method =wget), | | I notice the following warning messages: | i686-pc-linux-gnu-g++ -I/usr/lib/R/include -I/usr/lib/R/include | -I/usr/local/include-fpic -O2 -march=pentium-m -pipe | -fomit-frame-pointer -std=gnu99 -c api.cpp -o api.o | cc1plus: warning: command line option -std=gnu99 is valid for C/ObjC | but not for C++ | | The warning message itself is repeated during the entire build | process. However, the package builds fine, but fails to load: | library(rgl) | Error in dyn.load(file, ...) : |unable to load shared library '/usr/lib/R/library/rgl/libs/rgl.so': |/usr/lib/R/library/rgl/libs/rgl.so: undefined symbol: glTexCoordPointer | Error : .onLoad failed in 'loadNamespace' for 'rgl' | Error: package/namespace load failed for 'rgl' | | I had the exact same error message with rgl_0.76. | | Could anyone suggest how to make rgl build correctly? | | It sounds as though it is not finding the OpenGL libs when it | configures. That function should be in libGL.so. For what it is worth, the Debian r-cran-rgl packages uses these Build-Depends which may or may not map into similar Gentoo packages: Build-Depends: debhelper (= 5.0.0), r-base-dev (= 2.6.2), cdbs, \ libgl1-mesa-dev | libgl-dev, libglu1-mesa-dev | libglu-dev, \ libpng12-dev, libx11-dev, libxt-dev, x-dev So try looking for libgl and libglu, possibly in their mesa implementations. Hth, Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overloading %*%
Thanks, Ken. 1. How can I find S4 methods for a given function given class(es) of objects? The 'showMethods' function lists available generics for a given function; showMethods('%*%') just produced for me a list of 52 different signatures for %*%. However, I don't know how to find the functions with methods for a particular class. The 'methods' function will produce either S3 methods for a given function or S3 functions for a given class. It would help me if the 'methods' help page included See Also and Examples for S4 classes also, but it doesn't. 2. How can I find source code for S4 methods? I tried dumpMethods('%*%', 'mmult.R') and got an apparently empty file of 0 KB. Then I tried 'dumpMethod(%*%, c(x=TsparseMatrix, y=ANY))' and got a file with the following: setMethod(%*%, structure(c(TsparseMatrix, ANY), .Names = c(x, y)), NULL ) Thanks again for your reply regarding %*%. Spencer Graves knoblauch wrote: Joe Cainey jcainey at gmail.com writes: Is it possible to supply a new method for the %*% operator? clipped I've tried to do the same thing with %*%: %*%.ad - function(a,b) { # further code here } However this doesn't work; the new method is never called and the standard %*% operator is used instead. I've had a look at the documentation and it appears to be because the %*% operator is not part of the Math, Ops, Summary or Complex groups. I was wondering if anybody knew of a work-around for this? According to the help page for %*%, it is S4 generic but not S3, so you might make further progress using S4 methods. Thanks, Joe Cainey best, Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to assign text string as object?
I have a problem I cannot get over for a long time. Imagine I have a data frame with 17 colums. 16 of them are craniometric variables of Cervus elaphus and one contains age. The data frame has 83 rows. I want to write a loop which plots the values of each craniometric variable against the age. The names of columns are V1, V2, V3, etc... What I have done till now was writing this: layout(matrix(1:16,4,4)) plot(V1~AGE) plot(V2~AGE) . . . . etc. How to assign a string in the loop so that the program understands it is an object? Thank for your help in advance. Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to assign text string as object?
Take a look at 'matplot'. If you want to loop, try for (i in 1:16) plot(df[[paste(V, i, sep=)]] ~ df$AGE) On 3/16/08, Ing. Michal Kneifl, Ph.D. [EMAIL PROTECTED] wrote: I have a problem I cannot get over for a long time. Imagine I have a data frame with 17 colums. 16 of them are craniometric variables of Cervus elaphus and one contains age. The data frame has 83 rows. I want to write a loop which plots the values of each craniometric variable against the age. The names of columns are V1, V2, V3, etc... What I have done till now was writing this: layout(matrix(1:16,4,4)) plot(V1~AGE) plot(V2~AGE) . . . . etc. How to assign a string in the loop so that the program understands it is an object? Thank for your help in advance. Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to assign text string as object?
You can try this also: sapply(DF[-match(AGE, names(DF))], plot, x=DF$AGE) On 16/03/2008, Ing. Michal Kneifl, Ph.D. [EMAIL PROTECTED] wrote: I have a problem I cannot get over for a long time. Imagine I have a data frame with 17 colums. 16 of them are craniometric variables of Cervus elaphus and one contains age. The data frame has 83 rows. I want to write a loop which plots the values of each craniometric variable against the age. The names of columns are V1, V2, V3, etc... What I have done till now was writing this: layout(matrix(1:16,4,4)) plot(V1~AGE) plot(V2~AGE) . . . . etc. How to assign a string in the loop so that the program understands it is an object? Thank for your help in advance. Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setAs vs setIs
Hi the list I am fighting with the twins setAs and setIs... Here are some questions and comments (comments to myself but that migth be wrong, it is why I am posting them) 1. Very surprising : using setIs define 'is', 'as-' but not 'as' ??? 2. Using setAs define 'as', 'as-' but not 'is'... What surprise me is that as- can be define by both. I would have thing that setis is for 'is', setAs is for 'as' and 'as-'... Since it is not the case, is there a possibility to set with only one function 'as', 'is' and 'as-' Last point, I get a warning using setAs. I did not manage to find the name of the variable it want me to use... ### Data setClass(B,representation(b=numeric)) setClass(C,representation(c=numeric)) setClass(D,representation(d=numeric)) b - new(B,b=3) c - new(C,c=4) d - new(D,d=5) ### using setIs setIs(C,B, test=function(object){return([EMAIL PROTECTED]0)}, replace=function(from,values){ [EMAIL PROTECTED] - [EMAIL PROTECTED] return(from) } ) is(c,B) #Ok as(c,B) #not ok as(c,B) - b#ok (!) ### using setAs setAs(D,B, function(from,to){to-new(B,[EMAIL PROTECTED]);return(to)}, replace=function(from,values){ [EMAIL PROTECTED][EMAIL PROTECTED]; return(from) } ) is(d,B) # not ok as(d,B) # ok as(d,B)-b # ok Thanks Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setAs vs setIs
On Mar 16, 2008, at 8:12 PM, Christophe Genolini wrote: Hi the list I am fighting with the twins setAs and setIs... Here are some questions and comments (comments to myself but that migth be wrong, it is why I am posting them) 1. Very surprising : using setIs define 'is', 'as-' but not 'as' ??? 2. Using setAs define 'as', 'as-' but not 'is'... What surprise me is that as- can be define by both. I would have thing that setis is for 'is', setAs is for 'as' and 'as-'... Since it is not the case, is there a possibility to set with only one function 'as', 'is' and 'as-' Last point, I get a warning using setAs. I did not manage to find the name of the variable it want me to use... ### Data setClass(B,representation(b=numeric)) setClass(C,representation(c=numeric)) setClass(D,representation(d=numeric)) b - new(B,b=3) c - new(C,c=4) d - new(D,d=5) ### using setIs setIs(C,B, test=function(object){return([EMAIL PROTECTED]0)}, replace=function(from,values){ [EMAIL PROTECTED] - [EMAIL PROTECTED] return(from) } ) is(c,B) #Ok as(c,B) #not ok It seems to me your problem here is simply that you did not define a coerce cal in setIs, so it does not know how to turn a C object into a B object, which is what you ask it to do here. It knows how to test if C object is also a B object, because of the test function you provided, and it can do the replacement you ask it in as(c,B) -b because of the replace command you provided, but the third part is missing. Perhaps something like this: setIs(C,B, test=function(object){return([EMAIL PROTECTED]0)}, replace=function(from,values){ [EMAIL PROTECTED] - [EMAIL PROTECTED] return(from) }, coerce=function(from) { new(B,[EMAIL PROTECTED](1/3)) } ) as(c,B) - b#ok (!) ### using setAs setAs(D,B, function(from,to){to-new(B,[EMAIL PROTECTED]);return(to)}, replace=function(from,values){ [EMAIL PROTECTED][EMAIL PROTECTED]; return(from) } ) is(d,B) # not ok as(d,B) # ok as(d,B)-b # ok Thanks Christophe Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Std errors in glm models w/ and w/o intercept
I am doing a reanalysis of results that have previously been published. My hope was to demonstrate the value of adoption of more modern regression methods in preference to the traditional approach of univariate stratification. I have encountered a puzzle regarding differences between I thought would be two equivalent analyses. Using a single factor, I compare poisson models with and without the intercept term. As expected, the estimated coefficient and std error of the estimate are the same for the intercept and the base level of the factor in the two models. The sum of the intercept with each coefficient is equal to the individual factor coefficients in the no- intercept model. The overall model fit statistics are the same. However, the std errors for the other factors are much smaller in the model without the intercept. The offset = log(expected) is based on person-years of follow-up multiplied by the annual mortality experience of persons with known age, gender, and smoking status in a much larger cohort. My logic in removing the intercept was that the offset should be considered the baseline, and that I should estimate each level compared with that baseline. 18.5-24.9 was used as the reference level in the model with intercept. Removing the intercept appears to be a successful strategy. but have I committed any statistical sin? with(bmi, table(BMI,Actual_Deaths)) Actual_Deaths BMI 0 1 2 3 4 5 6 7 11 13 18.5-24.9 311 21 1 0 0 0 0 0 0 0 15.0-18.4 353 33 8 2 0 1 0 0 0 0 25.0-29.9 367 19 0 0 0 0 0 0 0 0 30.0-34.9 349 95 39 17 8 9 3 4 0 1 35.0-39.9 351 90 50 21 20 3 3 2 1 0 40.0-55.0 386 60 15 7 4 0 0 1 0 0 bmi.base - with(bmi,glm(Actual_Deaths ~ BMI + offset(log( MMI_VBT_Expected)), family=poisson)) summary(bmi.base) Call: glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)), family = poisson) Deviance Residuals: Min 1Q Median 3Q Max -2.6385 -0.5245 -0.2722 -0.1041 3.4426 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 0.429200.20851 2.058 0.0395 * BMI15.0-18.4 0.316080.24524 1.289 0.1974 BMI25.0-29.9 -0.227950.30999 -0.735 0.4621 BMI30.0-34.9 -0.096690.21506 -0.450 0.6530 BMI35.0-39.9 -0.042900.21455 -0.200 0.8415 BMI40.0-55.0 0.193480.22569 0.857 0.3913 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 1485.0 on 2654 degrees of freedom Residual deviance: 1470.0 on 2649 degrees of freedom AIC: 2760.9 Number of Fisher Scoring iterations: 6 - bmi.no.int - with(bmi,glm(Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)) -1 , family=poisson)) summary(bmi.no.int) Call: glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)) - 1, family = poisson) Deviance Residuals: Min 1Q Median 3Q Max -2.6385 -0.5245 -0.2722 -0.1041 3.4426 Coefficients: Estimate Std. Error z value Pr(|z|) BMI18.5-24.9 0.429200.20851 2.058 0.0395 * BMI15.0-18.4 0.745290.12910 5.773 7.79e-09 *** BMI25.0-29.9 0.201250.22939 0.877 0.3803 BMI30.0-34.9 0.332510.05270 6.309 2.81e-10 *** BMI35.0-39.9 0.386310.05057 7.639 2.19e-14 *** BMI40.0-55.0 0.622680.08639 7.208 5.67e-13 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 1630.7 on 2655 degrees of freedom Residual deviance: 1470.0 on 2649 degrees of freedom AIC: 2760.9 It does look statistically sensible that an estimate for BMI=40.0- 55.0 with over 100 events should have a much narrower CI than BMI=18.5-24.9 which only has 23 events. Is the model with an intercept term somehow spreading around uncertainty that really belongs to the reference category with its relatively low number of events? -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stepAIC and polynomial terms
Dear all, I have a question regarding the use of stepAIC and polynomial (quadratic to be specific) terms in a binary logistic regression model. I read in McCullagh and Nelder, (1989, p 89) and as far as I remember from my statistics cources, higher-degree polynomial effects should not be included without the main effects. If I understand this correctly, following a stepwise model selection based on AIC should not lead to a model where the main effect of some continuous covariate is removed, but the quadratic term is kept. The question is, should I keep the quadratic term (note, there are other main effects that were retained following the stepwise algorithm) in the final model or should I delete it as well and move on? Or should I retain the main effect as well? To picture it, the initial model to which I called stepAIC is: Call: glm(formula = S ~ FR + Date * age + I(age^2), family = logexposure(ExposureDays = DATA$int), data = DATA) and the final one: Call: glm(formula = S ~ FR + Date + I(age^2), family = logexposure(ExposureDays = DATA$int), data = DATA) Thanks very much in advance for your thoughts and suggestions, Caspar Caspar Hallmann MSc Student WUR The Netherlands [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overloading %*%
Spencer Graves wrote: Thanks, Ken. 1. How can I find S4 methods for a given function given class(es) of objects? The 'showMethods' function lists available generics for a given function; showMethods('%*%') just produced for me a list of 52 different signatures for %*%. However, I don't know how to find the functions with methods for a particular class. The 'methods' function Use the 'classes' argument to showMethods, either with or without a function as first argument. will produce either S3 methods for a given function or S3 functions for a given class. It would help me if the 'methods' help page included See Also and Examples for S4 classes also, but it doesn't. 2. How can I find source code for S4 methods? I tried use includeDef=TRUE with showMethods, or getMethod to get a specific method, or, for some fun, selectMethod to find the method to which object dispatch occurs. I'm not sure how to dump the method to a file. Martin dumpMethods('%*%', 'mmult.R') and got an apparently empty file of 0 KB. Then I tried 'dumpMethod(%*%, c(x=TsparseMatrix, y=ANY))' and got a file with the following: setMethod(%*%, structure(c(TsparseMatrix, ANY), .Names = c(x, y)), NULL ) Thanks again for your reply regarding %*%. Spencer Graves knoblauch wrote: Joe Cainey jcainey at gmail.com writes: Is it possible to supply a new method for the %*% operator? clipped I've tried to do the same thing with %*%: %*%.ad - function(a,b) { # further code here } However this doesn't work; the new method is never called and the standard %*% operator is used instead. I've had a look at the documentation and it appears to be because the %*% operator is not part of the Math, Ops, Summary or Complex groups. I was wondering if anybody knew of a work-around for this? According to the help page for %*%, it is S4 generic but not S3, so you might make further progress using S4 methods. Thanks, Joe Cainey best, Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overloading %*%
Dear Martin: This is wonderful. Thank you very much. It would be a great help if your suggestions could be added to See Also and Examples for methods. Thanks again, Spencer Graves Martin Morgan wrote: Spencer Graves wrote: Thanks, Ken. 1. How can I find S4 methods for a given function given class(es) of objects? The 'showMethods' function lists available generics for a given function; showMethods('%*%') just produced for me a list of 52 different signatures for %*%. However, I don't know how to find the functions with methods for a particular class. The 'methods' function Use the 'classes' argument to showMethods, either with or without a function as first argument. will produce either S3 methods for a given function or S3 functions for a given class. It would help me if the 'methods' help page included See Also and Examples for S4 classes also, but it doesn't. 2. How can I find source code for S4 methods? I tried use includeDef=TRUE with showMethods, or getMethod to get a specific method, or, for some fun, selectMethod to find the method to which object dispatch occurs. I'm not sure how to dump the method to a file. Martin dumpMethods('%*%', 'mmult.R') and got an apparently empty file of 0 KB. Then I tried 'dumpMethod(%*%, c(x=TsparseMatrix, y=ANY))' and got a file with the following: setMethod(%*%, structure(c(TsparseMatrix, ANY), .Names = c(x, y)), NULL ) Thanks again for your reply regarding %*%. Spencer Graves knoblauch wrote: Joe Cainey jcainey at gmail.com writes: Is it possible to supply a new method for the %*% operator? clipped I've tried to do the same thing with %*%: %*%.ad - function(a,b) { # further code here } However this doesn't work; the new method is never called and the standard %*% operator is used instead. I've had a look at the documentation and it appears to be because the %*% operator is not part of the Math, Ops, Summary or Complex groups. I was wondering if anybody knew of a work-around for this? According to the help page for %*%, it is S4 generic but not S3, so you might make further progress using S4 methods. Thanks, Joe Cainey best, Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stepAIC and polynomial terms
At 08:50 PM 3/16/2008, caspar wrote: Dear all, I have a question regarding the use of stepAIC and polynomial (quadratic to be specific) terms in a binary logistic regression model. I read in McCullagh and Nelder, (1989, p 89) and as far as I remember from my statistics cources, higher-degree polynomial effects should not be included without the main effects. If I understand this correctly, following a stepwise model selection based on AIC should not lead to a model where the main effect of some continuous covariate is removed, but the quadratic term is kept. The question is, should I keep the quadratic term (note, there are other main effects that were retained following the stepwise algorithm) in the final model or should I delete it as well and move on? Or should I retain the main effect as well? To picture it, the initial model to which I called stepAIC is: Call: glm(formula = S ~ FR + Date * age + I(age^2), family = logexposure(ExposureDays = DATA$int), data = DATA) and the final one: Call: glm(formula = S ~ FR + Date + I(age^2), family = logexposure(ExposureDays = DATA$int), data = DATA) Thanks very much in advance for your thoughts and suggestions, Caspar 1. You should only exclude age as a linear term if you have sound causal reason for believing a pure quadratic component is solely present. Based on your example, you probably don't have this. 2. You also need to work about interactions. 3. An alternative to your polynomial approach to such a causal variable as age is to categorize age into 5 or 10 year intervals, and see how the fit breaks down by these levels. 4. You should plot your data vs. age to see what the dependence is. Frequently curve is flat up to a certain age, and then linear thereafter. This gives rise to a pseudo-quadratic relationship. You should be able to fit it better with the split plus a linear term. 5. Think about how age should affect your response before trying models. Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: [EMAIL PROTECTED] Least Cost Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239Fax: 757-467-2947 Vere scire est per causas scire __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to loop through all the columns in dataframe
--- jim holtman [EMAIL PROTECTED] wrote: Glad I could help. You might want to post it back to R-Help so that others can see what was done. On Sun, Mar 16, 2008 at 6:02 PM, Felipe Carrillo [EMAIL PROTECTED] wrote: Jim: I owe you man, this is great,I never thought that I could acomplish this task. Now I can estimate confidence intervals for mydf and I will be done,,Thanks again Jim.. This should do what you want: (you had td and pd reversed in your example) xd - c(2.2024,2.4216,1.4672,1.4817,1.4957,1.4431,1.5676) td - c(0.017046,0.018504,0.012157,0.012253,0.012348,0.011997,0.012825) pd - c(160524,163565,143973,111956,89677,95269,81558) mydf-data.frame(xd,pd,td) trans-t(mydf) trans [,1][,2][,3] [,4] [,5] [,6] [,7] xd 2.20240e+00 2.42160e+00 1.46720e+00 1.48170e+00 1.4957e+00 1.4431e+00 1.5676e+00 pd 1.60524e+05 1.63565e+05 1.43973e+05 1.11956e+05 8.9677e+04 9.5269e+04 8.1558e+04 td 1.70460e-02 1.85040e-02 1.21570e-02 1.22530e-02 1.2348e-02 1.1997e-02 1.2825e-02 varA- 0.036084 covAB- (-0.013046) varB- 0.0052628 # create the sequences to test against i.seq - lapply(seq(ncol(trans) - 1), function(x) x:(ncol(trans) - 1)) x - lapply(i.seq, function(.col){ + # compute the 3 columns of data + cbind(xp=varA + trans[1, .col[1]] * covAB + trans[1, .col + 1] * covAB + trans[1, .col[1]] * trans[1, .col + 1] * varB, + pd=trans[2, .col[1]] * trans[2, .col + 1], + td=trans[3, .col[1]] * trans[3, .col + 1]) + }) # rbind for the output z - do.call(rbind, x) # add the covariance z - cbind(z, cov=z[, 'xp'] * z[, 'pd'] / z[, 'td']) z Felipe D. Carrillo Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA Be a better friend, newshound, and -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Felipe D. Carrillo Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stepAIC and polynomial terms
There is absolutely no reason to remove age altogether. Notice that typically age and age^2 are highly correlated. To see this, consider 100 people with mean age 35 and 95% tolerance limite between 15 and 55: age - rnorm(100, 35, 10) cor(age, age^2) [1] 0.9898186 So if you use raw age and I(age^2) as predictors, it's really just the luck of the draw which gets selected (usually), and they will do much the same job when it comes to prediction, of course. So what are McCullagh and Nelder on about? One way to look at it is as a policy issue. In a mathematical sense you would think that whether you used age as the predictor or (age - 35) (years away from mid-life) should not make any difference, and if in you model selection procedure it does make a difference, then something arbitrary is going on, and any arbitrariness in this game is often a precursor of trouble to come. Consider the correlations again: cor(age-35, (age-35)^2) [1] -0.1302315 One way to *encourage* the linear term to be chosen ahead of the quadratic term is, in fact, to mean correct the predictor: sAge - age - mean(age) and use sAge and I(sAge^2) as your predictors. I expect this will favour the linear term over the quadratic and you will be led to a model that has no quadratic term, even if, in a strictly mathematical sense, the starting models were entirely equivalent. (Beware if you do this, though, you make things difficult when it comes to prediction.) You draw attention to a bit of a gap in the software, in my view. In variable selection with functions line stepAIC you would like to be able to specify a set of marginality constraints (to use the McCullagh and Nelder term) that you would like the model sifting process to respect, in order to ensure invariance with respect to groups of transformations that are natural to the problem. In this case you would like to declare that 1 is marginal to age which in turn is marginal to (age^2), to ensure invariance with respect to the action of the location and scale group, as seems natural. Why should changing the origin and unit of measurement have any consequences for the model selection process? Notice that in the case of factors and interactions this happens already: main effect terms will not be dropped if interactions involving them are still present. It's a similar argument. The same feature, ideally, should be available for other cases where marginality issues are at stake, but doing that seems to be a tricky problem. Using it might be trickier still. People would have to think about group invariance properties and that's foreign to most people... Bill Venables. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of caspar Sent: Monday, 17 March 2008 11:50 AM To: r-help@r-project.org Subject: [R] stepAIC and polynomial terms Dear all, I have a question regarding the use of stepAIC and polynomial (quadratic to be specific) terms in a binary logistic regression model. I read in McCullagh and Nelder, (1989, p 89) and as far as I remember from my statistics cources, higher-degree polynomial effects should not be included without the main effects. If I understand this correctly, following a stepwise model selection based on AIC should not lead to a model where the main effect of some continuous covariate is removed, but the quadratic term is kept. The question is, should I keep the quadratic term (note, there are other main effects that were retained following the stepwise algorithm) in the final model or should I delete it as well and move on? Or should I retain the main effect as well? To picture it, the initial model to which I called stepAIC is: Call: glm(formula = S ~ FR + Date * age + I(age^2), family = logexposure(ExposureDays = DATA$int), data = DATA) and the final one: Call: glm(formula = S ~ FR + Date + I(age^2), family = logexposure(ExposureDays = DATA$int), data = DATA) Thanks very much in advance for your thoughts and suggestions, Caspar Caspar Hallmann MSc Student WUR The Netherlands [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mean for each row
Hi r-users, How do find the mean for each row? Thank you in advance for your help. 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 Day Totals 10.0 0.0 0.0 0.0 0.6 0.0 8.4 0.0 29.4 0.0 38.4 20.0 0.0 1.8 0.0 22.4 0.0 0.2 0.4 0.8 0.0 25.6 37.8 0.0 0.0 17.6 1.4 0.0 0.0 0.0 0.0 0.0 26.8 42.2 0.8 0.4 0.0 0.2 11.2 1.4 33.2 0.0 0.0 49.4 50.2 1.8 0.0 1.0 0.0 0.2 0.0 12.2 0.0 19.2 34.6 60.0 0.0 0.0 1.0 0.0 0.0 0.0 2.2 0.0 14.6 17.8 70.0 0.0 0.0 0.0 3.6 0.2 0.0 2.0 0.0 0.26.0 80.0 0.0 10.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.0 90.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00.0 10 1.0 1.4 0.0 0.0 0.0 1.0 0.4 0.0 0.0 0.03.8 11 0.0 8.2 0.0 0.0 0.0 0.0 8.4 0.0 0.0 0.4 17.0 12 10.8 0.8 0.0 0.0 0.0 0.0 1.0 0.0 0.0 4.2 16.8 13 32.8 0.0 0.0 0.8 0.0 0.0 0.0 0.0 0.2 0.2 34.0 14 1.0 0.0 1.6 0.2 0.0 0.0 0.0 0.0 0.0 0.02.8 15 0.0 0.0 0.0 2.2 1.4 0.0 0.0 0.0 0.0 0.03.6 16 0.6 0.0 0.0 0.6 22.0 0.0 0.0 0.0 0.0 0.0 23.2 17 0.0 0.0 0.0 0.0 0.0 0.0 2.8 0.0 0.0 0.02.8 18 2.8 0.0 0.0 0.0 0.0 0.0 8.2 0.0 0.0 8.2 19.2 19 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.20.4 20 0.0 8.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.08.2 21 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00.0 22 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.40.4 23 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.2 0.0 0.01.2 24 0.0 0.0 0.0 0.0 1.0 0.0 0.0 2.0 8.2 0.0 11.2 25 3.2 0.0 0.0 0.6 0.6 0.0 0.0 0.0 11.8 0.0 16.2 26 0.0 0.0 26.2 0.0 12.6 0.0 0.0 2.2 0.0 0.0 41.0 27 0.2 0.0 10.6 0.0 1.2 0.0 0.0 1.8 0.0 0.0 13.8 28 0.0 4.0 0.0 5.8 0.0 0.0 0.0 0.0 0.0 0.09.8 29 0.2 12.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 12.6 30 0.0 2.6 0.0 0.0 2.2 0.0 0.0 0.0 0.0 0.04.8 31 0.0 0.0 0.6 0.0 0.8 0.0 0.0 0.0 4.8 0.06.2 Year Totals 62.8 40.2 51.2 29.8 70.0 12.8 30.8 57.2 55.2 47.6 457.6 Be a better friend, newshound, and __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mean for each row
How about rowSums(x)/ncol(x), where x is the matrix? On Mon, Mar 17, 2008 at 1:48 PM, Roslina Zakaria [EMAIL PROTECTED] wrote: Hi r-users, How do find the mean for each row? Thank you in advance for your help. 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 Day Totals 10.0 0.0 0.0 0.0 0.6 0.0 8.4 0.0 29.4 0.0 38.4 20.0 0.0 1.8 0.0 22.4 0.0 0.2 0.4 0.8 0.0 25.6 37.8 0.0 0.0 17.6 1.4 0.0 0.0 0.0 0.0 0.0 26.8 42.2 0.8 0.4 0.0 0.2 11.2 1.4 33.2 0.0 0.0 49.4 50.2 1.8 0.0 1.0 0.0 0.2 0.0 12.2 0.0 19.2 34.6 60.0 0.0 0.0 1.0 0.0 0.0 0.0 2.2 0.0 14.6 17.8 70.0 0.0 0.0 0.0 3.6 0.2 0.0 2.0 0.0 0.26.0 80.0 0.0 10.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.0 90.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00.0 10 1.0 1.4 0.0 0.0 0.0 1.0 0.4 0.0 0.0 0.03.8 11 0.0 8.2 0.0 0.0 0.0 0.0 8.4 0.0 0.0 0.4 17.0 12 10.8 0.8 0.0 0.0 0.0 0.0 1.0 0.0 0.0 4.2 16.8 13 32.8 0.0 0.0 0.8 0.0 0.0 0.0 0.0 0.2 0.2 34.0 14 1.0 0.0 1.6 0.2 0.0 0.0 0.0 0.0 0.0 0.02.8 15 0.0 0.0 0.0 2.2 1.4 0.0 0.0 0.0 0.0 0.03.6 16 0.6 0.0 0.0 0.6 22.0 0.0 0.0 0.0 0.0 0.0 23.2 17 0.0 0.0 0.0 0.0 0.0 0.0 2.8 0.0 0.0 0.02.8 18 2.8 0.0 0.0 0.0 0.0 0.0 8.2 0.0 0.0 8.2 19.2 19 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.20.4 20 0.0 8.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.08.2 21 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00.0 22 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.40.4 23 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.2 0.0 0.01.2 24 0.0 0.0 0.0 0.0 1.0 0.0 0.0 2.0 8.2 0.0 11.2 25 3.2 0.0 0.0 0.6 0.6 0.0 0.0 0.0 11.8 0.0 16.2 26 0.0 0.0 26.2 0.0 12.6 0.0 0.0 2.2 0.0 0.0 41.0 27 0.2 0.0 10.6 0.0 1.2 0.0 0.0 1.8 0.0 0.0 13.8 28 0.0 4.0 0.0 5.8 0.0 0.0 0.0 0.0 0.0 0.09.8 29 0.2 12.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 12.6 30 0.0 2.6 0.0 0.0 2.2 0.0 0.0 0.0 0.0 0.04.8 31 0.0 0.0 0.6 0.0 0.8 0.0 0.0 0.0 4.8 0.06.2 Year Totals 62.8 40.2 51.2 29.8 70.0 12.8 30.8 57.2 55.2 47.6 457.6 Be a better friend, newshound, and __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- 彭河森 Hesen Peng Department of Statistics, Fudan University, Shanghai, P. R. C. Tel: Mobile 86-13052231416 Dormitory 86-21-65647724 Home 86-23-68207912 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.