Re: [R] Stepwise logistic regression....take too long...
On Sun, 20 Apr 2008, Marko Milicic wrote: Dear R helpers, I'm trying to build logistic regression model large dataset 360 factors and 850 observations. All 360 factors are known to be good predictors of outcome variable but I have to find best model with maximum 10 factors. I tried to fit full model and use stepAIC function to get best model but unfortenatly, the process takes too long to complete (more than 4 hours)... Is it expected behaviour of stepAIC function from MASS package or I'm doing something wrong. Both. Work out how hany fits you need to do backwards elimination. (It is tens of thousands.) 'I have to find best model with maximum 10 factors' looks like a homework problem to me. Where does the round number 10 come from? Also, unless almost all the 'factors' have only two levels this looks like over-fitting for a single model, let alone after model dredging. Any suggestions? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Choice of notch size in R
Alex Reynolds wrote: Is there a way to modify the choice of notch size [1] in R's boxplot routine from outlining a 5% significance region, to say 1% or lower? Not easily. If you look inside boxplots.stats you'll find the hardcoded constant 1.58, and the documentation has the following text: The notches (if requested) extend to '+/-1.58 IQR/sqrt(n)'. This seems to be based on same calculations as the formula with 1.57 in Chambers _et al._ (1983, p. 62), given in McGill _et al._ (1978, p. 16). They are based on asymptotic normality of the median and roughly equal sample sizes for the two medians being compared, and are said to be rather insensitive to the underlying distributions of the samples. The idea appears to be to give roughly a 95% confidence interval for the difference in two medians. Judging from the wording, either the theory is unclear, or the author of the help page was not up to speed. My bets are on the former, but as things stand, the best way forward is to reproduce the calculations leading up to 1.58 and substitute 0.05 by 0.01 in the appropriate place. Thanks, Alex [1] McGill, Tukey, and Larsen. Variations of Box Plots, The American Statistician, Vol. 32, No. 1, 12-16. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Choice of notch size in R
Alex Reynolds reynolda at u.washington.edu writes: Is there a way to modify the choice of notch size [1] in R's boxplot routine from outlining a 5% significance region, to say 1% or lower? Not directly from boxplot, because it is hardwired to indirectly call fivenum, not quantile. Check bxp instead, which is the workhorse for boxplot and is more flexible in the parameters passed. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent of intervals() in lmer
kedar nadkarni nadkarnikedar at gmail.com writes: I have been trying to obtain confidence intervals for the fit after having used lmer by using intervals(), but this does not work. intervals() is associated with lme but not with lmer(). What is the equivalent for intervals() in lmer()? ci in Gregory Warnes' package gmodels can do this. However, think twice if you really need lmer. Why not lme? It is well documented and has many features that are currently not in lmer. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matched pairs with two data frames
David, tkanks für your comment, the code and the link. You are right: arbitrary is a better word than exact pair matching. I took the term one-to-one exact matching from the paper MatchIt: Nonparametric Preprocessing for Parametric Causal Inference (p. 6): http://gking.harvard.edu/matchit/docs/matchit.pdf Is it really the case that SPSS would give the output that you describe without any warnings about non-uniqueness? My output described indeed causes the SPSS error message Warning # duplicate key in a file, however, the result is what I need (discarding the lines with missing values in V3 and V4. But I will check this again with my treat/control data from my example here. Kind regards Udo Zitat von David Winsemius [EMAIL PROTECTED]: Udo [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: Daniel, thank you! I want to perfrom the simplest way of matching: a one-to-one exact match (by age and school): for every case in treat find ONE case (if there is one) in control . The cases in control that could be matched, should be tagged as not available or taken away (deleted) from the control pool (thus, the used ones are not replaced). #treatment group treat - data.frame(age=c(1,1,2,2,2,4), school=c(10,10,20,20,20,11), out1=c(9.5,2.3,3.3,4.1,5.9,4.6)) #control group control - data.frame(age=c(1,1,1,1,3,2), school=c(10,10,10,10,33,20), out2=c(1.1,2,3.5,4.9,5.2,6.5)) #one-to-one exat matching-alorithmus matched.data.frame - ? In my example I matched the cases by hand to make things clear. Case 1 from treat was matched with case 1 from control, 2 with 2 and 3 with 6. Case 4, 5 and 6 could not be matched, because there is no partner in control . Thus my matched example data frame has 3 cases. Is it really the case that SPSS would give the output that you describe without any warnings about non-uniqueness? How could they live with themselves after such arbitrary behavior? This link is evidence that SPSS may not behave as you allege. http://kb.iu.edu/data/afit.html If you really want to persist in what cannot possibly be called one- to-one exact matching, but instead arbitrary convenience matching, then you need to construct a function that sequentially marches through treat, grabs the first match (perhaps with something like): matched.first - merge(treat[1,],control, by= c(age,school))[1,] matched.first age school out1 out2 1 1 10 9.5 1.1 ... except that the 1's would be replaced with an index variable, then mark that control as taken perhaps by using all of the variables as identifiers, and then attempt match/marking for each successive case among (taken == FALSE) controls. -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Udo KN G Ö I Clinic for Child an Adolescent Psychiatry Philipps University of Marburg / Germany __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rbugs on linux and wine
Hi List, I trying an example from pumps{rbugs} with .Renviron in $HOME adjusted for my box: WINE=/usr/bin/wine, BUGS=/usr/local/bin/WinBugs14/winbugs.exe data(pumps) pumps.data - list(t = pumps$t, x = pumps$x, N = nrow(pumps)) pumps.model - file.path(.path.package(rbugs), bugs/model, pumps.bug) #file.show(pumps.model) pumps.inits - file.path(.path.package(rbugs), bugs/inits, pumps.txt) #file.show(pumps.inits) inits - list(dget(pumps.inits)) parameters - c(theta, alpha, beta) pumps.sim - rbugs(data = pumps.data, inits, parameters, pumps.model, n.chains = 1, n.iter = 1000, workingDir=tmp, bugsWorkingDir=tmp, #WINE=/usr/bin/wine, #BUGS=/usr/local/bin/openbugs/winbugs.exe, useWine=TRUE) ## End(Not run) I get error in rbugs(data=pumps.data,inits,parameters, pumps.model,n.chains=1: wine executable does not exists. However, /usr/bin/wine /usr/local/bin/WinBugs14/winbugs.exe starts up winbugs. I am running opensuse 10.3 with R 2.6.2 (2007-11-26), winbugs 1.4.3 (6th August, 2007). Any ideas on what goes wrong? Thanks Herry Dr Alexander Herr - Herry CSIRO, Sustainable Ecosystems Gungahlin Homestead Bellenden Street GPO Box 284 Crace, ACT 2601 Phone/www (02) 6242 1542; 6242 1705(fax) 0408679811 (mob) home: www.csiro.au/people/Alexander.Herr Webadmin ABS: http://ausbats.org.au Sustainable Ecosystems: www.cse.csiro.au Dr Alexander Herr - Herry CSIRO, Sustainable Ecosystems Gungahlin Homestead Bellenden Street GPO Box 284 Crace, ACT 2601 Phone/www (02) 6242 1542; 6242 1705(fax) 0408679811 (mob) home: www.csiro.au/people/Alexander.Herr Webadmin ABS: http://ausbats.org.au Sustainable Ecosystems: www.cse.csiro.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to insert a vector or matrix into an existing matrix
On Sun, Apr 20, 2008 at 08:16:11PM +, David Winsemius wrote: Gabor Csardi [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: Hmm, my understanding is different, m - matrix(sample(10*10), ncol=10) m2 - rbind( m[1:5,], 1:10, m[6:10,] ) m3 - cbind( m[,1:8], 1:10, m[,9:10] ) I read the question the same way and, in response to the part of the question asking for no temporary matrix, offer this refinement on your suggestion: m - rbind( m[1:5,], 1:10, m[6:10,] ) # row insertion or ... # not to be followed by, but rather instead column insertion .. m - cbind( m[,1:8], 1:10, m[,9:10] ) There might be something wrong with my eyes, but where is the refinement here? Your lines are literally the same as mines. There is no temporary matrix here, m2 and m3 are the results, he wanted either between row 5 and 6 _OR_ column 8 and 9. Oh, if you mean that we immediately put back the result into 'm', then 1) it does not really matter, R will create a temporary matrix internally anyway, 2) i assumed that the user can figure this out him/herself. G. -- David Winsemius G. On Sun, Apr 20, 2008 at 10:21:47AM -0300, Henrique Dallazuanna wrote: If I understand: m - matrix(sample(10*10), ncol=10) m[5:6, 8:9] - 1:4 On 4/18/08, Ng Stanley [EMAIL PROTECTED] wrote: Hi, Is there any functions to insert a vector or matrix into an existing ma trix say between row 5 and 6 or column 8 and 9, without creating a temporary matrix ? Thanks Stanley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function redefinition - not urgent, but I am curious
Suppose I write: f1 - function(x) x + 1 f2 - function(x) 2 * f1(x) f2(10) # 22 f1 - function(x) x - 1 f2(10) # 18 This is quite obvious. But is there any way to define f2 in such a way that we freeze the definition of f1? f1 - function(x) x+1 f1frozen - f1 f2 - function(x) 2*f1frozen(x) f2(10) # 22 f1 - function(x) x-1 f2(10) # 22 Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function redefinition - not urgent, but I am curious
[EMAIL PROTECTED] wrote: Suppose I write: f1 - function(x) x + 1 f2 - function(x) 2 * f1(x) f2(10) # 22 f1 - function(x) x - 1 f2(10) # 18 This is quite obvious. But is there any way to define f2 in such a way that we freeze the definition of f1? f1 - function(x) x+1 f1frozen - f1 f2 - function(x) 2*f1frozen(x) f2(10) # 22 f1 - function(x) x-1 f2(10) # 22 f1 - function(x) x+1 f2 - local({f1 - f1; function(x) 2 * f1(x)}) f2(10) [1] 22 f1 - function(x) x-1 f2(10) [1] 22 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] UTF-8 or Unicode on Windows PC
Dear all, is it possible to set up RGUI or JGR on Windows PC to UTF-8 encoding? I looked for it in mailing lists and in the documentation, but I couldn't figure out it. My problem is e.g. to split a given string containing German and Russian words into characters. example: a - asdШas strsplit(a,NULL) [[1]] [1] a s d Ш a s works on each Mac or Linux computer, but I didn't find a way for Windows. I tried to set options(encoding) to UTF-8, I tried to use the Perl mode in strsplit, but I had no success. At least by using JGR I was able to type Russian and see my text correctly but strsplit failed. I set RGUI to a Unicode font, no success. I tried to save a script file in UTF-8 or UTF-16 and I tried to run source(FILE, encoding=***), no success. Is there really no way to use a Windows PC and R to work with Unicode texts? Many thanks in advance for each hint, --Hans __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data labels in barchart (lattice)
Dear all, I use the barchart-function (lattice) for plotting stacked barcharts. The data is a summary table (data frame) of likert-scale-evaluations (strongly agree, agree...strongly disagree) to different issues constructed as follows (L1=precentage of strongly agree evaluations, L4=precentage of strongly disagree evaluations): --- ID L1 L2 L3 L4 DN Issue1 25 40 35 0 0 Issue2 15 30 22 28 5 . . . --- What I have so far not achieved is adding data labels to each sub-bar of a 100%-bar. What I would like to have is something like this: Issue1: |###25%###OO40%OOXXX35%XXX Issue2: | (similar) ... What should I do in oder to display data labels? Thanks in advance, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UTF-8 or Unicode on Windows PC
You didn't tell us your R version (or your locale). Windows has no UTF-8 locales, so a lot of work has had to be done to allow Unicode chars to be handled on Windows. Please look into 2.7.0 RC, and in particular its CHANGES file at https://svn.r-project.org/R/branches/R-2-7-branch/src/gnuwin32/CHANGES On Mon, 21 Apr 2008, Hans-Joerg Bibiko wrote: Dear all, is it possible to set up RGUI or JGR on Windows PC to UTF-8 encoding? I looked for it in mailing lists and in the documentation, but I couldn't figure out it. My problem is e.g. to split a given string containing German and Russian words into characters. example: a - asdШas strsplit(a,NULL) [[1]] [1] a s d Ш a s works on each Mac or Linux computer, but I didn't find a way for Windows. I tried to set options(encoding) to UTF-8, I tried to use the Perl mode in strsplit, but I had no success. At least by using JGR I was able to type Russian and see my text correctly but strsplit failed. I set RGUI to a Unicode font, no success. I tried to save a script file in UTF-8 or UTF-16 and I tried to run source(FILE, encoding=***), no success. Is there really no way to use a Windows PC and R to work with Unicode texts? Many thanks in advance for each hint, --Hans __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix problem
Hi Everyone, I am running into a problem with matrices. I use R version 2.4.1 and an older version. The problem is this: m-matrix(ncol=3,nrow=4) m[,1:3]-runif(n=4) That does what I expect; it fills up the rows of the matrix with the data vector m [,1] [,2] [,3] [1,] 0.2083071 0.2083071 0.2083071 [2,] 0.5865763 0.5865763 0.5865763 [3,] 0.7901782 0.7901782 0.7901782 [4,] 0.8298317 0.8298317 0.8298317 But this doesn't work: m[1:4,]-runif(n=3) m [,1] [,2] [,3] [1,] 0.96864939 0.11656740 0.06182311 [2,] 0.11656740 0.06182311 0.96864939 [3,] 0.06182311 0.96864939 0.11656740 [4,] 0.96864939 0.11656740 0.06182311 I want it to fill up the columns of the matrix with the data vector. Maybe there is a better way to do what I want. I need to do both of the above. The matrices are large, so I need a fast method. Thanks very much for any help. Bill Simpson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics history
On windows you can go to the graphics window; click history - recording. You may also want to have a look at recordPlot() and replayPlot(). Regards Søren Fra: [EMAIL PROTECTED] på vegne af Norbert NEUWIRTH Sendt: ma 21-04-2008 10:59 Til: r-help@r-project.org Emne: [R] graphics history dear useRs and developeRs, I am afraid it is a very basic question, but I did not find anything alike in the literature. The R standard graphics device shows the opportunity to activate the history of plots drawn within the current session. Th user can scroll back and see the last graphs (or same graph with some changes in parameters). I did not find out yet how to activate the history by code. Any ideas? Thanks and best regards, Norbert -- ** Mag. Norbert Neuwirth Österreichisches Institut für Familienforschung (ÖIF) - Universität Wien Austrian Institute for Family Studies - University of Vienna http://www.oif.ac.at http://www.oif.ac.at/ e-mail:[EMAIL PROTECTED] tel: +43-1-4277-489-11 fax: +43-1-4277-9-489 address: A-1010 Wien, Grillparzerstraße 7/9 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UTF-8 or Unicode on Windows PC
On 21 Apr 2008, at 11:33, Prof Brian Ripley wrote: You didn't tell us your R version (or your locale). Windows has no UTF-8 locales, so a lot of work has had to be done to allow Unicode chars to be handled on Windows. It was more or less a general question on R running on Windows PCs. Normally I'm using R on a Mac or Linux. But some of my students asked for the Unicode support for Windows' RGUI. Please look into 2.7.0 RC, and in particular its CHANGES file at https://svn.r-project.org/R/branches/R-2-7-branch/src/gnuwin32/CHANGES These are really good news! I would like to express my gratitude toward anyone who was/is involved in that development! Is it possible to download a compiled snapshot of 2.7.0 for Windows XP? Thanks a lot, --Hans __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UTF-8 or Unicode on Windows PC
On Mon, 21 Apr 2008, Hans-Joerg Bibiko wrote: On 21 Apr 2008, at 11:33, Prof Brian Ripley wrote: You didn't tell us your R version (or your locale). Windows has no UTF-8 locales, so a lot of work has had to be done to allow Unicode chars to be handled on Windows. It was more or less a general question on R running on Windows PCs. Normally I'm using R on a Mac or Linux. But some of my students asked for the Unicode support for Windows' RGUI. Please look into 2.7.0 RC, and in particular its CHANGES file at https://svn.r-project.org/R/branches/R-2-7-branch/src/gnuwin32/CHANGES These are really good news! I would like to express my gratitude toward anyone who was/is involved in that development! Thanks for the thanks. Is it possible to download a compiled snapshot of 2.7.0 for Windows XP? Yes, http://cran.r-project.org/bin/windows/base/rtest.html And it is due for release tomorrow. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] means and variances of several groups in the matrix
kathie wrote: Dear R users, I have 32 observations in data x. After sorting this, I want to compute means and variances of 3 groups divided by nr. Actually, the number of groups is flexible. Any suggestion will be greatly appreciated. Hi Kathryn, One way (there are many others) is to use the brkdn function in the prettyR package. You have to create a grouping variable, but it's pretty easy... group1-rep(1:3,each=nr[1,]) group2-rep(1:3,each=nr[2,]) y.df-data.frame(y=y,group1=group1,group2=group2) brkdn(y~group1,y.df,num.desc=c(mean,var)) brkdn(y~group2,y.df,num.desc=c(mean,var)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] change column names of several data frames
Dear all, I have several data frames for which I want to change the column names. Example data: data.1 - data.frame(x1 = rnorm(5)) data.2 - data.frame(x1 = rnorm(5)) . . What I want to achieve: names(data.1) - y1 names(data.1) - y1 . . Is it possible to achieve this with a loop or any of the apply-functions? Some (out of several...) unsuccessful attempts using for-loops instead: for(i in 1:2) names(get(paste(data, i, sep = .))) - y1 for(i in 1:2) assign(paste(data, i, sep=.), names(get(paste(natal, i, sep = .))) - y1) Thanks in advance! / Henrik Pärn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: means and variances of several groups in the matrix
Hi [EMAIL PROTECTED] napsal dne 21.04.2008 09:03:30: Dear R users, I have 32 observations in data x. After sorting this, I want to compute means and variances of 3 groups divided by nr. Actually, the number of groups is flexible. Any suggestion will be greatly appreciated. Kathryn Lord --- x=rnorm(32) y=sort(x) nr=matrix(c(12,11,10,10,10,11),2,3) nr [,1] [,2] [,3] [1,] 12 10 10- sum=32 [2,] 11 10 11- sum=32 For the 1st row in nr, index of y = (1,..,12, 13,...,23, 24,...32) I want to compute means and variances for 3 groups (1st group is 1 through 12; 2nd group is 13-23; 3rd group is 24-32) For the 2nd row in nr, index of y = (1,..,11, 12,...,22, 23,...32) also, I want to compute means and variances for 3 groups (1st group is 1 through 11; 2nd group is 12-22; 3rd group is 23-32) If you know that your vector is sorted you can do sapply(split(y,rep(1:3, times=nr[1,])), mean) Regards Petr -- View this message in context: http://www.nabble.com/means-and-variances-of- several-groups-in-the-matrix-tp16803939p16803939.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem in caluclaring the multiple regression
man4ish wrote: I am trying to calculate the regression for the follwing input data stored in 'data.txt' file.I am reading this and storing it in the variable i .then i am trying to get the predicted value using f1 as dependent and others f2f10 as independent variables.It is giving the following error. Also i want that i shoul get one predicted value for each row(y). What should i do. Please help me out i will be thankful to you. i-read.table(data.txt,header=FALSE) i V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 1 molecule f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 2 m1 3 7 0 3 0 0 0 1 1 1 3 m2 2 7 0 2 0 2 0 1 0 1 4 m3 0 0 0 3 0 0 0 3 1 0 5 m4 3 7 0 1 3 0 0 0 0 1 attach(i) out-lm(y~x1+x2+x3+x4+x5+x6+x7+x8+x9+x10) Error in eval(expr, envir, enclos) : object y not found The explanation for the error message is trivial, there are no variables in the data frame with the names you specify in the call on lm (). The first row in the frame contains what probably are the intended names .. as data. Use header=TRUE in the read .table. On the other hand there are several other problem as well, the contents of the dependent variable as well as the number of dependent variables compared with the number of rows in the frame. Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can we predict the value of dependent variable using independent variable
hi , i am trying to predict the value of dependent variable using the independent variable using R . like y is dependent and x1,x2,x3 ...,xn are independent variables so how can predict the value of y using x1,x2,x3 ...,xn . y x1 x2 x3 x4 x5 x6 6 0 1 231 2 5 1 34 56 8 8 4 6 9 00 1 3 5 72 1 0 3 4 5 67 82 1 1 0 111 2 2 y (predicted) values =? -- View this message in context: http://www.nabble.com/How-can-we-predict-the-value-of-dependent-variable--using-independent-variable-tp16804086p16804086.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Choice of notch size in R
Is there a way to modify the choice of notch size [1] in R's boxplot routine from outlining a 5% significance region, to say 1% or lower? Yes, but it's not as simple as specifying the significance level. You'll have to update the function boxplot.stats, specifically the line conf - if (do.conf) stats[3] + c(-1.58, 1.58) * iqr/sqrt(n) Then you either need to modify boxplot to find your version of boxplot.stats, or call bxp (the low-level plotting function that boxplot calls) directly. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re ad From EXCEL
ermimi wrote: Hello!!! I have been read a much about as read data from Excel File, but I haven´t found the necesary information to read the data. Now, I can create a channel : channel - odbcConnectExcel(file.xls) but I don´t know as read the data?? I hope that you could help me. Thank you very much. You are making an attempt at the most complex way of doing this. The simplest by far is (a) to read the data from the clipboard with the read.table () function, or (b) save the spreadsheet as a .csv type file and use the same function to read the file with the appropriate arguments for separators etc.. I tend to use the latter approach. In any case, once imported to R, the frame should be carefully checked against the contents of the spreadsheet. Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix problem
On 4/21/2008 5:54 AM, William Simpson wrote: Hi Everyone, I am running into a problem with matrices. I use R version 2.4.1 and an older version. The problem is this: m-matrix(ncol=3,nrow=4) m[,1:3]-runif(n=4) That does what I expect; it fills up the rows of the matrix with the data vector m [,1] [,2] [,3] [1,] 0.2083071 0.2083071 0.2083071 [2,] 0.5865763 0.5865763 0.5865763 [3,] 0.7901782 0.7901782 0.7901782 [4,] 0.8298317 0.8298317 0.8298317 But this doesn't work: m[1:4,]-runif(n=3) m [,1] [,2] [,3] [1,] 0.96864939 0.11656740 0.06182311 [2,] 0.11656740 0.06182311 0.96864939 [3,] 0.06182311 0.96864939 0.11656740 [4,] 0.96864939 0.11656740 0.06182311 I want it to fill up the columns of the matrix with the data vector. Does this help? matrix(runif(4), ncol=3, nrow=4) [,1] [,2] [,3] [1,] 0.60226296 0.60226296 0.60226296 [2,] 0.74104084 0.74104084 0.74104084 [3,] 0.70955138 0.70955138 0.70955138 [4,] 0.03136881 0.03136881 0.03136881 matrix(runif(3), ncol=3, nrow=4, byrow=TRUE) [,1] [,2] [,3] [1,] 0.7008625 0.8348078 0.1003123 [2,] 0.7008625 0.8348078 0.1003123 [3,] 0.7008625 0.8348078 0.1003123 [4,] 0.7008625 0.8348078 0.1003123 Maybe there is a better way to do what I want. I need to do both of the above. The matrices are large, so I need a fast method. Thanks very much for any help. Bill Simpson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RPy
Doran, Harold wrote: I'm curious if there are users of RPy on this list. I've recently created a gui front end using Tkinter for some python scripts I've written for some of our internal operations and I am quite pleased with how this program works. Currently, I can use py2exe to create a executable that allows for this gui to appear and for all python scripts to run even if the user doesn't have python on their machine. So (maybe) in theory if I can link Rpy to my gui to run R, I can then use py2exe to compile it and that would allow the user to run the functions even if R isn't on the machine. You're confounding having something on a machine with having something installed on a machine. Py2exe works by bundling all of python with the exe file, so in a sense the target machine does have python on it, just not installed in C:\Python in the usual way. If you give someone four different py2exe programs, they end up having four lots of python. For Py2exe to work with R so that people wouldn't have to install R, it would mean that Py2exe would have to bundle up all of R in the exe file. So that's the R binary, the .dll, every library package needed and so on. In a word, 'ick!'. I realize this is a broad question and has no minimal commented code. But, if anyone has some experience using Rpy, Tkinter and R I can come up with a small example to see if we could work out a possible way to use Tkinter to run R Rpy is the way to go, but you will have to get your users to install R, python and Rpy. It's only a few clicks and they only have to do it once. Personally I've used PyQt to create python programs with Qt GUIs that call R and it all works very nicely. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reg. consensus ranking
Mallika, I am not sure exactly what you mean by consensus approach. One easy thing you can do is compute the Pareto front, which is the set of non-dominated models. A model is dominated or covered if another model exists which is unambiguously better according to the given scores. So, this method allows you to eliminate uninteresting models as a first step. ## assume high scores are good %covers% - function(a, b) all(a = b) any(a b) 1:5 %covers% 0:4 [1] TRUE 1:5 %covers% 4:0 [1] FALSE Say you have a matrix with a row for each model and their scores in columns. foo - matrix(nrow=1000, ncol=8) colnames(foo) - paste(variable, 1:ncol(foo), sep=) rownames(foo) - paste(model, 1:nrow(foo), sep=) foo[] - rnorm(length(foo)) ## compute the set of dominated models (SLOW) ## (for any serious application, write this in C) dominated - function(data) { apply(data, 1, function(rowi) any(apply(data, 1, function(rowj) rowj %covers% rowi))) } nondom - !dominated(foo) sum(nondom) [1] 505 So in this case, only about half the cases can be eliminated. But hopefully your scores will agree more than these random numbers do, so you will get a bit further. I do think that 20 indicators is probably too many to get a useful result from a consensus approach, so you might want to look at subsets of indicators. You should also consider the uncertainty inherent in the models and indicators when comparing them. An extension is to work with the cover matrix, which records which models are dominated by which others. This defines a graph (as in graph theory), and you can plot it as a Hasse diagram to see groupings etc. Take the transitive reduction first. Here's a good reference: Patil, G.P. and C. Taillie (2004), Multiple indicators, partially ordered sets, and linear extensions: Multi-criterion ranking and prioritization, Environmental and Ecological Statistics, 11, 199-228. and maybe cough Andrews, F. (2005). Representing Uncertainty in Ranking by Single or Multiple Indicators. In Zerger, A. and Argent, R.M. (eds) MODSIM 2005 International Congress on Modelling and Simulation. Modelling and Simulation Society of Australia and New Zealand, December 2005, pp. 2456-2462. ISBN: 0-9758400-2-9. http://www.mssanz.org.au/modsim05/papers/andrews.pdf On Mon, Apr 21, 2008 at 8:17 AM, Mallika Veeramalai [EMAIL PROTECTED] wrote: Dear All, I have a list of models(1000) which have variable scores from 20 different method. I would like to rank models using consensus approach based on high scores from different methods.Is there any function available in R for this purpose? I will appreciate any pointers in this regard. Thank you very much in Advance, Mallika *~~~* Mallika Veeramalai, PhD, Postdoctoral Associate, Bioinformatics Systems Biology, Prof. Adam Godzik Lab, Burnham Institute for Medical Research, La Jolla, San Diego, CA 92037, US. phone : +1 858 646 3100 ext: 3627 (work) Fax : +1 858 795 5249 Web : http://bioinformatics.burnham.org/~mallika/ Email : [EMAIL PROTECTED] [EMAIL PROTECTED] *~~~* __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 PhD candidate Integrated Catchment Assessment and Management Centre The Fenner School of Environment and Society The Australian National University (Building 48A), ACT 0200 Beijing Bag, Locked Bag 40, Kingston ACT 2604 http://www.neurofractal.org/felix/ 3358 543D AAC6 22C2 D336 80D9 360B 72DD 3E4C F5D8 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overall p-value from a factor in a coxph fit
Prof. Paul, Prof. Frank. Thank you very much for helping me out. The Design package did the trick. Here is how the anova table looks like without using the Design package: anova(Fit1) Analysis of Deviance Table Cox model: response is Surv(Time, cancer) Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL 16783 5341.8 relativ10.0 1678214995.0 hormone3939.4 1677914055.6 . . . As you see, no p-values reported Here is how it looks with after implementing Design: anova(Fit1) Wald Statistics Response: Surv(Time, cancer) Factor Chi-Square d.f. P relativ 6.08 1 0.0137 hormone 8.68 3 0.0339 . . . Regards, Kare On Fri, 2008-04-18 at 11:03 -0500, Frank E Harrell Jr wrote: Paul Johnson wrote: On Fri, Apr 18, 2008 at 3:06 AM, Kåre Edvardsen [EMAIL PROTECTED] wrote: Hi all. If I run the simple regression when x is a categorical variable ( x - factor(x) ): MyFit -coxph( Surv(start, stop, event) ~ x ) How can I get the overall p-value on x other than for each dummy variable? anova(MyFit) does NOT provide that information as previously suggested on the list. It should work... Here's a self contained example showing that anova does give the desired significance test for an lm model. y - rnorm(100) x - gl(5,20) mod - lm(y~x) anova(mod) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(F) x 4 6.575 1.644 1.5125 0.2047 Residuals 95 103.237 1.087 If you provide a similar self contained example leading up to a coxph, I would be glad to investigate your question. You don't give enough information for me to tell which version of coxph you are running, and from what package. Suppose I guess that you are using the coxph from the package survival. If so, it appears to me there is a bug in that package at the moment. The methods anova.coxph and drop1.coxph did exist at one time, until very recently. There is a thread in r-help (which I found by typing RSiteSearch(anova.coxph) ) discussing recent troubles with anova.coxph. http://finzi.psych.upenn.edu/R/Rhelp02a/archive/118481.html As you see from the discussion in that thread, there used to be an anova method for coxph, and in the version of survival I have now, there is no such method. The version I have is 2.34-1, Date: 2008-03-31. Here's what I see after I run example(coxph) in order to create some coxph objects, on which I can test the diagnostics: drop1(test2) Error in terms.default(terms1) : no terms component anova(test2) Error in UseMethod(anova) : no applicable method for anova In that survival package, I do find anova.survreg, but not anova.coxph. If you are using the survival package, I'd suggest you contact Thomas Lumley directly, since he maintains it. I think if you had reported the exact error you saw, it would have been easier for me to diagnose the trouble. HTH pj In the meantime you can do library(Design) f - cph( . . . ) anova(f) # multiple d.f. Wald statistics including tests of nonlinearity cph uses coxph but anova.Design is separate from the survival package. Frank [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: matrix problem
Hi Not sure what you want to do. You can set dimensions to your vector. vec-1:12 dim(vec)-c(3,4) you can repeat your vector for n times vec-rep(1:4,3) dim(vec) - c(4,3) or you can use byrow option vec-1:12 matrix(vec, nrow=4, ncol=3, byrow=T) Petr Pikal [EMAIL PROTECTED] 724008364, 581252140, 581252257 [EMAIL PROTECTED] napsal dne 21.04.2008 11:54:03: Hi Everyone, I am running into a problem with matrices. I use R version 2.4.1 and an older version. The problem is this: m-matrix(ncol=3,nrow=4) m[,1:3]-runif(n=4) That does what I expect; it fills up the rows of the matrix with the data vector m [,1] [,2] [,3] [1,] 0.2083071 0.2083071 0.2083071 [2,] 0.5865763 0.5865763 0.5865763 [3,] 0.7901782 0.7901782 0.7901782 [4,] 0.8298317 0.8298317 0.8298317 But this doesn't work: m[1:4,]-runif(n=3) m [,1] [,2] [,3] [1,] 0.96864939 0.11656740 0.06182311 [2,] 0.11656740 0.06182311 0.96864939 [3,] 0.06182311 0.96864939 0.11656740 [4,] 0.96864939 0.11656740 0.06182311 I want it to fill up the columns of the matrix with the data vector. Maybe there is a better way to do what I want. I need to do both of the above. The matrices are large, so I need a fast method. Thanks very much for any help. Bill Simpson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can we predict the value of dependent variable using independent variable
Generally lm and predict.lm will solve this kind of problems, see ?lm ?predict.lm But with your given data there is nothing to predict, since you have 6 independent variables and 6 observations. So you have a complete System of linear equations, which you can solve, see ?solve. hth. man4ish schrieb: hi , i am trying to predict the value of dependent variable using the independent variable using R . like y is dependent and x1,x2,x3 ...,xn are independent variables so how can predict the value of y using x1,x2,x3 ...,xn . y x1 x2 x3 x4 x5 x6 6 0 1 231 2 5 1 34 56 8 8 4 6 9 00 1 3 5 72 1 0 3 4 5 67 82 1 1 0 111 2 2 y (predicted) values =? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] another matrix question
Hi everyone, I would like to do the following. Given matrix m and matrix n, I would like to compute mn[i,,j]= m[i,,j] + n[i,,j] if either of these elements is 0. (In other words, whichever number is nonzero.) Else I want mn[i,,j]=(m[i,,j] + n[i,,j])/2 I need a fast method. Thanks very much for any help. Bill Simpson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] means and variances of several groups in the matrix
Dear R users, I have 32 observations in data x. After sorting this, I want to compute means and variances of 3 groups divided by nr. Actually, the number of groups is flexible. Any suggestion will be greatly appreciated. Kathryn Lord --- x=rnorm(32) y=sort(x) nr=matrix(c(12,11,10,10,10,11),2,3) nr [,1] [,2] [,3] [1,] 12 10 10- sum=32 [2,] 11 10 11- sum=32 For the 1st row in nr, index of y = (1,..,12, 13,...,23, 24,...32) I want to compute means and variances for 3 groups (1st group is 1 through 12; 2nd group is 13-23; 3rd group is 24-32) For the 2nd row in nr, index of y = (1,..,11, 12,...,22, 23,...32) also, I want to compute means and variances for 3 groups (1st group is 1 through 11; 2nd group is 12-22; 3rd group is 23-32) -- View this message in context: http://www.nabble.com/means-and-variances-of-several-groups-in-the-matrix-tp16803939p16803939.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] graphics history
dear useRs and developeRs, I am afraid it is a very basic question, but I did not find anything alike in the literature. The R standard graphics device shows the opportunity to activate the history of plots drawn within the current session. Th user can scroll back and see the last graphs (or same graph with some changes in parameters). I did not find out yet how to activate the history by code. Any ideas? Thanks and best regards, Norbert -- ** Mag. Norbert Neuwirth Österreichisches Institut für Familienforschung (ÖIF) - Universität Wien Austrian Institute for Family Studies - University of Vienna http://www.oif.ac.at e-mail:[EMAIL PROTECTED] tel: +43-1-4277-489-11 fax: +43-1-4277-9-489 address: A-1010 Wien, Grillparzerstraße 7/9 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent of intervals() in lmer
To help Kedar a bit: Here is one way: recall - c(10, 13, 13, 6, 8, 8, 11, 14, 14, 22, 23, 25, 16, 18, 20, 15, 17, 17, 1, 1, 4, 12, 15, 17, 9, 12, 12, 8, 9, 12) fr - data.frame(rcl = recall, time = factor(rep(c(1, 2, 5), 10)), subj = factor(rep(1:10, each = 3))) (fr.lmer - lmer(rcl ~ time + (1 | subj), fr)) require(gmodels) ci(fr.lmer) Now I have a problem to which I would very much appreciate having a solution: The model fr.lmer gives a SE of 1.8793 for the (Intercept) and 0.3507 for the other levels. The reason is that the first took account of the variability of the effect of subjects. Or using simulation: Estimate CI lower CI upper Std. Error p-value (Intercept) 11.107202 6.458765 15.208065 2.1587362 0.004 time22.012064 1.301701 2.795128 0.3743050 0.000 time53.206834 2.502870 3.939791 0.3694384 0.000 Now if I need to draw CI bars around the three means, it seems to me that they should be roughly 11, 13, and 16.2, each \pm 0.75, because I'm trying to estimate the variability of patterns within subjects, and am not interested in the subject to subject variation in the mean for the purposes of prediction. This what the authors in the paper cited below call on p. 402 a narrow [as opposed to a broad] inference space. My question: ***How do I extract the three narrow CIs from the lmer?*** @ARTICLE{BlouinRiopelle2005, author = {Blouin, David C. and Riopelle, Arthur J.}, title = {On confidence intervals for within-subjects designs}, journal = {Psychological Methods}, year = {2005}, volume = {10}, pages = {397--412}, number = {4}, month = dec, abstract = {Confidence intervals (CIs) for means are frequently advocated as alternatives to null hypothesis significance testing (NHST), for which a common theme in the debate is that conclusions from CIs and NHST should be mutually consistent. The authors examined a class of CIs for which the conclusions are said to be inconsistent with NHST in within- subjects designs and a class for which the conclusions are said to be consistent. The difference between them is a difference in models. In particular, the main issue is that the class for which the conclusions are said to be consistent derives from fixed-effects models with subjects fixed, not mixed models with subjects random. Offered is mixed model methodology that has been popularized in the statistical literature and statistical software procedures. Generalizations to different classes of within-subjects designs are explored, and comments on the future direction of the debate on NHST are offered.}, url = {http://search.epnet.com/login.aspx?direct=truedb=pdhan=met104397 } } _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ On Apr 21, 2008, at 2:24 AM, Dieter Menne wrote: kedar nadkarni nadkarnikedar at gmail.com writes: I have been trying to obtain confidence intervals for the fit after having used lmer by using intervals(), but this does not work. intervals() is associated with lme but not with lmer(). What is the equivalent for intervals() in lmer()? ci in Gregory Warnes' package gmodels can do this. However, think twice if you really need lmer. Why not lme? It is well documented and has many features that are currently not in lmer. Dieter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] Upload HardyWeinberg package (1.1)
Hi all, I've uploaded to CRAN a new version of the HardyWeinberg package. This package has routines for performing graphical significance tests (based on the ternary plot) for Hardy-Weinberg equilibrium of bi-allelic marker data. Jan. -- |Jan Graffelman |tel: +34-93-4011739| |Dpt. of Statistics Operations Research|fax: +34-93-4016575| |Universitat Politecnica de Catalunya|email: [EMAIL PROTECTED]| |Av. Diagonal 647, 6th floor |www: | |08028 Barcelona, Spain | http://www-eio.upc.es/~jan/| ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent of intervals() in lmer
On 4/21/08, Michael Kubovy [EMAIL PROTECTED] wrote: To help Kedar a bit: Here is one way: recall - c(10, 13, 13, 6, 8, 8, 11, 14, 14, 22, 23, 25, 16, 18, 20, 15, 17, 17, 1, 1, 4, 12, 15, 17, 9, 12, 12, 8, 9, 12) fr - data.frame(rcl = recall, time = factor(rep(c(1, 2, 5), 10)), subj = factor(rep(1:10, each = 3))) (fr.lmer - lmer(rcl ~ time + (1 | subj), fr)) require(gmodels) ci(fr.lmer) Now I have a problem to which I would very much appreciate having a solution: The model fr.lmer gives a SE of 1.8793 for the (Intercept) and 0.3507 for the other levels. The reason is that the first took account of the variability of the effect of subjects. Or using simulation: Estimate CI lower CI upper Std. Error p-value (Intercept) 11.107202 6.458765 15.208065 2.1587362 0.004 time22.012064 1.301701 2.795128 0.3743050 0.000 time53.206834 2.502870 3.939791 0.3694384 0.000 Now if I need to draw CI bars around the three means, it seems to me that they should be roughly 11, 13, and 16.2, each \pm 0.75, because I'm trying to estimate the variability of patterns within subjects, and am not interested in the subject to subject variation in the mean for the purposes of prediction. If you want to examine the three means then you should fit the model as lmer(rcl ~ time - 1 + (1 | subj), fr) This what the authors in the paper cited below call on p. 402 a narrow [as opposed to a broad] inference space. My question: ***How do I extract the three narrow CIs from the lmer?*** @ARTICLE{BlouinRiopelle2005, author = {Blouin, David C. and Riopelle, Arthur J.}, title = {On confidence intervals for within-subjects designs}, journal = {Psychological Methods}, year = {2005}, volume = {10}, pages = {397--412}, number = {4}, month = dec, abstract = {Confidence intervals (CIs) for means are frequently advocated as alternatives to null hypothesis significance testing (NHST), for which a common theme in the debate is that conclusions from CIs and NHST should be mutually consistent. The authors examined a class of CIs for which the conclusions are said to be inconsistent with NHST in within- subjects designs and a class for which the conclusions are said to be consistent. The difference between them is a difference in models. In particular, the main issue is that the class for which the conclusions are said to be consistent derives from fixed-effects models with subjects fixed, not mixed models with subjects random. Offered is mixed model methodology that has been popularized in the statistical literature and statistical software procedures. Generalizations to different classes of within-subjects designs are explored, and comments on the future direction of the debate on NHST are offered.}, url = {http://search.epnet.com/login.aspx?direct=truedb=pdhan=met104397 } } _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ On Apr 21, 2008, at 2:24 AM, Dieter Menne wrote: kedar nadkarni nadkarnikedar at gmail.com writes: I have been trying to obtain confidence intervals for the fit after having used lmer by using intervals(), but this does not work. intervals() is associated with lme but not with lmer(). What is the equivalent for intervals() in lmer()? ci in Gregory Warnes' package gmodels can do this. However, think twice if you really need lmer. Why not lme? It is well documented and has many features that are currently not in lmer. Dieter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creation of dialog box
Dear, List members, My student are creating some functions to implement the median polish kriging (one of prediction method in geostatistic). She want to create some dialog box (to input some data) and menu. For this she is using winMenuAddItem and winDialogString commands in function. But WinDialogString makes just one string to fill data. Which of commands she must use to create dialog box with a few strings? Regards Ingrida __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Symbolic Integration in R
This may be a question to R-development but I'm not sure. Symbolic differentiation is implemented in R (maybe not for extremely complex expressions), but it proves that it can be done. I know that in C++ it can be done (symbolic c++), do you think in R it can be programmed just using the R language without resorting to external sources? Does anyone know or can estimate the amount of resources needed to incorporate symbolic integration capabilities (of course R being an open source, this would be a completely benevolent act on the part of the experts)? The idea would be appealing, in a way it will contribute to the completeness of R as a self-contained/sufficient mathematical computing system (like mathematica and more). -- View this message in context: http://www.nabble.com/Symbolic-Integration-in-R-tp16807254p16807254.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optFederov/AlgDesign - help avail?
Hello, we are needing to generate optimal (Fractional) designs for discrete choice applications, where we will be using logistic regression or multinomial logit as the modeling technique. It looks like optFederov, in the AlgDesign package may work, but not sure if this algorithm works when the variable of interest is binary or nominal? Anyone who are experts in this area, anyone interested in consulting with us in this topic (if so, email me we can arrange)? Or can confirm/deny optFederov can work in the discrete case? thx! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change column names of several data frames
Henrik Parn henrik.parn at bio.ntnu.no writes: Dear all, I have several data frames for which I want to change the column names. Example data: data.1 - data.frame(x1 = rnorm(5)) data.2 - data.frame(x1 = rnorm(5)) Use lists. I.e.: data - list() data[[1]] - data.frame(x1 = rnorm(5)) data[[2]] - data.frame(x1 = rnorm(5)) . . What I want to achieve: names(data.1) - y1 names(data.1) - y1 . . Is it possible to achieve this with a loop or any of the apply-functions? Some (out of several...) unsuccessful attempts using for-loops instead: for(i in 1:2) names(get(paste(data, i, sep = .))) - y1 for(i in 1:2) assign(paste(data, i, sep=.), names(get(paste(natal, i, sep = .))) - y1) Thanks in advance! / Henrik Pärn __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Antonio, Fabio Di Narzo Ph.D. student at Department of Statistical Sciences University of Bologna, Italy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics history
On Mon, 21 Apr 2008, Duncan Murdoch wrote: On 21/04/2008 4:59 AM, Norbert NEUWIRTH wrote: dear useRs and developeRs, I am afraid it is a very basic question, but I did not find anything alike in the literature. The R standard graphics device shows the opportunity to activate the history of plots drawn within the current session. Th user can scroll back and see the last graphs (or same graph with some changes in parameters). I did not find out yet how to activate the history by code. Any ideas? When you open the window, use windows(record=TRUE). If you want this to happen by default, write your own wrapper for the windows() graphics device: windows - function(..., record=TRUE) grDevices::windows(..., record=record) Or in 2.7.0, windows.options(record=TRUE) sets this for the session. You can set this in a .Rprofile file by (untested, I am not on Windows) setHook(packageEvent(grDevices, onLoad), function(...) grDevices::windows.options(record=TRUE) ) One thing I'd like to do, but didn't have time to implement before 2.7.0, is to have history set to some finite size, e.g. a default might be the last 3 or 10 plots. The problem with record=TRUE is that it keeps a record of all the plots, so memory use just increases and increases. Why not just startup another device with record=FALSE? Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Symbolic Integration in R
yacas has symbolic integration and the Ryacas package interfaces to it -- although that portion of yacas is not very mature. There are examples in the vignette: library(Ryacas) vignette(Ryacas) On Mon, Apr 21, 2008 at 7:09 AM, francogrex [EMAIL PROTECTED] wrote: This may be a question to R-development but I'm not sure. Symbolic differentiation is implemented in R (maybe not for extremely complex expressions), but it proves that it can be done. I know that in C++ it can be done (symbolic c++), do you think in R it can be programmed just using the R language without resorting to external sources? Does anyone know or can estimate the amount of resources needed to incorporate symbolic integration capabilities (of course R being an open source, this would be a completely benevolent act on the part of the experts)? The idea would be appealing, in a way it will contribute to the completeness of R as a self-contained/sufficient mathematical computing system (like mathematica and more). -- View this message in context: http://www.nabble.com/Symbolic-Integration-in-R-tp16807254p16807254.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics history
On 21/04/2008 4:59 AM, Norbert NEUWIRTH wrote: dear useRs and developeRs, I am afraid it is a very basic question, but I did not find anything alike in the literature. The R standard graphics device shows the opportunity to activate the history of plots drawn within the current session. Th user can scroll back and see the last graphs (or same graph with some changes in parameters). I did not find out yet how to activate the history by code. Any ideas? When you open the window, use windows(record=TRUE). If you want this to happen by default, write your own wrapper for the windows() graphics device: windows - function(..., record=TRUE) grDevices::windows(..., record=record) One thing I'd like to do, but didn't have time to implement before 2.7.0, is to have history set to some finite size, e.g. a default might be the last 3 or 10 plots. The problem with record=TRUE is that it keeps a record of all the plots, so memory use just increases and increases. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change column names of several data frames
Hi Henrik, afaIcs this should work: for(v in sprintf(data.%d, 1:n)) { f = get(v) names(f) = whatever assign(v, f) } -- Best wishes Wolfgang -- Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber 21/04/2008 13:10 Antonio, Fabio Di Narzo a écrit Henrik Parn henrik.parn at bio.ntnu.no writes: Dear all, I have several data frames for which I want to change the column names. Example data: data.1 - data.frame(x1 = rnorm(5)) data.2 - data.frame(x1 = rnorm(5)) Use lists. I.e.: data - list() data[[1]] - data.frame(x1 = rnorm(5)) data[[2]] - data.frame(x1 = rnorm(5)) . . What I want to achieve: names(data.1) - y1 names(data.1) - y1 . . Is it possible to achieve this with a loop or any of the apply-functions? Some (out of several...) unsuccessful attempts using for-loops instead: for(i in 1:2) names(get(paste(data, i, sep = .))) - y1 for(i in 1:2) assign(paste(data, i, sep=.), names(get(paste(natal, i, sep = .))) - y1) Thanks in advance! / Henrik Pärn __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ANCOVA
R version 2.6.2 PowerBook G4 Hello R User, I try to perform an ANCOVA using the glm function. I have a dataset with continuous and categorical data (explanatory variables) and my response variable is also binary categorical. Fehler: NA/NaN/Inf in externem Funktionsaufruf (arg 4) Zusätzlich: Warning messages: 1: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren (makes no sense for factors) 2: In Ops.factor(eta, offset) : - nicht sinnvoll für Faktoren 3: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren My dataset contains NA`s but if I try to use na.exclude, I got the same Error message. I thought the function should use with my dataset. What am I doing wrong? Thanks in advance for your help. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] another matrix question
Hi, Given matrix m and matrix n, I would like to compute mn[i,,j]= m[i,,j] + n[i,,j] if either of these elements is 0. (In other words, whichever number is nonzero.) Else I want mn[i,,j]=(m[i,,j] + n[i,,j])/2 I need a fast method. m - matrix(c(0,1,2,3,4,0,5,6,0),nrow=3,ncol=3) n - matrix(c(1,2,0,3,0,4,0,5,6),nrow=3,ncol=3) mn - ifelse(m==0 | n==0, m+n,(m+n)/2) Hope that helps, -- Julien Barnier Groupe de recherche sur la socialisation ENS-LSH - Lyon, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trend test for survival data
Hello, is there a R package that provides a log rank trend test for survival data in =3 treatment groups? Or are there any comparable trend tests for survival data in R? Thanks a lot Markus -- Dipl. Inf. Markus Kreuz Universitaet Leipzig Institut fuer medizinische Informatik, Statistik und Epidemiologie (IMISE) Haertelstr. 16-18 D-04107 Leipzig Tel. +49 341 97 16 276 Fax. +49 341 97 16 109 email: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Deleting rows with missing data
Hi folks: I have a data set v1, v2, ... v10. Can anyone tell me how to create a new data set where the entire row is deleted if, say, v5, is missing, is NA on that row? Thanks, Charles __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANCOVA
Dear Brigit, My guess is that you forgot to specify the argument family=binomial in the call to glm(). Had you included the commands that you used as well as the error that was produced, it wouldn't be necessary to guess. I hope this helps, John On Mon, 21 Apr 2008 14:23:13 +0200 Birgit Lemcke [EMAIL PROTECTED] wrote: R version 2.6.2 PowerBook G4 Hello R User, I try to perform an ANCOVA using the glm function. I have a dataset with continuous and categorical data (explanatory variables) and my response variable is also binary categorical. Fehler: NA/NaN/Inf in externem Funktionsaufruf (arg 4) Zusätzlich: Warning messages: 1: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren (makes no sense for factors) 2: In Ops.factor(eta, offset) : - nicht sinnvoll für Faktoren 3: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren My dataset contains NA`s but if I try to use na.exclude, I got the same Error message. I thought the function should use with my dataset. What am I doing wrong? Thanks in advance for your help. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to insert a vector or matrix into an existing matrix
Gabor Csardi [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: On Sun, Apr 20, 2008 at 08:16:11PM +, David Winsemius wrote: Gabor Csardi [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: Hmm, my understanding is different, m - matrix(sample(10*10), ncol=10) m2 - rbind( m[1:5,], 1:10, m[6:10,] ) m3 - cbind( m[,1:8], 1:10, m[,9:10] ) I read the question the same way and, in response to the part of the question asking for no temporary matrix, offer this refinement on your suggestion: m - rbind( m[1:5,], 1:10, m[6:10,] ) # row insertion or ... # not to be followed by, but rather instead column insertion .. m - cbind( m[,1:8], 1:10, m[,9:10] ) There might be something wrong with my eyes, but where is the refinement here? Your lines are literally the same as mines. There is no temporary matrix here, m2 and m3 are the results, he wanted either between row 5 and 6 _OR_ column 8 and 9. Oh, if you mean that we immediately put back the result into 'm', then 1) it does not really matter, R will create a temporary matrix internally anyway, Am I correct in assuming that after the creation of m by way of a temporary matrix that the temporary matrix would then be available for garbage collection, whereas if both m and m2 were created, there would be more memory occupied by the two objects? -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regression inclusion of variable, effect on coefficients
Hello dear R users! I know this question is not strictly R-help, yet, maybe some of the guru's in statistics can help me out. I have a sample of data all from the same population. Say my regression equation is now this: m1 - lm(y ~ x1 + x2 + x3) I also regress on m2 - lm(y ~ x1 + x2 + x3 + x4) The thing is, that I want to study the effect of information x4. I would hypothesize, that the coefficient estimate for x1 goes down as I introduce x4, as x4 conveys some of the information conveyed by x1 (but not only). Of course x1 and x4 are correlated, however multicollinearity does not appear to be a problem, the variance inflation factors are rather low (around 1.5 or so). I want to basically study, how the interplay between x1 and x4 is, when introducing x4 into the regression equation and whether my hypothesis is correct; i.e. that given I consider the information x4, not so much of the variation is explained via x1 anymore. I observe that introducing x4 into the regression, the coefficient estimate for x1 goes down; also the associated p-value becomes bigger; i.e. x1 becomes comparatively less significant. However, x4 is not significant. Yet, the observation is in line with my theoretical argument. The question is now simple: how can I work this out? I know this is likely a dumb question, but I would really appreciate some links or help. Regards Thiemo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics history
On 4/21/2008 8:16 AM, Prof Brian Ripley wrote: On Mon, 21 Apr 2008, Duncan Murdoch wrote: On 21/04/2008 4:59 AM, Norbert NEUWIRTH wrote: dear useRs and developeRs, I am afraid it is a very basic question, but I did not find anything alike in the literature. The R standard graphics device shows the opportunity to activate the history of plots drawn within the current session. Th user can scroll back and see the last graphs (or same graph with some changes in parameters). I did not find out yet how to activate the history by code. Any ideas? When you open the window, use windows(record=TRUE). If you want this to happen by default, write your own wrapper for the windows() graphics device: windows - function(..., record=TRUE) grDevices::windows(..., record=record) Or in 2.7.0, windows.options(record=TRUE) sets this for the session. You can set this in a .Rprofile file by (untested, I am not on Windows) setHook(packageEvent(grDevices, onLoad), function(...) grDevices::windows.options(record=TRUE) ) One thing I'd like to do, but didn't have time to implement before 2.7.0, is to have history set to some finite size, e.g. a default might be the last 3 or 10 plots. The problem with record=TRUE is that it keeps a record of all the plots, so memory use just increases and increases. Why not just startup another device with record=FALSE? I'd like to have recording always on, but I don't need an infinite history. But this isn't urgent enough to have prodded me into writing it before now. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANCOVA
Hello John, I am really sorry about that. I wanted to include the code but I forgot and you are completely right, I forgot the family-argument. Thanks for the help. B. Am 21.04.2008 um 14:50 schrieb John Fox: Dear Brigit, My guess is that you forgot to specify the argument family=binomial in the call to glm(). Had you included the commands that you used as well as the error that was produced, it wouldn't be necessary to guess. I hope this helps, John On Mon, 21 Apr 2008 14:23:13 +0200 Birgit Lemcke [EMAIL PROTECTED] wrote: R version 2.6.2 PowerBook G4 Hello R User, I try to perform an ANCOVA using the glm function. I have a dataset with continuous and categorical data (explanatory variables) and my response variable is also binary categorical. Fehler: NA/NaN/Inf in externem Funktionsaufruf (arg 4) Zusätzlich: Warning messages: 1: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren (makes no sense for factors) 2: In Ops.factor(eta, offset) : - nicht sinnvoll für Faktoren 3: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren My dataset contains NA`s but if I try to use na.exclude, I got the same Error message. I thought the function should use with my dataset. What am I doing wrong? Thanks in advance for your help. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting rows with missing data
?na.omit Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Charles Vetterli Sent: Monday, April 21, 2008 7:44 AM To: R-help@r-project.org Subject: [R] Deleting rows with missing data Hi folks: I have a data set v1, v2, ... v10. Can anyone tell me how to create a new data set where the entire row is deleted if, say, v5, is missing, is NA on that row? Thanks, Charles __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to insert a vector or matrix into an existing matrix
On Mon, Apr 21, 2008 at 12:50:08PM +, David Winsemius wrote: [...] Am I correct in assuming that after the creation of m by way of a temporary matrix that the temporary matrix would then be available for garbage collection, whereas if both m and m2 were created, there would be more memory occupied by the two objects? Of course, yes. We just had a different interpretation about without a temporary matrix. G. -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting rows with missing data
?complete.cases On Mon, Apr 21, 2008 at 8:43 AM, Charles Vetterli [EMAIL PROTECTED] wrote: Hi folks: I have a data set v1, v2, ... v10. Can anyone tell me how to create a new data set where the entire row is deleted if, say, v5, is missing, is NA on that row? Thanks, Charles __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent of intervals() in lmer
Douglas Bates bates at stat.wisc.edu writes: If you want to examine the three means then you should fit the model as lmer(rcl ~ time - 1 + (1 | subj), fr) True, but for the notorious error bars in plots that reviewers always request the 0.35 is probable more relevant than the 1.87. Which I think is justified in this case, but in most non-orthogonal designs with three or more factors, where we have a mixture of between/withing subject, there is no clear solution. What to do when required to produce error-bars that reasonably mirror p-values? It's easier with British Journals in the medical field that often have statistical professionals as reviewers, but many American Journals with their amateur physician/statisticians (why no t-test on raw data?) drive me nuts. Dieter #- library(lme4) recall - c(10, 13, 13, 6, 8, 8, 11, 14, 14, 22, 23, 25, 16, 18, 20, 15, 17, 17, 1, 1, 4, 12, 15, 17, 9, 12, 12, 8, 9, 12) fr - data.frame(rcl = recall, time = factor(rep(c(1, 2, 5), 10)), subj = factor(rep(1:10, each = 3))) fr.lmer - lmer(rcl ~ time -1 +(1 | subj), fr) summary(fr.lmer) fr.lmer - lmer(rcl ~ time +(1 | subj), fr) summary(fr.lmer) -- Fixed effects: Estimate Std. Error t value time1 11.000 1.879 5.853 time2 13.000 1.879 6.918 time5 14.200 1.879 7.556 Fixed effects: Estimate Std. Error t value (Intercept) 11. 1.8793 5.853 time2 2. 0.3507 5.703 time5 3.2000 0.3507 9.125 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] estimate of overdispersion with glm.nb
Dear R users, I am trying to fully understand the difference between estimating overdispersion with glm.nb() from MASS compared to glm(..., family = quasipoisson). It seems that (i) the coefficient estimates are different and also (ii) the summary() method for glm.nb suggests that overdispersion is taken to be one: Dispersion parameter for Negative Binomial(0.9695) family taken to be 1, which is not what I expected. The code I used is pasted below: x - rep(seq(0,23,by=1),50); s - rep(seq(1,2,length=50*24),1); tmp - cbind.data.frame(y=rnbinom(length(tmp1),mu=8*(sin(2*pi*x/24)+2),size = 1),x=x,s=s); tmp.glm.qp - glm(y~factor(x)-1,data = tmp, family=quasipoisson, offset=log(s)); tmp.glm.nb - glm.nb(y~factor(x)-1 +offset(log(s)),data = tmp); On a more advanced topic, I was furthermore hoping to compare models with a global estimate of overdispersion with one that allows overdispersion to be estimated separately for each level of the factor x. Can I achieve that in glm or do I need to employ a mixed effects model ? Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent of intervals() in lmer
Thanks Doug, You write: If you want to examine the three means then you should fit the model as lmer(rcl ~ time - 1 + (1 | subj), fr) I do just that (which is what Dieter just sent). But the CIs are much too big compared to the CIs for differences between means (which should be bigger than the CIs on the means themselves). If you write the model as ~ 1 - time, then the CIs are roughly of the same (large) size. But I'm really interested in the CIs on the means that capture the variability *within* subjects. I believe that this is what experimentalists in psychology need (and have been debating for a long time what the correct analysis is that produces these error bars). The theory is not about generalizing to people, but generalizing to responses to different situations within people. The article by Brillouin and Riopelle (2005) is the only one that tries to do this within the framework of LMEMs that I know of, and it's couched in terms of SAS. For the moment I wonder if the solution is not to use CIs based on the two low SEs produced by the ~ 1 - time model, and to treat them as least-significant difference intervals. _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ On Apr 21, 2008, at 7:56 AM, Douglas Bates wrote: On 4/21/08, Michael Kubovy [EMAIL PROTECTED] wrote: To help Kedar a bit: Here is one way: recall - c(10, 13, 13, 6, 8, 8, 11, 14, 14, 22, 23, 25, 16, 18, 20, 15, 17, 17, 1, 1, 4, 12, 15, 17, 9, 12, 12, 8, 9, 12) fr - data.frame(rcl = recall, time = factor(rep(c(1, 2, 5), 10)), subj = factor(rep(1:10, each = 3))) (fr.lmer - lmer(rcl ~ time + (1 | subj), fr)) require(gmodels) ci(fr.lmer) Now I have a problem to which I would very much appreciate having a solution: The model fr.lmer gives a SE of 1.8793 for the (Intercept) and 0.3507 for the other levels. The reason is that the first took account of the variability of the effect of subjects. Or using simulation: Estimate CI lower CI upper Std. Error p-value (Intercept) 11.107202 6.458765 15.208065 2.1587362 0.004 time22.012064 1.301701 2.795128 0.3743050 0.000 time53.206834 2.502870 3.939791 0.3694384 0.000 Now if I need to draw CI bars around the three means, it seems to me that they should be roughly 11, 13, and 16.2, each \pm 0.75, because I'm trying to estimate the variability of patterns within subjects, and am not interested in the subject to subject variation in the mean for the purposes of prediction. If you want to examine the three means then you should fit the model as lmer(rcl ~ time - 1 + (1 | subj), fr) This what the authors in the paper cited below call on p. 402 a narrow [as opposed to a broad] inference space. My question: ***How do I extract the three narrow CIs from the lmer?*** @ARTICLE{BlouinRiopelle2005, author = {Blouin, David C. and Riopelle, Arthur J.}, title = {On confidence intervals for within-subjects designs}, journal = {Psychological Methods}, year = {2005}, volume = {10}, pages = {397--412}, number = {4}, month = dec, abstract = {Confidence intervals (CIs) for means are frequently advocated as alternatives to null hypothesis significance testing (NHST), for which a common theme in the debate is that conclusions from CIs and NHST should be mutually consistent. The authors examined a class of CIs for which the conclusions are said to be inconsistent with NHST in within- subjects designs and a class for which the conclusions are said to be consistent. The difference between them is a difference in models. In particular, the main issue is that the class for which the conclusions are said to be consistent derives from fixed-effects models with subjects fixed, not mixed models with subjects random. Offered is mixed model methodology that has been popularized in the statistical literature and statistical software procedures. Generalizations to different classes of within-subjects designs are explored, and comments on the future direction of the debate on NHST are offered.}, url = {http://search.epnet.com/login.aspx?direct=truedb=pdhan=met104397 } } _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:
Re: [R] ANCOVA error again
Hello R users! I got again an error message. I used this code: with (FemMal85_Sex, { ModelFemMal85- glm (Sex~outLatTep_like_other*outLatTep_like_conduplicate*outLatTep_keeled_w inged*spathellae_conspicuous*spathellae_inconspicuous_absent *InfSpath_persistence*InfSpath_caducuous*bractsSpacing_lax*bractsSpacing _imbricate*InfType_sparsely_paniculate*InfType_racemose*InfType_panicula te*InfType_globose*bracApexShape_truncate *bracApexShape_rounded *bracApexShape_obtuse *bracApexShape_acute *bracApexShape_acuminate *bracApexShape_apiculate *bracApexShape_aciculate *BracUpperMarg_like_rest*BracUpperMarg_memebranous*BracUpperMarg_honeyco mbed_cells*InfSpathText_coriaceous*InfSpathText_hyaline*InfSpathText_cha rtaceous*InfSpathText_cartilaginous*InfSpathText_membranous*spikShapeSid e_linear*spikShapeSide_oblong*spikShapeSide_square*spikShapeSide_ellipti cal*spikShapeSide_ovate*spikShapeSide_obovate*spikShapeSide_obtriangular *spikShapeSide_orbicular*spikShapeSide_undifferentiated*SpikApexShape_tr uncate*SpikApexShape_rounded*SpikApexShape_obtuse*SpikApexShape_acute*Sp ikApexShape_undifferentiated*BracShape_linear*BracShape_oblong*BracShape _square*BracShape_elliptical*BracShape_ovate*BracShape_obovate*BracShape _orbicular*BracText_bony*BracText_coriaceous*BracText_hyline*BracText_ch artaceous*BracText_cartilaginous *BracText_membranous *BracText_centrChartaceousMargMembranous *TepText_bony*TepText_coriaceous*TepText_chartaceous *TepText_cartilaginous *TepText_membranous*InfLengthMin*InfLengthMax*InfWidthMin*InfWidthMax*Sp athellaeLengthMin*SpathellaeLengthMax*SpikLengthMin*SpikLengthMax*FlowNu mbSpikMin*FlowNumbSpikMax*BracLengthMin*BracLengthMax*FlowLengthMin*Flow LengthMax*InfSpathLengthToSpikMin*InfSpathLengthToSpikMax*TepInOutMin*Te pInOutMax*BracLengthtoFlowMin*BracLengthtoFlowMax*BracMargMin*BracMargMa x*BracAwnToBodyMin*BracAwnToBodyMax, na.action=na.exclude,family=binomial)}) and got this error message: *** caught segfault *** address 0xbf7fffb0, cause 'memory not mapped' Traceback: 1: terms.formula(formula, data = data) 2: terms(formula, data = data) 3: model.frame.default(formula = Sex ~ outLatTep_like_other * outLatTep_like_conduplicate *... * BracAwnToBodyMax, drop.unused.levels = TRUE) 4: model.frame(formula = Sex ~ outLatTep_like_other * outLatTep_like_conduplicate *... * BracAwnToBodyMax, drop.unused.levels = TRUE) 5: eval(expr, envir, enclos) 6: eval(mf, parent.frame()) 7: glm(Sex ~ outLatTep_like_other * outLatTep_like_conduplicate ** BracAwnToBodyMax, family = binomial) 8: eval.with.vis(expr, envir, enclos) 9: eval.with.vis(ei, envir) 10: source(/Users/birgitlemcke/Job/Doktorarbeit/R/Protokolle_Codes/ Protokoll21.04.08.R) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: ... I deleted here some of the 85 variables What does this message mean? Thanks a lot in advance. B. Am 21.04.2008 um 14:50 schrieb John Fox: Dear Brigit, My guess is that you forgot to specify the argument family=binomial in the call to glm(). Had you included the commands that you used as well as the error that was produced, it wouldn't be necessary to guess. I hope this helps, John On Mon, 21 Apr 2008 14:23:13 +0200 Birgit Lemcke [EMAIL PROTECTED] wrote: R version 2.6.2 PowerBook G4 Hello R User, I try to perform an ANCOVA using the glm function. I have a dataset with continuous and categorical data (explanatory variables) and my response variable is also binary categorical. Fehler: NA/NaN/Inf in externem Funktionsaufruf (arg 4) Zusätzlich: Warning messages: 1: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren (makes no sense for factors) 2: In Ops.factor(eta, offset) : - nicht sinnvoll für Faktoren 3: In Ops.factor(y, mu) : - nicht sinnvoll für Faktoren My dataset contains NA`s but if I try to use na.exclude, I got the same Error message. I thought the function should use with my dataset. What am I doing wrong? Thanks in advance for your help. Birgit Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] 175 Jahre UZH «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.» MNF-Jubiläumsevent für gross und klein. 19. April 2008, 10.00 Uhr bis 02.00 Uhr Campus Irchel, Winterthurerstrasse 190, 8057 Zürich Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft __
Re: [R] How to do survival analysis with time-related IVs?
Tianxu [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: I am wondering how to do survival analysis with time-related IVs in R. For example, See section 4 of Fox's contribution: http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf http://cran.r-project.org/doc/contrib/Fox-Companion/cox-regression.txt snip example And then run logistic regression using censor as the DV? I was initially unclear about why you were running logistic regression on censor, but perhaps you are trying to assess the non-informative censoring assumption? The R function that immediately comes to mind is glm( , family=binomial). Hmisc package also has lrm(...) and associated summary and diagnostic functions. -- David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Labelling a secondary axis in R
Hello, How can I label a secondary axis in R? At the moment it's labelled as c(-100,200). Obviously I would like it to be more sensible. Here is the code I am using newx = -100+37.5*((1:9)-1) axis(4,at=newx,labels=(newx+100)/3750) Thanks, Rob -- View this message in context: http://www.nabble.com/Labelling-a-secondary-axis-in-R-tp16807708p16807708.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent of intervals() in lmer
Sorry, I meant to say: For the moment I wonder if the solution is not to use CIs based on the two low SEs produced by the ~ time model, and to treat them as least-significant difference intervals. _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ On Apr 21, 2008, at 9:23 AM, Michael Kubovy wrote: Thanks Doug, You write: If you want to examine the three means then you should fit the model as lmer(rcl ~ time - 1 + (1 | subj), fr) I do just that (which is what Dieter just sent). But the CIs are much too big compared to the CIs for differences between means (which should be bigger than the CIs on the means themselves). If you write the model as ~ 1 - time, then the CIs are roughly of the same (large) size. But I'm really interested in the CIs on the means that capture the variability *within* subjects. I believe that this is what experimentalists in psychology need (and have been debating for a long time what the correct analysis is that produces these error bars). The theory is not about generalizing to people, but generalizing to responses to different situations within people. The article by Brillouin and Riopelle (2005) is the only one that tries to do this within the framework of LMEMs that I know of, and it's couched in terms of SAS. For the moment I wonder if the solution is not to use CIs based on the two low SEs produced by the ~ 1 - time model, and to treat them as least-significant difference intervals. On Apr 21, 2008, at 7:56 AM, Douglas Bates wrote: On 4/21/08, Michael Kubovy [EMAIL PROTECTED] wrote: To help Kedar a bit: Here is one way: recall - c(10, 13, 13, 6, 8, 8, 11, 14, 14, 22, 23, 25, 16, 18, 20, 15, 17, 17, 1, 1, 4, 12, 15, 17, 9, 12, 12, 8, 9, 12) fr - data.frame(rcl = recall, time = factor(rep(c(1, 2, 5), 10)), subj = factor(rep(1:10, each = 3))) (fr.lmer - lmer(rcl ~ time + (1 | subj), fr)) require(gmodels) ci(fr.lmer) Now I have a problem to which I would very much appreciate having a solution: The model fr.lmer gives a SE of 1.8793 for the (Intercept) and 0.3507 for the other levels. The reason is that the first took account of the variability of the effect of subjects. Or using simulation: Estimate CI lower CI upper Std. Error p-value (Intercept) 11.107202 6.458765 15.208065 2.1587362 0.004 time22.012064 1.301701 2.795128 0.3743050 0.000 time53.206834 2.502870 3.939791 0.3694384 0.000 Now if I need to draw CI bars around the three means, it seems to me that they should be roughly 11, 13, and 16.2, each \pm 0.75, because I'm trying to estimate the variability of patterns within subjects, and am not interested in the subject to subject variation in the mean for the purposes of prediction. If you want to examine the three means then you should fit the model as lmer(rcl ~ time - 1 + (1 | subj), fr) This what the authors in the paper cited below call on p. 402 a narrow [as opposed to a broad] inference space. My question: ***How do I extract the three narrow CIs from the lmer?*** @ARTICLE{BlouinRiopelle2005, author = {Blouin, David C. and Riopelle, Arthur J.}, title = {On confidence intervals for within-subjects designs}, journal = {Psychological Methods}, year = {2005}, volume = {10}, pages = {397--412}, number = {4}, month = dec, abstract = {Confidence intervals (CIs) for means are frequently advocated as alternatives to null hypothesis significance testing (NHST), for which a common theme in the debate is that conclusions from CIs and NHST should be mutually consistent. The authors examined a class of CIs for which the conclusions are said to be inconsistent with NHST in within- subjects designs and a class for which the conclusions are said to be consistent. The difference between them is a difference in models. In particular, the main issue is that the class for which the conclusions are said to be consistent derives from fixed-effects models with subjects fixed, not mixed models with subjects random. Offered is mixed model methodology that has been popularized in the statistical literature and statistical software procedures. Generalizations to different classes of within-subjects designs are explored, and comments on the future direction of the debate on NHST are offered.}, url = {http://search.epnet.com/login.aspx?direct=truedb=pdhan=met104397 } } On Apr 21, 2008, at 2:24 AM, Dieter Menne wrote: kedar nadkarni nadkarnikedar at gmail.com writes: I have been trying to obtain
[R] External regressors in GARCH
Hello to all R users, up to my knowledge, neither garch(tseries) nor garchFit(fGarch) support including external regressors in the regression, which, for example, arima(stats) can do by setting xreg. Is there a package or any other way that can do this? To be precise, I want to estimate a variance equation that goes like this: h_t = arch_t + garch_t + dummy1_t + dummy2_t + v_t. Any advise appreciated! Radovan Fiser -- Institute of Economic Studies Prague http://ies.fsv.cuni.cz/ radekf.net bikeri.cz - kostelnibriza.cz - fiserovi.cz - hcsgang.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labelling a secondary axis in R
On 4/21/2008 9:02 AM, Nakamura wrote: Hello, How can I label a secondary axis in R? At the moment it's labelled as c(-100,200). Obviously I would like it to be more sensible. Here is the code I am using newx = -100+37.5*((1:9)-1) axis(4,at=newx,labels=(newx+100)/3750) I don't understand your question. When I run this code: newx = -100+37.5*((1:9)-1) plot(1:9, newx) axis(4,at=newx,labels=(newx+100)/3750) I get labels on side 4 which are 0, 0.01, ..., 0.08. I think we need a complete example to see the problem. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pairs diagram of qq plots?
Hello everyone, for some exploratory analysis I would like to compare the distribution of an observable WERT pairwise between several samples identified by STICHPROBE (which differ in size). str(stichproben_o1o4_20080327ff[c(STICHPROBE, WERT)]) 'data.frame': 6087 obs. of 2 variables: $ STICHPROBE: num 9 9 2 2 7 3 2 3 8 6 ... $ WERT : num 165 184 110 131 87 111 210 88 159 198 ... A good way to compare two distributions is a Q-Q or Tukey mean-difference (tmd) plot. I would like to arrange these qq or tmd plots in a matrix as the pairs() function does. Can pairs() be made to immediately produce tmd plots instead of plain scatter plots, or will I have to do the tmd processing in a separate step and only pass the such preprocessed xy data to pairs()? Another problem is the representation of the data with respect to pairs(). My data.frame identifies the sample of each measurement in column STICHPROBE. It does not have one column for each sample (note again that the samples differ in size). From what I understand about pairs() it requires a separate column for each variable. The reshape() function should be able to change the representation but the best I can achieve is a wide dataframe with multiple columns (as desired) but no rows: reshape(stichproben_o1o4_20080327ff[c(STICHPROBE, WERT)], timevar=STICHPROBE, direction=wide) [1] WERT.9 WERT.2 WERT.7 WERT.3 WERT.8 WERT.6 WERT.1 WERT.4 WERT.0 WERT.5 0 rows (or 0-length row.names) Best regards Stefan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labelling a secondary axis in R
Apologies for the private mail, Nabble has not yet updated the thread so I can write another post in it. I think I have confused things. I don't mean the labels are incorrect. They are fine. What I am referring to is a title for the secondary axis, which is currently entitled as c(-100,200). Obviously this isn't very useful. I will put some reproducible code on the post when it updates. Thanks, Rob On Apr 21 2008, Duncan Murdoch wrote: On 4/21/2008 9:02 AM, Nakamura wrote: Hello, How can I label a secondary axis in R? At the moment it's labelled as c(-100,200). Obviously I would like it to be more sensible. Here is the code I am using newx = -100+37.5*((1:9)-1) axis(4,at=newx,labels=(newx+100)/3750) I don't understand your question. When I run this code: newx = -100+37.5*((1:9)-1) plot(1:9, newx) axis(4,at=newx,labels=(newx+100)/3750) I get labels on side 4 which are 0, 0.01, ..., 0.08. I think we need a complete example to see the problem. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Array within an array
Hello, I need help to build an array within an array, i. e., I have this: tt[,,c(1,2)] , , metadados.class_7.R CS WRC LRA Inicial 1.000 1.000 1.000 Final 0.5974482 0.6095162 0.5866560 Indep 0.4335460 0.4799575 0.4169591 Inicial 0.9925572 0.9925572 0.9925572 Final 0.6079745 0.6004785 0.5708134 Indep 0.4335460 0.4799575 0.4169591 Inicial 0.2395003 0.2395003 0.2395003 Final 0.2906433 0.3400851 0.3616162 Indep 0.4335460 0.4799575 0.4169591 , , metadados.class_stat.R CS WRC LRA Inicial 1.000 1.000 1.000 Final 0.6978175 0.711 0.5665584 Indep 0.6079365 0.6289683 0.5211580 Inicial 1.000 1.000 0.9973485 Final 0.6978175 0.711 0.5641775 Indep 0.6079365 0.6289683 0.5211580 Inicial 0.2873016 0.2873016 0.2690476 Final 0.4988095 0.5591270 0.2951840 Indep 0.6079365 0.6289683 0.5211580 How can I divide the third dimension in more three? Thank You! Carla Rebelo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANCOVA error again
On Mon, 2008-04-21 at 15:43 +0200, Birgit Lemcke wrote: Hello R users! I got again an error message. Something here is causing compiled code to segfault (crash). I don't know what the problem is here exactly --- I'll let those much more acquainted with R look into that --- but you seem to be using R's model formulae in a non-standard way. You don't need with() wrapping your call to glm(), just include a data frame as the data argument: ModelFemMal85 - glm(Sex ~ .^2, data = FemMal85_Sex, na.action = na.exclude, family = binomial) Will do what you appear to have attempted below (all main effects plus first order interactions). This is a simpler call so see if this will work in R without causing the segfault. However, I would consider what on earth you are going to do with such a huge number of coefficients in the model --- over 3500 if I interpretted your formula correctly and assuming that the variables are all continuous. You do have many, many more than 3500 observations? If you are trying to predict the sex of individuals, why not try some of the classification techniques available in R? A simple technique would be a classification tree (packages rpart and party for example). These will help with feature selection and do include interactions, though not in exactly the same way you have done so here. Bagging, boosting or randomForests could be used to improve predictions (or make them more stable). Check out the Machine Learning and Environmetrics Task Views for additional info and pointers to relevant R packages/functions. My two pennies worth, G I used this code: with (FemMal85_Sex, { ModelFemMal85- glm (Sex~outLatTep_like_other*outLatTep_like_conduplicate*outLatTep_keeled_w inged*spathellae_conspicuous*spathellae_inconspicuous_absent *InfSpath_persistence*InfSpath_caducuous*bractsSpacing_lax*bractsSpacing _imbricate*InfType_sparsely_paniculate*InfType_racemose*InfType_panicula te*InfType_globose*bracApexShape_truncate *bracApexShape_rounded *bracApexShape_obtuse *bracApexShape_acute *bracApexShape_acuminate *bracApexShape_apiculate *bracApexShape_aciculate *BracUpperMarg_like_rest*BracUpperMarg_memebranous*BracUpperMarg_honeyco mbed_cells*InfSpathText_coriaceous*InfSpathText_hyaline*InfSpathText_cha rtaceous*InfSpathText_cartilaginous*InfSpathText_membranous*spikShapeSid e_linear*spikShapeSide_oblong*spikShapeSide_square*spikShapeSide_ellipti cal*spikShapeSide_ovate*spikShapeSide_obovate*spikShapeSide_obtriangular *spikShapeSide_orbicular*spikShapeSide_undifferentiated*SpikApexShape_tr uncate*SpikApexShape_rounded*SpikApexShape_obtuse*SpikApexShape_acute*Sp ikApexShape_undifferentiated*BracShape_linear*BracShape_oblong*BracShape _square*BracShape_elliptical*BracShape_ovate*BracShape_obovate*BracShape _orbicular*BracText_bony*BracText_coriaceous*BracText_hyline*BracText_ch artaceous*BracText_cartilaginous *BracText_membranous *BracText_centrChartaceousMargMembranous *TepText_bony*TepText_coriaceous*TepText_chartaceous *TepText_cartilaginous *TepText_membranous*InfLengthMin*InfLengthMax*InfWidthMin*InfWidthMax*Sp athellaeLengthMin*SpathellaeLengthMax*SpikLengthMin*SpikLengthMax*FlowNu mbSpikMin*FlowNumbSpikMax*BracLengthMin*BracLengthMax*FlowLengthMin*Flow LengthMax*InfSpathLengthToSpikMin*InfSpathLengthToSpikMax*TepInOutMin*Te pInOutMax*BracLengthtoFlowMin*BracLengthtoFlowMax*BracMargMin*BracMargMa x*BracAwnToBodyMin*BracAwnToBodyMax, na.action=na.exclude,family=binomial)}) and got this error message: *** caught segfault *** address 0xbf7fffb0, cause 'memory not mapped' Traceback: 1: terms.formula(formula, data = data) 2: terms(formula, data = data) 3: model.frame.default(formula = Sex ~ outLatTep_like_other * outLatTep_like_conduplicate *... * BracAwnToBodyMax, drop.unused.levels = TRUE) 4: model.frame(formula = Sex ~ outLatTep_like_other * outLatTep_like_conduplicate *... * BracAwnToBodyMax, drop.unused.levels = TRUE) 5: eval(expr, envir, enclos) 6: eval(mf, parent.frame()) 7: glm(Sex ~ outLatTep_like_other * outLatTep_like_conduplicate ** BracAwnToBodyMax, family = binomial) 8: eval.with.vis(expr, envir, enclos) 9: eval.with.vis(ei, envir) 10: source(/Users/birgitlemcke/Job/Doktorarbeit/R/Protokolle_Codes/ Protokoll21.04.08.R) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: ... I deleted here some of the 85 variables What does this
Re: [R] Labelling a secondary axis in R
If you just want the title, look at ?mtext. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Monday, April 21, 2008 10:23 AM To: Duncan Murdoch Cc: r-help@r-project.org Subject: Re: [R] Labelling a secondary axis in R Apologies for the private mail, Nabble has not yet updated the thread so I can write another post in it. I think I have confused things. I don't mean the labels are incorrect. They are fine. What I am referring to is a title for the secondary axis, which is currently entitled as c(-100,200). Obviously this isn't very useful. I will put some reproducible code on the post when it updates. Thanks, Rob On Apr 21 2008, Duncan Murdoch wrote: On 4/21/2008 9:02 AM, Nakamura wrote: Hello, How can I label a secondary axis in R? At the moment it's labelled as c(-100,200). Obviously I would like it to be more sensible. Here is the code I am using newx = -100+37.5*((1:9)-1) axis(4,at=newx,labels=(newx+100)/3750) I don't understand your question. When I run this code: newx = -100+37.5*((1:9)-1) plot(1:9, newx) axis(4,at=newx,labels=(newx+100)/3750) I get labels on side 4 which are 0, 0.01, ..., 0.08. I think we need a complete example to see the problem. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re ad From EXCEL - Question
Erich: Past posts on this list have pointed out various pros and cons of different methods of data transer to R from Excel, in particular, loss of precision, formatting problems, etc. Do you have any comments about to what degree any of these alternatives may be susceptible or immune from these difficulties? -- Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Erich Neuwirth Sent: Sunday, April 20, 2008 2:18 AM To: r-help@r-project.org Subject: Re: [R] Re ad From EXCEL To transfer data from Excel to R you have a least 3 options. RODBC is platform-independent. You can use it to read Excel files on any platform where you have an ODBC driver for Excel installed. xlsReadWrite is available only on Windows. It has a function read.xls which reads data from Excel worksheets into data frames or matrices. It does not need Excel installed. RExcel (available through package RExcelInstaller on CRAN) needs Excel. Among other things, it allows you to select a range in Excel and transfer it into R through operations available on additional Excel menus. RExcel not only allows data transfer, it also allows you to use R function in Excel macros and even in Excel worksheet functions. RExcel (and related software) has its own website at http://rcom.univie.ac.at. It also has its own mailing list which can be reached through this website. ermimi wrote: Hello!!! I have been read a much about as read data from Excel File, but I haven´t found the necesary information to read the data. Now, I can create a channel : channel - odbcConnectExcel(file.xls) but I don´t know as read the data?? I hope that you could help me. Thank you very much. -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANCOVA error again
Hello Gavin, thanks for you answer. If I use it without with I get back the same error. The with thing was only to try out for functions that do not contain a data-argument. I still try to learn and therefor I sometimes just try. It is understood that I am on the way to simplify the model once I have it for the hole slot. I don`t wanna predict the gender. I would like to know which of my variables are the strongest to divide all into the already existing groups: male and female. In the case that all my variables would be continuous, I could have probably used a discriminant function analysis, but most of the variables are categorical. My plan is to delete in each case, one of the interacting variables and then compare the models with the left over variables using a ChiSquare test. But I am always open for suggestions, because I am still not very good in statistics. Presently I still have the same error message and don`t know how to fix this. Greets B. Am 21.04.2008 um 16:52 schrieb Gavin Simpson: On Mon, 2008-04-21 at 15:43 +0200, Birgit Lemcke wrote: Hello R users! I got again an error message. Something here is causing compiled code to segfault (crash). I don't know what the problem is here exactly --- I'll let those much more acquainted with R look into that --- but you seem to be using R's model formulae in a non-standard way. You don't need with() wrapping your call to glm(), just include a data frame as the data argument: ModelFemMal85 - glm(Sex ~ .^2, data = FemMal85_Sex, na.action = na.exclude, family = binomial) Will do what you appear to have attempted below (all main effects plus first order interactions). This is a simpler call so see if this will work in R without causing the segfault. However, I would consider what on earth you are going to do with such a huge number of coefficients in the model --- over 3500 if I interpretted your formula correctly and assuming that the variables are all continuous. You do have many, many more than 3500 observations? If you are trying to predict the sex of individuals, why not try some of the classification techniques available in R? A simple technique would be a classification tree (packages rpart and party for example). These will help with feature selection and do include interactions, though not in exactly the same way you have done so here. Bagging, boosting or randomForests could be used to improve predictions (or make them more stable). Check out the Machine Learning and Environmetrics Task Views for additional info and pointers to relevant R packages/functions. My two pennies worth, G I used this code: with (FemMal85_Sex, { ModelFemMal85- glm (Sex~outLatTep_like_other*outLatTep_like_conduplicate*outLatTep_keele d_w inged*spathellae_conspicuous*spathellae_inconspicuous_absent *InfSpath_persistence*InfSpath_caducuous*bractsSpacing_lax*bractsSpac ing _imbricate*InfType_sparsely_paniculate*InfType_racemose*InfType_panic ula te*InfType_globose*bracApexShape_truncate *bracApexShape_rounded *bracApexShape_obtuse *bracApexShape_acute *bracApexShape_acuminate *bracApexShape_apiculate *bracApexShape_aciculate *BracUpperMarg_like_rest*BracUpperMarg_memebranous*BracUpperMarg_hone yco mbed_cells*InfSpathText_coriaceous*InfSpathText_hyaline*InfSpathText_ cha rtaceous*InfSpathText_cartilaginous*InfSpathText_membranous*spikShape Sid e_linear*spikShapeSide_oblong*spikShapeSide_square*spikShapeSide_elli pti cal*spikShapeSide_ovate*spikShapeSide_obovate*spikShapeSide_obtriangu lar *spikShapeSide_orbicular*spikShapeSide_undifferentiated*SpikApexShape _tr uncate*SpikApexShape_rounded*SpikApexShape_obtuse*SpikApexShape_acute *Sp ikApexShape_undifferentiated*BracShape_linear*BracShape_oblong*BracSh ape _square*BracShape_elliptical*BracShape_ovate*BracShape_obovate*BracSh ape _orbicular*BracText_bony*BracText_coriaceous*BracText_hyline*BracText _ch artaceous*BracText_cartilaginous *BracText_membranous *BracText_centrChartaceousMargMembranous *TepText_bony*TepText_coriaceous*TepText_chartaceous *TepText_cartilaginous *TepText_membranous*InfLengthMin*InfLengthMax*InfWidthMin*InfWidthMax *Sp athellaeLengthMin*SpathellaeLengthMax*SpikLengthMin*SpikLengthMax*Flo wNu mbSpikMin*FlowNumbSpikMax*BracLengthMin*BracLengthMax*FlowLengthMin*F low LengthMax*InfSpathLengthToSpikMin*InfSpathLengthToSpikMax*TepInOutMin *Te pInOutMax*BracLengthtoFlowMin*BracLengthtoFlowMax*BracMargMin*BracMar gMa x*BracAwnToBodyMin*BracAwnToBodyMax, na.action=na.exclude,family=binomial)}) and got this error message: *** caught segfault *** address 0xbf7fffb0, cause 'memory not mapped' Traceback: 1: terms.formula(formula, data = data) 2: terms(formula, data = data) 3: model.frame.default(formula = Sex ~
[R] Bar Charts
Hello, Im trying to create a bar chart. I have a file with two different columns. I have created bar charts before where I am only reading from one column but now i wish to read from two columns. Here is the code i use when creating bar charts when reading from one column: satisfaction -read.table(C://project/graphs/satisfaction/reason.csv, sep=,, header=TRUE) barplot(table(satisfaction$reason), cex.names=1.0, col=blue, ylab=Number of people surveyed, border=blue, density=c(10,20,30,40,50), main=Results) Does this code above need to change much to create 2 bars in my graph? Here is my file I am working with. Im basically looking to present in a graph the number of females that are managers or Clerical staff. gender relationship Female Manager Female Clerical Female Clerical Female Manager Female Manager Female Clerical Female Manager Female Manager Female Clerical Female Clerical Female Clerical Female Manager Female Manager Female Clerical Thank you in advance. Best Regards, John Hogan. -- View this message in context: http://www.nabble.com/Bar-Charts-tp16807973p16807973.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Avoiding a loop
Dear R-users, I've been working with three different data sets (X, Y and Z) with the same dimension (i.e, n \times k). What I needed to do was to conform a 4th data set, i.e. FINAL, which first row was the X's first row, its second row was the Y's first row, and its third row was the Z's first row, and so on. My code is below. Is it possible to avoid the loop? Thanks in advance, Jorge # - Code starts here # Seed and data frames X, Y and Z set.seed(123) X=matrix(rnorm(300),ncol=5) Y=matrix(rpois(300,10),ncol=5) Z=matrix(rexp(300,1),ncol=5) # First five columns and rows X[1:3,1:5] Y[1:3,1:5] Z[1:3,1:5] # FINAL' six rows res=NULL; for(i in 1:nrow(X)) res=rbind(res,X[i,],Y[i,],Z[i,]) FINAL=data.frame(from.data=c('X','Y','Z'),res) FINAL[1:9,1:6] # - Code ends here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logit GLM without intercept
Dear Statisticians, I would like to analyse my data with a GLM with binomial error distribution and logit link function. The point is that I want a model fitted without intercept, i.e. the fitted curve should start at y=0.5 for x=0. I tried it with the following code: glm(value~0+ppm, binomial) Does this code yield the correct model or is there another possibility? Id appreciate it very much if you could help me out with this. I attached some example data. Thanks all the best Robert Robert Junker Department of Animal Ecology Tropical Biology University of Würzburg Biozentrum, Am Hubland 97074 Würzburg, Germany ppm value 65.85986417 1 65.85986417 1 65.85986417 0 65.85986417 0 65.85986417 1 65.85986417 0 65.85986417 0 65.85986417 1 65.85986417 1 65.85986417 1 659.4188035 1 659.4188035 0 659.4188035 0 659.4188035 1 659.4188035 0 659.4188035 0 659.4188035 0 659.4188035 0 659.4188035 1 659.4188035 0 659.4188035 0 659.4188035 0 659.4188035 1 659.4188035 0 659.4188035 1 659.4188035 1 659.4188035 1 659.4188035 0 659.4188035 1 659.4188035 1 1245.665143 0 1245.665143 1 1245.665143 1 1245.665143 1 1245.665143 0 1245.665143 1 1245.665143 1 1245.665143 1 1245.665143 0 1245.665143 1 1245.665143 0 1245.665143 1 1245.665143 1 1245.665143 0 1245.665143 1 1245.665143 1 1245.665143 0 1245.665143 1 1245.665143 0 1245.665143 0 5823.423892 0 5823.423892 1 5823.423892 0 5823.423892 0 5823.423892 1 5823.423892 0 5823.423892 0 5823.423892 0 5823.423892 0 5823.423892 1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with graphic exhibition in Java using JRI
Dear all I am working on a Java class which calls R functions. To do that, I use JRI. Everything works fine up to now. The only problem I have is the following: when I call the command plot( .) from java, a blank R Graphics window appears and if I click on it the program does not answer any more. Could anyone please tell me what should I do to fix this problem? Thanks in advance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANCOVA error again
On Mon, 2008-04-21 at 17:21 +0200, Birgit Lemcke wrote: Hello Gavin, thanks for you answer. If I use it without with I get back the same error. The with thing was only to try out for functions that do not contain a data-argument. I still try to learn and therefor I sometimes just try. OK, as per Prof. Ripley's off-list reply to us both, the R developers and helpeRs on the list can't diagnose and fix the segfault without a reproducible example, or the data and the exact code to reproduce the segfault. R shouldn't segfault so this is something that could potentially be fixed, but not without a reproducible example. It is understood that I am on the way to simplify the model once I have it for the hole slot. I don`t wanna predict the gender. I would like to know which of my variables are the strongest to divide all into the already existing groups: male and female. In the case that all my variables would be continuous, I could have probably used a discriminant function analysis, but most of the variables are categorical. That is what I meant --- you /are/ trying to predict sex, it is known and you want to find rules that allow you to assign unknowns to one of the two sexes. Discriminants analysis (LDA) is one technique in the broad topic of classification (not to be confused with clustering; ecologists often call clustering classification), or supervised learning. Here categorical variables can be handled just fine using classification trees. A good introduction from the ecologists point of view is: CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS Glenn De'ath and Katharina E. Fabricius Ecology Volume 81, Issue 11 (November 2000) pp. 3178–3192 DOI: 10.1890/0012-9658(2000)081[3178:CARTAP]2.0.CO;2 And, in the same journal, the use of randomForests is introduced in: Cutler et al (2007) RANDOM FORESTS FOR CLASSIFICATION IN ECOLOGY,Ecology 88(11) 2783-2792. DOI: 10.1890/07-0539.1 Then take a look at Andy Liaw and Matthew Wiener. Classification and regression by randomForest. R News, 2(3):18-22, December 2002, for an intro to using randomForest in R if you want to give that a try. See the variable importance example in that newsletter for one approach that could be used instead of your multiple testing idea. You might also want to take a look at: BOOSTED TREES FOR ECOLOGICAL MODELING AND PREDICTION Glenn De'ath Ecology Volume 88, Issue 1 (January 2007) pp. 243–251 DOI: 10.1890/0012-9658(2007)88[243:BTFEMA]2.0.CO;2 My plan is to delete in each case, one of the interacting variables and then compare the models with the left over variables using a ChiSquare test. That sounds like the definition of data dredging to me ;-) But I am always open for suggestions, because I am still not very good in statistics. Presently I still have the same error message and don`t know how to fix this. Unless you know C and the R internals very well, you can't fix this. You can try a different approach, such as the classification/supervised learning one I provide references for. You are on a hiding to nothing if you proceed with your current approach... All the best, G Greets B. Am 21.04.2008 um 16:52 schrieb Gavin Simpson: On Mon, 2008-04-21 at 15:43 +0200, Birgit Lemcke wrote: Hello R users! I got again an error message. Something here is causing compiled code to segfault (crash). I don't know what the problem is here exactly --- I'll let those much more acquainted with R look into that --- but you seem to be using R's model formulae in a non-standard way. You don't need with() wrapping your call to glm(), just include a data frame as the data argument: ModelFemMal85 - glm(Sex ~ .^2, data = FemMal85_Sex, na.action = na.exclude, family = binomial) Will do what you appear to have attempted below (all main effects plus first order interactions). This is a simpler call so see if this will work in R without causing the segfault. However, I would consider what on earth you are going to do with such a huge number of coefficients in the model --- over 3500 if I interpretted your formula correctly and assuming that the variables are all continuous. You do have many, many more than 3500 observations? If you are trying to predict the sex of individuals, why not try some of the classification techniques available in R? A simple technique would be a classification tree (packages rpart and party for example). These will help with feature selection and do include interactions, though not in exactly the same way you have done so here. Bagging, boosting or randomForests could be used to improve predictions (or make them more stable). Check out the Machine Learning and Environmetrics Task Views for additional info and pointers to relevant R packages/functions. My two pennies worth, G I used this code:
Re: [R] Design and analysis of mixture experiments
A summary, for those interested and posterity... Thanks to Christos Hatzis who is correct, the package 'AlgDesign' (which I'd overlooked) has gen.mixture which Creates a candidate list of mixture variables. gen.mixture(4,c(egg, flour, butter)) Thanks also to a private e-mailer who suggested approaches on the lines of egg - flour - butter - seq(0,100,length=2) cakemix - expand.grid(egg=egg, flour=flour,butter=butter) cakemix - cakemix[apply(cakemix,1,sum)0,] cakemix - as.data.frame(t(apply(cakemix,1,function(x) x/sum(x cakemix - cakemix[!duplicated(cakemix),] cakemix However, I was looking for something much more sophisticated, on the lines of pages following the link in my first post http://www.itl.nist.gov/div898/handbook/pri/section5/pri54.htm, at least including extreme vertices designs. In addition to the design, analysis of the resulting data can be (is!) complicated by the redundancy in the design variables; one must be quite sophisticated (by my standards!) in specifying models to be fitted to the data and in interpreting the results. I was hoping that R had a package to make the whole thing easier, but I guess I have to agree with the private e-mailer that QUOTE In fact, R is pretty ropey at this kind of classical industrial experimental designs. You can cobble designs together, but there seems to be nothing that straightforwardly generates things like central composites, box-behnken etc. Obviously still waiting for someone to write the package. /QUOTE I suppose this is a consequence of Rs origins in academia, and the academic background of many of the major contributors (to whom all praise!). I'm sure I could do this work in R (or even contribute a package!) if only I had the necessary skills and time. Never mind, I have other tools which will do the job. It's just a little iconoclastic finding needs that R can't meet :-O Regards to all. Keith Jewell mailto:[EMAIL PROTECTED] telephone (direct) +44 (0)1386 842055 Released by Mr. K. Jewell -Original Message- From: Christos Hatzis [mailto:[EMAIL PROTECTED] Sent: 17 April 2008 17:24 To: Jewell, Keith; r-help@r-project.org Subject: RE: [R] Design and analysis of mixture experiments The place to look is the CRAN Task View 'ExperimentalDesign'. There are several packages there related to design and analysis of experiments. The package 'AlgDesign' appears to have a function for generating mixture designs, and there might be others in other packages. Good luck! -Christos -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, April 17, 2008 11:45 AM To: r-help@r-project.org Subject: [R] Design and analysis of mixture experiments Hi, I'm interested in experimental design and data analysis on mixtures, like cake recipes where the sum of the components is fixed; e.g. http://www.itl.nist.gov/div898/handbook/pri/section5/pri54.htm. I can't believe that R doesn't have facilities to design and analyse such experiments, but I haven't been able to find them (I have looked quite hard!). Can anyone point me in the right direction? Thanks in advance, Keith Jewell mailto:[EMAIL PROTECTED] telephone (direct) +44 (0)1386 842055 Released by Mr. K. Jewell _ The information in this document and attachments is given after the exercise of all reasonable care and skill in its compilation, preparation and issue, but is provided without liability in its application or use. It may contain privileged information that is exempt from disclosure by law and may be confidential. If you are not the intended recipient you must not copy, distribute or take any action in reliance on it. If you have received this document in error please notify us and delete this message from your system immediately. Campden Chorleywood Food Research Association Group; Campden Chorleywood Food Research Association (company limited by guarantee),registered number 510618); CCFRA Group Services Ltd. (registered number 3841905); and CCFRA Technology Ltd (registered number 3836922), all registered in England and Wales with the registered office at Station Road, Chipping Campden, Gloucestershire, GL55 6LD. The Group may monitor e-mail traffic data and also the content of e-mail for the purposes of security and staff training. Any advice given is subject to our normal terms and conditions of trading, a copy of which is available on request. This e-mail has been scanned for all viruses by MessageL...{{dropped:2}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UTF-8 or Unicode on Windows PC
On 21 Apr 2008, at 12:33, Prof Brian Ripley wrote: Is it possible to download a compiled snapshot of 2.7.0 for Windows XP? Yes, http://cran.r-project.org/bin/windows/base/rtest.html And it is due for release tomorrow. Many thanks! I can see the progress :) But please forgive my incompetence. I'm not so familiar with Windows. If I start e.g. RGUI by using: Rgui.exe LC_CTYPE=ja I can type Japanese, Russian, and German. strsplit works perfectly! ;) But if I type for instance a German umlaut 'ü' it comes out as 'u'. OK, it is due to the fact I didn't set up Rgui in UTF-8 mode. But how can I do this? My data are written in many different languages, and I want to do some statistics. R version 2.7.0 RC (2008-04-19 r45391) i386-pc-mingw32 locales: all to German_Germany.1252 LC_CTYPE=Japanese_Japan.932 ### There are some minor issues. I set Rgui's font to Arial Unicode. This works but I have some troubles to place my cursor, caused by the issue that Arial Unicode is not a monospaced font. If I start up Rgui in German, I can see the localized menu items, but for each non-ASCII character I see cryptic things. It seems to me that the localized strings are written in UTF-8, and Rgui expects ANSI characters. ### Nevertheless, thanks a lot! --Hans __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data labels in barchart (lattice)
On 4/21/08, K. Elo [EMAIL PROTECTED] wrote: Dear all, I use the barchart-function (lattice) for plotting stacked barcharts. The data is a summary table (data frame) of likert-scale-evaluations (strongly agree, agree...strongly disagree) to different issues constructed as follows (L1=precentage of strongly agree evaluations, L4=precentage of strongly disagree evaluations): --- ID L1 L2 L3 L4 DN Issue1 25 40 35 0 0 Issue2 15 30 22 28 5 . . . --- What I have so far not achieved is adding data labels to each sub-bar of a 100%-bar. What I would like to have is something like this: Issue1: |###25%###OO40%OOXXX35%XXX Issue2: | (similar) ... What should I do in oder to display data labels? Write your own panel function (which may or may not be a simple exercise depending on your level of expertise in R). You could use panel.barchart as a starting point. Basically, you need to insert some calls to panel.text() (or something equivalent) after calls to panel.rect() that draw the bars. -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression inclusion of variable, effect on coefficients
This is not a dump question. This is a serious problem and it depends on what you know or assume about the relastionship between x1 and x4. If you assume linear interaction, you might want to introduce some interaction term to the model for example. Uwe Ligges Thiemo Fetzer wrote: Hello dear R users! I know this question is not strictly R-help, yet, maybe some of the guru's in statistics can help me out. I have a sample of data all from the same population. Say my regression equation is now this: m1 - lm(y ~ x1 + x2 + x3) I also regress on m2 - lm(y ~ x1 + x2 + x3 + x4) The thing is, that I want to study the effect of information x4. I would hypothesize, that the coefficient estimate for x1 goes down as I introduce x4, as x4 conveys some of the information conveyed by x1 (but not only). Of course x1 and x4 are correlated, however multicollinearity does not appear to be a problem, the variance inflation factors are rather low (around 1.5 or so). I want to basically study, how the interplay between x1 and x4 is, when introducing x4 into the regression equation and whether my hypothesis is correct; i.e. that given I consider the information x4, not so much of the variation is explained via x1 anymore. I observe that introducing x4 into the regression, the coefficient estimate for x1 goes down; also the associated p-value becomes bigger; i.e. x1 becomes comparatively less significant. However, x4 is not significant. Yet, the observation is in line with my theoretical argument. The question is now simple: how can I work this out? I know this is likely a dumb question, but I would really appreciate some links or help. Regards Thiemo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Avoiding a loop
Will this do it for you: # Seed and data frames X, Y and Z set.seed(123) X=matrix(rnorm(300),ncol=5) Y=matrix(rpois(300,10),ncol=5) Z=matrix(rexp(300,1),ncol=5) index - seq(1, by=3, length=nrow(X)) FINAL - matrix(ncol=5, nrow=3*nrow(X)) FINAL[index,] - X FINAL[index + 1,] - Y FINAL[index + 2,] - Z On Mon, Apr 21, 2008 at 11:44 AM, Jorge Ivan Velez [EMAIL PROTECTED] wrote: Dear R-users, I've been working with three different data sets (X, Y and Z) with the same dimension (i.e, n \times k). What I needed to do was to conform a 4th data set, i.e. FINAL, which first row was the X's first row, its second row was the Y's first row, and its third row was the Z's first row, and so on. My code is below. Is it possible to avoid the loop? Thanks in advance, Jorge # - Code starts here # Seed and data frames X, Y and Z set.seed(123) X=matrix(rnorm(300),ncol=5) Y=matrix(rpois(300,10),ncol=5) Z=matrix(rexp(300,1),ncol=5) # First five columns and rows X[1:3,1:5] Y[1:3,1:5] Z[1:3,1:5] # FINAL' six rows res=NULL; for(i in 1:nrow(X)) res=rbind(res,X[i,],Y[i,],Z[i,]) FINAL=data.frame(from.data=c('X','Y','Z'),res) FINAL[1:9,1:6] # - Code ends here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creation of dialog box
There is a package in the R souces called windlgs in .../src/gnuwin32/windlgs which has examples, but you may want to go some way that is cross platform and avoid Windows specific programming at all. Best wishes, Uwe Ligges Ingrida B wrote: Dear, List members, My student are creating some functions to implement the median polish kriging (one of prediction method in geostatistic). She want to create some dialog box (to input some data) and menu. For this she is using winMenuAddItem and winDialogString commands in function. But WinDialogString makes just one string to fill data. Which of commands she must use to create dialog box with a few strings? Regards Ingrida __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UTF-8 or Unicode on Windows PC
On Mon, 21 Apr 2008, Hans-Joerg Bibiko wrote: On 21 Apr 2008, at 12:33, Prof Brian Ripley wrote: Is it possible to download a compiled snapshot of 2.7.0 for Windows XP? Yes, http://cran.r-project.org/bin/windows/base/rtest.html And it is due for release tomorrow. Many thanks! I can see the progress :) But please forgive my incompetence. I'm not so familiar with Windows. If I start e.g. RGUI by using: Rgui.exe LC_CTYPE=ja I can type Japanese, Russian, and German. strsplit works perfectly! ;) But if I type for instance a German umlaut 'ü' it comes out as 'u'. OK, it is due to the fact I didn't set up Rgui in UTF-8 mode. Entering at the keyboard in more than one language is close to impossible (not quite, as 'Japanese' covers a few but you need a Japanese keyboard to do it). You can't change the language of Windows just by setting locales. But how can I do this? My data are written in many different languages, and I want to do some statistics. You can read in files in known encodings, though. R version 2.7.0 RC (2008-04-19 r45391) i386-pc-mingw32 locales: all to German_Germany.1252 LC_CTYPE=Japanese_Japan.932 ### There are some minor issues. I set Rgui's font to Arial Unicode. This works but I have some troubles to place my cursor, caused by the issue that Arial Unicode is not a monospaced font. Right, and you are warned not to do that. You must use a fixed-width font, and for CJK characters, one in the standard single/double spacing. (See for example the comments in Rconsole and rw-FAQ 3.5. The GUI preferrences dialog only offers fixed-width fonts, so you have to work quite hard to do anything else.) If I start up Rgui in German, I can see the localized menu items, but for each non-ASCII character I see cryptic things. It seems to me that the localized strings are written in UTF-8, and Rgui expects ANSI characters. Argh, yes, that was an error by the translator in marking the file -- thanks, I just have time to fix it. (RGui does not expect ANSI, but all of R does expect translations to be in the encoding they are declared to be-- this eas declared as ISO-8859-1.) ### Nevertheless, thanks a lot! --Hans -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding number of non-NAs to boxplot
boxplot(x[,c(2,15,28,41,54,67,80,93,106)], ylab=mg/s, names=c(RM215, RM202, RM198, RM190, RM185, RM179, RM148, RM119, RM61)) this is the code I am using to make a standard box plot. Is there a way to get the number of NA observations plotted onto the graph easily. I can always go in and extract the numbers and add them into the boxplot from the output of boxplot d - boxplot(x[,c(2,15,28,41,54,67,80,93,106)], ylab=mg/s, names=c(RM215, RM202, RM198, RM190, RM185, RM179, RM148, RM119, RM61)) d$n then I am still confused how to get this information into the graph I could use a legend but that seems suboptimal- I would like to have them under the names like RM215 n=24 I can provide data, but this seems more of a graph construction question than an analysis one. Thanks in advance Stephen -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using the 'by' function withing a 'for' loop
Dear R experts, I am trying to optimize my script, because right now it requires a lot of memory. The goal is to generate four plots in one page. Every plot corresponds to the means and sem's calculated for a given variable at different days. In order to obtain the means and sem's I apply the 'by' function. The way I have done it so far is like this: Read the data Generate a summary of the mean and sem of a variable at every Day. Plot the mean and sem of that variable. Repeat the same process for the other 3 variables. I tried to optimize the code by using a for loop, the code is below. #Reading the data dato-read.csv('mydata.csv') names(dato)-c(id,day,tx,var1,var2,var3,var4) dato-dato[,1:7] #Specify varible to be plotted variable-c('var1','var2','var3','var4') #Define parameters of window where panel: margins, number of plots in the panel windows(height=9, width=9, rescale='fixed') par(mfrow=c(2,2),xpd=T, bty='l', omi=c(0.8,0.25,1.2,0.15), mai=c(1.1,0.8,0.3,0.3)) for (k in variable) { dat-dato[!is.na(k),] summ-by(dat,dat[,c(tx,day)], function(x) { mn-mean(x$k) std-sd(x$k) n-length(x$k) se-std/sqrt(n) lowb-mn-se upb-mn+se data.frame(tx=x$tx[1],day=x$day[1],mn=mn,std=std,lowb=lowb,upb=upb,se=se) }) summ-do.call(rbind,summ) #Definining x axis range xmax-unique(max(summ$day,na.rm=TRUE)) xmin-unique(min(summ$day,na.rm=TRUE)) yaxmin-unique(min(summ$lowb)) yaxmax-unique(max(summ$upb)) plot(1,1,type='n',xlab='Day',xlim=c(xmin,xmax),ylim=c(yaxmin,yaxmax), ylab=k, las=1,cex.lab=1,xaxp=c(xmin,xmax,diff(range(c(xmin,xmax) points(summ$day,summ$mn) } Where variable is a vector that specifies all the variables I want to plot. But I am getting the following error: Error in var(as.vector(x), na.rm = na.rm) : 'x' is empty In addition: Warning message: In mean.default(x$k) : argument is not numeric or logical: returning NA Could some one please show me how to structure my code to achieve my final goal, which is to simplify it? I am attaching a csv file in case you want to run my code. Thank you very much in advance for your time and help, Judith Be a better friend, newshound, and __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re ad From EXCEL
fp0 = C:\\Documents and Settings\\myname\\My Documents\\Research\\ #upper file path fp1 = PIname\\PIproject\\Working00\\ #middle file path fp2 = DateData\\Dates.xls #lower file path dbase = file.path(paste(fp0,fp1,fp2, sep=)) #create complete file path by paste-ing them together varList = ID, ADM_DATE, DISCHARGE_DATE, ICU_DATE, TRANSFER_DATE, READMIT_ICU #make variable list varType = c(rep(T,5),F) #I forget what this was for ch0 = odbcConnectExcel(dbase) #open channel ch0 #odbcGetInfo(ch0) #check for channel infoirmattion #sqlTables(ch0) #check data information Dates.data = sqlQuery(ch0, paste(SELECT, varList, FROM A), as.is=varType ) #get the data from XL sheet A (Here data happens to be dates, another complication, but ignored here) close(ch0) #close channel #data now stored in Dates.data Joe -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of ermimi Sent: Saturday, April 19, 2008 5:06 PM To: r-help@r-project.org Subject: [R] Re ad From EXCEL Hello!!! I have been read a much about as read data from Excel File, but I haven´t found the necesary information to read the data. Now, I can create a channel : channel - odbcConnectExcel(file.xls) but I don´t know as read the data?? I hope that you could help me. Thank you very much. -- View this message in context: http://www.nabble.com/Read-From-EXCEL-tp16787900p16787900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Avoiding a loop
Thank you so much to Jim and Mark for their advices. Now I solved the problem I had using a new approach. Best, Jorge On Mon, Apr 21, 2008 at 12:53 PM, jim holtman [EMAIL PROTECTED] wrote: Will this do it for you: # Seed and data frames X, Y and Z set.seed(123) X=matrix(rnorm(300),ncol=5) Y=matrix(rpois(300,10),ncol=5) Z=matrix(rexp(300,1),ncol=5) index - seq(1, by=3, length=nrow(X)) FINAL - matrix(ncol=5, nrow=3*nrow(X)) FINAL[index,] - X FINAL[index + 1,] - Y FINAL[index + 2,] - Z On Mon, Apr 21, 2008 at 11:44 AM, Jorge Ivan Velez [EMAIL PROTECTED] wrote: Dear R-users, I've been working with three different data sets (X, Y and Z) with the same dimension (i.e, n \times k). What I needed to do was to conform a 4th data set, i.e. FINAL, which first row was the X's first row, its second row was the Y's first row, and its third row was the Z's first row, and so on. My code is below. Is it possible to avoid the loop? Thanks in advance, Jorge # - Code starts here # Seed and data frames X, Y and Z set.seed(123) X=matrix(rnorm(300),ncol=5) Y=matrix(rpois(300,10),ncol=5) Z=matrix(rexp(300,1),ncol=5) # First five columns and rows X[1:3,1:5] Y[1:3,1:5] Z[1:3,1:5] # FINAL' six rows res=NULL; for(i in 1:nrow(X)) res=rbind(res,X[i,],Y[i,],Z[i,]) FINAL=data.frame(from.data=c('X','Y','Z'),res) FINAL[1:9,1:6] # - Code ends here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Analysis of Epidemiological Data Using R
Hi everyone, I'm studying the manual name: Analysis of Epidemiological Data Using R and Epicalc, maked by: Virasakdi Chongsuvivatwong and Edward McNeil. And I can't find the data base that they use in some examples, this are the names: Chapter7.Rdata,Chapter8.Rdata,Chapter9.Rdata Somebody can tell me how can I have this files? Thk! José O__ José Bustos M. c/ /'_ --- Master Apllied Stat Program (*) \(*) -- University of Concepción __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re ad From EXCEL
Hi there, Try this: # Function to read data in R from Excel FromExcel=function(yourfile,spreadsheet){ require(RODBC) channel=odbcConnectExcel(yourfile) sqlTables(channel) mydata=sqlFetch(channel, spreadsheet) attach(mydata) mydata } mydata=FromExcel(C:/mydata/2008/yourfile.xls,yourspreadsheet) mydata[1:10,1:10] FromExcel is not as efficient as it could be, but it works for me every time. I hope this helps, Jorge On Sat, Apr 19, 2008 at 6:06 PM, ermimi [EMAIL PROTECTED] wrote: Hello!!! I have been read a much about as read data from Excel File, but I haven´t found the necesary information to read the data. Now, I can create a channel : channel - odbcConnectExcel(file.xls) but I don´t know as read the data?? I hope that you could help me. Thank you very much. -- View this message in context: http://www.nabble.com/Read-From-EXCEL-tp16787900p16787900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data labels in barchart (lattice)
Hi again, Deepayan Sarkar wrote (21.4.2008): Write your own panel function (which may or may not be a simple exercise depending on your level of expertise in R). You could use panel.barchart as a starting point. Basically, you need to insert some calls to panel.text() (or something equivalent) after calls to panel.rect() that draw the bars. Thanks to Deepayan for his quick answer. Well, I am quite familiar with R programming, so programming would not be the issue. What is an issue is that I do not (yet) fully understand how the panel-function interacts with the calling barchart-function (or vice versa). My stacked bar is build of five variables [barchart(ID ~ L1+L2+L3+L4+DN ...) ] so the problem is that the data label to be displayed is either L1,2,3,4 or DN (for ID, see my first posting). So the question is: How could I use the current data value used for drawing the sub-bar as an argument/variable for/in the panel-function? Many thanks again greetings, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] jpeg legend space issues
Dear R community, I am printing a jpeg file (using plot) and my y-axis label becomes partly cut (at the left) by a very close margin of document. See example: http://www.igm.jhmi.edu/~gehret/progr_collect_data/beta.jpg Can you please help me fix this? I tried din and fig in the parameter and this did not help... (fig permits for a larger margin, but the text is still not plotted correctly). Thank you and wishing you an excellent day! Georg. *** Georg Ehret Johns Hopkins Baltimore, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding number of non-NAs to boxplot
Have a look at the addtable2plot function in the plotrix package. It should do what you want. --- stephen sefick [EMAIL PROTECTED] wrote: boxplot(x[,c(2,15,28,41,54,67,80,93,106)], ylab=mg/s, names=c(RM215, RM202, RM198, RM190, RM185, RM179, RM148, RM119, RM61)) this is the code I am using to make a standard box plot. Is there a way to get the number of NA observations plotted onto the graph easily. I can always go in and extract the numbers and add them into the boxplot from the output of boxplot d - boxplot(x[,c(2,15,28,41,54,67,80,93,106)], ylab=mg/s, names=c(RM215, RM202, RM198, RM190, RM185, RM179, RM148, RM119, RM61)) d$n then I am still confused how to get this information into the graph I could use a legend but that seems suboptimal- I would like to have them under the names like RM215 n=24 I can provide data, but this seems more of a graph construction question than an analysis one. Thanks in advance Stephen -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change the core code
HI, pretty basic question: is that possible to change the code of the function within library? If so what should I do? I work on R linux (ubuntu), thanks a lot -- View this message in context: http://www.nabble.com/Change-the-core-code-tp16808285p16808285.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] finding an unknown distribution
Hi, I need to analyze the influences of several factors on a variable that is a measure of fecundity, consisting of 73 observations ranging from 0 to 5. The variable is continuous and highly positive skewed, none of the typical transformations was able to normalize the data. Thus, I was thinking in analyzing these data using a generalized linear model where I can specify a distribution other than normal. I'm thinking it may fit a gamma or exponential distribution. But I'm not sure if the data meets the assumptions of those distributions because their definitions are too complex for my understanding! I tried to use R to asses the fit to a particular distribution. I used the fitdistr function from the MASS package and was able to obtain an estimate for the rate for the exponential distribution. But I couldn't get the gamma to work. If I don't provide initial estimates it says Error in optim (... initial value in 'vmmin' is not finite), if I provide some initial values it says Error in optim (... non-finite finite-difference value [1]). I then tried to test the fit of the exponential distribution using the Kolmogorov-Smirnov goodness of fit test (ks.test), but I got the warning message cannot compute correct p-values with ties. This is strange given that the details for the ks.test says that continuous variables do not generate ties. I'll greatly appreciate any ideas on how to proceed with thisThanks, Andrea _ Discover the new Windows Vista [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Analysis of Epidemiological Data Using R
José Ignacio Bustos Melo wrote: Hi everyone, I'm studying the manual name: Analysis of Epidemiological Data Using R and Epicalc, maked by: Virasakdi Chongsuvivatwong and Edward McNeil. And I can't find the data base that they use in some examples, this are the names: Chapter7.Rdata,Chapter8.Rdata,Chapter9.Rdata Somebody can tell me how can I have this files? As far as I can tell, they are byproducts of working through the examples in the relevant sections. See pp. 78, 88, and 96. Thk! José O__ José Bustos M. c/ /'_ --- Master Apllied Stat Program (*) \(*) -- University of Concepción __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression inclusion of variable, effect on coefficients
Hello :) I am happy to hear that I am not necessarily asking stupid questions. The thing is, that I have data on x1 and x4 for the whole sample. However, theoretically, it is clear that the informational content of x1 is not as high as of x4. x4 provides more accurate information to the subjects participating in the game, as it has been experimentally and theoretically shown that the x1 is biased. So the experimentators introduced x4 in response to the biased x1. Both prevail however together, so that the subjects have available information on x1 and x4. Theoretically, I argued that the relative importance of x1 on y will decrease in light that information x4 is available, as x4 is more accurate. With a simple regression, however, I do not find significant relationships. For x1 it has been empirically and theoretically shown that it has a positive effect on y. The same should hold for x4. There is no necessary theoretical argument as how x1 and x4 interact mathematically, as they both are a measure of the same thing. Yet, x4 is more accurate and contains even more information. It could be any kind of interaction. They are positively correlated, which is also reasonable. Could you suggest me a simple interaction model, with which I could try my luck? Thanks a lot Thiemo -Original Message- From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: Montag, 21. April 2008 18:54 To: Thiemo Fetzer Cc: r-help@r-project.org Subject: Re: [R] Regression inclusion of variable, effect on coefficients This is not a dump question. This is a serious problem and it depends on what you know or assume about the relastionship between x1 and x4. If you assume linear interaction, you might want to introduce some interaction term to the model for example. Uwe Ligges Thiemo Fetzer wrote: Hello dear R users! I know this question is not strictly R-help, yet, maybe some of the guru's in statistics can help me out. I have a sample of data all from the same population. Say my regression equation is now this: m1 - lm(y ~ x1 + x2 + x3) I also regress on m2 - lm(y ~ x1 + x2 + x3 + x4) The thing is, that I want to study the effect of information x4. I would hypothesize, that the coefficient estimate for x1 goes down as I introduce x4, as x4 conveys some of the information conveyed by x1 (but not only). Of course x1 and x4 are correlated, however multicollinearity does not appear to be a problem, the variance inflation factors are rather low (around 1.5 or so). I want to basically study, how the interplay between x1 and x4 is, when introducing x4 into the regression equation and whether my hypothesis is correct; i.e. that given I consider the information x4, not so much of the variation is explained via x1 anymore. I observe that introducing x4 into the regression, the coefficient estimate for x1 goes down; also the associated p-value becomes bigger; i.e. x1 becomes comparatively less significant. However, x4 is not significant. Yet, the observation is in line with my theoretical argument. The question is now simple: how can I work this out? I know this is likely a dumb question, but I would really appreciate some links or help. Regards Thiemo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression inclusion of variable, effect on coefficients
Hello! I was thinking again about the possible interaction between x1 and x4. Theoretically it makes sense, that the influence of x4 on y is the stronger, the less informative is x1. It can be argued that the higher x1, the less informative it is x1. How could I incorporate this relationship in the model? Thanks a lot for your help in advance, Thiemo -Original Message- From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: Montag, 21. April 2008 18:54 To: Thiemo Fetzer Cc: r-help@r-project.org Subject: Re: [R] Regression inclusion of variable, effect on coefficients This is not a dump question. This is a serious problem and it depends on what you know or assume about the relastionship between x1 and x4. If you assume linear interaction, you might want to introduce some interaction term to the model for example. Uwe Ligges Thiemo Fetzer wrote: Hello dear R users! I know this question is not strictly R-help, yet, maybe some of the guru's in statistics can help me out. I have a sample of data all from the same population. Say my regression equation is now this: m1 - lm(y ~ x1 + x2 + x3) I also regress on m2 - lm(y ~ x1 + x2 + x3 + x4) The thing is, that I want to study the effect of information x4. I would hypothesize, that the coefficient estimate for x1 goes down as I introduce x4, as x4 conveys some of the information conveyed by x1 (but not only). Of course x1 and x4 are correlated, however multicollinearity does not appear to be a problem, the variance inflation factors are rather low (around 1.5 or so). I want to basically study, how the interplay between x1 and x4 is, when introducing x4 into the regression equation and whether my hypothesis is correct; i.e. that given I consider the information x4, not so much of the variation is explained via x1 anymore. I observe that introducing x4 into the regression, the coefficient estimate for x1 goes down; also the associated p-value becomes bigger; i.e. x1 becomes comparatively less significant. However, x4 is not significant. Yet, the observation is in line with my theoretical argument. The question is now simple: how can I work this out? I know this is likely a dumb question, but I would really appreciate some links or help. Regards Thiemo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] finding an unknown distribution
andrea previtali wrote: Hi, I need to analyze the influences of several factors on a variable that is a measure of fecundity, consisting of 73 observations ranging from 0 to 5. The variable is continuous and highly positive skewed, none of the typical transformations was able to normalize the data. Thus, I was thinking in analyzing these data using a generalized linear model where I can specify a distribution other than normal. I'm thinking it may fit a gamma or exponential distribution. But I'm not sure if the data meets the assumptions of those distributions because their definitions are too complex for my understanding! Roughly, the exponential distribution is the model of a random variable describing the time/distance between two independent events that occur at the same constant rate. The gamma distribution is the model of a random variable that can be thought of as the sum of exponential random variables. I don't think fecundity data, the count of reproductive cells, qualifies as a random variable to be modeled by either of these distributions. If the count of reproductive cells is very large, and you are modeling this count as a function of animal size, such as length, you should consider the lognormal distribution, since the count of cells grow multiplicatively (volumetrically) with the increase in length. In that case you can model your response variable using glm with family=gaussian(link=log). Rubén __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics history
Duncan Murdoch [EMAIL PROTECTED] wrote: One thing I'd like to do, but didn't have time to implement before 2.7.0, is to have history set to some finite size, e.g. a default might be the last 3 or 10 plots. The problem with record=TRUE is that it keeps a record of all the plots, so memory use just increases and increases. Why not just startup another device with record=FALSE? I'd like to have recording always on, but I don't need an infinite history. But this isn't urgent enough to have prodded me into writing it before now. A finite size would be nice. I've been using this code in scripts: graphics.off() windows(record = TRUE) .SavedPlots - NULL Not exactly the same thing, but it limits memory use. Are there side effects that could bite me? -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change the core code
I often just download the source, find the appropriate function, create an file with an alternate version of it (i.e. plotMeans becomes plot.Means) and modify it to suit. I load all custom functions like that from my .rprofile. I guess you _could_ recompile the library, but, that might break other things later on. Hence the better choice of your own custom library of scripts. -Jarrett threshold wrote: HI, pretty basic question: is that possible to change the code of the function within library? If so what should I do? I work on R linux (ubuntu), thanks a lot -- View this message in context: http://www.nabble.com/Change-the-core-code-tp16808285p16811527.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.