Re: [R] Q. regarding optim()
On Fri, 21 Jul 2006, N. Goodacre wrote: Dear R mailing group, The second parameter for the function optim()is a function whose parameters are to be optimized. The description of this function given in the help file is the following: fn: A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result. Let's say the second argument is x, a vector of x values But it says `parameters'. Please look at the examples on that help page. optim is concerned with optimizing functions of more than one parameter. In a statistical setting 'x' may be the data, but e.g. for fitting a gamma distribution c(scale, shape) are the parameters. I would have thought that fn should return a vector full of y values for the x values entered as the second argument. If the function just takes one value at a time and outputs a scalar, how can I specify for fn which x value, of the vector x, to take? Sincerely, Norman Goodacre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parameterization puzzle
R does not know that poly(age,2) and poly(age,1) are linearly dependent. (And indeed they only are for some functions 'poly'.) I cannot reproduce your example ('l' is missing), but perhaps glm(deaths ~ poly(age,2) + poly(age,1)*Smoke + offset(l), poisson) was your intention? On Fri, 21 Jul 2006, Murray Jorgensen wrote: Consider the following example (based on an example in Pat Altham's GLM notes) pyears - scan() 18793 52407 10673 43248 5710 28612 2585 12663 1462 5317 deaths - scan() 2 32 12 104 28 206 28 186 31 102 Smoke - gl(2,1,10,labels=c(No,Yes)) Age - gl(5,2,10,labels=c(35-44,45-54,55-64,65-74,75-84), ordered=TRUE) mod1.glm - glm(deaths ~ Age * Smoke + offset(l),family=poisson) summary(mod1.glm) age - as.numeric(Age) mod2.glm - aso1.glm - glm(deaths ~ poly(age,2) + Smoke + poly(age,1):Smoke + offset(l),family=poisson) summary(mod2.glm) The business part of the summary for the first model Estimate Std. Error z value Pr(|z|) (Intercept)-5.927060.16577 -35.754 2e-16 *** Age.L 4.064900.47414 8.573 2e-16 *** Age.Q -1.082930.41326 -2.620 0.008781 ** Age.C 0.241580.31756 0.761 0.446816 Age^4 0.042440.23061 0.184 0.853986 SmokeYes0.619160.17296 3.580 0.000344 *** Age.L:SmokeYes -1.312340.49267 -2.664 0.007729 ** Age.Q:SmokeYes 0.390430.43008 0.908 0.363976 Age.C:SmokeYes -0.295930.33309 -0.888 0.374298 Age^4:SmokeYes -0.036820.24432 -0.151 0.880218 inspires me to fit the second model that omits the nonsignificant terms, however this produces the summary Estimate Std. Error z value Pr(|z|) (Intercept)-5.8368 0.1213 -48.103 2e-16 *** poly(age, 2)1 3.9483 0.1755 22.497 2e-16 *** poly(age, 2)2 -1.0460 0.1448 -7.223 5.08e-13 *** SmokeYes0.5183 0.1262 4.106 4.02e-05 *** SmokeNo:poly(age, 1)1.3755 0.4340 3.169 0.00153 ** SmokeYes:poly(age, 1) NA NA NA NA Why do we get a SmokeNo:poly(age, 1) term? Can I re-express mod2.glm so that this term does not appear? Cheers, Murray Jorgensen -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] insert insertRow?
Dear all, In the search for a command to insert a row between other rows in a data frame I found that there seems to be no such command in the base R package. There is however a very simple function insertRow in the micEcon package, that makes use of rbind. I wondered if it would not be possible to include the following micEcon functions in the base package: insertRow insertCol Since the functions are already there, I would gather this is not a very big effort. It would save people that just want to insert rows and columns easily to install another two packages (since micEcon needs systemfit) or defining the functions themselves. I hope my suggestion will be taken into consideration. Whether it will be adopted or not, I still think R is a fantastic statistical package that I love to use. Greetings, Edwin Commandeur __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merge two dataframes of different column length and row length by two columns at a time
Hello, I have two dataframes of the following structures: str(a) `data.frame': 1354896 obs. of 14 variables: $ V1 : int 0 1 2 3 4 5 6 7 8 9 ... $ V2 : int 0 0 0 0 0 0 0 0 0 0 ... $ V3 : int 74 12305 103 12337 46 57 12446 90 12097 79 ... $ V4 : num11.8 1529.2 17.8 1579.46.7 ... $ V5 : int 88 11040 104 11557 56 58 11040 74 10991 81 ... $ V6 : num15.5 1921.3 20.3 1888.2 12.6 ... $ V7 : int 87 8793 90 10142 67 64 9264 73 8589 71 ... $ V8 : num16.0 1750.6 15.2 1783.7 11.0 ... $ V9 : int 68 11279 93 11831 43 61 11234 82 10919 76 ... $ V10: num11.5 1999.5 39.0 1842.25.0 ... $ V11: int 110 12456 92 12063 60 59 12393 82 11831 77 ... $ V12: num21.4 1974.7 33.9 1689.9 10.6 ... $ V13: int 81 10887 101 10874 62 74 5 79 10789 93 ... $ V14: num19.5 1812.3 31.7 1524.1 11.9 ... str(b) `data.frame': 1213901 obs. of 4 variables: $ V1: int 0 1 2 3 5 6 7 8 9 10 ... $ V2: int 0 0 0 0 0 0 0 0 0 0 ... $ V3: Factor w/ 54676 levels str23,str53,..: 54676 54676 54676 54676 54676 54676 54676 54676 54676 54676 ... $ V4: Factor w/ 3 levels Match,NoMatch,..: 2 2 2 2 2 2 2 2 2 2 ... I want to merge these dataframes by V1 and V2 of a and b. The combination of V1, V2 is a unique key. Note that b is smaller than a. Any suggestions to solve this problem ? Gunther Höning Diplom Physiker Bioinformatiker Langenbeckstraße1 55131 Mainz [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correspondence analysis with R
On Thu, 2006-07-20 at 14:38 +, Emanuele Mazzola wrote: Hello everybody, i'm having some trouble performing correspondence analysis with R for Mac OS X. Does anyone know about some useful package? And also, if i had found coordinates of points representing row and column profiles, how do i put them in a single figure with labels identifying each one of them? This thing is getting me almost crazy... Thank you in advance for answering, bye Emanuele A summary of packages/functions providing ordination methods including CA is in the Environmetrics Task View on CRAN, e.g.: http://www.stats.bris.ac.uk/R/src/contrib/Views/Environmetrics.html HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% *Note new Address and Fax and Telephone numbers from 10th April 2006* %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/cv/ WC1E 6BT [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loss of numerical precision from conversion to list ?
Duncan Murdoch [EMAIL PROTECTED] writes: R tries to use the maximum precision (64 bit mantissa) in the floating ... Or perhaps your problem has nothing to do with this; I didn't really look at it in detail. It hasn't. I was off speculating about sum vs rowSums too, but: num.v- rowSums(((lambda-lambda0)*mu*w.k.sq[,-(K+1)])/(1+lambda*mu)) Inside this, we have mu*w.k.sq[,-(K+1)] . mu is a vector of length 27, and w.k.sq has 10 rows and 28 *columns*. Column-major storage and vector recycling kicks in... If mu has identical elements (never mind the magnitude), of course, the recycling doesn't matter. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correspondence analysis with R -follow up
On Thu, 2006-07-20 at 16:40 +, Emanuele Mazzola wrote: Hi all, thank you for your answers; i've tried both cca from vegan library, and dudi.coa from ade4 library; one last question: my deal is mainly with contingency tables, like the one i'm posting here acciaieria-c(.41,.02,.44,.04,.09) laminatoio-c(.34,.28,.26,.01,.11) fonderia-c(.48,.05,.34,.08,.05) leghe-c(.45,.19,.25,.03,.08) pct-cbind(acciaieria,laminatoio,fonderia,leghe) pct-data.frame(pct,row.names=c(normale,preparaz.,manutenz.,installaz.,trasferim.)) BUT...cca and dudi.coa seem to return quite different graphical results; where am i doing wrong? Do they do the same to you with the table above? Thank you very much again! Bye Emanuele Hi, I haven't used ade4 at all, but as long as you did CA correctly using ade4 functions, I doubt the plotted results differ really in all but cosmetic ways or perhaps in terms of the scalings used. You are plotting two bits of information in the biplot and you can only represent one of those optimally, or you could compromise and plot with symmetric scaling. Or you could plot them in a multitude of ways - there are whole books on biplots! Take a look at argument scaling in ?plot.cca (for vegan). Try plotting your CA with different scalings and see if that better matches the CA from ade4, eg: library(vegan) mod - cca(pct) plot(mod) # default is scaling = 2 plot(mod, scaling = 1) plot(mod, scaling = 3) If using vegan, you might want to look at Jari Oksanens Vegan Tutorial for more info on using his functions: http://cc.oulu.fi/~jarioksa/opetus/metodi/vegantutor.pdf If this doesn't help your understanding or problem, post back with the ade4 and vegan code you are using and I'll have a look. G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% *Note new Address and Fax and Telephone numbers from 10th April 2006* %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/cv/ WC1E 6BT [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] failed installing rgl
Dear all, I have tried installing rgl with the usual command: R CMD INSTALL rgl_0.67-2.tar.gz Differently from what happened last time I have succesfully installed this package, this time there was a failure: ... ...g++ -I/usr/lib/R/include -I/usr/lib/R/include -I -DHAVE_PNG_H -I/usr/include/libpng12 -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -c api.cpp -o api.o In file included from glgui.hpp:9, from gui.hpp:11, from rglview.h:10, from Device.hpp:11, from DeviceManager.hpp:9, from api.cpp:14: opengl.hpp:23:20: error: GL/glu.h: No such file or directory Disposable.hpp:13: warning: ‘struct IDisposeListener’ has virtual functions but non-virtual destructor types.h:77: warning: ‘class DestroyHandler’ has virtual functions but non-virtual destructor gui.hpp:56: warning: ‘class gui::WindowImpl’ has virtual functions but non-virtual destructor gui.hpp:90: warning: ‘class gui::GUIFactory’ has virtual functions but non-virtual destructor pixmap.h:39: warning: ‘class PixmapFormat’ has virtual functions but non-virtual destructor api.cpp: In function ‘void rgl_user2window(int*, int*, double*, double*, double*, double*, int*)’: api.cpp:707: error: ‘gluProject’ was not declared in this scope api.cpp: In function ‘void rgl_window2user(int*, int*, double*, double*, double*, double*, int*)’: api.cpp:735: error: ‘gluUnProject’ was not declared in this scope make: *** [api.o] Error 1 chmod: cannot access `/usr/lib/R/library/rgl/libs/*': No such file or directory ERROR: compilation failed for package 'rgl' ** Removing '/usr/lib/R/library/rgl' ... ... It is clear that glu.h cannot be found during installation and, indeed, I have checked and there is no glu.h under /usr/include/GL. Any suggestion on how I could proceed? I am using FEDORA CORE 5. By typing glxinfo it seems that glu 1.3 is installed. But where's glu.h, then? Many thanks J -- Dr James Foadi Department of Physics University of York York YO10 5DD email: [EMAIL PROTECTED] Tel: 0044 1904 434622 Mobile: 0044 7740 678548 ___ All New Yahoo! Mail � Tired of [EMAIL PROTECTED]@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parameterization puzzle
G'day all, BDR == Prof Brian Ripley [EMAIL PROTECTED] writes: BDR R does not know that poly(age,2) and poly(age,1) are linearly BDR dependent. Indeed, I also thought that this is the reason of the problem. BDR (And indeed they only are for some functions 'poly'.) I am surprised about this. Should probably read the help page of 'poly' once more and more carefully. BDR I cannot reproduce your example ('l' is missing), [...] My guess is that 'l' is 'pyears'. At least, I worked under that assumption. Interestingly, on my machine (using R 2.3.1, 2.2.1 and 2.1.1) I cannot fit any of the Poisson GLM that Murray tried. I always get the error message: Error: no valid set of coefficients has been found: please supply starting values But I have to investigate this further. I can fit binomial models that give me similar answers. BDR [...] but perhaps BDR glm(deaths ~ poly(age,2) + poly(age,1)*Smoke + offset(l), BDR poisson) BDR was your intention? In this parameterisation a 'poly(age,1)' term will appear among the coefficients with an estimated value of NA since it is aliased with 'poly(age, 2)1'. So I don't believe that this was Murray's intention. The only suggestion I can come up with is: summary(glm(cbind(deaths, l-deaths) ~ age*Smoke+I(age^2), family=binomial)) [...] Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -10.798950.45149 -23.918 2e-16 *** age2.378920.20877 11.395 2e-16 *** SmokeYes 1.445730.37347 3.871 0.000108 *** I(age^2) -0.197060.02749 -7.168 7.6e-13 *** age:SmokeYes -0.308500.09756 -3.162 0.001566 ** [...] Which doesn't use orthogonal polynomials anymore. But I don't see how you can fit the model that Murray want to fit using orthogonal polynomials given the way R's model language operates. So I guess the Poisson GLM that Murray wants to fit is: glm(deaths~ age*Smoke+I(age^2)+offset(l), family=poisson) Cheers, Berwin == Full address Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Mathematics and Statistics+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: [EMAIL PROTECTED] Australiahttp://www.maths.uwa.edu.au/~berwin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge two dataframes of different column length and row length by two columns at a time
On Fri, Jul 21, 2006 at 09:33:56AM +0200, Gunther Höning wrote: I have two dataframes of the following structures: [...] I want to merge these dataframes by V1 and V2 of a and b. The combination of V1, V2 is a unique key. Note that b is smaller than a. Any suggestions to solve this problem ? Use merge() cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parameterization puzzle
On Fri, 21 Jul 2006, Berwin A Turlach wrote: G'day all, BDR == Prof Brian Ripley [EMAIL PROTECTED] writes: BDR R does not know that poly(age,2) and poly(age,1) are linearly BDR dependent. Indeed, I also thought that this is the reason of the problem. BDR (And indeed they only are for some functions 'poly'.) I am surprised about this. Should probably read the help page of 'poly' once more and more carefully. My point was perhaps simpler: if you or I or Murray had a function poly() in our workspace, that one would be found, I think. (I have not checked the ramifications of namespaces here, but I believe that would be the intention, that formulae are evaluated in their defining environment.) So omly when the model matrix is set up could the linear dependence be known (and there is nothing in the system poly() to tell model.matrix). What is sometimes called extrinsic aliasing is left to the fitting function, which seems to be to do a sensible job provided the main effect is in the model. Indeed, including interactions without main effects (as Murray did) often makes the results hard to interpret. BDR I cannot reproduce your example ('l' is missing), [...] My guess is that 'l' is 'pyears'. At least, I worked under that assumption. Apparently l = log(pyears), which was my uncertain guess. Interestingly, on my machine (using R 2.3.1, 2.2.1 and 2.1.1) I cannot fit any of the Poisson GLM that Murray tried. I always get the error message: Error: no valid set of coefficients has been found: please supply starting values Related to the offset, I believe. But I have to investigate this further. I can fit binomial models that give me similar answers. BDR [...] but perhaps BDR glm(deaths ~ poly(age,2) + poly(age,1)*Smoke + offset(l), BDR poisson) BDR was your intention? In this parameterisation a 'poly(age,1)' term will appear among the coefficients with an estimated value of NA since it is aliased with 'poly(age, 2)1'. So I don't believe that this was Murray's intention. The only suggestion I can come up with is: summary(glm(cbind(deaths, l-deaths) ~ age*Smoke+I(age^2), family=binomial)) [...] Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -10.798950.45149 -23.918 2e-16 *** age2.378920.20877 11.395 2e-16 *** SmokeYes 1.445730.37347 3.871 0.000108 *** I(age^2) -0.197060.02749 -7.168 7.6e-13 *** age:SmokeYes -0.308500.09756 -3.162 0.001566 ** [...] Which doesn't use orthogonal polynomials anymore. But I don't see how you can fit the model that Murray want to fit using orthogonal polynomials given the way R's model language operates. So I guess the Poisson GLM that Murray wants to fit is: glm(deaths~ age*Smoke+I(age^2)+offset(l), family=poisson) Cheers, Berwin == Full address Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Mathematics and Statistics+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: [EMAIL PROTECTED] Australiahttp://www.maths.uwa.edu.au/~berwin -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correspondence analysis with R -follow up
Hi, Hi all, thank you for your answers; i've tried both cca from vegan library, and dudi.coa from ade4 library; one last question: my deal is mainly with contingency tables, like the one i'm posting here acciaieria-c(.41,.02,.44,.04,.09) laminatoio-c(.34,.28,.26,.01,.11) fonderia-c(.48,.05,.34,.08,.05) leghe-c(.45,.19,.25,.03,.08) pct-cbind(acciaieria,laminatoio,fonderia,leghe) pct-data.frame(pct,row.names=c(normale,preparaz.,manutenz.,installaz.,trasferim.)) the data.frame pct is not a contingency table. BUT...cca and dudi.coa seem to return quite different graphical results; where am i doing wrong? Do they do the same to you with the table above? graphicals reults are similar. see the argument method in the function scatter.coa (library ade4). ... method: an integer between 1 and 3 1 Rows and columns with the coordinates of lambda variance 2 Rows variance 1 and columns by averaging 3 Columns variance 1 and rows by averaging ... #example : require(ade4) data(rpjdl) coa1 - dudi.coa(rpjdl$fau, scannf = FALSE, nf = 4) require(vegan) coa2 - cca(rpjdl$fau) # biplot representations par(mfrow=c(2,2)) plot(coa2,type=text) ? scatter.coa scatter(coa1,method=1) scatter(coa1,method=2) scatter(coa1,method=3) hope this help :) Pierre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] failed installing rgl
On FC5: roc% rpm -q --file /usr/include/GL/glu.h mesa-libGLU-devel-6.4.2-6.FC5.3 Do check what the R-admin manual says about -devel RPMs. On Thu, 20 Jul 2006, James Foadi wrote: Dear all, I have tried installing rgl with the usual command: R CMD INSTALL rgl_0.67-2.tar.gz Differently from what happened last time I have succesfully installed this package, this time there was a failure: ... ...g++ -I/usr/lib/R/include -I/usr/lib/R/include -I -DHAVE_PNG_H -I/usr/include/libpng12 -I/usr/local/include -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -c api.cpp -o api.o In file included from glgui.hpp:9, from gui.hpp:11, from rglview.h:10, from Device.hpp:11, from DeviceManager.hpp:9, from api.cpp:14: opengl.hpp:23:20: error: GL/glu.h: No such file or directory Disposable.hpp:13: warning: ÿÿstruct IDisposeListenerÿÿ has virtual functions but non-virtual destructor types.h:77: warning: ÿÿclass DestroyHandlerÿÿ has virtual functions but non-virtual destructor gui.hpp:56: warning: ÿÿclass gui::WindowImplÿÿ has virtual functions but non-virtual destructor gui.hpp:90: warning: ÿÿclass gui::GUIFactoryÿÿ has virtual functions but non-virtual destructor pixmap.h:39: warning: ÿÿclass PixmapFormatÿÿ has virtual functions but non-virtual destructor api.cpp: In function ÿÿvoid rgl_user2window(int*, int*, double*, double*, double*, double*, int*)ÿÿ: api.cpp:707: error: ÿÿgluProjectÿÿ was not declared in this scope api.cpp: In function ÿÿvoid rgl_window2user(int*, int*, double*, double*, double*, double*, int*)ÿÿ: api.cpp:735: error: ÿÿgluUnProjectÿÿ was not declared in this scope make: *** [api.o] Error 1 chmod: cannot access `/usr/lib/R/library/rgl/libs/*': No such file or directory ERROR: compilation failed for package 'rgl' ** Removing '/usr/lib/R/library/rgl' ... ... It is clear that glu.h cannot be found during installation and, indeed, I have checked and there is no glu.h under /usr/include/GL. Any suggestion on how I could proceed? I am using FEDORA CORE 5. By typing glxinfo it seems that glu 1.3 is installed. But where's glu.h, then? Many thanks J -- Dr James Foadi Department of Physics University of York York YO10 5DD email: [EMAIL PROTECTED] Tel: 0044 1904 434622 Mobile: 0044 7740 678548 ___ All New Yahoo! Mail ? Tired of [EMAIL PROTECTED]@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loss of numerical precision from conversion to list ?
Thank you both very much for your help. Peter Dalgaard is right- i didn't consider the fact that elementwise multiplication is column-wise rather than row-wise. Sorry for taking up timespace with such a trivial mistake. Original-Nachricht Datum: 21 Jul 2006 10:07:31 +0200 Von: Peter Dalgaard [EMAIL PROTECTED] An: Duncan Murdoch [EMAIL PROTECTED] Betreff: Re: [R] Loss of numerical precision from conversion to list ? Duncan Murdoch [EMAIL PROTECTED] writes: R tries to use the maximum precision (64 bit mantissa) in the floating ... Or perhaps your problem has nothing to do with this; I didn't really look at it in detail. It hasn't. I was off speculating about sum vs rowSums too, but: num.v- rowSums(((lambda-lambda0)*mu*w.k.sq[,-(K+1)])/(1+lambda*mu)) Inside this, we have mu*w.k.sq[,-(K+1)] . mu is a vector of length 27, and w.k.sq has 10 rows and 28 *columns*. Column-major storage and vector recycling kicks in... If mu has identical elements (never mind the magnitude), of course, the recycling doesn't matter. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 -- Echte DSL-Flatrate dauerhaft für 0,- Euro*! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] failed installing rgl
Subject: Re: [R] failed installing rgl Date: Friday 21 July 2006 10:08 From: Prof Brian D Ripley [EMAIL PROTECTED] To: James Foadi [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch On FC5: roc% rpm -q --file /usr/include/GL/glu.h mesa-libGLU-devel-6.4.2-6.FC5.3 Do check what the R-admin manual says about -devel RPMs. I have followed Brian Ripley advice to read the R-admin manual. In Appendix A it is reported: Remember that some package management systems (such as RPM and deb) make a distinction between the user version of a package and the development version. The latter usually has the same name but with the extension `-devel' or `-dev': you need both versions installed. In fact file glu.h was not installed because only the user version of GLU was installed on my system. Thus, I have downloaded the development version of GLU (mesa-libGLU-devel-6.4.2-6.FC5.3.i386.rpm) and installed it. After that the package rgl has installed fine. Thank you, Brian. J --- -- Dr James Foadi Department of Physics University of York York YO10 5DD email: [EMAIL PROTECTED] Tel: 0044 1904 434622 Mobile: 0044 7740 678548 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] from character to numeric over multiple columns
Hi All, I have a data frame of three columns, all of which names. The names in the three cols overlap up to a point, so I might have *Harry* in all three columns, *Tom* in cols 1 and 3 and *Bob* in col 3 only. harry paulbob anita harry tom frank jackharry tom peteben I want to turn the names into numbers, BUT I want the numeric code for, say, *Harry*, to be the same on all columns. Doing cbind(as.numeric(col1), as.numeric(col2), as.numeric(col3)) does not work because the factors are different in each column, hence they get a different number even though the name is the same. Ideas? Cheers Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [Fwd: Re: Parameterization puzzle]
Bother! This cold has made me accident-prone. I meant to hit Reply-all. Clarification below. Original Message Subject: Re: [R] Parameterization puzzle Date: Fri, 21 Jul 2006 19:10:03 +1200 From: Murray Jorgensen [EMAIL PROTECTED] To: Prof Brian Ripley [EMAIL PROTECTED] References: [EMAIL PROTECTED] [EMAIL PROTECTED] Apologies for a non-selfcontained example. Here is what I should have sent: pyears - scan() 18793 52407 10673 43248 5710 28612 2585 12663 1462 5317 deaths - scan() 2 32 12 104 28 206 28 186 31 102 l - log(pyears) Smoke - gl(2,1,10,labels=c(No,Yes)) Age - gl(5,2,10,labels=c(35-44,45-54,55-64,65-74,75-84), ordered=TRUE) mod1.glm - glm(deaths ~ Age * Smoke + offset(l),family=poisson) summary(mod1.glm) age - as.numeric(Age) mod2.glm - aso1.glm - glm(deaths ~ poly(age,2) + Smoke + poly(age,1):Smoke + offset(l),family=poisson) summary(mod2.glm) Cheers, Murray Jorgensen Prof Brian Ripley wrote: R does not know that poly(age,2) and poly(age,1) are linearly dependent. (And indeed they only are for some functions 'poly'.) I cannot reproduce your example ('l' is missing), but perhaps glm(deaths ~ poly(age,2) + poly(age,1)*Smoke + offset(l), poisson) was your intention? On Fri, 21 Jul 2006, Murray Jorgensen wrote: Consider the following example (based on an example in Pat Altham's GLM notes) pyears - scan() 18793 52407 10673 43248 5710 28612 2585 12663 1462 5317 deaths - scan() 2 32 12 104 28 206 28 186 31 102 Smoke - gl(2,1,10,labels=c(No,Yes)) Age - gl(5,2,10,labels=c(35-44,45-54,55-64,65-74,75-84), ordered=TRUE) mod1.glm - glm(deaths ~ Age * Smoke + offset(l),family=poisson) summary(mod1.glm) age - as.numeric(Age) mod2.glm - aso1.glm - glm(deaths ~ poly(age,2) + Smoke + poly(age,1):Smoke + offset(l),family=poisson) summary(mod2.glm) The business part of the summary for the first model Estimate Std. Error z value Pr(|z|) (Intercept)-5.927060.16577 -35.754 2e-16 *** Age.L 4.064900.47414 8.573 2e-16 *** Age.Q -1.082930.41326 -2.620 0.008781 ** Age.C 0.241580.31756 0.761 0.446816 Age^4 0.042440.23061 0.184 0.853986 SmokeYes0.619160.17296 3.580 0.000344 *** Age.L:SmokeYes -1.312340.49267 -2.664 0.007729 ** Age.Q:SmokeYes 0.390430.43008 0.908 0.363976 Age.C:SmokeYes -0.295930.33309 -0.888 0.374298 Age^4:SmokeYes -0.036820.24432 -0.151 0.880218 inspires me to fit the second model that omits the nonsignificant terms, however this produces the summary Estimate Std. Error z value Pr(|z|) (Intercept)-5.8368 0.1213 -48.103 2e-16 *** poly(age, 2)1 3.9483 0.1755 22.497 2e-16 *** poly(age, 2)2 -1.0460 0.1448 -7.223 5.08e-13 *** SmokeYes0.5183 0.1262 4.106 4.02e-05 *** SmokeNo:poly(age, 1)1.3755 0.4340 3.169 0.00153 ** SmokeYes:poly(age, 1) NA NA NA NA Why do we get a SmokeNo:poly(age, 1) term? Can I re-express mod2.glm so that this term does not appear? Cheers, Murray Jorgensen -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862 -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] tsDyn and RTisean packages on CRAN
Dear R users, I've just uploaded 2 packages on CRAN, RTisean and tsDyn, both for time series analysis (joint research projects with members of the Statistics Department, University of Udine). Brief descriptions follow. RTisean is an R interface to TISEAN executables (http://www.mpipks-dresden.mpg.de/~tisean/). TISEAN is a suite of C and Fortran routines for nonlinear time series analysis, coded and documented by Rainer Hegger, Holger Kantz and Thomas Schreiber. In RTisean, almost all TISEAN routines are wrapped in a conventional R function (which silently calls TISEAN executables), with online help and examples (thanks to Gianluca Gazzola). tsDyn is a package for nonlinear time series modelling. At this point the package focuses on Nonlinear Autoregressive Models, often indicated as NLAR (as a major reference, see Tong (1990)). Currently available are threshold (SETAR and LSTAR), neural networks (NNET), and additive autoregressive (AAR) models. An experimental cross-platform tcltk GUI is included for model selection. Explorative and diagnostic tools are also available. A vignette is included for a clearer presentation of the package contents. Comments and suggestions appreciated Antonio, Fabio Di Narzo Dipartimento di Statistica Università degli studi di Bologna. ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] from character to numeric over multiple columns
Are the columns factors or character? I'll try to write code that copes with both: nm - unique(c(as.character(col1), as.character(col2), as.character(col3))) DF[] - lapply(DF, function(x) match(x, nm)) On Fri, 21 Jul 2006, Federico Calboli wrote: Hi All, I have a data frame of three columns, all of which names. The names in the three cols overlap up to a point, so I might have *Harry* in all three columns, *Tom* in cols 1 and 3 and *Bob* in col 3 only. harry paulbob anita harry tom frank jackharry tom peteben I want to turn the names into numbers, BUT I want the numeric code for, say, *Harry*, to be the same on all columns. Doing cbind(as.numeric(col1), as.numeric(col2), as.numeric(col3)) does not work because the factors are different in each column, hence they get a different number even though the name is the same. Ideas? Cheers Federico -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] from character to numeric over multiple columns
Prof Brian Ripley wrote: Are the columns factors or character? I'll try to write code that copes with both: nm - unique(c(as.character(col1), as.character(col2), as.character(col3))) DF[] - lapply(DF, function(x) match(x, nm)) Cheers, it worked. Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm cannot find valid starting values
glm(S ~ -1 + Mdif, family=quasipoisson(link=identity), start=strt, sdat) gives error: Error in glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, : cannot find valid starting values: please specify some strt is set to be the coefficient for a similar fit glm(S ~ -1 + I(Mdif + 1),... i.e. (Mdif + 1) is a vector similar to Mdif. The error appears to occur when some values of Mdif are negative, though I have not had this problem with simulated datasets. Any solutions greatly appreciated (full details and data below). Dan Bebber Department of Plant Sciences University of Oxford OS: WinXP Pro SP2 and Win ME (tried both) Processor: Dual Intel Xeon and Pentium 4 (tried both) R version: 2.3.0 and 2.3.1 (tried both) #Full details (can be pasted directly into R): #Data: sdat - data.frame( S = c(0, 0, 0, 0, 28, 0, 1, 7, 0, 0, 39, 2, 0, 0, 40, 0, 0, 0, 0, 0, 0, 15, 0, 0, 3, 0, 0, 0, 2, 0, 3, 0, 30, 0, 20, 0, 1, 0, 0, 1, 21, 0, 0, 4, 14, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 3, 0, 2, 5, 0, 0, 0, 0, 0, 0, 0, 25, 0, 5, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 0, 0, 0, 0, 9, 1, 1, 1, 1, 0, 0, 3, 0, 27, 7, 0, 0, 0, 0, 0, 1, 1, 4, 2, 1, 2, 4, 1, 4, 6, 12, 4, 6, 3, 4, 0, 4, 0, 6, 1, 3, 1, 4, 4, 1, 1, 2, 1, 6, 1, 0, 0, 1, 0, 1, 0, 0, 6, 0, 0, 0, 0, 2, 2, 3, 1, 6, 2, 2, 1, 1, 4, 4, 3, 3, 7, 1, 3, 5, 6, 4, 0, 1, 4, 2, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 4, 0, 0, 1, 0, 2, 0, 0, 1, 2, 0, 4, 0, 2, 0, 3, 0, 2, 2, 0, 4, 0, 2, 0, 1, 2, 3, 0, 0, 0, 0, 3, 1, 0, 0, 0, 3, 2, 1, 2, 1, 1, 2, 0, 0, 3, 3, 1, 0, 1, 1, 2, 0, 1, 0, 1, 0, 1, 3, 2, 0, 0, 2, 1, 0, 1, 0, 0, 0, 0, 0, 0, 4, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 4, 1, 0, 0, 1), M = 620+c(0,cumsum(sdat$S[-328]))) #S is the (unknown) number of N individuals that irreversibly change state in a time #interval t. The data provided are a subset of the full data set. #M is the cumulative sum of individuals that have changed state up to t-1. #Assume that the rate of state change is constant (S ~ kM), but the #distribution of S is clustered. #The goal is to estimate N. #N can be estimated by fitting: qpglm - glm(S ~ M, family = quasipoisson(link = identity), sdat) summary(qpglm) N.est - -coef(qpglm)[1]/coef(qpglm)[2] N.est #i.e. x-intercept is minus intercept/slope #To estimate confidence limits on N.est, fit models without intercept to #N.est - M + x, where x is an integer. The model will have the lowest deviance #when x = 0. x - 0 Mdif - N.est - M + x qpglm2 - glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), sdat) summary(qpglm2) #Use analysis of deviance to estimate confidence limits on N.est: disp - summary(qpglm)$dispersion dfres - df.residual(qpglm) dev.res - deviance(qpglm) #From MASS4, p. 210, assume that changes in deviance scaled by #dispersion as |x| increases have an F distribution dev.crit - dev.res+qf(0.95,1,dfres)*disp dev.crit #values of x for which the deviance = dev.crit give approximate 95% confidence limits #on N.est. #The error occurs when x = -91.7: x - -91.7 sdat$Mdif - N.est - sdat$M + x strt - coef(glm(S ~ -1 + I(Mdif+1), family = quasipoisson(link = identity), sdat)) qpglm2 - glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), start=strt, sdat) #The problem is that this interferes with optimization to find values of x for which #deviance = dev.crit __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unexpected results
Hi, I'm just learning R and am encountering some unexpected results following a guide on the internet. If anyone can help that would be great - I think it is something about the way the data has been read in! I've read a coma delimited text data file that was saved from excel: jacs.data - read.table(/Users/natha/Desktop/JACSdata2.txt, header=TRUE, sep=,) This seemed to work fine, but then I start to get unusual results that don't seem right: The guide says that names(file.data) will give output like ID JURIS RESCODE , but I get ID.JURIS.RESCODE. The guide says that file.data[5,1] will give me the data for the 5th subject but i get: [1] 7\tACT\t\t\tSUMMONS\tACTCC321.001\tA.C.T. - MINOR THEFT (REPLACEMENT VALUE $2000 OR LESS)\ etc - which seems scrambled The guide says that file.data[var50] will give me the data for all subject who meet that condition (ie greater than 0 on var5), but I get: Error in [.data.frame(jacs.data, offend 0) : object offend not found can anyone help? Thanks nathan -- View this message in context: http://www.nabble.com/unexpected-results-tf1979786.html#a5432210 Sent from the R help forum at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpart unbalanced data
Hello all, I am currently working with rpart to classify vegetation types by spectral characteristics, and am comming up with poor classifications based on the fact that I have some vegetation types that have only 15 observations, while others have over 100. I have attempted to supply prior weights to the dataset, though this does not improve the classification greatly. Could anyone supply some hints about how to improve a classification for a badly unbalanced datase? Thank you, Helen Mills Poulos __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm cannot find valid starting values
On Fri, 21 Jul 2006, Dan Bebber wrote: glm(S ~ -1 + Mdif, family=quasipoisson(link=identity), start=strt, sdat) gives error: Error in glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, : cannot find valid starting values: please specify some strt is set to be the coefficient for a similar fit glm(S ~ -1 + I(Mdif + 1),... i.e. (Mdif + 1) is a vector similar to Mdif. The error appears to occur when some values of Mdif are negative, though I have not had this problem with simulated datasets. Right: if Mdif contains both positive and negative values there are no coefficients that are valid for that model (you are bound to predict negative means). You often do better to take etastart from another fit than start, but that will not help here, I believe. BTW, your example cannot be pasted in as 'sdat' self-references. It could be fixed, but I gave up at that point. Any solutions greatly appreciated (full details and data below). Dan Bebber Department of Plant Sciences University of Oxford OS: WinXP Pro SP2 and Win ME (tried both) Processor: Dual Intel Xeon and Pentium 4 (tried both) R version: 2.3.0 and 2.3.1 (tried both) #Full details (can be pasted directly into R): #Data: sdat - data.frame( S = c(0, 0, 0, 0, 28, 0, 1, 7, 0, 0, 39, 2, 0, 0, 40, 0, 0, 0, 0, 0, 0, 15, 0, 0, 3, 0, 0, 0, 2, 0, 3, 0, 30, 0, 20, 0, 1, 0, 0, 1, 21, 0, 0, 4, 14, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 3, 0, 2, 5, 0, 0, 0, 0, 0, 0, 0, 25, 0, 5, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 0, 0, 0, 0, 9, 1, 1, 1, 1, 0, 0, 3, 0, 27, 7, 0, 0, 0, 0, 0, 1, 1, 4, 2, 1, 2, 4, 1, 4, 6, 12, 4, 6, 3, 4, 0, 4, 0, 6, 1, 3, 1, 4, 4, 1, 1, 2, 1, 6, 1, 0, 0, 1, 0, 1, 0, 0, 6, 0, 0, 0, 0, 2, 2, 3, 1, 6, 2, 2, 1, 1, 4, 4, 3, 3, 7, 1, 3, 5, 6, 4, 0, 1, 4, 2, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 4, 0, 0, 1, 0, 2, 0, 0, 1, 2, 0, 4, 0, 2, 0, 3, 0, 2, 2, 0, 4, 0, 2, 0, 1, 2, 3, 0, 0, 0, 0, 3, 1, 0, 0, 0, 3, 2, 1, 2, 1, 1, 2, 0, 0, 3, 3, 1, 0, 1, 1, 2, 0, 1, 0, 1, 0, 1, 3, 2, 0, 0, 2, 1, 0, 1, 0, 0, 0, 0, 0, 0, 4, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 4, 1, 0, 0, 1), M = 620+c(0,cumsum(sdat$S[-328]))) #S is the (unknown) number of N individuals that irreversibly change state in a time #interval t. The data provided are a subset of the full data set. #M is the cumulative sum of individuals that have changed state up to t-1. #Assume that the rate of state change is constant (S ~ kM), but the #distribution of S is clustered. #The goal is to estimate N. #N can be estimated by fitting: qpglm - glm(S ~ M, family = quasipoisson(link = identity), sdat) summary(qpglm) N.est - -coef(qpglm)[1]/coef(qpglm)[2] N.est #i.e. x-intercept is minus intercept/slope #To estimate confidence limits on N.est, fit models without intercept to #N.est - M + x, where x is an integer. The model will have the lowest deviance #when x = 0. x - 0 Mdif - N.est - M + x qpglm2 - glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), sdat) summary(qpglm2) #Use analysis of deviance to estimate confidence limits on N.est: disp - summary(qpglm)$dispersion dfres - df.residual(qpglm) dev.res - deviance(qpglm) #From MASS4, p. 210, assume that changes in deviance scaled by #dispersion as |x| increases have an F distribution dev.crit - dev.res+qf(0.95,1,dfres)*disp dev.crit #values of x for which the deviance = dev.crit give approximate 95% confidence limits #on N.est. #The error occurs when x = -91.7: x - -91.7 sdat$Mdif - N.est - sdat$M + x strt - coef(glm(S ~ -1 + I(Mdif+1), family = quasipoisson(link = identity), sdat)) qpglm2 - glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), start=strt, sdat) #The problem is that this interferes with optimization to find values of x for which #deviance = dev.crit __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart unbalanced data
Dear Helen, You may want to have a look at http://www.togaware.com/datamining/survivor/Predicting_Fraud.html Greets, Diego Kuonen [EMAIL PROTECTED] wrote: Hello all, I am currently working with rpart to classify vegetation types by spectral characteristics, and am comming up with poor classifications based on the fact that I have some vegetation types that have only 15 observations, while others have over 100. I have attempted to supply prior weights to the dataset, though this does not improve the classification greatly. Could anyone supply some hints about how to improve a classification for a badly unbalanced datase? Thank you, Helen Mills Poulos -- Dr. ès sc. Diego Kuonen, CEOphone +41 (0)21 693 5508 Statoo Consulting fax+41 (0)21 693 8765 PO Box 107 mobile +41 (0)78 709 5384 CH-1015 Lausanne 15 email [EMAIL PROTECTED] web http://www.statoo.info skype Kuonen.Statoo.Consulting - | Statistical Consulting + Data Analysis + Data Mining Services | - + Are you drowning in information and starving for knowledge? + + Have you ever been Statooed? http://www.statoo.biz + __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected results
On 7/21/2006 7:36 AM, nathan wrote: Hi, I'm just learning R and am encountering some unexpected results following a guide on the internet. If anyone can help that would be great - I think it is something about the way the data has been read in! I've read a coma delimited text data file that was saved from excel: jacs.data - read.table(/Users/natha/Desktop/JACSdata2.txt, header=TRUE, sep=,) This seemed to work fine, but then I start to get unusual results that don't seem right: The guide says that names(file.data) will give output like ID JURIS RESCODE , but I get ID.JURIS.RESCODE. The guide says that file.data[5,1] will give me the data for the 5th subject but i get: [1] 7\tACT\t\t\tSUMMONS\tACTCC321.001\tA.C.T. - MINOR THEFT (REPLACEMENT VALUE $2000 OR LESS)\ etc - which seems scrambled The \t values are tabs. I think your file was tab delimited, not comma delimited. R thinks it has only one column, because it didn't fine any commas. The guide says that file.data[var50] will give me the data for all subject who meet that condition (ie greater than 0 on var5), but I get: Error in [.data.frame(jacs.data, offend 0) : object offend not found It looks like you typed jacs.data[offend 0]. There are two problems: 1. You want to select rows matching the condition, so you need another comma, i.e. jacs.data[offend 0, ] (the empty entry after the comma means all columns). 2. You need to have a variable named offend outside the dataframe. The error message makes it look as though you don't. If offend is a column in the dataframe, then you would use jacs.data[jacs.data$offend 0, ] or subset(jacs.data, offend 0) Duncan Murdoch can anyone help? Thanks nathan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected results
On Fri, 2006-07-21 at 04:36 -0700, nathan wrote: Hi, I'm just learning R and am encountering some unexpected results following a guide on the internet. If anyone can help that would be great - I think it is something about the way the data has been read in! I've read a coma delimited text data file that was saved from excel: jacs.data - read.table(/Users/natha/Desktop/JACSdata2.txt, header=TRUE, sep=,) This seemed to work fine, but then I start to get unusual results that don't seem right: The guide says that names(file.data) will give output like ID JURIS RESCODE , but I get ID.JURIS.RESCODE. The guide says that file.data[5,1] will give me the data for the 5th subject but i get: [1] 7\tACT\t\t\tSUMMONS\tACTCC321.001\tA.C.T. - MINOR THEFT (REPLACEMENT VALUE $2000 OR LESS)\ etc - which seems scrambled The guide says that file.data[var50] will give me the data for all subject who meet that condition (ie greater than 0 on var5), but I get: Error in [.data.frame(jacs.data, offend 0) : object offend not found can anyone help? Thanks nathan It would appear that your data file is NOT comma delimited, but TAB delimited. The \t characters in the output above support this, since you asked for just the first column for the 5th subject and it appears that you got the entire row. Re-run the import using: jacs.data - read.table(/Users/natha/Desktop/JACSdata2.txt, header=TRUE, sep = \t) so that you are indicating that the delimiter is a TAB character, not a comma. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected results
Hi On 21 Jul 2006 at 4:36, nathan wrote: Date sent: Fri, 21 Jul 2006 04:36:53 -0700 (PDT) From: nathan [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject:[R] unexpected results Hi, I'm just learning R and am encountering some unexpected results following a guide on the internet. If anyone can help that would be great - I think it is something about the way the data has been read in! I've read a coma delimited text data file that was saved from excel: jacs.data - read.table(/Users/natha/Desktop/JACSdata2.txt, header=TRUE, sep=,) This seemed to work fine, but then I start to get unusual results that I do not think so. Probably separation character of your file is not , as you set in sep=,. don't seem right: The guide says that names(file.data) will give output like ID JURIS RESCODE , but I get ID.JURIS.RESCODE. e.g. ID.JURIS.RESCODE was read into one column. Best way how to copy data from Excel (if you have Excel) Select your data in Excel including first row with headers Ctrl-C Go to R Write mydata - read.delim(clipboard) or look at JACSdata2.txt what is the separator between fields and set it in your read.table command accordingly. From later I suppose your txt file is tab delimited so read.delim() shall capture it. HTH Petr The guide says that file.data[5,1] will give me the data for the 5th subject but i get: [1] 7\tACT\t\t\tSUMMONS\tACTCC321.001\tA.C.T. - MINOR THEFT (REPLACEMENT VALUE $2000 OR LESS)\ etc - which seems scrambled The guide says that file.data[var50] will give me the data for all subject who meet that condition (ie greater than 0 on var5), but I get: Error in [.data.frame(jacs.data, offend 0) : object offend not found can anyone help? Thanks nathan -- View this message in context: http://www.nabble.com/unexpected-results-tf1979786.html#a5432210 Sent from the R help forum at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading data with blank elements
Hi, I have a dataset saved in *.csv format, that contains 13 columns (the first column being the title name and the rest experiments) and about 2500 rows. Not all columns in the row have data in it i.e for eg BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 BS01,0.491270283,0.875826172,, BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.2301671020.152322609,-1.495513519,, I am using R to perform a correlation, but I am getting an error while trying to read the data as person.data-read.table(datafile.csv,header=TRUE,sep=',',row.names=1) Error in scan (file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1919 did not have 13 elements Execution halted The error looks as though there is a problem with the last element being not read when it is blank. I could introduce terms like na to the blank elements but I donot want to do that because this will hinder my future analysis. Can some one suggest me a solution to overcome this problem while reading the data? , or is there something that I have missed to make the data readable. Thank you in advance, PS: The data was imported from a experiment and saved in excel sheet as a *.csv and then used. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with blank elements
Use x - count.fields('datafile.csv',sep=',') on your file. This will tell you the number of columns that R thinks is one each line. Then do: table(x) to see if all the lines have the same number of columns. If not, then find the lines which don't: which(x != legal.length) and then look at your input data for those lines. On 7/21/06, Ahamarshan jn [EMAIL PROTECTED] wrote: Hi, I have a dataset saved in *.csv format, that contains 13 columns (the first column being the title name and the rest experiments) and about 2500 rows. Not all columns in the row have data in it i.e for eg BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,- 0.1935,-0.147974,0.30685 BS01,0.491270283,0.875826172,, BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599, 0.47636458, BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361, 0.736247,2.053192,-0.423658,0.4591219,1.1245015, BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.230167102 0.152322609,-1.495513519,, I am using R to perform a correlation, but I am getting an error while trying to read the data as person.data-read.table(datafile.csv,header=TRUE,sep=',',row.names=1) Error in scan (file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1919 did not have 13 elements Execution halted The error looks as though there is a problem with the last element being not read when it is blank. I could introduce terms like na to the blank elements but I donot want to do that because this will hinder my future analysis. Can some one suggest me a solution to overcome this problem while reading the data? , or is there something that I have missed to make the data readable. Thank you in advance, PS: The data was imported from a experiment and saved in excel sheet as a *.csv and then used. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with blank elements
See if this works: read.csv(datafile.csv, row.names = 1, fill = TRUE) On 7/21/06, Ahamarshan jn [EMAIL PROTECTED] wrote: Hi, I have a dataset saved in *.csv format, that contains 13 columns (the first column being the title name and the rest experiments) and about 2500 rows. Not all columns in the row have data in it i.e for eg BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 BS01,0.491270283,0.875826172,, BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.2301671020.152322609,-1.495513519,, I am using R to perform a correlation, but I am getting an error while trying to read the data as person.data-read.table(datafile.csv,header=TRUE,sep=',',row.names=1) Error in scan (file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1919 did not have 13 elements Execution halted The error looks as though there is a problem with the last element being not read when it is blank. I could introduce terms like na to the blank elements but I donot want to do that because this will hinder my future analysis. Can some one suggest me a solution to overcome this problem while reading the data? , or is there something that I have missed to make the data readable. Thank you in advance, PS: The data was imported from a experiment and saved in excel sheet as a *.csv and then used. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] positive semi-definite matrix
I have a covariance matrix that is not positive semi-definite matrix and I need it to be via some sort of adjustment. Is there any R routine or package to help me do this? Thanks, Roger [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm cannot find valid starting values
Brian Ripley wrote: BTW, your example cannot be pasted in as 'sdat' self-references. It could be fixed, but I gave up at that point. Oh dear, I'm very sorry. I forgot to run rm(list=ls(all=TRUE)) before testing. The corrected code is: #Data: S - c(0, 0, 0, 0, 28, 0, 1, 7, 0, 0, 39, 2, 0, 0, 40, 0, 0, 0, 0, 0, 0, 15, 0, 0, 3, 0, 0, 0, 2, 0, 3, 0, 30, 0, 20, 0, 1, 0, 0, 1, 21, 0, 0, 4, 14, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 3, 0, 2, 5, 0, 0, 0, 0, 0, 0, 0, 25, 0, 5, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 0, 0, 0, 0, 9, 1, 1, 1, 1, 0, 0, 3, 0, 27, 7, 0, 0, 0, 0, 0, 1, 1, 4, 2, 1, 2, 4, 1, 4, 6, 12, 4, 6, 3, 4, 0, 4, 0, 6, 1, 3, 1, 4, 4, 1, 1, 2, 1, 6, 1, 0, 0, 1, 0, 1, 0, 0, 6, 0, 0, 0, 0, 2, 2, 3, 1, 6, 2, 2, 1, 1, 4, 4, 3, 3, 7, 1, 3, 5, 6, 4, 0, 1, 4, 2, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 4, 0, 0, 1, 0, 2, 0, 0, 1, 2, 0, 4, 0, 2, 0, 3, 0, 2, 2, 0, 4, 0, 2, 0, 1, 2, 3, 0, 0, 0, 0, 3, 1, 0, 0, 0, 3, 2, 1, 2, 1, 1, 2, 0, 0, 3, 3, 1, 0, 1, 1, 2, 0, 1, 0, 1, 0, 1, 3, 2, 0, 0, 2, 1, 0, 1, 0, 0, 0, 0, 0, 0, 4, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 4, 1, 0, 0, 1) M - 620+c(0,cumsum(S[-328])) sdat - data.frame(S,M) #S is the number of N individuals that irreversibly change state in a time #interval t. The data provided are a subset of the full data set. #M is the cumulative sum of individuals that have changed state up to t-1. #Assume that the rate of state change is constant (S ~ kM), but the #distribution of S is clustered. #N can be estimated by fitting: qpglm - glm(S ~ M, family = quasipoisson(link = identity), sdat) summary(qpglm) N.est - -coef(qpglm)[1]/coef(qpglm)[2] N.est #i.e. x-intercept is minus intercept/slope #To estimate confidence limits on N.est, fit models without intercept to #N.est - M + x, where x is an integer. The model will have the lowest deviance #when x = 0. x - 0 Mdif - N.est - M + x qpglm2 - glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), sdat) summary(qpglm2) #Use analysis of deviance to estimate confidence limits on N.est: disp - summary(qpglm)$dispersion dfres - df.residual(qpglm) dev.res - deviance(qpglm) #From MASS4, p. 210, assume that changes in deviance scaled by #dispersion as |x| increases have an F distribution dev.crit - dev.res+qf(0.95,1,dfres)*disp dev.crit #values of x for which the deviance = dev.crit give approximate 95% confidence limits #on N.est. #The error occurs when x = -91.7: x - -91.7 sdat$Mdif - N.est - sdat$M + x strt - coef(glm(S ~ -1 + I(Mdif+1), family = quasipoisson(link = identity), sdat)) qpglm2 - glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), start=strt, sdat) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with blank elements
Hi! On Fri, Jul 21, 2006 at 05:43:03AM -0700, Ahamarshan jn wrote: I have a dataset saved in *.csv format, that contains [...] BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 BS01,0.491270283,0.875826172,, BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.2301671020.152322609,-1.495513519,, person.data-read.table(datafile.csv,header=TRUE,sep=',',row.names=1) Error in scan (file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1919 did not have 13 elements Execution halted R does handle empty elements fine. The error message you quote occurs if a row does not contain the expected number of elements (empty or not.) Did you have a look at row 1919? Does it really contain the same number of separators (commas) as the other ones? Some programs handle empty elements at the end of a row in a 'lazy' way and simply ommit them. If this is the case you can use the option 'fill=TRUE' to tell read.table that you want it to silently pad short rows with empty elements. Another 'popular' reason for funny errors with read.table is the unexpected occurence of quotation or comment characters in the data... cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] positive semi-definite matrix
On 7/21/2006 8:59 AM, roger bos wrote: I have a covariance matrix that is not positive semi-definite matrix and I need it to be via some sort of adjustment. Is there any R routine or package to help me do this? I think you need to be more specific about what have and what you want, but if the matrix is symmetric and nearly positive semi-definite (but not exactly because of rounding error), you might try something like fixit - function(A) { eig - eigen(A, symmetric = TRUE) eig$values - pmax(0, eig$values) return(eig$vectors %*% diag(eig$values) %*% t(eig$vectors)) } Rounding error means this is not guaranteed to be positive semi-definite, but it will be very close. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Hi, I am having problems when trying to apply factor analysis in a covariance matrix, factanal(covmat=strip1cmcd, factors=5, control=list(lower=0.0264)) Error in optim(start, FAfn, FAgr, method = L-BFGS-B, lower = lower, : non-finite value supplied by optim In addition: Warning message: NaNs produced in: sqrt(diag(cv)) I have searched for possible solutions in messages posted previously. I think that: Zeroing the workspace is not a requirement of the original L-BFGS-B code that I can see. Given that it was originally in Fortran, and Fortran often does zero it seems a likely symptom, but it does mean that a variable is being used uninitialized somewhere in the code (converted to C). It would be better to leave vect alone and to zero the workspace with a memset call in lbfgsb. (Incidentally, I don't know why S_alloc does not use memset -- we do require standard C and use in seeral other places.) -- Brian D. Ripley, could be relevant to my case. The question is: How can I zero the workspace with a memset call in lbfgsb? I have read the Rhelp for windows and tried to zero the workspace from RGui.exe mim.mem.size but is not working. Any suggestions will be very welcome indeed. Regards Dr. Gabriela Munoz-Melendez Environmental Processes and Systems Research Group Department of Earth Science and Engineering Imperial College London SW7 2AZ Tel: +44 (0)207 594 7369 Fax: +44 (0)207 594 7354 E-mail: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] positive semi-definite matrix
Duncan == Duncan Murdoch [EMAIL PROTECTED] on Fri, 21 Jul 2006 09:44:42 -0400 writes: Duncan On 7/21/2006 8:59 AM, roger bos wrote: I have a covariance matrix that is not positive semi-definite matrix and I need it to be via some sort of adjustment. Is there any R routine or package to help me do this? Duncan I think you need to be more specific about what have and what you want, Duncan but if the matrix is symmetric and nearly positive semi-definite (but Duncan not exactly because of rounding error), you might try something like Duncan fixit - function(A) { Duncan eig - eigen(A, symmetric = TRUE) Duncan eig$values - pmax(0, eig$values) Duncan return(eig$vectors %*% diag(eig$values) %*% t(eig$vectors)) Duncan } Duncan Rounding error means this is not guaranteed to be positive Duncan semi-definite, but it will be very close. A slightly more general and stable version of the above is available via sfsmisc::posdefify(.) : install.packages(sfsmisc) ?posdefify Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] : Arial font for text in lattice plots under Linux
Hi, I have been asked by a publisher to change the font style of a lattice plot in my manuscript. I have consulted documentation on trellis graphics and the excellent book R graphics, but am still not sure how I could create plots with Arial as the font style for text in the plot. I am running R (Version 2.3.1 (2006-06-01)) under debian Linux. I have msttcorefonts installed. Will it be possible to do this with jpeg() as a device? Or with postscript()? Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weibull distribution
Dear Leaf, I modified your code as follows: gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) {abs(sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2))} alpha - optim(start, f.fn) beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } Now it works properly. First, I added an abs(). You tried to solve an equation by means of the R-function optim(), which finds a minimum. That's why you can find the solution of f(x)=a through minimization of abs(f(x)-a). Second, I deleted the optim-method BFGS from the optim() function, because it is not appropriate in this case. BFGS seeks to make the gradient (or here the first derivative) zero and in your case f(x) converges to a constant for big x, which means f'(x) is approximately 0 for big x, which is why BFGS stops almost immediately after the start value. The default method of optim() ( Nelder and Mead ) is more appropriate, since it does not need the first derivative and works only with function values. Best regards, Valentin --- Leaf Sun [EMAIL PROTECTED] wrote: Hi all, By its definition, the mean and variance of two-par. Weibull distribution are: (www.wikipedia.org) I was wondering, if given mean and sd. could we parameterize the distribution? I tried this in R. gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2) alpha - optim(start, f.fn,method='BFGS') beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } But the problems come up here: 1) the return values of a and b are only related to the input mean, and nothing to do with the sd. For instance, when I apply a mean mu = 3 whatever I use sd=2, sd=4, the function returned the same scale and shape values. gamma.fun(3,4,10); ab 5.112554 3.263178 gamma.fun(3,2,10); ab 5.112554 3.263178 2) the start value determines the results: if I apply mean = 3, and sd=2, with a start of 10, it would return alpha close to 10, if I use a start = 100, it would return alpha close to 100. gamma.fun(3,2,10); ab 5.112554 3.263178 gamma.fun(3,2,100); a b 99.71 3.017120 Since I am not a statistician, I guess there must be some theoretical reasons wrong with this question. So I am looking forward to some correction and advice to solve these. Thanks a lot in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order-restricted inference
On Thu, 20 Jul 2006, Bruno L. Giordano wrote: If R packages are not found for my purpose, can somebody please point me to some recent monographs on order-restricted inference? One more recent book is: Robertson, T., Wright, F., Dykstra, R. (1988). Order restricted statistical inference. New York: John Wiley and Sons. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] positive semi-definite matrix
Take a look at make.positive.definite in the corpcor package. The implementation is very similar to what Duncan suggested. Regards, Francisco Dr. Francisco J. Zagmutt College of Veterinary Medicine and Biomedical Sciences Colorado State University From: Duncan Murdoch [EMAIL PROTECTED] To: roger bos [EMAIL PROTECTED] CC: RHelp r-help@stat.math.ethz.ch Subject: Re: [R] positive semi-definite matrix Date: Fri, 21 Jul 2006 09:44:42 -0400 On 7/21/2006 8:59 AM, roger bos wrote: I have a covariance matrix that is not positive semi-definite matrix and I need it to be via some sort of adjustment. Is there any R routine or package to help me do this? I think you need to be more specific about what have and what you want, but if the matrix is symmetric and nearly positive semi-definite (but not exactly because of rounding error), you might try something like fixit - function(A) { eig - eigen(A, symmetric = TRUE) eig$values - pmax(0, eig$values) return(eig$vectors %*% diag(eig$values) %*% t(eig$vectors)) } Rounding error means this is not guaranteed to be positive semi-definite, but it will be very close. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] positive semi-definite matrix
There is a paper by N.J. Higham (SIAM J Matrix Anal, 1998) on a modified cholesky decomposition of symmetric and not necessarily positive definite matrix (say, A), with an important goal of producing a small-normed perturbation of A (say, delA), that makes (A + delA) positive definite. http://epubs.siam.org/sam-bin/dbq/article/30289 There is also an algorithm in Gill, Murray and Wright's text - Practical Optimization (section 4.4.2). These may be relevant to your problem. I am not sure if these algorithms have been implemented in R, for example, in the matrix library. Ravi. -- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -- -Original Message- From: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Friday, July 21, 2006 9:45 AM To: roger bos Cc: RHelp Subject: Re: [R] positive semi-definite matrix On 7/21/2006 8:59 AM, roger bos wrote: I have a covariance matrix that is not positive semi-definite matrix and I need it to be via some sort of adjustment. Is there any R routine or package to help me do this? I think you need to be more specific about what have and what you want, but if the matrix is symmetric and nearly positive semi-definite (but not exactly because of rounding error), you might try something like fixit - function(A) { eig - eigen(A, symmetric = TRUE) eig$values - pmax(0, eig$values) return(eig$vectors %*% diag(eig$values) %*% t(eig$vectors)) } Rounding error means this is not guaranteed to be positive semi-definite, but it will be very close. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weibull distribution
On Fri, 21 Jul 2006, Valentin Dimitrov wrote: Dear Leaf, I modified your code as follows: gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) {abs(sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2))} alpha - optim(start, f.fn) beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } Now it works properly. First, I added an abs(). You tried to solve an equation by means of the R-function optim(), which finds a minimum. That's why you can find the solution of f(x)=a through minimization of abs(f(x)-a). Second, I deleted the optim-method BFGS from the optim() function, because it is not appropriate in this case. optim() is not appropriate at all in this case -- its help page says to use optimize() for one-dimensional problems. In fact, in one dimension there isn't any need to resort to optimization when you really want root-finding, and uniroot() is more appropriate than optimize(). -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with character spacing in windows metafiles...
This problem was posted a couple years ago here: http://tolstoy.newcastle.edu.au/R/help/04/01/0231.html Using Windows XP with R 2.3.1, I am now experiencing the same problem again: when a plot is saved and/or copied as a WMF, the labels do not have the correct character spacing. I am trying to insert the WMF into a MS Word 2003 document (my first mistake, I know), but even when the WMF is opened with other graphics software, the problem remains. Must be something with the way the WMF is written. The strange this is that I was able to do this on another computer with a slilghtly older version of R (2.2.x?), and did not have the problem. The version of Microsoft Word was the same on both, and the code did not matter (any basic plot with any font had its text labels squeezed). I've done some searching and haven't come up with any good solutions. For Windows users, the WMF is the most convenient format for saving vector graphics. EPS is not an option for me. Any help would be greatly appreciated. Cheers, Dan -- View this message in context: http://www.nabble.com/Problems-with-character-spacing-in-windows-metafiles...-tf1980921.html#a5435945 Sent from the R help forum at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] : Arial font for text in lattice plots under Linux
This is a question about devices, not lattice per se, so applies to all forms of plotting. For jpeg, you can specify the fonts via the 'fonts' argument: see ?X11. For postscript, you can convert the fonts to Type 1 via e.g. ttf2pt1, or you can use ttf2afm to make .afm files and a postscript interpreter that can handle TrueType fonts. On Fri, 21 Jul 2006, Peter Ho wrote: Hi, I have been asked by a publisher to change the font style of a lattice plot in my manuscript. I have consulted documentation on trellis graphics and the excellent book R graphics, but am still not sure how I could create plots with Arial as the font style for text in the plot. I am running R (Version 2.3.1 (2006-06-01)) under debian Linux. I have msttcorefonts installed. Will it be possible to do this with jpeg() as a device? Or with postscript()? Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with blank elements
On Fri, 2006-07-21 at 05:43 -0700, Ahamarshan jn wrote: Hi, I have a dataset saved in *.csv format, that contains 13 columns (the first column being the title name and the rest experiments) and about 2500 rows. Not all columns in the row have data in it i.e for eg BS00,-0.084,0.0136,-0.1569,-0.6484,1.103,1.7859,0.40287,0.5368,0.08461,-0.1935,-0.147974,0.30685 BS01,0.491270283,0.875826172,, BS02,0.090794476,0.225858954,,,0.32643,0.34317,0.133145295,,,0.115832599,0.47636458, BS03,0.019828221,-0.095735935,-0.122767219,-0.0676,0.002533,-0.1510361,0.736247,2.053192,-0.423658,0.4591219,1.1245015, BS04,-0.435189342,-0.041595955,-0.781281128,-1.923036,-3.2301671020.152322609,-1.495513519,, I am using R to perform a correlation, but I am getting an error while trying to read the data as person.data-read.table(datafile.csv,header=TRUE,sep=',',row.names=1) Error in scan (file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1919 did not have 13 elements Execution halted The error looks as though there is a problem with the last element being not read when it is blank. I could introduce terms like na to the blank elements but I donot want to do that because this will hinder my future analysis. Can some one suggest me a solution to overcome this problem while reading the data? , or is there something that I have missed to make the data readable. Thank you in advance, PS: The data was imported from a experiment and saved in excel sheet as a *.csv and then used. You have already had other replies, to which I would add, be sure to read Chapter 8 in the R Import/Export Manual regarding importing Excel files and other options besides exporting to a CSV file. In addition, the issue of Excel generating CSV files with the last column missing on some rows is a known issue and is reported in the MSKB here: http://support.microsoft.com/default.aspx?scid=kb;EN-US;q77295 Even though the latest version of Excel listed in the article as being relevant is 97, I had this problem with 2000 and 2003 as well. I would instead use OpenOffice.org's Calc to do the export when this was required. Calc did not have this problem. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multcomp plotting
I am using the multcomp package for doing multiple comparisons. Since the data I am handling is huge the number of comparisons are also large. I am interested in: 1 Breaking down my plots to get rid of the clutter that happens when plotting the entire data set. How do I pass only part of the data to the plot function ? fungus.cirec-simint(Fungus.yield~Habitat, data=fungus,conf.level=0.95,type =c(Tukey)) plot(fungus.cirec) #This plots the entire data. I want to plot part of the data only 2I am also interested in getting rid of the field name associated with each categorical variable. Here is what the part of the data looks like Habitat Fungus.yield Birch 20.83829053 Birch 22.9718181 Birch 22.28216829 Birch 24.23136797 Birch 22.32147961 Birch 20.30783598 Oak 27.24047258 Oak 29.7730014 Oak 30.12608508 Oak 25.76088669 Oak 30.14750974 Hornbeam 17.05307949 Hornbeam 15.32805111 Hornbeam 18.26920177 Hornbeam 21.30987049 Hornbeam 21.7173223 When it plots labels it as HabitatBirch-HabitatOak for example. How do I get rid of the field name Habitat in the plot? 3 How do I tell the method to mark the significant comparisons? (i.e those that do not intersect the zero line). Thanks ../Murli __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] seeking robust test for equality of variances w/ observation weights
Hello R community, I am looking for a robust test for equality of variances that can take observation weights. I realize I can do the F-test with weighted variances, but I've read that this test is not very robust. So I thought about maybe adding a weights argument to John Fox's code for the Levene Test (in the car library, levene.test), substituting his median function for a weighted.mean and also including the observation weights in his lm run-- after all, Levene's original test used the mean, not the median. I asked John about it and he doesn't know what the properties of this weighted Levene test would be. Does anyone have any thoughts or suggestions, or know of a robust weighted hypothesis test for equality of variances? Thank you in advance for any advice you can provide, Alexis Diamond [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multicomp plotting
I am using the multcomp package for doing multiple comparisons. Since the data I am handling is huge the number of comparisons are also large. I am interested in: 1 Breaking down my plots to get rid of the clutter that happens when plotting the entire data set. How do I pass only part of the data to the plot function ? fungus.cirec-simint(Fungus.yield~Habitat, data=fungus,conf.level=0.95,type =c(Tukey)) plot(fungus.cirec) #This plots the entire data. I want to plot part of the data only 2I am also interested in getting rid of the field name associated with each categorical variable. Here is what the part of the data looks like Habitat Fungus.yield Birch 20.83829053 Birch 22.9718181 Birch 22.28216829 Birch 24.23136797 Birch 22.32147961 Birch 20.30783598 Oak 27.24047258 Oak 29.7730014 Oak 30.12608508 Oak 25.76088669 Oak 30.14750974 Hornbeam 17.05307949 Hornbeam 15.32805111 Hornbeam 18.26920177 Hornbeam 21.30987049 Hornbeam 21.7173223 It plots labels as HabitatBirch-HabitatOak for example. How do I get rid of the field name Habitat in the plot? 3 How do I tell the method to mark the significant comparisons? (i.e those that do not intersect the zero line). Thanks ../Murli __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A statistical question
Dear Users, I have two particular problems that I need to solve. I do an analysis of a survey about sexuality. (Don't read any further if you don't like the subject.) Problem (1) In the survey there a two questions about monogamy. (1) Are you monogamous? (At this moment.) (2) Have you been in contact with men and / or women? (Past / Present) For some other inferences I need to extract historical data out of these questions about monogamy, like past monogamy (which hasn't been asked). This should be possible. Is there a reasonable way out of here? Problem (2) I have to do some geographical testing. I have to check the geographical distribution of respondents and some other properties. Could you advise me what to read / use? An R package with help is sufficient (I hope). Thanks, Wilfred __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] positive semi-definite matrix
Ravi == Ravi Varadhan [EMAIL PROTECTED] on Fri, 21 Jul 2006 11:33:23 -0400 writes: Ravi There is a paper by N.J. Higham (SIAM J Matrix Anal, Ravi 1998) on a modified cholesky decomposition of Ravi symmetric and not necessarily positive definite matrix Ravi (say, A), with an important goal of producing a Ravi small-normed perturbation of A (say, delA), that Ravi makes (A + delA) positive definite. Ravi http://epubs.siam.org/sam-bin/dbq/article/30289 Ravi There is also an algorithm in Gill, Murray and Ravi Wright's text - Practical Optimization (section Ravi 4.4.2). Thanks a lot, Ravi, for the interesting references, in the past I once had looked for such things but did not find any --- most probably because I used wrong keywords. Ravi These may be relevant to your problem. I am not sure Ravi if these algorithms have been implemented in R, for Ravi example, in the matrix library. O... ! It's Matrix and `package', yes `package', yes `package' .. but no, it hasn't been implemented there yet, AFAIK. OTOH, it's not a bad idea to do there, since it's building on the LDL' cholesky factorization which we are using in Matrix in other places anyway. Thanks again for your help! Martin Maechler, ETH Zurich Ravi -- Ravi Ravi Varadhan, Ph.D. Ravi Assistant Professor, The Center on Aging and Health Ravi Division of Geriatric Medicine and Gerontology Ravi Johns Hopkins University Ravi Ph: (410) 502-2619 Ravi Fax: (410) 614-9625 Ravi Email: [EMAIL PROTECTED] Ravi Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html Ravi -- -Original Message- From: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Friday, July 21, 2006 9:45 AM To: roger bos Cc: RHelp Subject: Re: [R] positive semi-definite matrix On 7/21/2006 8:59 AM, roger bos wrote: I have a covariance matrix that is not positive semi-definite matrix and I need it to be via some sort of adjustment. Is there any R routine or package to help me do this? I think you need to be more specific about what have and what you want, but if the matrix is symmetric and nearly positive semi-definite (but not exactly because of rounding error), you might try something like fixit - function(A) { eig - eigen(A, symmetric = TRUE) eig$values - pmax(0, eig$values) return(eig$vectors %*% diag(eig$values) %*% t(eig$vectors)) } Rounding error means this is not guaranteed to be positive semi-definite, but it will be very close. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Order-restricted inference
I found this other more recent monograph: Silvapulle and Sen (October 2004), Constrained Statistical Inference: Order, Inequality, and Shape Constraints, Wiley-Interscience. http://ca.wiley.com/WileyCDA/WileyTitle/productCd-0471208272,descCd-reviews.html Thanks for the tip, Bruno - Original Message - From: Thomas Lumley [EMAIL PROTECTED] To: Bruno L. Giordano [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Friday, July 21, 2006 11:22 AM Subject: Re: [R] Order-restricted inference On Thu, 20 Jul 2006, Bruno L. Giordano wrote: If R packages are not found for my purpose, can somebody please point me to some recent monographs on order-restricted inference? One more recent book is: Robertson, T., Wright, F., Dykstra, R. (1988). Order restricted statistical inference. New York: John Wiley and Sons. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle ~~ Bruno L. Giordano, Ph.D. CIRMMT Schulich School of Music, McGill University 555 Sherbrooke Street West Montréal, QC H3A 1E3 Canada http://www.music.mcgill.ca/~bruno/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] seeking robust test for equality of variances w/ observationweights
You can always bootstrap any robust spread measure (e.g. mad or higher efficiency versions from robustbase or other packages). -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alexis Diamond Sent: Friday, July 21, 2006 9:56 AM To: r-help@stat.math.ethz.ch Subject: [R] seeking robust test for equality of variances w/ observationweights Hello R community, I am looking for a robust test for equality of variances that can take observation weights. I realize I can do the F-test with weighted variances, but I've read that this test is not very robust. So I thought about maybe adding a weights argument to John Fox's code for the Levene Test (in the car library, levene.test), substituting his median function for a weighted.mean and also including the observation weights in his lm run-- after all, Levene's original test used the mean, not the median. I asked John about it and he doesn't know what the properties of this weighted Levene test would be. Does anyone have any thoughts or suggestions, or know of a robust weighted hypothesis test for equality of variances? Thank you in advance for any advice you can provide, Alexis Diamond [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] intersect of list elements
Hi, i have a list of several vectors, for example: vectorlist $vector.a.1 [1] a b c $vector.a.2 [1] a b d $vector.b.1 [1] e f g I can use intersect to find elements that appear in $vector.a.1 and $vector.a.2: intersect(vectorlist[[1]], vectorlist[[2]]) [1] a b I would like to use grep to get the vectors by their names matching an expression and to find the intersects between those vectors. For the first step: vectorlist[grep (vector.a, names(vectorlist))] $vector.a.1 [1] a b c $vector.a.2 [1] a b d Unfortunately, I can not pass the two vectors as argument to intersect: intersect(vectorlist[grep (vector.a, names(vectorlist))]) Error in unique(y[match(x, y, 0)]) : argument y is missing, with no default I am running R Version 2.3.1 (2006-06-01) Could somone help me to solve this? Cheers, Georg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] intersect of list elements
FAQ 7.21. But there are perhaps slicker ways. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Georg Otto Sent: Friday, July 21, 2006 10:39 AM To: r-help@stat.math.ethz.ch Subject: [R] intersect of list elements Hi, i have a list of several vectors, for example: vectorlist $vector.a.1 [1] a b c $vector.a.2 [1] a b d $vector.b.1 [1] e f g I can use intersect to find elements that appear in $vector.a.1 and $vector.a.2: intersect(vectorlist[[1]], vectorlist[[2]]) [1] a b I would like to use grep to get the vectors by their names matching an expression and to find the intersects between those vectors. For the first step: vectorlist[grep (vector.a, names(vectorlist))] $vector.a.1 [1] a b c $vector.a.2 [1] a b d Unfortunately, I can not pass the two vectors as argument to intersect: intersect(vectorlist[grep (vector.a, names(vectorlist))]) Error in unique(y[match(x, y, 0)]) : argument y is missing, with no default I am running R Version 2.3.1 (2006-06-01) Could somone help me to solve this? Cheers, Georg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] intersect of list elements
Georg Otto wrote: Hi, i have a list of several vectors, for example: vectorlist $vector.a.1 [1] a b c $vector.a.2 [1] a b d $vector.b.1 [1] e f g I can use intersect to find elements that appear in $vector.a.1 and $vector.a.2: intersect(vectorlist[[1]], vectorlist[[2]]) [1] a b I would like to use grep to get the vectors by their names matching an expression and to find the intersects between those vectors. For the first step: vectorlist[grep (vector.a, names(vectorlist))] $vector.a.1 [1] a b c $vector.a.2 [1] a b d Unfortunately, I can not pass the two vectors as argument to intersect: intersect(vectorlist[grep (vector.a, names(vectorlist))]) Error in unique(y[match(x, y, 0)]) : argument y is missing, with no default I am running R Version 2.3.1 (2006-06-01) Could somone help me to solve this? Cheers, Georg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Will this work for you? vectorlist - list(vector.a.1 = c(a, b, c), vector.a.2 = c(a, b, d), vector.b.1 = c(e, f, g)) intersect2 - function(...) { args - list(...) nargs - length(args) if(nargs = 1) { if(nargs == 1 is.list(args[[1]])) { do.call(intersect2, args[[1]]) } else { stop(cannot evaluate intersection fewer than 2 arguments) } } else if(nargs == 2) { intersect(args[[1]], args[[2]]) } else { intersect(args[[1]], intersect2(args[-1])) } } vector.a - vectorlist[grep (vector.a, names(vectorlist))] intersect2(vector.a) intersect2(vectorlist) HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] intersect of list elements
The following assumes that within each component of vectorlist the vector elements are unique. In that case the first two lines define vectorlist and perform the grep, as in your post. Elements of the intersection must occur n times where n is the number of components of vectorlist that match the grep and those elements are extracted in the last line. # from your post vectorlist - list(vector.a.1 = c(a, b, c), vector.a.2 = c(a, b, d), vector.b.1. = c(e, f, g)) idx - grep(vector.a, names(vectorlist)) # get intersection names(which(table(unlist(vectorlist[idx])) == length(idx))) On 7/21/06, Georg Otto [EMAIL PROTECTED] wrote: Hi, i have a list of several vectors, for example: vectorlist $vector.a.1 [1] a b c $vector.a.2 [1] a b d $vector.b.1 [1] e f g I can use intersect to find elements that appear in $vector.a.1 and $vector.a.2: intersect(vectorlist[[1]], vectorlist[[2]]) [1] a b I would like to use grep to get the vectors by their names matching an expression and to find the intersects between those vectors. For the first step: vectorlist[grep (vector.a, names(vectorlist))] $vector.a.1 [1] a b c $vector.a.2 [1] a b d Unfortunately, I can not pass the two vectors as argument to intersect: intersect(vectorlist[grep (vector.a, names(vectorlist))]) Error in unique(y[match(x, y, 0)]) : argument y is missing, with no default I am running R Version 2.3.1 (2006-06-01) Could somone help me to solve this? Cheers, Georg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] connection to X11 problem
Dear List, I am a new Mac user and I am having problem generating png (or jpeg) using the GUI version of R. I installed R-2.3.1.dmg (custom install with everything selected) and X11User.pkg but I am still getting the following X11 connection error when I try to generate a png (or a jpeg): Error in X11(paste(png::, filename, sep = ), width, height, pointsize, : unable to start device PNG In addition: Warning message: unable to open connection to X11 display '' I tried to set up the DISPLAY variable using the command: Sys.putenv(DISPLAY=:0) but I am still running into the same problem. Is there anything else I need to do or install in order to use X11? I am using a intel Core Duo processor and OSX 10.4.7. Thank you for your help, Agnes __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with character spacing in windows metafiles...
I finally determined the problem, but not the specific cause. For some reason, the WMF is resized when saved or copied from R. If you compare the same plot pasted into PowerPoint as both a bitmap and a WMF, the WMF has its height increased ~14% and its width decreased ~5% consistently. This is true whether you copy/paste or save the WMF. At least, that is what occurs on my Windows XP machine with Office 2003 and R 2.3.1 installed. -- View this message in context: http://www.nabble.com/Problems-with-character-spacing-in-windows-metafiles...-tf1980921.html#a5438500 Sent from the R help forum at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] connection to X11 problem
On Fri, 2006-07-21 at 11:47 -0700, Paquet, Agnes wrote: Dear List, I am a new Mac user and I am having problem generating png (or jpeg) using the GUI version of R. I installed R-2.3.1.dmg (custom install with everything selected) and X11User.pkg but I am still getting the following X11 connection error when I try to generate a png (or a jpeg): Error in X11(paste(png::, filename, sep = ), width, height, pointsize, : unable to start device PNG In addition: Warning message: unable to open connection to X11 display '' I tried to set up the DISPLAY variable using the command: Sys.putenv(DISPLAY=:0) but I am still running into the same problem. Is there anything else I need to do or install in order to use X11? I am using a intel Core Duo processor and OSX 10.4.7. Thank you for your help, Agnes I don't use a Mac, but the following might be helpful: http://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#X11-window-server-_0028optional_0029 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/77424.html http://finzi.psych.upenn.edu/R/Rhelp02a/archive/46097.html Also note that there is a R-sig-Mac e-mail list: https://stat.ethz.ch/mailman/listinfo/r-sig-mac HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] positive semi-definite matrix
Martin, You are most welcome. I apologize for my faux pas. I really did mean to say Matrix package, but got sloppy! There is also another (more recent) article by Higham: http://www.maths.man.ac.uk/~nareports/narep369.pdf Best, Ravi. -- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -- -Original Message- From: Martin Maechler [mailto:[EMAIL PROTECTED] Sent: Friday, July 21, 2006 1:14 PM To: Ravi Varadhan Cc: 'Duncan Murdoch'; 'roger bos'; 'RHelp'; [EMAIL PROTECTED] Subject: Re: [R] positive semi-definite matrix Ravi == Ravi Varadhan [EMAIL PROTECTED] on Fri, 21 Jul 2006 11:33:23 -0400 writes: Ravi There is a paper by N.J. Higham (SIAM J Matrix Anal, Ravi 1998) on a modified cholesky decomposition of Ravi symmetric and not necessarily positive definite matrix Ravi (say, A), with an important goal of producing a Ravi small-normed perturbation of A (say, delA), that Ravi makes (A + delA) positive definite. Ravi http://epubs.siam.org/sam-bin/dbq/article/30289 Ravi There is also an algorithm in Gill, Murray and Ravi Wright's text - Practical Optimization (section Ravi 4.4.2). Thanks a lot, Ravi, for the interesting references, in the past I once had looked for such things but did not find any --- most probably because I used wrong keywords. Ravi These may be relevant to your problem. I am not sure Ravi if these algorithms have been implemented in R, for Ravi example, in the matrix library. O... ! It's Matrix and `package', yes `package', yes `package' .. but no, it hasn't been implemented there yet, AFAIK. OTOH, it's not a bad idea to do there, since it's building on the LDL' cholesky factorization which we are using in Matrix in other places anyway. Thanks again for your help! Martin Maechler, ETH Zurich Ravi -- Ravi Ravi Varadhan, Ph.D. Ravi Assistant Professor, The Center on Aging and Health Ravi Division of Geriatric Medicine and Gerontology Ravi Johns Hopkins University Ravi Ph: (410) 502-2619 Ravi Fax: (410) 614-9625 Ravi Email: [EMAIL PROTECTED] Ravi Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html Ravi -- -Original Message- From: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Friday, July 21, 2006 9:45 AM To: roger bos Cc: RHelp Subject: Re: [R] positive semi-definite matrix On 7/21/2006 8:59 AM, roger bos wrote: I have a covariance matrix that is not positive semi-definite matrix and I need it to be via some sort of adjustment. Is there any R routine or package to help me do this? I think you need to be more specific about what have and what you want, but if the matrix is symmetric and nearly positive semi-definite (but not exactly because of rounding error), you might try something like fixit - function(A) { eig - eigen(A, symmetric = TRUE) eig$values - pmax(0, eig$values) return(eig$vectors %*% diag(eig$values) %*% t(eig$vectors)) } Rounding error means this is not guaranteed to be positive semi-definite, but it will be very close. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multicomp plotting
I am using the multcomp package for doing multiple comparisons. Since the data I am handling is huge the number of comparisons are also large. I am interested in: 1 Breaking down my plots to get rid of the clutter that happens when plotting the entire data set. How do I pass only part of the data to the plot function ? fungus.cirec-simint(Fungus.yield~Habitat,data=fungus,conf.level=0.95,ty pe =c(Tukey)) plot(fungus.cirec) #This plots the entire data. I want to plot part of the data only 2I am also interested in getting rid of the field name associated with each categorical variable. Here is what the part of the data looks like Habitat Fungus.yield Birch 20.83829053 Birch 22.9718181 Birch 22.28216829 Birch 24.23136797 Birch 22.32147961 Birch 20.30783598 Oak 27.24047258 Oak 29.7730014 Oak 30.12608508 Oak 25.76088669 Oak 30.14750974 Hornbeam 17.05307949 Hornbeam 15.32805111 Hornbeam 18.26920177 Hornbeam 21.30987049 Hornbeam 21.7173223 It plots labels as HabitatBirch-HabitatOak for example. How do I get rid of the field name Habitat in the plot? 3 How do I tell the method to mark the significant comparisons? (i.e those that do not intersect the zero line). Thanks ../Murli __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] character to vector
I have an object of mode character that contains a long sequence of letters. How can I convert this object into a vector with each element of the vector containing a single letter? Essentially, I want to break the long string of letters into many individual letters. Thanks for the help. Alex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] table elemets testing
Hi everybody, i'm dealing with some percentage tables, of which i should test rowwise if the entries are sgnificantly equal or not. Namely, on row 1, test H0: element 1= element2, H0: element 1= element3...H0: element 2= element3...H0: element n-1= element n. The same on the other rows. Anybody knows how this can be done in quick way? I don't have large matrices, but it seems quite boring... Thank you very much in advance for your answering, Emanuele __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] character to vector
On Fri, 2006-07-21 at 15:15 -0500, [EMAIL PROTECTED] wrote: I have an object of mode character that contains a long sequence of letters. How can I convert this object into a vector with each element of the vector containing a single letter? Essentially, I want to break the long string of letters into many individual letters. Thanks for the help. Alex letters [1] a b c d e f g h i j k l m n o p q [18] r s t u v w x y z MyChar - paste(letters, collapse = ) MyChar [1] abcdefghijklmnopqrstuvwxyz MyVec - unlist(strsplit(MyChar, )) MyVec [1] a b c d e f g h i j k l m n o p q [18] r s t u v w x y z See ?strsplit and ?unlist and of course, ?paste for the reverse operation as above. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] One arm survival sample calculations
Does anyone know of a function that computes the necessary sample size to reject a null median survival in favor of a given alternative median survival in a one arm trial? Cody Hamilton, Ph.D Institute for Health Care Research and Improvement Baylor Health Care System (214) 265-3618 This e-mail, facsimile, or letter and any files or attachments transmitted with it contains information that is confidential and privileged. This information is intended only for the use of the individual(s) and entity(ies) to whom it is addressed. If you are the intended recipient, further disclosures are prohibited without proper authorization. If you are not the intended recipient, any disclosure, copying, printing, or use of this information is strictly prohibited and possibly a violation of federal or state law and regulations. If you have received this information in error, please notify Baylor Health Care System immediately at 1-866-402-1661 or via e-mail at [EMAIL PROTECTED] Baylor Health Care System, its subsidiaries, and affiliates hereby claim all applicable privileges related to this information. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weibull distribution
Thanks for the suggestion! I switched to optimize(), al - optimize(f.fn, lower = 0.1, upper =100,tol=0.001); the warnings were gone and it works stably. But when I tried al - uniroot(f.fn, lower = 0.1, upper =100,tol=0.001); error occured: f() values at end points not of opposite sign. The error seems to me like there is no root found within the interval. I was not able to solve this problem. Thanks! Leaf - Original Message - From: Thomas Lumley, [EMAIL PROTECTED] Sent: 2006-07-21, 09:35:11 To: Valentin Dimitrov, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution On Fri, 21 Jul 2006, Valentin Dimitrov wrote: Dear Leaf, I modified your code as follows: gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) {abs(sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2))} alpha - optim(start, f.fn) beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } Now it works properly. First, I added an abs(). You tried to solve an equation by means of the R-function optim(), which finds a minimum. That's why you can find the solution of f(x)=a through minimization of abs(f(x)-a). Second, I deleted the optim-method BFGS from the optim() function, because it is not appropriate in this case. optim() is not appropriate at all in this case -- its help page says to use optimize() for one-dimensional problems. In fact, in one dimension there isn't any need to resort to optimization when you really want root-finding, and uniroot() is more appropriate than optimize(). -thomas [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim()
Dear Spencer, Thank you very much for your helpful reply. I was trying to reproduce a table in one paper. After I modified my code according to your suggestion, I was able to get the results that are very close to those in the paper. It seems the starting values of the parameters to be optimized are very crutial. So I will have different optimal values for different starting vectors. How can I be sure the min value returned by optim() is the true optimal value? I am also curious why you choose the constant penalty to handle the constraint in the first place. Why not use lagrange multiplier method to eliminate the constraint? Thanks again. I am grateful for your help. Best regards, Iris On 7/18/06, Spencer Graves [EMAIL PROTECTED] wrote: I had good luck translating constrained into unconstrained problems and then optimizing the unconstrained problem. Have you tried something like the following: Define: z = c(z1, z2, z3), where p1=1/(1+exp(-z1), etc. This translates the constraints on the p's to G(z) = P*(f1(z)-r12*f2(z))^2-f1(z) where f1(z) = f1(p1(z1), p2(z2), p3(z3), and similarly for f2(z), and where P = a penalty term, and r12 = (1-c)*k1/(c*(1-k1). Can f2(z) ever go outside (0, 1)? If yes, I would modify G(z) by adding a term like (min(0, f2(z), 1-f2(z))^2) If I haven't made a math error, your problem should translate into this form. I first solve this problem for z with P small like 1. Then after I've got a solution for that, I increase P to 2, then 10, then 100, etc., until the penalty is so great that the desired equality has been effectively achieved. With 'P' fixed, 'optim' should handle this kind of problem handily. To learn how, I suggest you work through the examples in the ?optim help page. I'd ignore the gradient, at least initially. A silly math error in computing the gradient can delay a solutions unnecessarily. If you need to solve thousands of problems like this for different values of k1 and 'c', I might later program the gradient. However, I would not do that initially. Also, if you are not already familiar with Venables and Ripley (2002) Modern Applied Statistics with S, 4th ed. (Springer -- or an earlier edition), I would encourage you to spend some quality time with this book. It can help you with 'optim', with contour plots, etc. Hope this helps, Spencer Graves Iris Zhao wrote: Dear all, I am working on optimization problem and have some trouble running optim(). I have two functions (f1, f2) and 4 unknown parameters (p1, p2, p3, p4). Both f1 and f2 are functions of p1, p2, and p3, denoted by f1(p1, p2, p3) and f2(p1,p2,p3) respectively. The goal is to maximize f1(p1, p2, p3) subject to two constraints: (1) c = k1*p4/(k1*p4+(1-k1)*f1(p1,p2,p3)), where c and k1 are some known constants (2) p4 = f2(p1, p2, p3) In addition, each parameter ranges from 0 to 1, and both f1 and f2 involve integrations. I tried to use lagrange multipliers to eliminate two equality constraints and then use optim() to find the maximum value and optimal parameter estimates. So I let fn be f1+lambda1*(c- k1*p4/(k1*p4+(1-k1)*f1(p1,p2,p3))) + lambda2(p4-f2(p1,p2,p3)). The error message I got was Error in fn(par, ...) : recursive default argument reference. I wonder whether current build-in functions in R can do this type of jobs. Any suggestion will be greatly appreciated. Iris [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] connection to X11 problem
On 21-Jul-06 Paquet, Agnes wrote: Dear List, I am a new Mac user and I am having problem generating png (or jpeg) using the GUI version of R. I installed R-2.3.1.dmg (custom install with everything selected) and X11User.pkg but I am still getting the following X11 connection error when I try to generate a png (or a jpeg): Error in X11(paste(png::, filename, sep = ), width, height, pointsize, : unable to start device PNG In addition: Warning message: unable to open connection to X11 display '' I tried to set up the DISPLAY variable using the command: Sys.putenv(DISPLAY=:0) but I am still running into the same problem. Like Marc, I don't use a Mac either. But the underlying BSD OS is basically similar to Linux. On Linux, my primary X11 DISPLAY envvar would be :0.0, so (at a guess) I suggest you try Sys.putenv(DISPLAY=:0.0) Hoping this helps! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 21-Jul-06 Time: 23:34:05 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parameterization puzzle
Thanks to Brian and Berwin with there help. I faced a double problem in that I not only wanted to fit the model but I also wanted to do so in such a way that it would seem natural for a classroom example. The moral seems to be that I should do the orthogonal polynomial stuff outside the model formula. Here then is my solution: pyears - scan() 18793 52407 10673 43248 5710 28612 2585 12663 1462 5317 deaths - scan() 2 32 12 104 28 206 28 186 31 102 l - log(pyears) Smoke - gl(2,1,10,labels=c(No,Yes)) Age - gl(5,2,10,labels=c(35-44,45-54,55-64,65-74,75-84), ordered=TRUE) mod1.glm - glm(deaths ~ Age * Smoke + offset(l),family=poisson) summary(mod1.glm) age - as.numeric(Age) age1 - poly(age,2)[,1] age2 - poly(age,2)[,2] mod2.glm - aso1.glm - glm(deaths ~ age1 + age2 + Smoke + age1:Smoke + offset(l),family=poisson) summary(mod2.glm) The final summary then comes out looking like this: Estimate Std. Error z value Pr(|z|) (Intercept)-5.8368 0.1213 -48.103 2e-16 *** age15.3238 0.4129 12.893 2e-16 *** age2 -1.0460 0.1448 -7.223 5.08e-13 *** SmokeYes0.5183 0.1262 4.106 4.02e-05 *** age1:SmokeYes -1.3755 0.4340 -3.169 0.00153 ** which is just what I wanted. Cheers, Murray Jorgensen Prof Brian D Ripley wrote: On Fri, 21 Jul 2006, Berwin A Turlach wrote: G'day all, BDR == Prof Brian Ripley [EMAIL PROTECTED] writes: BDR R does not know that poly(age,2) and poly(age,1) are linearly BDR dependent. Indeed, I also thought that this is the reason of the problem. BDR (And indeed they only are for some functions 'poly'.) I am surprised about this. Should probably read the help page of 'poly' once more and more carefully. My point was perhaps simpler: if you or I or Murray had a function poly() in our workspace, that one would be found, I think. (I have not checked the ramifications of namespaces here, but I believe that would be the intention, that formulae are evaluated in their defining environment.) So omly when the model matrix is set up could the linear dependence be known (and there is nothing in the system poly() to tell model.matrix). What is sometimes called extrinsic aliasing is left to the fitting function, which seems to be to do a sensible job provided the main effect is in the model. Indeed, including interactions without main effects (as Murray did) often makes the results hard to interpret. BDR I cannot reproduce your example ('l' is missing), [...] My guess is that 'l' is 'pyears'. At least, I worked under that assumption. Apparently l = log(pyears), which was my uncertain guess. Interestingly, on my machine (using R 2.3.1, 2.2.1 and 2.1.1) I cannot fit any of the Poisson GLM that Murray tried. I always get the error message: Error: no valid set of coefficients has been found: please supply starting values Related to the offset, I believe. But I have to investigate this further. I can fit binomial models that give me similar answers. BDR [...] but perhaps BDR glm(deaths ~ poly(age,2) + poly(age,1)*Smoke + offset(l), BDR poisson) BDR was your intention? In this parameterisation a 'poly(age,1)' term will appear among the coefficients with an estimated value of NA since it is aliased with 'poly(age, 2)1'. So I don't believe that this was Murray's intention. The only suggestion I can come up with is: summary(glm(cbind(deaths, l-deaths) ~ age*Smoke+I(age^2), family=binomial)) [...] Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -10.798950.45149 -23.918 2e-16 *** age2.378920.20877 11.395 2e-16 *** SmokeYes 1.445730.37347 3.871 0.000108 *** I(age^2) -0.197060.02749 -7.168 7.6e-13 *** age:SmokeYes -0.308500.09756 -3.162 0.001566 ** [...] Which doesn't use orthogonal polynomials anymore. But I don't see how you can fit the model that Murray want to fit using orthogonal polynomials given the way R's model language operates. So I guess the Poisson GLM that Murray wants to fit is: glm(deaths~ age*Smoke+I(age^2)+offset(l), family=poisson) Cheers, Berwin == Full address Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Mathematics and Statistics+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: [EMAIL PROTECTED] Australiahttp://www.maths.uwa.edu.au/~berwin -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862
Re: [R] connection to X11 problem: problem fixed
Hi, I finally managed to make it work (I just needed to have a X11 window open). Thank you very much for your help. Agnes -Original Message- From: Ted Harding [mailto:[EMAIL PROTECTED] Sent: Friday, July 21, 2006 3:34 PM To: Paquet, Agnes Cc: r-help@stat.math.ethz.ch Subject: RE: [R] connection to X11 problem On 21-Jul-06 Paquet, Agnes wrote: Dear List, I am a new Mac user and I am having problem generating png (or jpeg) using the GUI version of R. I installed R-2.3.1.dmg (custom install with everything selected) and X11User.pkg but I am still getting the following X11 connection error when I try to generate a png (or a jpeg): Error in X11(paste(png::, filename, sep = ), width, height, pointsize, : unable to start device PNG In addition: Warning message: unable to open connection to X11 display '' I tried to set up the DISPLAY variable using the command: Sys.putenv(DISPLAY=:0) but I am still running into the same problem. Like Marc, I don't use a Mac either. But the underlying BSD OS is basically similar to Linux. On Linux, my primary X11 DISPLAY envvar would be :0.0, so (at a guess) I suggest you try Sys.putenv(DISPLAY=:0.0) Hoping this helps! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 21-Jul-06 Time: 23:34:05 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected results
Thanks, You are right about it being a tab delimited file - I should have spotted that. I am now getting an error saying that line 4 did not have 27 elements but will fiddle around and try to work it out - I'm guessing because I have some empty feild its causing problems. Anyway thanks for the differnt bits of help -- View this message in context: http://www.nabble.com/unexpected-results-tf1979786.html#a5441821 Sent from the R help forum at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Putting x into the right subset
Dear all, I'm sorry I have to bother you with this newbie stuff. Within a loop I obtain pairs of values x - c(a, b). An empty set M is defined before the loop (as a list or whatever). Now I want to do the following: if there is a vector y in M that contains at least one of the values of x, then x shall be concatenated to y. If y doesn't contain any of the values in x, then x shall be put into M as a new vector. I first imagined it to be trivial but I'm having a hard time with the if-statement. I tried to define a function contains(value, vector), which returns FALSE or TRUE, but R complains that I tried to execute a non-function... Could anybody help a despaired newbie, please! Thanks a lot, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weibull distribution
It seems to me that not for all values of mu and sd there is a Weibull distribution with mean=mu and variance=sd^2. the programm with optimize(f.fn) finds always a solution, but this is not necessarily what we need, because the minimum of (abs(f(x)) is not always 0. Suppose f(x)=2+x^2, then optimize(x) finds x=0, but x=0 is not a root of f(x)=0. That's why I agree with Thomas Lumley, that uniroot() could be more appropriate than optim and optimize. Best regards, Valentin --- Leaf Sun [EMAIL PROTECTED] wrote: Thanks for the suggestion! I switched to optimize(), al - optimize(f.fn, lower = 0.1, upper =100,tol=0.001); the warnings were gone and it works stably. But when I tried al - uniroot(f.fn, lower = 0.1, upper =100,tol=0.001); error occured: f() values at end points not of opposite sign. The error seems to me like there is no root found within the interval. I was not able to solve this problem. Thanks! Leaf - Original Message - From: Thomas Lumley, [EMAIL PROTECTED] Sent: 2006-07-21, 09:35:11 To: Valentin Dimitrov, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution On Fri, 21 Jul 2006, Valentin Dimitrov wrote: Dear Leaf, I modified your code as follows: gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) {abs(sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2))} alpha - optim(start, f.fn) beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } Now it works properly. First, I added an abs(). You tried to solve an equation by means of the R-function optim(), which finds a minimum. That's why you can find the solution of f(x)=a through minimization of abs(f(x)-a). Second, I deleted the optim-method BFGS from the optim() function, because it is not appropriate in this case. optim() is not appropriate at all in this case -- its help page says to use optimize() for one-dimensional problems. In fact, in one dimension there isn't any need to resort to optimization when you really want root-finding, and uniroot() is more appropriate than optimize(). -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nested repeated measures in R
R help, How would I input data, verify assumptions, and run a nested repeated measures ANOVA using R? I have STATION nested in SITE nested in BLOCK with measurements repeated for five YEARs. All are random variables and it's only slightly unbalanced. I'm trying to characterize spatiotemporal variation in stream habitat variables. Thanks for your help! Eric Merten __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim()
1. I'd be worried any time the answer depended very much on the starting values. It suggests that the objective function is not well behaved, and I must be very careful to make sure I get an appropriate answer, and I'm not being misled by round-off, etc. 2. I would NOT use a constant penalty; I'd start with a small constant penalty, then increase it gradually until I had a solution that honestly satisfied the constraints. 3. Alternatively, you could use Lagrange multipliers by redefining your objective function and including the Lagrange multiplier(s) as other paremeter(s) to be estimated. That sounds like a sensible idea, but I have no experience trying that. I would expect it might work fine provided the objective function with constraints were all differentiable and sufficiently smooth to admit only one optimum. 4. However, I believe you will gain more insight into the problem by trying to reduce the number of unknowns rather than increase them. For example, I noted in my earlier reply that your constraints are equivalent to f1 = r12*f2, with r12 = r12 = (1-c)*k1/(c*(1-k1) = a constant determined from known constants in your previous problem statement. If it were my problem, I might focus on trying to use this constraint to determine one of (p1, p2, p3) as a function of the other two. For example, for what combinations in the (p1, p2) space is there a single, unique solution, when is there 0 and when are there 2, 3, or more? To accomplish that, I might use 'expand.grid' to generate many different combinations of (p1, p2) and then use 'optim' to minimize SSD = (f1-r12*f2)^2 over variations in p3. (Of course, if it were easier to solve for p1 or p2 in terms of the other two, I might do that.) For each combination of (p1, p2), I'd store the resulting values of SSD, p3, and f1. Then if any of the SSD values were numerically greater than 0, I'd worry about those cases. Otherwise, I'd look at contour and perspective plots of p3 and f1 vs. p1 and p2 to try to generate some insight into this problem -- and perhaps generate a simple way to solve for p3 and f1 from p1 and p2. Make sense? Hope this helps. Spencer Graves Iris Zhao wrote: Dear Spencer, Thank you very much for your helpful reply. I was trying to reproduce a table in one paper. After I modified my code according to your suggestion, I was able to get the results that are very close to those in the paper. It seems the starting values of the parameters to be optimized are very crutial. So I will have different optimal values for different starting vectors. How can I be sure the min value returned by optim() is the true optimal value? I am also curious why you choose the constant penalty to handle the constraint in the first place. Why not use lagrange multiplier method to eliminate the constraint? Thanks again. I am grateful for your help. Best regards, Iris On 7/18/06, Spencer Graves [EMAIL PROTECTED] wrote: I had good luck translating constrained into unconstrained problems and then optimizing the unconstrained problem. Have you tried something like the following: Define: z = c(z1, z2, z3), where p1=1/(1+exp(-z1), etc. This translates the constraints on the p's to G(z) = P*(f1(z)-r12*f2(z))^2-f1(z) where f1(z) = f1(p1(z1), p2(z2), p3(z3), and similarly for f2(z), and where P = a penalty term, and r12 = (1-c)*k1/(c*(1-k1). Can f2(z) ever go outside (0, 1)? If yes, I would modify G(z) by adding a term like (min(0, f2(z), 1-f2(z))^2) If I haven't made a math error, your problem should translate into this form. I first solve this problem for z with P small like 1. Then after I've got a solution for that, I increase P to 2, then 10, then 100, etc., until the penalty is so great that the desired equality has been effectively achieved. With 'P' fixed, 'optim' should handle this kind of problem handily. To learn how, I suggest you work through the examples in the ?optim help page. I'd ignore the gradient, at least initially. A silly math error in computing the gradient can delay a solutions unnecessarily. If you need to solve thousands of problems like this for different values of k1 and 'c', I might later program the gradient. However, I would not do that initially. Also, if you are not already familiar with Venables and Ripley (2002) Modern Applied Statistics with S, 4th ed. (Springer -- or an earlier edition), I would encourage you to spend some quality time with this book. It can help you with 'optim', with contour plots, etc. Hope this helps, Spencer Graves Iris Zhao wrote: Dear all, I am working on optimization problem and have some trouble running optim(). I have two functions (f1, f2) and 4 unknown parameters (p1, p2, p3, p4). Both f1 and f2 are functions