Re: [R] Processing a large number of files
Douglas Bates wrote: I maintain the Devore5 package which contains the data sets from the 5th edition of Jay Devore's text Probability and Statistics for Engineering and the Sciences. The 6th edition has now been published and it includes several new data sets in exercises and examples. In addition, some exercises and examples from the 5th edition are renumbered in the 6th edition. I face the daunting task of adding and documenting the new data sets and updating the numbering. I had thought of going back to the text files but discovered that it was easier to work from another form. A CD-ROM with the book provides the data sets in several different formats, including SPSS saved data sets. I was pleasantly surprised that I could write an R script that read the data from the .sav file, converted it to an R data frame, converted the SPSS name such as ex01-11.sav to an allowable R name (ex01.11), and saved the resulting data set in a new directory. In the past I would have written Python or Perl scripts to do all the manipulations of iterating over files but with the current facilities in R for listing file names, etc., I can do the whole thing in R. My script, which worked on the first try, is library(foreign) SPSS = /cdrom/Manual Install/Datasets/SPSS/ # change as appropriate Rdata = /tmp/Devore6/data/# change as appropriate chapters = c(CH01, CH04, CH06, CH07, CH08, CH09, CH10, CH11, CH12, CH13, CH15, Ch14, Ch16) for (ch in chapters) { path = paste(SPSS, ch, sep = '') files = list.files(path = path, pattern = '*.sav') for (ff in files) { dsn = gsub('-', '.', gsub('\.sav$', '', ff)) assign(dsn, data.frame(read.spss(paste(path, ff, sep = '/' save(list = dsn, file = paste(Rdata, dsn, .rda, sep = '')) } } In fact this script processed the 326 files so quickly that I thought I must have made a mistake and somehow missed most of the files. I had to look in the output directory to convince myself that it had indeed run properly. I would encourage others to consider using list.files, gsub, etc. within R for such scripting applications. Doug, indeed, it's great. The main part of the current automated script files for compiling R binary packages for Windows is done in R including processing of files (e.g. checking which of the 2xx CRAN packages has been updated) and generation of Windows *.bat files for the final processing and upload steps. In principle, the whole stuff could be done in a single R script (but would be more difficult to debug hence not implemented that way). Uwe Uwe Ligges __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] using getBioC()
There was a post in the Bioconductor mailing list regarding the same error message: https://stat.ethz.ch/pipermail/bioconductor/2003-February/000854.html One possible solution mentioned there was to remove the line in getBioC() that checks the capabilities. Since it seems likely that you do have access to http from your machine, even if the R version doesn't know that, then you could try saving the script to a local file, edit it to change the line that reads: http - as.logical(capabilities(what=http/ftp)) to: ##http - as.logical(capabilities(what=http/ftp)) http - TRUE then source your local copy of the file, and finally try re-running the function. I don't understand why the capability would be detected as FALSE. I built my version on Linux, but never made any explicit selection of the capability. I do see some other items listed by the capabilities() function that I did configure. (It wouldn't be some oddity of the Mac OS would it?...) Anyway, I hope this helps. Cheers, Bill Barnard On Tue, 2003-07-22 at 16:43, Carol Foster wrote: Hello, I am trying to install R/Bioconductor on a G4 Mac running OS X. I have successfully installed R so that a command window opens, but installation of the downloaded Bioconductor package is giving me trouble. After copying/pasting the Bioconductor installation script in to the window and typing getBioC(), I get the following error message. Error in getBioC(): R not currently configured to allow HTTP connections, which is required for getBioC to work properly. Any suggestions would be greatly appreciated. Sincerely, Carol Foster __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] libblas.so.3
On Tue, 22 Jul 2003, Francisco J Molina wrote: I am using linux rehat 9 When I try to run R I get usr/lib/R/bin/R.bin: error while loading shared libraries: libblas.so.3: cannot open shared object file: No such file or directory ( because I removed it ) I thought of compiling R from source but I have read: R currently uses only level 1 blas and the most significant atlas optimizations are for level 3 (and level 2 to some extent). The problem with using ATLAS is that its installation process does not build the shared libraries by default and the whole build process is rather complicated. This is from a 3 years old message but I guess that the state of the art now is similar. It's not. R now makes heavy use of level 3 BLAS. On RH9 ATLAS should build out of the box (with static libraries), and be well worth using. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] lattice: how to format axis labels?
Dear r-help, I draw graphics with xyplot() function. Labels on the y axis are appearing as follows: 1.5, 1, 0.5, 0 I'd like to have them to be 1.5, 1.0, 0.5, 0.0, i.e. with fixed number of digits after the dot (one in this case). Is there any way to do this without implicit specifying labels? And some questions about font. Unfortunately I cannot find in the documentation how to make the axis labels bold. What's the difference between fonts 1, 2, 3 and 4? I have tried them all (trellis.par.set(par.xlab.text,list(font=4));), but haven't seen any difference. I use R 1.7.1 on WindowsNT. Thank you! -- Best regards Wladimir Eremeev mailto:[EMAIL PROTECTED] == Research ScientistLeninsky Prospect 33, Space Monitoring Ecoinformation Systems Sector, Moscow, Russia, 119071, Institute of Ecology, Phone: (095) 135-9972; Russian Academy of Sciences Fax: (095) 954-5534 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Read trajectory file into R
dear helpers, I wonder if there is a way to read a molecular dynamic trajectory file ( binary file) produced by CHARMM into R. Something like that in matlab. Actually this will save tremendous effort in post processing. best regards karim __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Curious warning in R for OS X w/Xwindows
Bear F. Braumoeller [EMAIL PROTECTED] writes: I'm starting R with xterm -sb -rightbar -sl 1000 -bg black -fg blue -title R -e /usr/local/bin/R -- but it also happens if I just start a vanilla terminal and type R. As to the other questions, Sys.getenv(TERM) TERM xterm Sys.getenv(PAGER) PAGER /usr/bin/less options(pager) $pager [1] /usr/local/lib/R/bin/pager And less itself works OK in an xterm? The above looks perfectly normal to me. The whole procedure is external to R: R writes a file, then fires up the pager on it, so it is difficult to imagine that something in R itself should cause the problem. You said that man works; does that use less as its pager too? man whatever | less might be illuminating. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Boosting,bagging and bumping. Questions about R tools and predictions.
Take a look at the randomForest package on CRAN: randomForest: Breiman's random forest for classification and regression Classification and regression based on a forest of trees using random inputs. Version: 3.9-6 Depends: R (= 1.7.0) Author: Fortran original by Leo Breiman and Adele Cutler, R port by Andy Liaw and Matthew Wiener. Maintainer: Andy Liaw [EMAIL PROTECTED] which has a predict function HTH Gav monkeychump wrote: I'm interested in further understanding the differences in using many classification trees to improve classification rates. I'm also interested in finding out what I can do in R and which methods will allow prediction. Can anybody point me to a citation or discussion? Specifically, I want to classify remotely sensed imagery where training data is extracted on class membership by the user. That training data (usually spectral bands and categorical data - e.g., soil type) is classified (using rpart for instance) and then the resulting tree is applied to the entire image. This results in a classified image that can then be checked for accuracy. Classification trees are increasingly used by the remote sensing folks but it seems like finding optimal trees is an active area of research in computational statistics. I've seen great claims made by baggers and boosters (and just what is bumping?) of increasing classification accuracy but aside from TreeNet by Salford Systems I'm not aware of tools that can grow forests of trees that can then be used to make predictions. Can anybody help? Promote security and make money with the Hushmail Affiliate Program: __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [T] +44 (0)20 7679 5522 ENSIS Research Fellow [F] +44 (0)20 7679 7565 ENSIS Ltd. ECRC [E] [EMAIL PROTECTED] UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/ 26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/ London. WC1H 0AP. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] animal models and lme
Not convinced that responses so far have addressed the problem. The model is y = mu + U + e where e is a vector of independendent errors with variance ve, and U is a vector of random effects with covariance matrix va*A, where A is a known matrix (which we can assume is a correlation matrix). If we know the ratio (va/ve), this reduces to a GLS problem, but not otherwise. Usually we have to estimate both ve and va. == I.White ICAPB, University of Edinburgh Ashworth Laboratories, West Mains Road Edinburgh EH9 3JT Fax: 0131 650 6564 Tel: 0131 650 5490 E-mail: [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Read trajectory file into R
Prof Brian Ripley wrote: On Wed, 23 Jul 2003, Karim Elsawy wrote: I wonder if there is a way to read a molecular dynamic trajectory file ( binary file) produced by CHARMM into R. Something like that in matlab. Actually this will save tremendous effort in post processing. If you know the file format, yes. That's a main aim of connections and function readBin(). Function read.S (in package foreign) is an example. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Thanks a lot for your help, actually I do not know the exact file format at the moment all what I know is : The DCD files (the trajectory files) are single precision binary FORTRAN files, so are transportable between computer architectures. They are not, unfortunately, transportable between big-endian (most workstations) and little endian (Intel) architectures is this enough best regards karim __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] category data (text categories)
May I ask for a pointer to documentation of how to use category data (text categories), which I would like to graph as present/absent at stations along a transect. Thanks for any pointers, Alan Davis -- [EMAIL PROTECTED] 1-670-322-6580 Alan E. Davis, PMB 30, Box 10006, Saipan, MP 96950-8906, CNMI I have steadily endeavored to keep my mind free, so as to give up any hypothesis, however much beloved -- and I cannot resist forming one on every subject -- as soon as facts are shown to be opposed to it. -- Charles Darwin (1809-1882) The right to search for truth implies also a duty; one must not conceal any part of what one has recognized to be true. -- Albert Einstein As we enjoy great advantages from the inventions of others we should be glad of an opportunity to serve others by any invention of ours, and this we should do freely and generously. -- Benjamin Franklin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] multinomial logit discrete choice model
Hi, I'm struggling trying to specify a multinomial logit discrete choice model in R. Any help and/or code examples appreciated. I am specifically interested in specifying a model where no universal choice set exists and each choice set has a variable number of alternatives (one of which is chosen) see data below. Many Thanks, David PS I have asked earlier but without reply. Is it because: a. it's a stupid question b. It's obvious c. No one knows the answer Data: Chosen AttrQ AttrW | Choices set 1 0 80 | 120 34 | 0 72 | 0 53 | set 2 0 35 | 1 25 18 | 04 9 | 1 30 12 | set 3 024 | __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re[2]: [R] lattice: how to format axis labels?
Dear james, jhcc check out 'sprintf' for formating in a specific way. This will not solve the problem. I will have to specify the argument like labels=(...). I would like to avoid it. I wonder if there a key or option to make automatically appearing labels be formatted in the mentioned way. I haven't found it in the documentation. = jhcc I draw graphics with xyplot() function. jhcc Labels on the y axis are appearing as follows: 1.5, 1, 0.5, 0 jhcc I'd like to have them to be 1.5, 1.0, 0.5, 0.0, i.e. with fixed jhcc number of digits after the dot (one in this case). jhcc Is there any way to do this without implicit specifying labels? -- Best regards Wladimir Eremeev mailto:[EMAIL PROTECTED] == Research ScientistLeninsky Prospect 33, Space Monitoring Ecoinformation Systems Sector, Moscow, Russia, 119071, Institute of Ecology, Phone: (095) 135-9972; Russian Academy of Sciences Fax: (095) 954-5534 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: Re[2]: [R] lattice: how to format axis labels?
High-level control of axes in xyplot is implemented by the scales argument to xyplot. You can include components 'at' and 'labels' in a list given as the scales argument. See ?xyplot. Wladimir Eremeev [EMAIL PROTECTED] writes: jhcc check out 'sprintf' for formating in a specific way. This will not solve the problem. I will have to specify the argument like labels=(...). I would like to avoid it. I wonder if there a key or option to make automatically appearing labels be formatted in the mentioned way. I haven't found it in the documentation. = jhcc I draw graphics with xyplot() function. jhcc Labels on the y axis are appearing as follows: 1.5, 1, 0.5, 0 jhcc I'd like to have them to be 1.5, 1.0, 0.5, 0.0, i.e. with fixed jhcc number of digits after the dot (one in this case). jhcc Is there any way to do this without implicit specifying labels? I don't think so. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] animal models and lme
If you can solve the problem for fixed rho = (va/ve) using gls, then you can call gls for many values of rho, plot the log(likelihood) contours vs. rho, construct confidence intervals, etc. You may even be able to write a function to return (-2)*log(likelihood) for a fixed rho and then use optim to minimize that deviance. [I would suspect that the log(likelihood) might look more parabolic in terms of log(rho) that in terms of rho itself. In addition, optim might work better with the minimum for log.rho = (-Inf) than with a lower bound for rho at 0.] hope this helps. spencer graves Douglas Bates wrote: [EMAIL PROTECTED] writes: Not convinced that responses so far have addressed the problem. The model is y = mu + U + e where e is a vector of independendent errors with variance ve, and U is a vector of random effects with covariance matrix va*A, where A is a known matrix (which we can assume is a correlation matrix). If we know the ratio (va/ve), this reduces to a GLS problem, but not otherwise. Usually we have to estimate both ve and va. Sorry to say that I don't think lme will handle that problem gracefully. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] S3 and S4 classes
On Wed, 23 Jul 2003 14:53:56 +0200, Laurent Faisnel [EMAIL PROTECTED] wrote : Could anyone point me out what's S3-like in the following sample and why it is not fully S4-compatible ? # a function that objects of this class have perform - function(.Object) UseMethod(perform, .Object); It think this is unnecessary, and somewhat S3-like. A more S4-looking way to do the same (?) thing is setGeneric(perform, function(.Object) standardGeneric(perform)) but I think this will be generated automatically when you define your methods. Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Condition indexes and variance inflation factors
Has anyone programmed condition indexes in R? I know that there is a function for variance inflation factors available in the car package; however, Belsley (1991) Conditioning Diagnostics (Wiley) notes that there are several weaknesses of VIFs: e.g. 1) High VIFs are sufficient but not necessary conditions for collinearity 2) VIFs don't diagnose the number of collinearities and 3) No one has determined how high a VIF has to be for the collinearity to be damaging. He then develops and suggests using condition indexes instead, so I was wondering if anyone had programmed them. Thanks Peter Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Read trajectory file into R
On Wed, 23 Jul 2003, Karim Elsawy wrote: Thanks a lot for your help, actually I do not know the exact file format at the moment all what I know is : The DCD files (the trajectory files) are single precision binary FORTRAN files, so are transportable between computer architectures. They are not, unfortunately, transportable between big-endian (most workstations) and little endian (Intel) architectures is this enough Well, that's enough to get the numbers into R. You then will have to work out what they mean. readBin(connection, numeric(), size=4, n=whatever) will read `whatever' Fortran single precision numbers from `connection'. If you are doing this on the machine where the file was generated then you don't need to worry about endianness. On a different machine (eg moving from a Sparc to a PC) you may need to add endian=swap. Look at readBin for more information. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Condition indexes and variance inflation factors
Peter Flom wrote: Has anyone programmed condition indexes in R? I know that there is a function for variance inflation factors available in the car package; however, Belsley (1991) Conditioning Diagnostics (Wiley) notes that there are several weaknesses of VIFs: e.g. 1) High VIFs are sufficient but not necessary conditions for collinearity 2) VIFs don't diagnose the number of collinearities and 3) No one has determined how high a VIF has to be for the collinearity to be damaging. He then develops and suggests using condition indexes instead, so I was wondering if anyone had programmed them. Thanks Peter I think Juergen Gross has something like that in his new book Gross, J. (2003): Linear Regression, Springer (in press - OK, not very helpful here). You might want to contact him privately (in CC). Uwe Ligges Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Strange behaviour when running R from within Emacs on Winddows
On Wed, 23 Jul 2003, Søren Højsgaard wrote: Dear R-experts, I run R in a shell under Emacs on Win2k using ESS. I get the following strange error shell(copy c:\\file.txt c:\\newfile.txt) warning: extra args ignored after 'copy' Forkert syntaks for kommandoen. Warning message: cmd execution failed with error code 1 in: shell(copy c:\\file.txt c:\\newfile.txt) The same problem emerges independently of whether I use a dos or bash as shell! However, if I run the shell() thing in the Gui, things work fine and so do they in Rterm Can anyone help me? I suspect Emacs/ESS has set the SHELL variable: you can ask shell to use a specific shell via its second argument or by setting R_SHELL. bash won't work, as `copy' is a DOS internal command. Why don't you just use file.copy()? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Read trajectory file into R
On Wed, 23 Jul 2003, Karim Elsawy wrote: Prof Brian Ripley wrote: On Wed, 23 Jul 2003, Karim Elsawy wrote: I wonder if there is a way to read a molecular dynamic trajectory file ( binary file) produced by CHARMM into R. Something like that in matlab. Actually this will save tremendous effort in post processing. If you know the file format, yes. That's a main aim of connections and function readBin(). Function read.S (in package foreign) is an example. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Thanks a lot for your help, actually I do not know the exact file format at the moment all what I know is : The DCD files (the trajectory files) are single precision binary FORTRAN files, so are transportable between computer architectures. They are not, unfortunately, transportable between big-endian (most workstations) and little endian (Intel) architectures is this enough Possibly. readBin can read those (at least on Unix-like OSes), and if you can look at them some other way you can probably sort out the structure of the values. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Passing references to data objects into R functions
Hi. I have the following question about reading from large data objects from within R functions; I have tried to simplify my problem as much as possible in what follows. Imagine I have various large data objects sitting in my global environment (call them data1, data2, ...). I want to write a function extract that extracts some of the rows of a particular data object, does some further manipulations on the extract and then returns the result. The function takes the data object's name and an index vector -- for example the following call would return the first 3 rows of object data1. ans = extract(data1, 1:3) I could write a simple function like this: extract1 = function(object.name, index) { temp = get(object.name, envir = .GlobalEnv) temp = temp[index, , drop=FALSE] # do some further manipulations here return(temp) } The problem is that the function makes a copy temp of the object in the function frame, which (in my application) is very memory inefficient as the data objects are very large. It is especially inefficient when the length of the index vector is much smaller than the number of rows in the data object. What I really would like to do is to be able to read from the underlying data object directly (in other programming languages this would be achieved by passing a pointer to the object instead), without making a copy. Given the rules of variable name scoping in R, I could avoid making a copy with the following call: extract2 = function(object.name, index) { eval(parse(text = temp = , object.name, [index, , drop=FALSE], sep=)) # do some further manipulations here return(temp) } But this seems very messy. Is there a better way? Thanks for your help David Khabie-Zeitoune __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] trouble with maps
Has anyone else seen this behavior from the maps package? map('state', fill=TRUE) results in a lively mix of overlapping polygons inside a map of the US, but they have no obvious relationship to state boundaries. (See attached jpeg.) I reinstalled the maps and mapdata packages from ftp://ftp.mcs.vuw.ac.nz/pub/statistics/map/ to see it that would help, but it doesn't. Since our platform is an SGI (running R-1.8.0, development version, June 25), I made sure to use mapget.c.notlinux, and that didn't help either. Any thoughts? Is there something I'm missing? Debby [Apologies if r-help receives this twice; I'm learning to use a new mailer.]__ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Curious warning in R for OS X w/Xwindows
On Wednesday, July 23, 2003, at 05:03 AM, Peter Dalgaard BSA wrote: Bear F. Braumoeller [EMAIL PROTECTED] writes: As to the other questions, Sys.getenv(TERM) TERM xterm Sys.getenv(PAGER) PAGER /usr/bin/less options(pager) $pager [1] /usr/local/lib/R/bin/pager And less itself works OK in an xterm? The above looks perfectly normal to me. The whole procedure is external to R: R writes a file, then fires up the pager on it, so it is difficult to imagine that something in R itself should cause the problem. You said that man works; does that use less as its pager too? man whatever | less might be illuminating. I thought of that after I wrote, and I ran it through its paces -- less works like a charm. I also piped the output specifically through /usr/bin/less in case (for some odd reason) the terminal was defaulting to a different copy of less than R was. Still works just fine. The only thing that I can see that might (??) be causing trouble is in /usr/local/lib/R/bin/pager, which reads #!/bin/sh ## For the curious: pager $1 doesn't work in batch, because more will ## eat the rest of stdin. The no-argument version is intended for use at ## the end of a pipeline. ## ## PAGER is determined at configure time and recorded in `etc/Renviron'. if test -n ${1}; then exec ${PAGER} ${1} else exec ${PAGER} fi ### Local Variables: *** ### mode: sh *** ### sh-indentation: 2 *** ### End: *** -- but I don't know enough about how R is calling the pager to know what this is doing. Bear F. Braumoeller Assistant Professor Department of Government Harvard University http://www.people.fas.harvard.edu/~bfbraum __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] .ps files in R
I have recently printed in R to a postscript file. I'm working on a SSH without an X terminal. It was fairly automatic: plot(x,y) dev.off() And then the default creates a file called Rplots.ps which I can ftp to my laptop and open in Ghostscript. I can see the file, and nothing looks odd. However, when I import it into LaTeX, it refuses to configure right side up. (It stays 90 degrees.) I've tried saving it as .eps with different options in ghostscript. I've also tried many different rotating commands in LaTeX (angle in \includegraphics, \rotate, \sideways,...) But, the picture seems to be unaffected by any of these commands. Does anyone know a trick to getting R postscript files into LaTeX? Thanks, Jo Johanna Hardin Department of Mathematics Computer Science 610 N. College Way Pomona College Claremont, CA 91711 (909) 607-8717 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] .ps files in R
You can specify the postscript() option horizontal=F. See help on horizontal in ?postscript. HTH, Jerome On July 23, 2003 11:42 am, Johanna Hardin wrote: I have recently printed in R to a postscript file. I'm working on a SSH without an X terminal. It was fairly automatic: plot(x,y) dev.off() And then the default creates a file called Rplots.ps which I can ftp to my laptop and open in Ghostscript. I can see the file, and nothing looks odd. However, when I import it into LaTeX, it refuses to configure right side up. (It stays 90 degrees.) I've tried saving it as .eps with different options in ghostscript. I've also tried many different rotating commands in LaTeX (angle in \includegraphics, \rotate, \sideways,...) But, the picture seems to be unaffected by any of these commands. Does anyone know a trick to getting R postscript files into LaTeX? Thanks, Jo Johanna Hardin Department of Mathematics Computer Science 610 N. College Way Pomona College Claremont, CA 91711 (909) 607-8717 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] .ps files in R
On Wed, 23 Jul 2003, Johanna Hardin wrote: in ghostscript. I've also tried many different rotating commands in LaTeX (angle in \includegraphics, \rotate, \sideways,...) But, the picture seems to be unaffected by any of these commands. I find it the rotation won't show up in the DVI file, but will once you convert the DVI file into a PS file. But that's another story. Does anyone know a trick to getting R postscript files into LaTeX? When I need to generate a PS file in R I always do something like: postscript(foo.eps, height = 6.9, width = 6.6, horizontal = FALSE, onefile = FALSE, print.it = FALSE) plot(1:10) dev.off() then in my LaTeX file do something like: \begin{figure}[h!] \centering \begin{center} \includegraphics[width = .8\textwidth]{foo.eps} \end{center} \caption{My Caption} \label{fig:foo} \end{figure} -- Cheers, Kevin -- On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question. -- Charles Babbage (1791-1871) From Computer Stupidities: http://rinkworks.com/stupid/ -- Ko-Kang Kevin Wang Master of Science (MSc) Student SLC Tutor and Lab Demonstrator Department of Statistics University of Auckland New Zealand Homepage: http://www.stat.auckland.ac.nz/~kwan022 Ph: 373-7599 x88475 (City) x88480 (Tamaki) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] trouble with maps
On Wed, 23 Jul 2003, Deborah Swayne wrote: Has anyone else seen this behavior from the maps package? map('state', fill=TRUE) results in a lively mix of overlapping polygons inside a map of the US, but they have no obvious relationship to state boundaries. (See attached jpeg.) Yes, this replicates. I think the trouble is in the part of map() where it makes a polygon of the lines: if (fill) { gonsize - line$size color - rep(color, length = length(gonsize)) keep - !is.na(color) coord[c(x, y)] - makepoly(coord, gonsize, keep) color - color[keep] } and I think makepoly needs gon, not gonsize, as an argument, to stitch the boundary lines together in the correct order - the interesting effect seems to come from some lines not being reversed. Unfortunately, I don't have archival copies of earlier map() functions to check this - it could also be in mapgetl(): if (fill) coord - mapgetl(database, line$number, xlim, ylim) although this is less likely, because the same function is used to retrieve the same data when fill=FALSE too. Something has got lost in building the polygons, it seems! As far as I can establish, fill=TRUE did work in earlier versions. p - map('state', region=c('penn'), resolution=0) plot(p, type=l) gives the boundaries in both cases, pf - map('state', region=c('penn'), fill=TRUE, resolution=0) gives black/white interesting polygons, and plot(pf, type=l) draws the boundary lines with wrong links to next line segents. Roger -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] odd behavior with the maps package
Has anyone else seen this behavior from the maps package? map('state', fill=TRUE) results in a lively mix of overlapping polygons inside a map of the US, but they have no obvious relationship to state boundaries. (See attached jpeg.) I reinstalled the maps and mapdata packages from ftp://ftp.mcs.vuw.ac.nz/pub/statistics/map/ to see it that would help, but it doesn't. Since our platform is an SGI (running R-1.8.0, development version, June 25), I made sure to use mapget.c.notlinux, and that didn't help either. Any thoughts? Is there something I'm missing? Debby__ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] .ps files in R
You can include in your latex with \includegraphics[angle=90]{foo.eps} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] S3 and S4 classes
Duncan Murdoch wrote: On Wed, 23 Jul 2003 14:53:56 +0200, Laurent Faisnel [EMAIL PROTECTED] wrote : Could anyone point me out what's S3-like in the following sample and why it is not fully S4-compatible ? # a function that objects of this class have perform - function(.Object) UseMethod(perform, .Object); It think this is unnecessary, and somewhat S3-like. A more S4-looking way to do the same (?) thing is setGeneric(perform, function(.Object) standardGeneric(perform)) but I think this will be generated automatically when you define your methods. As Duncan says, this is the S3-style portion of the example. It's not wrong, but there are advantages to NOT going this route. The UseMethod() call says that this is a function with S3-style methods. Is that true? It might well be--you could have a function perform.default, for example, that was the default method to use. The disadvantage of hanging on to S3 methods is that they're hidden; unlike S4 methods, you can't easily find out what methods are defined (by calling showMethods()). If you don't have any existing definition of perform(), you will need to call setGeneric() as Duncan showed. The implication is that perform() doesn't have a default method--unless the argument inherits from one of the classes in a setMethod() call, the result is an error. (If there is a non-generic version of perform, that becomes the default method, as it would in your example.) If you DID have a perform.default, you might want to make that explicitly the S4 default method setMethod(perform, ANY, perform.default) after the setGeneric call. Similarly, you could make other S3 methods into S4 methods. Then all the methods are visible. Also, a point of good style, unrelated to methods. It's not generally a good idea to have function arguments starting with .. Names of this form are intended for behind-the-scenes manipulations. By sticking to names that start with a letter, you avoid the chance of conflicting with some such manipulation. So, Object rather than .Object. (The reason intialize() uses .Object is exactly BECAUSE it expects user-defined arguments, in the ..., to start with a letter, and so chooses .Object to minimize the chance of conflicting.) Regards, John Chambers Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- John M. Chambers [EMAIL PROTECTED] Bell Labs, Lucent Technologiesoffice: (908)582-2681 700 Mountain Avenue, Room 2C-282 fax:(908)582-3340 Murray Hill, NJ 07974web: http://www.cs.bell-labs.com/~jmc __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Condition indexes and variance inflation factors
Dear Peter and Uwe, I don't have a copy of Belsley's 1991 book here, but I do have Belsley, Kuh, and Welsch, Regression Diagnostics (Wiley, 1980). If my memory is right, the approach is the same: Belsley's collinearity diagnostics are based on a singular-value decomposition of the scaled but uncentred model matrix. A straightforward, if inelegant, rendition is belsley - function(model){ X - model.matrix(model) X - scale(X, center=FALSE)/sqrt(nrow(X) - 1) svd.X - svd(X) result - list(singular.values = svd.X$d, condition.indices = max(svd.X$d)/svd.X$d) phi - sweep(svd.X$v^2, 2, svd.X$d^2, /) Pi - t(sweep(phi, 1, rowSums(phi), /)) colnames(Pi) - names(coef(model)) rownames(Pi) - 1:nrow(Pi) result$pi - Pi class(result) - belsley result } print.belsley - function(x, digits = 3, ...){ cat(\nSingular values: , x$singular.values) cat(\nCondition indices: , x$condition.indices) cat(\n\nVariance-decomposition proportions\n) print(round(x$pi, digits)) invisible(x) } This gives the singular values, condition indices, and variance-decomposition proportions. (I'm pretty sure that you can get the same thing more elegantly from the qr decomposition, but I don't know how off the top of my head -- someone else on the list doubtless can supply the details.) For example, for the illustration on p. 161 of BKW, X V1 V2 V3 V4 V5 1 -74 80 18-56 -112 2 14 -69 21 52104 3 66 -72 -5764 1528 4 -12 66 -30 4096 8192 5 3 8 -7 -13276 -26552 6 4 -12 4 8421 16842 mod - lm(y ~ X - 1) # nb., y was just randomly generated belsley(mod) Singular values: 1.414214 1.361734 1.066707 0.08840437 3.614479e-17 Condition indices: 1 1.038538 1.325775 15.9971 3.912635e+16 Variance-decomposition proportions XV1 XV2 XV3 XV4 XV5 1 0.000 0.000 0.000 0 0 2 0.005 0.005 0.000 0 0 3 0.001 0.001 0.047 0 0 4 0.994 0.994 0.953 0 0 5 0.000 0.000 0.000 1 1 which is in good agreement with the values given in the text. Now some comments: (1) I've never liked this approach for a model with a constant, where it makes more sense to me to centre the data. I realize that opinions differ here, but it seems to me that failing to centre the data conflates collinearity with numerical instability. (2) I also disagree with the comment that condition indices are easier to interpret than variance-inflation factors. In either case, since collinearity is a continuous phenomenon, cutoffs for large values are necessarily arbitrary. (3) If you're interested in figuring out which variables are involved in each collinear relationship, then (for centred and scaled data) you can equivalently (and to me, more intuitively) work with the principal-components analysis of the predictors. (4) I have doubts about the whole enterprise. Collinearity is one source of imprecision -- others are small sample size, homogeneous predictors, and large error variance. Aren't the coefficient standard errors the bottom line? If these are sufficiently small, why worry? I hope that this helps. John At 05:35 PM 7/23/2003 +0200, Uwe Ligges wrote: Peter Flom wrote: Has anyone programmed condition indexes in R? I know that there is a function for variance inflation factors available in the car package; however, Belsley (1991) Conditioning Diagnostics (Wiley) notes that there are several weaknesses of VIFs: e.g. 1) High VIFs are sufficient but not necessary conditions for collinearity 2) VIFs don't diagnose the number of collinearities and 3) No one has determined how high a VIF has to be for the collinearity to be damaging. He then develops and suggests using condition indexes instead, so I was wondering if anyone had programmed them. Thanks Peter I think Juergen Gross has something like that in his new book Gross, J. (2003): Linear Regression, Springer (in press - OK, not very helpful here). You might want to contact him privately (in CC). Uwe Ligges - John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Dismal R performance of Athlon moble CPU?
I have been using a laptop computer of Pentium III 1.13 Ghz. I heard that AMD's Athlon has excellent floating point capacity. So I bought a Athlon 2200+ laptop yesterday. I expected that new Athlon 2200+ will be twice as fast as the P III 1.13 GB. I ran a R simulation program and the new computer is only 30% faster, in fact slightly slower than a Celeron 1.50 GB laptop. I am very disappointed by this. What is your experience with Athlon? Should I stick to Intel in the future? Thanks. By the way, the OS is Windows XP home edtion. Jason = Jason G. Liao, Ph.D. Division of Biometrics University of Medicine and Dentistry of New Jersey 335 George Street, Suite 2200 New Brunswick, NJ 08903-2688 phone (732) 235-8611, fax (732) 235-9777 http://www.geocities.com/jg_liao __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Dismal R performance of Athlon moble CPU?
On Wed, 23 Jul 2003, Jason Liao wrote: I have been using a laptop computer of Pentium III 1.13 Ghz. I heard that AMD's Athlon has excellent floating point capacity. So I bought a Athlon 2200+ laptop yesterday. I expected that new Athlon 2200+ will be twice as fast as the P III 1.13 GB. I ran a R simulation program and the new computer is only 30% faster, in fact slightly slower than a Celeron 1.50 GB laptop. I am very disappointed by this. What is your experience with Athlon? Should I stick to Intel in the future? Thanks. So I expect you think a P4M 1.4GHz (on which I am writing this) should be a lot faster than a PIII 1GHz? It is often slower. Don't compare laptop chips with desktop ones, nor different chip families (an Athlon 2200 is not 2.2GHz, BTW). PIIIs seem the fastest per GHz, but they don't do many GHz. I am rather pleased with my dual Athlon 2600, but then P4's don't allow multiprocessors and the machine with dual Athlons was cheaper than a comparable one with a single 2.4GHz P4. You have tuned an ATLAS implementation to your CPU, I take it? If not, that's the first step to optimal R performance. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Dismal R performance of Athlon moble CPU?
Thanks for Prof. Ripley and Andy for your technical explantion. It seems that that the real CPU speed has not advanced as fast as these Ghz or other performance indicator suggest. Yes, my program is totally CPU intensive. We do have a dual P4 Xeon 2.4 GHz with 8GB RAM, and jobs run more than twice as fast as my PIII 933MHz laptop. R can not really use dual CPU for one R session if I understand correctly Jason --- Liaw, Andy [EMAIL PROTECTED] wrote: Overall performance depends on a few other things besides CPU clock speed (e.g., RAM speed and size, cache size, disk speed, etc.) Unless your code is spending great majority of the time in the CPU, you should not expect speed-up to be equal to ratio of clock speeds. (Also, as Prof. Ripley pointed out, a P4 does less than a PIII at the same clock speed, and the number AMD attach to Athlon is not clock speed.) We do have a dual P4 Xeon 2.4 GHz with 8GB RAM, and jobs run more than twice as fast as my PIII 933MHz laptop. Andy -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 23, 2003 4:55 PM To: Jason Liao Cc: [EMAIL PROTECTED] Subject: Re: [R] Dismal R performance of Athlon moble CPU? On Wed, 23 Jul 2003, Jason Liao wrote: I have been using a laptop computer of Pentium III 1.13 Ghz. I heard that AMD's Athlon has excellent floating point capacity. So I bought a Athlon 2200+ laptop yesterday. I expected that new Athlon 2200+ will be twice as fast as the P III 1.13 GB. I ran a R simulation program and the new computer is only 30% faster, in fact slightly slower than a Celeron 1.50 GB laptop. I am very disappointed by this. What is your experience with Athlon? Should I stick to Intel in the future? Thanks. So I expect you think a P4M 1.4GHz (on which I am writing this) should be a lot faster than a PIII 1GHz? It is often slower. Don't compare laptop chips with desktop ones, nor different chip families (an Athlon 2200 is not 2.2GHz, BTW). PIIIs seem the fastest per GHz, but they don't do many GHz. I am rather pleased with my dual Athlon 2600, but then P4's don't allow multiprocessors and the machine with dual Athlons was cheaper than a comparable one with a single 2.4GHz P4. You have tuned an ATLAS implementation to your CPU, I take it? If not, that's the first step to optimal R performance. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. -- = Jason G. Liao, Ph.D. Division of Biometrics University of Medicine and Dentistry of New Jersey 335 George Street, Suite 2200 New Brunswick, NJ 08903-2688 phone (732) 235-8611, fax (732) 235-9777 http://www.geocities.com/jg_liao __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] error bars in color
Hi, is it possible to generate differently colored error bars in one plot? Thx in advance, Heinrich [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Shapefiles package upload
I uploaded version 0.3 of the shapefiles package to CRAN earlier today. Version 0.2 had a bug that omitted the decimal precision in numeric fields so some programs (such as ArcGIS) would not parse the fields correctly. I also added an argument to write.dbf to swap . with _ since ArcGIS does not permit underscores in field names. Let me know if you run into any other problems. Thanks. Benjamin Stabler Transportation Planning Analysis Unit Oregon Department of Transportation 555 13th Street NE, Suite 2 Salem, OR 97301 Ph: 503-986-4104 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] RE: Shapefiles package upload
Sorry, I meant to say ArcGIS does not permit periods in field names. -Original Message- From: STABLER Benjamin Sent: Wednesday, July 23, 2003 3:29 PM To: [EMAIL PROTECTED] Subject: Shapefiles package upload I uploaded version 0.3 of the shapefiles package to CRAN earlier today. Version 0.2 had a bug that omitted the decimal precision in numeric fields so some programs (such as ArcGIS) would not parse the fields correctly. I also added an argument to write.dbf to swap . with _ since ArcGIS does not permit underscores in field names. Let me know if you run into any other problems. Thanks. Benjamin Stabler Transportation Planning Analysis Unit Oregon Department of Transportation 555 13th Street NE, Suite 2 Salem, OR 97301 Ph: 503-986-4104 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] error bars in color
Yes. See ?plot, ?segments, ?lines, and in particular see the help on the col option in ?par. HTH, Jerome On July 23, 2003 03:13 pm, Heinrich Kestler wrote: Hi, is it possible to generate differently colored error bars in one plot? Thx in advance, Heinrich [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Dismal R performance of Athlon moble CPU?
Jason Liao [EMAIL PROTECTED] writes: R can not really use dual CPU for one R session if I understand correctly It certainly can, using message passing libraries or sockets. While it isn't technically one session, it's awfully similar to that, for the user. best, -tony -- A.J. Rossini / [EMAIL PROTECTED] / [EMAIL PROTECTED] http://software.biostat.washington.edu/ UNTIL IT MOVES IN JULY. Biomedical and Health Informatics, University of Washington Biostatistics, HVTN/SCHARP, Fred Hutchinson Cancer Research Center. FHCRC: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Dismal R performance of Athlon moble CPU?
From: Jason Liao [mailto:[EMAIL PROTECTED] Thanks for Prof. Ripley and Andy for your technical explantion. It seems that that the real CPU speed has not advanced as fast as these Ghz or other performance indicator suggest. Yes, my program is totally CPU intensive. We do have a dual P4 Xeon 2.4 GHz with 8GB RAM, and jobs run more than twice as fast as my PIII 933MHz laptop. R can not really use dual CPU for one R session if I understand correctly No, but that machine is being shared by several people. Even if only one person uses the box, it helps to have one CPU dedicated to R, and another taking care of other things. Having 12k rpm SCSI disks and fast RAM helped, too. Andy Jason --- Liaw, Andy [EMAIL PROTECTED] wrote: Overall performance depends on a few other things besides CPU clock speed (e.g., RAM speed and size, cache size, disk speed, etc.) Unless your code is spending great majority of the time in the CPU, you should not expect speed-up to be equal to ratio of clock speeds. (Also, as Prof. Ripley pointed out, a P4 does less than a PIII at the same clock speed, and the number AMD attach to Athlon is not clock speed.) We do have a dual P4 Xeon 2.4 GHz with 8GB RAM, and jobs run more than twice as fast as my PIII 933MHz laptop. Andy -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 23, 2003 4:55 PM To: Jason Liao Cc: [EMAIL PROTECTED] Subject: Re: [R] Dismal R performance of Athlon moble CPU? On Wed, 23 Jul 2003, Jason Liao wrote: I have been using a laptop computer of Pentium III 1.13 Ghz. I heard that AMD's Athlon has excellent floating point capacity. So I bought a Athlon 2200+ laptop yesterday. I expected that new Athlon 2200+ will be twice as fast as the P III 1.13 GB. I ran a R simulation program and the new computer is only 30% faster, in fact slightly slower than a Celeron 1.50 GB laptop. I am very disappointed by this. What is your experience with Athlon? Should I stick to Intel in the future? Thanks. So I expect you think a P4M 1.4GHz (on which I am writing this) should be a lot faster than a PIII 1GHz? It is often slower. Don't compare laptop chips with desktop ones, nor different chip families (an Athlon 2200 is not 2.2GHz, BTW). PIIIs seem the fastest per GHz, but they don't do many GHz. I am rather pleased with my dual Athlon 2600, but then P4's don't allow multiprocessors and the machine with dual Athlons was cheaper than a comparable one with a single 2.4GHz P4. You have tuned an ATLAS implementation to your CPU, I take it? If not, that's the first step to optimal R performance. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. -- = Jason G. Liao, Ph.D. Division of Biometrics University of Medicine and Dentistry of New Jersey 335 George Street, Suite 2200 New Brunswick, NJ 08903-2688 phone (732) 235-8611, fax (732) 235-9777 http://www.geocities.com/jg_liao -- Notice: This e-mail message, together with any attachments, ...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] unz( x.zip, y.csv ) != pipe( unzip -p x.zip y.csv )
Not sure this is a bug in R. Maybe its a bug in my understanding of unz(). The character 'b2' (hexadecimal) is in position 535 of line 1 of 'naughty.csv'. This character appears as superscript '2' and came to me in an EXCEL file that I converted to text in a comma separated ( *.csv ) format. The first line gets truncated by readLines after 534 characters using unz(): nchar( readLines( unz( bad.zip, naughty.csv ))) [1] 534 11 9 22 nchar(readLines( pipe( unzip -p bad.zip naughty.csv ) )) [1] 809 11 9 22 attempting to read the same file using scan( unz( ... ) ) concat's the rest of the file (including comma separators) to the word that included 'b2', while scan( pipe( unzip ... ) ) reads all elements. options(width = 50 ) # prevent my mailer from line wrapping nchar(scan(unz( bad.zip, naughty.csv) , what=a, sep=,,nlines=1) ) Read 45 items [1] 5 9 12 8 11 4 2 1 1 8 8 [12] 8 9 5 10 8 6 12 10 8 16 16 [23] 12 14 12 20 10 8 6 12 10 8 16 [34] 16 12 14 12 20 20 18 20 18 13 13 [45] 329 nchar( scan( pipe( unzip -p bad.zip naughty.csv ) , what=a, sep=,,nlines=1) ) Read 62 items [1] 5 9 12 8 11 4 2 1 1 8 8 8 9 5 10 [16] 8 6 12 10 8 16 16 12 14 12 20 10 8 6 12 [31] 10 8 16 16 12 14 12 20 20 18 20 18 13 13 10 [46] 13 14 12 12 10 16 14 12 10 16 14 22 20 22 20 [61] 15 15 version## LINUX R-1.7.1 gave similar results _ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major1 minor7.0 year 2003 month04 day 16 language R Chuck Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://hacuna.ucsd.edu/members/ccb.html La Jolla, San Diego 92093-0717 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] trouble with maps
[EMAIL PROTECTED] wrote: On Wed, 23 Jul 2003, Deborah Swayne wrote: Has anyone else seen this behavior from the maps package? map('state', fill=TRUE) results in a lively mix of overlapping polygons inside a map of the US, but they have no obvious relationship to state boundaries. (See attached jpeg.) Ah, this was a 'known problem' :-) first noticed in January this year, by Ott Toomet [EMAIL PROTECTED]. I suspect it is a feature of the port to R which never worked properly. The solution is to reverse the sign of every line number making up the polygons in the .gon file (or alternately, reverse the order of the line numbers there). A 'fixed' maps package (Unix only) is available at: ftp://ftp.mcs.vuw.ac.nz/pub/statistics/map/maps_1.1-2.tar.gz Ray Brownrigg __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] pls regression - optimal number of LVs
Dear R-helpers, I have performed a PLS regression with the mvr function from the pls.pcr package an I have 2 questions : 1- do you know if mvr automatically centers the data ? It seems to me that it does so... 2- why in the situation below does the output say that the optimal number of latent variables is 4 ? In my humble opinion, it is 2 because the RMS increases and the R2 decreases when 3 LVs are considered : summary(maturityCondor.raw.mvr) Data: X dimension: 8 1050 Y dimension: 8 1 Method: SIMPLS Number of latent variables considered: 1-7 TRAINING: RMS table: [,1] 1 LV's 1.23e+01 2 LV's 6.79e+00 3 LV's 5.00e+00 4 LV's 2.17e+00 5 LV's 1.93e+00 6 LV's 7.79e-01 7 LV's 1.01e-09 Cumulative fraction of variance explained: X Y 1 LV's 0.848 0.499 2 LV's 0.930 0.846 3 LV's 0.979 0.917 4 LV's 0.992 0.984 5 LV's 0.999 0.988 6 LV's 1.000 0.998 7 LV's 1.000 1.000 VALIDATION Optimal number of latent variables: 4 RMS table (10-fold crossvalidation): [,1] 1 LV's 16.21 2 LV's 12.15 3 LV's 13.81 4 LV's 6.68 5 LV's 6.38 6 LV's 5.91 7 LV's 13.38 Coefficient of multiple determination (R2): [,1] 1 LV's 0.20 2 LV's 0.51 3 LV's 0.41 4 LV's 0.88 5 LV's 0.87 6 LV's 0.90 7 LV's 0.77 Thanks for your help, Arnaud * Arnaud DOWKIW Department of Primary Industries J. Bjelke-Petersen Research Station KINGAROY, QLD 4610 Australia T : + 61 7 41 600 700 T : + 61 7 41 600 728 (direct) F : + 61 7 41 600 760 ** DISCLAIMER**...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Dismal R performance of Athlon moble CPU?
I haven't gotten around to assembling the toolset required to build R on Windows, since most of what I do is smallish interactive problems. However, another possibility would be to load CygWin/XFree86 on your laptop (which I've done), then download Atlas 3.5.7 from SourceForge (which I've done), then build Atlas with CygWin(which I've done) and then build a second version of R under CygWin using Atlas, and use the CygWin/Atlas R for the heavy number-crunching jobs. This last I haven't done, so I can't say whether there are any gotchas, but everything else I've done with CygWin/XFree86 has worked. My laptop is a Compaq Presario with a 1.67 GHz Athlon XP. Atlas screams on it; the Atlas folks were grinning when I sent them the log. Atlas has an assembly language kernel for Athlons (and P4s as well IIRC). Oh, yeah ... If you do try my scheme, make sure you don't have spaces in the paths ... Atlas still isn't immune to that sort of thing under CygWin. -- M. Edward (Ed) Borasky mailto:[EMAIL PROTECTED] http://www.borasky-research.net Suppose that tonight, while you sleep, a miracle happens - you wake up tomorrow with what you have longed for! How will you discover that a miracle happened? How will your loved ones? What will be different? What will you notice? What do you need to explode into tomorrow with grace, power, love, passion and confidence? -- L. Michael Hall, PhD -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jason Liao Sent: Wednesday, July 23, 2003 1:44 PM To: [EMAIL PROTECTED] Subject: [R] Dismal R performance of Athlon moble CPU? I have been using a laptop computer of Pentium III 1.13 Ghz. I heard that AMD's Athlon has excellent floating point capacity. So I bought a Athlon 2200+ laptop yesterday. I expected that new Athlon 2200+ will be twice as fast as the P III 1.13 GB. I ran a R simulation program and the new computer is only 30% faster, in fact slightly slower than a Celeron 1.50 GB laptop. I am very disappointed by this. What is your experience with Athlon? Should I stick to Intel in the future? Thanks. By the way, the OS is Windows XP home edtion. Jason = Jason G. Liao, Ph.D. Division of Biometrics University of Medicine and Dentistry of New Jersey 335 George Street, Suite 2200 New Brunswick, NJ 08903-2688 phone (732) 235-8611, fax (732) 235-9777 http://www.geocities.com/jg_liao __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Intervention/Impact analysis in time series
Hi R users: Does any one knows about a R library for deal with intervention/impact analysis in time series (eg. Box-Tiao et. al. theory?). Thank you for your help -- Kenneth Roy Cabrera Torres Celular +57 (315) 405 9339 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help