[R] how to plot a fitted smooth line on histograms?
for example, x - read.table(); here x is a vector containing my data to be analyzed. hist(x,plot=TRUE,breaks=200) then I got a histogram graph. However, How can't get a smooth curve based on the histogram cells to show out the outline? Thanks very much. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] comparison operator, decimals, and signif()
On 22-May-05 Nick Drew wrote: Hi, I recently spent quite a bit of time trouble shooting a function that I had written only to discover that the problem I was having was with the comparison operator. I assumed that the following would return TRUE: testMean - 82.8 + 0.1 testMean [1] 82.9 testMean == 82.9 [1] FALSE Apparently this has to do with deciml places. Look: newTest - 82.0 newTest [1] 82 newTest == 82 [1] TRUE newTest == 82.0 [1] TRUE What does signif() do to my object called testMean so that the comparison now evaluates to TRUE? signif(testMean, 3) == 82.9 [1] TRUE While not an exact explanation, the following strongly hints at the underlying answer: print(82.9,digits=20) [1] 82.9 print(82.8,digits=20) [1] 82.8 print(0.1,digits=20) [1] 0.1 print(82.8+0.1,digits=20) [1] 82.89 (82.8+0.1)-82.9 [1] -1.421085e-14 So decimal arithmetic in a binary computer is not as exact as decimal arithmetic done with pencil and paper. The underlying reason (in part) for the above apparent anomalies is that when 82.8 is added to 0.1, the smaller of these numbers (in its binary floating point representation) is shifted along until corresponding digits line up (the shift being accounted for in the exponent). In performing the shift, binary digits are dropped from the bottom end. Since 0.1 in binary in fact has an infinite number of significant decimal places, inaccuracy inevitably creeps in! A close neighbour of the above calculations, such that there are no discrepancies in the binary representations, illustrates the above explanation: print((82.0 + 7/8),digits=20) [1] 82.875 print((82.0 + 7/8)+1/16,digits=20) [1] 82.9375 print(82.9375,digits=20) [1] 82.9375 ((82.0 + 7/8)+1/16)-82.9375 [1] 0 No discrepancies here! The reason is that the fractional parts of these numbers all have exact finite (and short) binary representations. Compare: ((82.0 + 7/8)+1/16)==82.9375 [1] TRUE ((82.0 + 8/10)+1/10)==82.9 [1] FALSE Hoping this helps, Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 22-May-05 Time: 09:58:00 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Numerical PDE Solver
Thanks. I was thinking of something like Crank-Nicolson ? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: 21 May 2005 11:53 To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Numerical PDE Solver Tolga Uzuner [EMAIL PROTECTED] writes: Is there a package in R which implements numerically solves pde ? Thanks, Tolga Not that I know of. What kind? Parabolic PDEs can be fairly easily converted to ODEs by the method of lines. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Suimmary of answers : Possible (ab)use of lexical scoping in R ?
Emmanuel Charpentier wrote: Dear List, I asked how to create a set of functions (and maybe variables) shared by another set of functions but hidden from the main environment. Duncan Murdoch and Brian Ripley advised to use the package creation system. Brian ripley (and someone else, offlist) also pointed me to the local() function, which creates new environments with specified contents, and which I was unaware of (btw, when this function has been introduced ? It is mentioned neither in MASS 4th edition index, nor in 'S Programming' index). I think everybody remembers that local() has been introduced in R-0.65.0 (those who do not remember can simply look into the file OONEWS). I like the two books as well, but I don't think they can be complete in the sense of describing *all* R functions. I'd like to recommend the package way. Working with envrionments and local() does not seem to be the R way - and it is very confusing to read code that refers to a couple of different environments ... Uwe Ligges After re-reading the available docs (which I may have misunderstood...), I come to the following conclusions : - The package creation is the most elegant and portable form. Unless I am mistaken, it entails however some administrative overhead (creation of a directory structure, R CMD installation*, etc ...). - Using local() is a (semi-) kludge, easy to use in one-file disposable works. It might be more error-prone than package creation. - It has been pointed to me that manipulating environment (via local() or otherwise), in a very Abelson--Sussmann-like way, allowed to create OO-oriented code, somewhat different from S3 and S4 class mechanisms. Since repeated experiences have proved to my satisfaction that I am piss-poor at top-down design, I will probably use the environment manipulation for initial head-scratching phase, switching to package creation at the formalization phase. A big thank you to all respondents, whose answers have been *very* useful. Emmanuel Charpentier __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Calling R from R and specifying wait until script is finished
On 5/22/05, Duncan Murdoch [EMAIL PROTECTED] wrote: Lapointe, Pierre wrote: Hello, Let's say I have 50 R scripts to run. What would be the most efficient way to run them? I thought I could do multiple Rterms in a DOS batch file: Ex: Rterm 1.R 1.txt Rterm 2.R 2.txt ... Rterm 50.R 50.txt However, I'm afraid they will all open at the same time. That could be a problem if you had used Rgui (depending on which shell you are using), but Rterm should run until completion before the next line of the batch file will start. Using Windows XP cmd, one can force waiting for Rgui to complete like this (note the /wait): start/wait \Program Files\R\rw2010\bin\Rgui.exe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to plot a fitted smooth line on histograms?
You may check the chapter 8 Probability distributions in An Introduction to R. Wuming On 5/22/05, Hu Chen [EMAIL PROTECTED] wrote: for example, x - read.table(); here x is a vector containing my data to be analyzed. hist(x,plot=TRUE,breaks=200) then I got a histogram graph. However, How can't get a smooth curve based on the histogram cells to show out the outline? Thanks very much. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
R.batch (Was: Re: [R] Calling R from R and specifying wait until script is finished)
Hi. I have a package R.batch in R.classes [http://www.maths.lth.se/help/R/R.classes/] to simplify running multiple batch jobs, which might interest you. The idea is as follows. You setup a directory structure defining a 'JobBatch'; path-to/jobs/ src/ input/ output/ erroneous/ failed/ finished/ interrupted/ running/ todo/ job01/ job02/ job03/ A 'Job' is simply a directory (above job01/, job02/, job03/). Put code shared by all Job:s in src/. Put code unique to each Job in its job directory, e.g. job01/setupParameters.R. All *.R files in src/ and then in job directory are source():ed before a Job is started. When a Job is run, onRun(job) is called. Thus, you have to define onRun() in src/ with the option to override it in each job directory. As soon as the Job is being processed it is moved to running/. When a Job is successful and completed, onFinally(job) is called and it is moved to finished/. If a Job is interrupted, say by Ctrl+C or by sending SIG-INT for elsewhere, onInterrupt() is called and the job is moved to interrupted/. Similarly, if an error occurs, say, by calling stop(), onError(job) is called and it is moved to failed/. (If an error occurs while source():ing the *.R files before starting, the job is moved to erroneous/). If you call sourceHotCode(job) once in a while in your onRun(job) code, code in src/hot/ or job01/hot/ will be source():ed *and *removed. This allows you to fix problems (redefine objective functions etc) *while running*, say a 10-hour job. When a Job runs, its working directory is set to itself, e.g. running/job01/. Thus, written result files and created images etc will naturally be save in each job directory. You can also write to getOutputPath(), which is the output/. Log files are written to getLogPath(), which defaults to output/. Each Job can access common data from the input/ path by getInputPath(). Note that in Unix input/ can be a soft link to another directory. To provide the same functionality under Windows, Windows shortcut files, say, input.lnk, are recognized and followed if getInputPath() is used. [This actually holds for other directories too; if multiple batches share same source code, you can link src/]. Given the above structure, you run all Jobs one by one, by library(R.batch) batch - JobBatch(path-to/jobs/) run(batch) # wait until all jobs are processed # Try Ctrl+C, rerun by run(batch). print(batch) # Gives a summary of the status of all jobs Logging and everything else is taken care of automatically. The code is written such it should be possible for several R sessions to operate on the same batch set simultaneously. Lock files are used to control for this. I used this last summer to run batch jobs from 30+ computers sharing the same file system. Want to try it? Try this install.packages(R.classes, contriburl=http://www.maths.lth.se/help/R;) library(R.batch) example(JobBatch) and a batch of Mandelbrot sets (from Martin Maechler's rhelp example) will be generated together with images. Warning: The package works, but the API is not fixed, meaning it may change in future releases. However, the general idea should remain. Currently I feel that the names of some methods and directories are a little bit confusing. Feedback on this is appreciated. Future: Recently, I have been working on adding dependency control between jobs so certain jobs are processed before others. This is not included in the current version. Some kind of mechanism to restarting interrupted jobs where they where interrupted would also be very nice, but this is very tricky and will propably require modification of the R engine, which is beyond my skills. Cheers Henrik Bengtsson Lapointe, Pierre wrote: Hello, Let's say I have 50 R scripts to run. What would be the most efficient way to run them? I thought I could do multiple Rterms in a DOS batch file: Ex: Rterm 1.R 1.txt Rterm 2.R 2.txt ... Rterm 50.R 50.txt However, I'm afraid they will all open at the same time. I know I could pause the batch file with something like: PING 1.1.1.1 -n 1 -w 6 NUL (to delay 60 seconds) But that would require that I know how long each of my scripts take. Is there an easier way? Something like calling R from R and specifying that the script has to be finished before continuing. Thanks Pierre Lapointe *** AVIS DE NON-RESPONSABILITE:\ Ce document transmis par courri...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
Re: R.batch (Was: Re: [R] Calling R from R and specifying wait until script is finished)
On 5/22/05, Henrik Bengtsson [EMAIL PROTECTED] wrote: -- very interesting package description deleted -- Future: Recently, I have been working on adding dependency control between jobs so certain jobs are processed before others. This is not included in the current version. Some kind of mechanism to restarting interrupted jobs where they where interrupted would also be very nice, but this is very tricky and will propably require modification of the R engine, which is beyond my skills. For the latter, probably the best approach would be to employ a job/grid/queueing engine which does this (i.e. grid-enabling R with a grid approach which has checkpointing). I've been looking at this at work in a different context. best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Hmisc/Design and problem with cgroup
On 5/17/05 21:44 Frank E Harrell Jr sent the following: Aric Gregson wrote: Hello, I am trying to use the following to output a table to latex: cohortbyagesummary - by(data.frame(age,ethnicity), cohort, summary) w - latex.default(cohortbyagesummary, caption=Five Number Age Summaries by Cohort, label=agesummarybycohort, cgroup=c('hello','goodbye','hello'), colheads=c(Age,Ethnicity), extracolheads=c('hello','goodbye'), greek=TRUE, ctable=TRUE) I am not able to get the major column headings of cgroup to work. I receive the error: Object cline not found See if a modified version at http://biostat.mc.vanderbilt.edu/cgi-bin/cvsweb.cgi/Hmisc/R/latex.s fixes your problem. Dr. Harrell, Thanks for your help. I have downloaded latex.s, but am unable to figure out where it should go. How do I ensure that it is called when I load Hmisc? Do I patch it against R/Hmisc? I have a local directory structure in ~/Library/R/library/Hmisc/: -rw-r--r-- 1 user user 594 12 May 19:20 CITATION -rw-r--r-- 1 user user 989 12 May 19:20 DESCRIPTION drwxr-xr-x 3 user user 102 12 May 19:20 Meta/package.rds drwxr-xr-x 3 user user 102 12 May 19:20 R/Hmisc drwxr-xr-x 3 user user 102 12 May 19:20 libs/Hmisc.so Thanks for your patience. aric __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] constraints
On Sunday 22 May 2005 03:18, Ingmar Visser wrote: Is there a package in R that handles general linear (in-)equality + box constrained optimization? R help.search(constrained optimisation) constrOptim(stats) Linearly constrained optimisation Best wishes, Arne If it is not there, could anyone advise me which way to go? And/or point me to packages that solve these problems partially? best, ingmar -- Arne Henningsen Department of Agricultural Economics University of Kiel Olshausenstr. 40 D-24098 Kiel (Germany) Tel: +49-431-880 4445 Fax: +49-431-880 1397 [EMAIL PROTECTED] http://www.uni-kiel.de/agrarpol/ahenningsen/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Maps, Eastern Europe
From: Rau, Roland [EMAIL PROTECTED] I would like to employ a European map in a presentation. My idea was to use: library(mapdata) map(worldHires, c(Austria, Switzerland, Germany)) where I include all countries from my analysis as a vector of character strings like in the example above. Unfortunately, I was unable to specify the Baltic States (Latvia, Lithuania, Estonia) or the Czech Republic. Initially, I thought I misspelled the country names. To avoid this obvious error, I checked within 'mapdata_2.0-14.tar.gz' the file 'worldHires.names'. To my surprise, the spelling was not the source of the error - it was rather the case that those countries were not included. Did some people experience similar problems? And if yes: how did you proceed? Unfortunately the world databases for the maps package are very old (early 1990's) and have not been updated since, except for minor typographical errors in country names. Note that Czechoslovakia is there (as is USSR!). Also, it currently satisfies my needs, which is why I don't have the resources to update it. I suggest you start with the maptools package, but as I understand it, you need to also find some appropriate shapefile data. [Other respondents may improve upon this suggestion.] Regards, Ray Brownrigg maps package maintainer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R version 1.9.1
Chris, Try duplicating R.app (select the R.app in Applications and do Command-D). Double click both versions and you have 2 completely separated versions of R. You can drag/place the multiple icons in the dock. Some issues will show up, e.g. it will use the history from the version that was closed last, both copies will share preferences, etc. I would be interested why you use this setup (I can think of a few reasons), but would like to understand yours. If you can send these to me directly or to the R-SIG-Mac, that would be appreciated. Rob On May 21, 2005, at 3:39 PM, Christopher Wills wrote: Can anybody help? I have been running multiple programs simultaneously in Raqua windows, on Mac OS 10.3 on a G5 dual processor Mac, using R version 1.9.1. I rebooted my computer recently, and thought that I would upgrade to the latest R at the same time. Alas, I find that the newest R does not (so far as I can tell) allow multiple Raqua windows. I would like to go back to 1.9.1, but I have erased the original compressed binary file and it has vanished from the official web sites. Does anyone have a binary of this version that they can send me? (I have Fetch, if it should prove too large.) Alternatively, is there a way to run multiple Raqua windows in the newest R? Many thanks! Chris Wills Christopher Wills Professor of Biological Sciences University of California, San Diego La Jolla CA 92093-0116 Phone: 858-534-4113 Fax: 858-534-7108 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Hmisc/Design and problem with cgroup
Uwe Ligges wrote: Aric Gregson wrote: On 5/17/05 21:44 Frank E Harrell Jr sent the following: Aric Gregson wrote: Hello, I am trying to use the following to output a table to latex: cohortbyagesummary - by(data.frame(age,ethnicity), cohort, summary) w - latex.default(cohortbyagesummary,caption=Five Number Age Summaries by Cohort, label=agesummarybycohort,cgroup=c('hello','goodbye','hello'), colheads=c(Age,Ethnicity), extracolheads=c('hello','goodbye'),greek=TRUE, ctable=TRUE) I am not able to get the major column headings of cgroup to work. I receive the error: Object cline not found See if a modified version at http://biostat.mc.vanderbilt.edu/cgi-bin/cvsweb.cgi/Hmisc/R/latex.s fixes your problem. Dr. Harrell, Thanks for your help. I have downloaded latex.s, but am unable to figure out where it should go. How do I ensure that it is called when I load Hmisc? Do I patch it against R/Hmisc? I have a local directory structure in ~/Library/R/library/Hmisc/: -rw-r--r-- 1 user user 594 12 May 19:20 CITATION -rw-r--r-- 1 user user 989 12 May 19:20 DESCRIPTION drwxr-xr-x 3 user user 102 12 May 19:20 Meta/package.rds drwxr-xr-x 3 user user 102 12 May 19:20 R/Hmisc drwxr-xr-x 3 user user 102 12 May 19:20 libs/Hmisc.so Thanks for your patience. Your directory structure is from an installed binary package. You need to replace the file in the *source* package and *install* the source package after that. Uwe Ligges A better approach is just to source( ) in the new version after library(Hmisc) is issued. You can put latex.s anywhere, just put that path in the source( ). A new version of Hmisc will include the fix. Frank aric __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] skewness and kurtosis in e1071 correct?
I wonder whether the functions for skewness and kurtosis in the e1071 package are based on correct formulas. The functions in the package e1071 are: # skewness - function (x, na.rm = FALSE) { if (na.rm) x - x[!is.na(x)] sum((x - mean(x))^3)/(length(x) * sd(x)^3) } # and # kurtosis - function (x, na.rm = FALSE) { if (na.rm) x - x[!is.na(x)] sum((x - mean(x))^4)/(length(x) * var(x)^2) - 3 } # However, sd() and var() are the estimated population parameters. To calculate the sample statistics of skewness and kurtosis, shouldn't one use the sample statistics of the standard deviation (and variance), as well? For example: # # Function to calculate the sample statistic of skewness: skew_s=function(x) { x = x[!is.na(x)] n = length(x) if (n 3) { cat('valid cases = ',n,'\nskewness is not defined for less than 3 valid cases!\n') } else { z = sqrt(n/(n-1))*scale(x) mean(z^3) } } # and # # Function to calculate the sample statistic of kurtosis: kurt_s=function(x) { x = x[!is.na(x)] n = length(x) if (n 4) { cat('valid cases = ',n,'\nkurtosis is not defined for less than 4 valid cases!\n') } else { z = sqrt(n/(n-1))*scale(x) mean(z^4)-3 } } # Whereas, to calculate the (unbiased) estimated population parameter of skewness and kurtosis, the correction should also include the number of cases in the following way: # # Function to calculate the unbiased populataion estimate of skewness: skew=function(x) { x = x[!is.na(x)] n = length(x) if (n 3) { cat('valid cases = ',n,'\nskewness is not defined for less than 3 valid cases!\n') } else { z = scale(x) sum(z^3)*n/((n-1)*(n-2)) } } # and # # Function to calculate the unbiased population estimate of kurtosis: kurt=function(x) { x = x[!is.na(x)] n = length(x) if (n 4) { cat('valid cases = ',n,'\nkurtosis is not defined for less than 4 valid cases!\n') } else { z = scale(x) sum(z^4)*n*(n+1)/((n-1)*(n-2)*(n-3))-3*(n-1)^2/((n-2)*(n-3)) } } # Thus, it seems that the formulas used in the e1071 package are neither formulas for the sample statistics nor for the (unbiased) estimates of the population parameters. Is there another definition of kurtosis and skewness that I am not aware of? Dirk -- * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html