Re: [R] Opening SAS file using read.sas7bdat() function in sas7bdat library.
Thanks for the helpful comments from others. The KNOWNHOST variable lists the types of file that are known to work with the read.sas7bdat function. It's likely that most files written on Windows platforms will work, even if not listed in KNOWNHOST. If you're feeling experimental, you might just comment the lines that test against the KNOWNHOST list. Unfortunately, it appears that the file formatting depends on the system where is was originally written. The hypothesis is that sas7bdat files were originally no more than a memory dump of a C structure, or similar. Because C structures may be laid out differently by different compilers (i.e., on different platforms), this may have led to the difficulty apparent here. Regards, Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Any good R server-with connection examples
I want to connect R with HTML/PHP pages to take input from user,do some statistical processing on it show results to HTML page again. I search on net,i got Rserve package,but examples are mainly for java langaure not for PHP i am wondering how to connect it to PHP-Apache-MySQL Is there any good tutorial/video which will tell me how to do that ? At least tell me logical way how to use it ? Check out http://rapache.net/ rApache connects R and the Apache 2 web server, such that R can act as a server-side scripting language, like PHP. This may be the easiest way, using R, to take user input from the web browser. The site has some decent documentation and links to examples. --Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nested brew call yields Error in .brew.cat(26, 28) : unused argument(s) (26, 28)
On Wed, 2012-03-28 at 11:40 +0100, Chris Beeley wrote: I am writing several webpages using the brew package and R2HTML. I would like to work off one script so I am using nested brew calls. The documentation for brew states that: NOTE: brew calls can be nested and rely on placing a function named ’.brew.cat’ in the environment in which it is passed. Each time brew is called, a check for the existence of this function is made. If it exists, then it is replaced with a new copy that is lexically scoped to the current brew frame. Once the brew call is done, the function is replaced with the previous function. The function is finally removed from the environment once all brew calls return. I'm afraid I can't quite figure out what it is I'm supposed to do here. I've tried loading the brew library within the script which I pass to brew, and I've tried defining brew cat like this: The paragraph above describes what brew is doing behind the scenes. It's not necessary to modify or set the .brew.cat function. A nested (or recursive) brew call occurs when brew() is called from a document currently being processed by brew(). To illustrate further, suppose there are two brew documents, example-1.brew and example-2.brew, where example-1.brew contains the following text (delimited by '''): ''' This text is in example-1.brew. %= brew::brew(example-2.brew) % ''' and the example-2.brew contains ''' This text is in example-2.brew. %= date() -% ''' Then from the R prompt we have: Rbrew::brew(example-1.brew) This text is in example-1.brew. This text is in example-2.brew. Thu Mar 29 20:24:52 2012 .brew.cat=function(){} This generates the following error message: Error in .brew.cat(26, 28) : unused argument(s) (26, 28) I think perhaps it is more likely that I need to insert into the script the actual content of .brew.cat, but I can't seem to get R to tell me what it is and Googling throws up a lot of stuff about beer and not much else (drew a blank also from RSiteSearch(Nested brew)) Any help gratefully received. Chris Beeley Institute of Mental Health, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Assistant Professor, Department of Biostatistics School of Medicine, Vanderbilt University 1161 21st Ave. S2323 MCN Office CC2102L Nashville, TN 37232-2158 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Matrix code optimization
The chol and solve methods for dpoMatrix (Matrix package) are much faster than the default methods. But, the time required to coerce a regular matrix to dpoMatrix swamps the advantage. Hence, I have the following problem, where use of dpoMatrix is worse than a regular matrix. library(Matrix) x - diag(10) system.time( for(r in seq(0.1, 0.9, length.out=1000)) { m - r^abs(row(x)-col(x)); chol(m); solve(m); }) system.time( for(r in seq(0.1, 0.9, length.out=1000)) { M - as(r^abs(row(x)-col(x)), 'dpoMatrix') chol(M); solve(M); }) Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function restrictedparts
That's because the number of partitions of 281 items of order 10 is quite large: R library('partitions') R R(10,281) [1] 1218681472 Without thinking about this too hard, the result of restrictedparts(281,10) should require around R 1218681472 * 10 * 4 / 10^9 [1] 48.74726 gigabytes of storage space (because the result is a 1218681472 x 10 array of 4 byte integers). Because the number of partitions grows 'explosively' with the number of items, this is a serious obstacle for statistical partitioning and clustering methods. For more discouragement, see the 'Bell number'. You can enumerate these restricted partitions one by one; see R ?partitions::nextpart Matt On Wed, 2012-01-25 at 15:11 +, yan jiao wrote: I am using function restrictedparts, but got error: restrictedparts(281,10) Error in integer(len) : vector size specified is too large Calls: restrictedparts - integer In addition: Warning message: In restrictedparts(281, 10) : NAs introduced by coercion Error in integer(len) : vector size specified is too large Calls: restrictedparts - integer is there a similar function can deal with long vector? I'm using R version 2.14.1 (2011-12-22),x86_64, linux-gnu many thanks yan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian data analysis recommendations
On Thu, 2012-01-19 at 19:23 -0500, C W wrote: Thanks, Rich, I will look at the book. I agree, there are many nice packages, but what if the package changes in a few years? I would have no idea what is going on! I've heard from predecessor in the industry who emphasize the learning, not just plug and chug. I really want to learn the material and understand it, above all, it is interesting. I am looking more towards Bayesian statistics or Bayesian inference. I am in statistics graduate school, though not my field, the biology application could help in the understand I suppose? This list (r-help) may not be the best place to look for advice on this. But here is some anyway :) For a well-rounded introduction, I recommend Robert's 'The Bayesian Choice'. This is a great foundation for Bayesians who intend to defend their positions on statistical inference. For a more practical approach, Gelman, Carlin, Stern, and Rubin's book 'Bayesian Data Analysis' has been very popular (THE most popular, according to some). Regarding the software tools for Bayesian data analysis, the most mature _and_ active _and_ best integrated with the R project is Martyn Plummer's JAGS (See also the R package rjags, by the same author). Another tool that I'm planning to check out is PyMC: http://code.google.com/p/pymc/ Best, Matt On Thu, Jan 19, 2012 at 7:07 PM, Rich Shepard rshep...@appl-ecosys.com wrote: On Thu, 19 Jan 2012, C W wrote: I am trying to learn Bayesian inference and Bayesian data analysis, I am new in the field. Would any experts on the list recommend any good sites or materials for beginners? My approach is to learn and understand the theory first, then program on my own using R, though I see there are already packages. I'm far from an expert, but why not avoid re-inventing the wheel while you learn? Buy and read Jim Albert's Bayesian Computation with R. If you're a population ecologist (or willing to extend pesented examples and ideas to communities and ecosystems), Ben Bolker's Ecological Models and Data in R explains when Bayesian and frequentist approaches each have advantages over the other. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R logo in eps formt
See this earlier post for SVG logos: http://tolstoy.newcastle.edu.au/R/e12/devel/10/10/0112.html Using Image Magick, do something like convert logo.svg logo.eps On Thu, 2011-12-01 at 10:56 +0700, Ben Madin wrote: G'day all, Sorry if this message has been posted before, but searching for R is always difficult... I was hoping for a copy of the logo in eps format? Can I do this from R, or is one available for download? cheers Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] contact person for UseR 2012, please?
The contact person is: Stephania McNeal-Goddard email: stephania.mcneal-godd...@vanderbilt.edu phone: (615)322-2768 Vanderbilt University School of Medicine Department of Biostatistics S-2323 Medical Center North Nashville, TN 37232-2158 On Tue, 2011-10-18 at 12:41 -0400, David Winsemius wrote: On Oct 18, 2011, at 12:25 PM, Erin Hodgess wrote: Dear R People: Do you know who the contact person is for UseR 2012, please? I'm trying to get together some numbers for funding (sorry for the Funny, it was the first hit on a Google search with term useR2012 http://biostat.mc.vanderbilt.edu/wiki/Main/UseR-2012 David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Assistant Professor, Department of Biostatistics School of Medicine, Vanderbilt University 1161 21st Ave. S2323 MCN Office CC2102L Nashville, TN 37232-2158 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Related Topic] need help on read.spss
Would it be worthwhile to update the read.spss implementation using the more recent discoveries from the PSPP group? I don't mean to copy their code; but to use the ideas in their code. Is anyone working on this? I wouldn't want the effort to be duplicated. On Thu, 2011-10-13 at 16:22 +0200, Uwe Ligges wrote: On 11.10.2011 12:07, Smart Guy wrote: Hi, I have one doubt about one of the parameter of 'read.spss()' from 'foreign' package. Here is the syntax :- read.spss ( file, use.value.labels = TRUE, to.data.frame = FALSE, max.value.labels = Inf, trim.factor.names = FALSE, trim_values = TRUE, reencode = NA, use.missings = to.data.frame ) In above syntax when I pass *'to.data.frame= FALSE*' it gives me missing values from SPSS file (that I try to read using read.spss() ). But when I pass '*to.data.frame = TRUE*' then its not giving me missing values. And need to get missing values. According to read.spss() documentation *to.data.frame : return a data frame?* I am curious to know, if we pass *'to.data.frame = TRUE*' , is it going to cause some issue or effect something? I didn't understand the read.spss() documentation correctly. Please explain. Thanks in Advance An R data.frame cannot represent different kinds of missing values, since R just has NA. Therefore, there are two way to import data: to.data.frame=FALSE will read all the information, but into a format you will likely have to postprocess to make it conveniently usable. to.data.frame=TRUE will import into a data.frame, but that cannot represent all the nuances known from the SPSS representation. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rweb and setting up R on a server
Erin, I haven't used Rweb recently. The URL is http://www.math.montana.edu/Rweb/ . If you have a server, you could set up the server version of RStudio: http://rstudio.org/download/server . It worked well when I tried it. Best, Matt On Tue, 2011-09-06 at 17:07 -0500, Erin Hodgess wrote: Dear R People: At one time, Rweb existed, which had R on a server. I looked for it, but can't find it. Has anyone used that recently, or is there a new equivalent, please? Thanks, Erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readBin fails to read large files
On Thu, 2011-09-01 at 17:36 +0100, Prof Brian Ripley wrote: readBin is intended to read a few items at a time, not 10^9. You are probably getting 32-bit integer overflow inside your OS, since the number of bytes you are trying to read in one go exceeds 2GB. Don't do that: read say a million at time. And BTW, if these really are unsigned ints you will get wraparound. To elaborate, ?readBin reads that the 'signed' argument is only used for integers of size 1 and 2 bytes. These are ultimately converted to signed 4 byte integers, because that's how R stores integers. To be exact, if your file contains integers larger than 2^31-1 = 2147483647, would occur. In actuality, R returns NA for those values. I'm bringing this up because R normally issues a warning: R 2147483647L + 1L [1] NA Warning message: In 2147483647L + 1L : NAs produced by integer overflow But, a similar warning isn't issued by readBin when NA results from signed integer overflow: #The raw vector below represents 2147483647L and 2147483647L + 1L #in little endian, unsigned, 4 byte integers R dat - as.raw(c(0xff,0xff,0xff,0x7f,0x00,0x00,0x00,0x80)) R writeBin(dat, 'test.bin') R readBin('test.bin', n=2, integer(), signed=FALSE) [1] 2147483647 NA On Thu, 1 Sep 2011, Benton, Paul wrote: Posting for a friend Begin forwarded message: From: Geier, Florian florian.geie...@imperial.ac.ukmailto:florian.geie...@imperial.ac.uk Subject: Fwd: readBin fails to read large files Date: September 1, 2011 4:10:53 PM GMT+01:00 To: Begin forwarded message: Date: 1 September 2011 16:01:45 GMT+01:00 Subject: readBin fails to read large files Dear all, I am trying to read a large file (~2GB) of unsigned ints into R. Using the command: raw-readBin(file,n=10^8, integer(),endian=little,signed=FALSE) It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My machine$sizeof.long is 8 bit. I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) architecture. Thanks for your help Florian -- AXA doctoral fellow Bundy lab - Biomolecular Medicine Imperial College London -- AXA doctoral fellow Bundy lab - Biomolecular Medicine Imperial College London [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrap
In order to apply the bootstrap, you must resample, uniformly at random from the independent units of measurement in your data. Assuming that these represent the rows of 'data', consider the following: est - function(y, x, obeta = c(1,1), verbose=FALSE) { n - length(x) X - cbind(rep(1, n), x) nbeta - c(0,0) iter - 0 while(crossprod(obeta-nbeta)10^(-12)) { nbeta - obeta eta - X%*%nbeta mu- eta mu1 - 1/eta W - diag(as.vector(mu1)) Z - X%*%nbeta+(y-mu) XWX - t(X)%*%W%*%X XWZ - t(X)%*%W%*%Z Cov - solve(XWX) obeta - Cov%*%XWZ iter - iter+1 if(verbose) cat(Iteration # and beta1= ,iter, nbeta, \n) } return(nbeta[1,1]) } boot - function(data, reps) { n - nrow(data) Nt - vector('numeric', length=reps) for(Ncount in 1:reps) { #resample the rows of data bdata - data[sample(1:n,n,replace=TRUE),] #recompute and store estimate Nt[Ncount] - est(bdata[,1], bdata[,2]) } return(Nt) } stem(boot(data,1000),width=60) The decimal point is at the | -3 | 4 -2 | -1 | 2 -0 | 88866555444333222111 0 | 0022+400 1 | 0001+203 2 | 2224+23 3 | 112223344455 4 | 113344555789 5 | 02334446677899 6 | 1112334455778 7 | 11235568 8 | 001799 9 | 0259 10 | 1446 11 | 19 12 | 48 13 | 8 14 | 024 15 | 16 | 17 | 0788 18 | 19 | 1 On Wed, 2011-07-20 at 18:09 -0400, Val wrote: Hi all, I am facing difficulty on how to use bootstrap sampling and below is my example of function. Read a data , use some functions and use iteration to find the solution( ie, convergence is reached). I want to use bootstrap approach to do it several times (200 or 300 times) this whole process and see the distribution of parameter of interest. Below is a small example that resembles my problem. However, I found out all samples are the same. So I would appreciate your help on this case. #** rm(list=ls()) xx - read.table(textConnection( y x 11 5.16 11 4.04 14 3.85 19 5.68 4 1.26 23 7.89 15 4.25 17 3.94 7 2.35 17 4.74 14 5.49 11 4.12 17 5.92), header=TRUE) data - as.matrix(xx) closeAllconnections() Nt - NULL for (Ncount in 1:100) { y - data[,1] x - data[,2] n - length(x) X - cbind(rep(1,n),x) #covariate/design matrix obeta- c(1,1) #previous/starting values of beta nbeta - c(0,0)#new beta iter=0 while(crossprod(obeta-nbeta)10^(-12)) { nbeta - obeta eta - X%*%nbeta mu- eta mu1 - 1/eta W - diag(as.vector(mu1)) Z - X%*%nbeta+(y-mu) XWX - t(X)%*%W%*%X XWZ - t(X)%*%W%*%Z Cov - solve(XWX) obeta - Cov%*%XWZ iter - iter+1 cat(Iteration # and beta1= ,iter, nbeta, \n) } Nt[Ncount] - nbeta[1,1] } Nt summary(Nt) #**e* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to capture console output in a numeric format
Ravi, Consider using an environment (i.e. a 'reference' object) to store the results, avoiding string manipulation, and the potential for loss of precision: fr - function(x, env) { ## Rosenbrock Banana function x1 - x[1] x2 - x[2] f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 if(exists('fout', env)) fout - rbind(get('fout', env), c(x1, x2, f)) else fout - c(x1=x1, x2=x2, f=f) assign('fout', fout, env) f } out - new.env() ans - optim(c(-1.2, 1), fr, env=out) out$fout Best, Matt On Fri, 2011-06-24 at 15:10 +, Ravi Varadhan wrote: Thank you very much, Jim. That works! I did know that I could process the character strings using regex, but was also wondering if there was a direct way to get this. Suppose, in the current example I would like to obtain a 3-column matrix that contains the parameters and the function value: fr - function(x) { ## Rosenbrock Banana function on.exit(print(cbind(x1, x2, f))) x1 - x[1] x2 - x[2] f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 f } fvals - capture.output(ans - optim(c(-1.2,1), fr)) Now, I need to tweak your solution to get the 3-column matrix. It would be nice, if there was a more direct way to get the numerical output, perhaps a numeric option in capture.output(). Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Friday, June 24, 2011 10:48 AM To: Ravi Varadhan Cc: r-help@r-project.org Subject: Re: [R] How to capture console output in a numeric format try this: fr - function(x) { ## Rosenbrock Banana function +on.exit(print(f)) +x1 - x[1] +x2 - x[2] +f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 +f + } fvals - capture.output(ans - optim(c(-1.2,1), fr)) # convert to numeric fvals - as.numeric(sub(^.* , , fvals)) fvals [1] 24.20 7.095296 15.08 4.541696 [5] 6.029216 4.456256 8.879936 7.777856 [9] 4.728125 5.167901 4.21 4.437670 [13] 4.178989 4.326023 4.070813 4.221489 [17] 4.039810 4.896359 4.009379 4.077130 [21] 4.020798 3.993600 4.024586 4.117625 [25] 3.993115 3.976081 3.971089 4.023905 [29] 3.980807 3.952577 3.932179 3.935345 On Fri, Jun 24, 2011 at 10:39 AM, Ravi Varadhan rvarad...@jhmi.edu wrote: Hi, I would like to know how to capture the console output from running an algorithm for further analysis. I can capture this using capture.output() but that yields a character vector. I would like to extract the actual numeric values. Here is an example of what I am trying to do. fr - function(x) { ## Rosenbrock Banana function on.exit(print(f)) x1 - x[1] x2 - x[2] f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 f } fvals - capture.output(ans - optim(c(-1.2,1), fr)) Now, `fvals' contains character elements, but I would like to obtain the actual numerical values. How can I do this? Thanks very much for any suggestions. Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edumailto:rvarad...@jhmi.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Assistant Professor, Department of Biostatistics School of Medicine, Vanderbilt University 1161 21st Ave. S2323 MCN Office CC2102L Nashville, TN 37232-2158 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to capture console output in a numeric format
On Fri, 2011-06-24 at 12:09 -0400, David Winsemius wrote: On Jun 24, 2011, at 11:27 AM, Matt Shotwell wrote: Ravi, Consider using an environment (i.e. a 'reference' object) to store the results, avoiding string manipulation, and the potential for loss of precision: fr - function(x, env) { ## Rosenbrock Banana function x1 - x[1] x2 - x[2] f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 if(exists('fout', env)) fout - rbind(get('fout', env), c(x1, x2, f)) So _that's_ what a reference object is? Well, environments have 'pass-by-reference' behavior. That is, when they are passed to a function, modifications to the environment persist outside the function call. This is distinct from the Reference class (?methods::ReferenceClass). But there are similar concepts. The methods of a reference class can modify the class fields in a 'by-reference' fashion. However, the fields need not be passed to a method. This seems to give the same results in this example. Am I committing any sins by sneaking around the get()? if(exists('fout', env)) fout - rbind(env[['fout']], c(x1, x2, f)) # seems more direct 'env$fout' works here too. Thinking I also might be able to avoid the later assign(), I tried these without success. fr - function(x, env) { ## Rosenbrock Banana function x1 - x[1] x2 - x[2] f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 if(exists('fout', env)) env[['fout']] - rbind(env[['fout']], c(x1, x2, f)) else fout - c(x1=x1, x2=x2, f=f) f } this would work with 'env$fout - c(x1=x1, x2=x2, f=f)' following the 'else'. Hence, David's version might look like this: fr - function(x, env) { ## Rosenbrock Banana function x1 - x[1] x2 - x[2] f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 if(exists('fout', env)) env$fout - rbind(env$fout, c(x1, x2, f)) else env$fout - c(x1=x1, x2=x2, f=f) f } out - new.env() ans - optim(c(-1.2, 1), fr, env=out) out$fout -Matt out - new.env() ans - optim(c(-1.2, 1), fr, env=out) out$fout # NULL Is there no '[[-' for environments? (Also tried '-' but I know that is sinful/ ) -- David. else fout - c(x1=x1, x2=x2, f=f) assign('fout', fout, env) f } out - new.env() ans - optim(c(-1.2, 1), fr, env=out) out$fout Best, Matt On Fri, 2011-06-24 at 15:10 +, Ravi Varadhan wrote: Thank you very much, Jim. That works! I did know that I could process the character strings using regex, but was also wondering if there was a direct way to get this. Suppose, in the current example I would like to obtain a 3-column matrix that contains the parameters and the function value: fr - function(x) { ## Rosenbrock Banana function on.exit(print(cbind(x1, x2, f))) x1 - x[1] x2 - x[2] f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 f } fvals - capture.output(ans - optim(c(-1.2,1), fr)) Now, I need to tweak your solution to get the 3-column matrix. It would be nice, if there was a more direct way to get the numerical output, perhaps a numeric option in capture.output(). Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Friday, June 24, 2011 10:48 AM To: Ravi Varadhan Cc: r-help@r-project.org Subject: Re: [R] How to capture console output in a numeric format try this: fr - function(x) { ## Rosenbrock Banana function +on.exit(print(f)) +x1 - x[1] +x2 - x[2] +f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 +f + } fvals - capture.output(ans - optim(c(-1.2,1), fr)) # convert to numeric fvals - as.numeric(sub(^.* , , fvals)) fvals [1] 24.20 7.095296 15.08 4.541696 [5] 6.029216 4.456256 8.879936 7.777856 [9] 4.728125 5.167901 4.21 4.437670 [13] 4.178989 4.326023 4.070813 4.221489 [17] 4.039810 4.896359 4.009379 4.077130 [21] 4.020798 3.993600 4.024586 4.117625 [25] 3.993115 3.976081 3.971089 4.023905 [29] 3.980807 3.952577 3.932179 3.935345 On Fri, Jun 24, 2011 at 10:39 AM, Ravi Varadhan rvarad...@jhmi.edu wrote: Hi, I would like to know how to capture the console output from running an algorithm for further analysis. I can capture this using capture.output() but that yields a character vector. I would like
Re: [R] Elbow criterion
On Mon, 2011-06-20 at 13:38 +0200, Dominik P.H. Kalisch wrote: Hi, I would like to cluster a dataset with the ward algorithm. I'm assuming that this refers to the agglomerative partitioning method [1]. That is, the number of clusters is selected according to the data partition that is sequentially optimal with respect to an `objective function'. In order to apply the elbow criterion, it should be possible to optimize over subsets of all possible data partitions where the number of clusters is fixed. Although the Ward method yields a sequence of data partitions with decreasing cluster sizes, there is no guarantee that _any_ of these partitions are optimal (except sequentially, of course). To apply the elbow method post hoc seems dubious, but maybe no more so than the Ward method itself. There are clustering methods that optimize the data partition (w.r.t a likelihood/posterior) with a fixed number of clusters, for instance, those based on finite mixture models. The elbow principle and method seem more valid in this context. See the R package 'mclust', and the CRAN task view for cluster analysis: http://cran.r-project.org/web/views/Cluster.html That works fine. But I can't find a method to plot the structure chart to estimate the elbow crterion for the number of clusters. Can someone tell me how I can do it? Thanks for your help. Dominik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [1] Ward, J. H. (1963), “Hierarchical Grouping to Optimize an Objective Function,” Journal of the American Statistical Association, 58, 236–244. -- Matthew S. Shotwell Assistant Professor, Department of Biostatistics School of Medicine, Vanderbilt University 1161 21st Ave. S2323 MCN Office CC2102L Nashville, TN 37232-2158 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can we prepare a questionaire in R
As Mike had written, there are frameworks for web-development with R. RApache http://www.rapache.net is one. Also, see the R package Rook: http://cran.r-project.org/web/packages/Rook/index.html . On Wed, 2011-06-08 at 17:26 +0530, amrita gs wrote: How can we create HTML forms in R Wouldn't you rather create HTML forms in HTML? See the links above to use R for server-side scripting, for example, to receive form data from a web browser. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about curve function
On Tue, 2011-06-07 at 16:17 +0200, Uwe Ligges wrote: On 07.06.2011 11:57, peter dalgaard wrote: On Jun 6, 2011, at 11:22 , Prof Brian Ripley wrote: As a further example of the trickiness, the function method of plot() relies on curve(x, ...) being a request to plot the function x(x) against x. I've added a comment to that effect to the help page. Ouch. This springs to mind: fortune(106) If the answer is parse() you should usually rethink the question. -- Thomas Lumley R-help (February 2005) but curve() predates that insight by half a decade or more. It could probably do with a redesign, if anyone is up to it. By the way, it really does work if the 2nd arg is an expression object (as opposed to an expression evaluating to an expression object): do.call(curve,list(expression(x))) or cl- quote(curve(x)) cl[[2]]- expression(x) eval(cl) (The trouble with nonstandard evaluation is that it doesn't follow standard evaluation rules...) If this is not already a fortune, I will add it. And one more for Uwe's principle: when discontent, circumvent! :) Which is why I useually circvumvent curve(). It is typically faster to just evaluate a function at positions x and plot it rather than thinking minutes about how curve() expects its arguments. Uwe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about curve function
I think there is trouble because expr in curve(expr) may be the name of a function, and it's ambiguous whether 'x' should be interpreted as a mathematical expression involving x, or the name of a function. Here are some examples that work: curve(I(x)) curve(1*x) On Sun, 2011-06-05 at 12:07 -0500, Abhilash Balakrishnan wrote: Dear Sirs, I am a new user of the R package. When I try to use the curve function it confuses me. curve(x^2) Works fine. curve(x) Makes a complaint I don't understand. Why is x^2 valid and x is not? I check the documentation of curve, and it says the first argument must be an expression containing x. expression(x) Is an expression containing x. curve(expression(x)) Makes a different complaint and mentions different lengths of x and y (but I use no y here). I understand that plotting the function y(x) = x is rather silly, but I want to know what I am doing wrong, for the sake of my understanding of how R works. Thank you for support. Abhilash B. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a vector from a file
On Tue, 2011-05-31 at 15:36 +0200, heimat los wrote: Hello all, I am new to R and my question should be trivial. I need to create a word cloud from a txt file containing the words and their occurrence number. For that purposes I am using the snippets package [1]. As it can be seen at the bottom of the link, first I have to create a vector (is that right that words is a vector?) like bellow. words - c(apple=10, pie=14, orange=5, fruit=4) My problem is to do the same thing but create the vector from a file which would contain words and their occurence number. I would be very happy if you could give me some hints. How is the file formatted? Can you provide a small example? Moreover, to understand the format of the file to be inserted I write the vector words to a file. write(words, file=words.txt) However, the file words.txt contains only the values but not the names(apple, pie etc.). $ cat words.txt 10 14 5 4 It seems that I have to understand more about the data types in R. Thanks. PH http://www.rforge.net/doc/packages/snippets/cloud.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a vector from a file
On Tue, 2011-05-31 at 16:19 +0200, heimat los wrote: On Tue, May 31, 2011 at 4:12 PM, Matt Shotwell m...@biostatmatt.com wrote: On Tue, 2011-05-31 at 15:36 +0200, heimat los wrote: Hello all, I am new to R and my question should be trivial. I need to create a word cloud from a txt file containing the words and their occurrence number. For that purposes I am using the snippets package [1]. As it can be seen at the bottom of the link, first I have to create a vector (is that right that words is a vector?) like bellow. words - c(apple=10, pie=14, orange=5, fruit=4) My problem is to do the same thing but create the vector from a file which would contain words and their occurence number. I would be very happy if you could give me some hints. How is the file formatted? Can you provide a small example? The file format is video tape=8 object recognition=45 object detection=23 vhs tape=2 But I can change it if needed with bash scripting. A CSV might be more universal, but this will do. Regards OK. Save the above as 'words.txt', then from the R prompt: words.df - read.table(words.txt, sep==) words.vec - words.df$V2 names(words.vec) - words.df$V1 Then use words.vec with the snippets::cloud function. I wasn't able to install the snippets package and test the cloud function, because I am still using R 2.13.0-alpha. read.table returns what R calls a 'data frame'; basically a collection of records over some number of fields. It's like a matrix but different, since fields may take values of different types. In the example above, the data frame returned by read.table has two fields named 'V1' and 'V2', respectively. The R expression 'words.df$V2' references the 'V2' field of words.df, which is a vector. The last expression sets names for words.vec, by referencing the 'V1' field of words.df. Moreover, to understand the format of the file to be inserted I write the vector words to a file. write(words, file=words.txt) However, the file words.txt contains only the values but not the names(apple, pie etc.). $ cat words.txt 10 14 5 4 It seems that I have to understand more about the data types in R. Thanks. PH http://www.rforge.net/doc/packages/snippets/cloud.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] blank space escape sequence in R?
You can embed hex escapes in strings (except \x00). The value(s) that you embed will depend on the character encoding used on you platform. If this is UTF-8, or some other ASCII compatible encoding, \x20 will work: foo\x20bar [1] foo bar For other locales, you might try charToRaw( ) to see the binary (hex) representation for the space character on your platform, and substitute this sequence instead. On Mon, 2011-04-25 at 15:01 +0200, Mark Heckmann wrote: Is there a blank space escape sequence in R, i.e. something like \sp etc. to produce a blank space? TIA Mark ––– Mark Heckmann Blog: www.markheckmann.de R-Blog: http://ryouready.wordpress.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] blank space escape sequence in R?
I may have misread your original email. Whether you use a hex escape or a space character, the resulting string in memory is identical: identical(a\x20b, a b) [1] TRUE But, if you were to read a file containing the six characters a \x20b (say with readLines), then the six characters would be read into memory, and printed like this: a\\x20b That is, not with a space character substituted for \x20. So, now I'm not sure this is a solution. On Mon, 2011-04-25 at 12:24 -0500, Matt Shotwell wrote: You can embed hex escapes in strings (except \x00). The value(s) that you embed will depend on the character encoding used on you platform. If this is UTF-8, or some other ASCII compatible encoding, \x20 will work: foo\x20bar [1] foo bar For other locales, you might try charToRaw( ) to see the binary (hex) representation for the space character on your platform, and substitute this sequence instead. On Mon, 2011-04-25 at 15:01 +0200, Mark Heckmann wrote: Is there a blank space escape sequence in R, i.e. something like \sp etc. to produce a blank space? TIA Mark ––– Mark Heckmann Blog: www.markheckmann.de R-Blog: http://ryouready.wordpress.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting 16-bit to 8-bit encoding?
On 04/21/2011 10:36 AM, Brian Buma wrote: Hello all- I have a question related to encoding. I'm using a seperate program which takes either 16 bit or 8 bit (flat binary files) as inputs (they are raster satellite imagery and the associated quality files), but can't handle both at the same time. Problem is the quality and the image come in different formats (quality- 8bit, image- 16bit). I need to switch the encoding on the I think some more detail about these files is necessary. What do these 16/8 bit quantities represent? Are these files just a sequence of such quantities, or is there meta information (i.e. image dimension)? quality files to 16 bit, without altering anything else (they are img files right now). I imagine this is a fairly simply process, but I haven't been Does 'img files' indicate that these files are formatted according to a standard?. Finally, are you using some R code to manipulate these files? Have an example, including data? able to find a package or anything which can tell me how to do it- perhaps I'm searching the wrong terms, but I did look. Is there any methods to do this quickly? Ideally, the solution would involve reading in a list of files and replacing the original with the new, 16 bit version, as I have over 300 files to convert. I hope that's clear. Thanks in advance! -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting 16-bit to 8-bit encoding?
OK. I'm going to copy this back to R-help too. With R, we can convert a file of 8-bit integers to 16-bit integers like so: # Create a test file of 8-bit integers: con - file(test.8, wb) writeBin(sample(-1L:4L, 1024, TRUE), con, size=1) close(con) # Convert test.8 to test.16 icon - file(test.8, rb) ocon - file(test.16, wb) while(length(dat - readBin(icon, integer, 1024, size=1)) 0) writeBin(dat, ocon, size=2) close(icon) close(ocon) This assumes (without considering a more formal description of the format) that the file and your computing platform agree on how multi-byte signed integers are represented. Hope that will get you going. On 04/21/2011 11:02 AM, Brian Buma wrote: Apologies. The 8-bit file (the one that needs to be converted) is just a series of integers, -1 to 4, which is no doubt why they are encoded in 8 bit. They don't need to be changed numerically, just put in a 16-bit encoding. No meta info, headerless. All the data is MODIS satellite imagery. I have been using the raster program to visualize things, and processing (when I get that far) will be done in that program mainly. I've used that program on a different project, and it seemed to work well. The actual program that can't handle two different inputs is Timesat, a phenology-program (not R). I was thinking that R could probably do this conversion quick and easy (fairly), but haven't figured out how to yet. As an example, I have an NDVI file (flat binary, 16bit encoding)- so a string of numbers, 4450, 4650, etc... The associated quality file is another string, 1,1,2,1,0, etc. It's encoded as an 8bit file. Conceptually, all it needs (I think) is to be read in and resaved in the less memory-efficient 16-bit format. Thanks! Sorry if the explanation isn't clear. On Thu, Apr 21, 2011 at 9:50 AM, Matt Shotwell matt.shotw...@vanderbilt.edu mailto:matt.shotw...@vanderbilt.edu wrote: On 04/21/2011 10:36 AM, Brian Buma wrote: Hello all- I have a question related to encoding. I'm using a seperate program which takes either 16 bit or 8 bit (flat binary files) as inputs (they are raster satellite imagery and the associated quality files), but can't handle both at the same time. Problem is the quality and the image come in different formats (quality- 8bit, image- 16bit). I need to switch the encoding on the I think some more detail about these files is necessary. What do these 16/8 bit quantities represent? Are these files just a sequence of such quantities, or is there meta information (i.e. image dimension)? quality files to 16 bit, without altering anything else (they are img files right now). I imagine this is a fairly simply process, but I haven't been Does 'img files' indicate that these files are formatted according to a standard?. Finally, are you using some R code to manipulate these files? Have an example, including data? able to find a package or anything which can tell me how to do it- perhaps I'm searching the wrong terms, but I did look. Is there any methods to do this quickly? Ideally, the solution would involve reading in a list of files and replacing the original with the new, 16 bit version, as I have over 300 files to convert. I hope that's clear. Thanks in advance! -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University -- Brian Buma PhD Candidate Ecology and Evolutionary Biology / CIRES University of Colorado, Boulder brian.b...@colorado.edu mailto:brian.b...@colorado.edu -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.raw - but convert ASCII?
On Tue, 2011-04-19 at 03:14 -0400, Duncan Murdoch wrote: On 11-04-18 9:51 PM, Matt Shotwell wrote: Does anyone know if there is a simple way to print raw vectors, such that ASCII characters are printed for bytes in the ASCII range, and their hex representation otherwise? rawToChar doesn't work when we have something like c(0x00, 0x00, 0x44, 0x00). Do you really need hex? rawToChar(x, multiple=TRUE) comes close, but displays using octal or symbolic escapes, e.g. No, but I've almost learned to count efficiently in hex. :) [1] \001 \002 \003 \004 \005 \006 \a \b \t \n [12] \v \f \r \016 \017 \020 \021 \022 \023 \024 \025 [23] \026 \027 \030 \031 \032 \033 \034 \035 \036 \037 [34] !\ #$%'() *+ If you really do want hex, then you'll need something like ifelse( x 32 | x = 127, as.character(x), rawToChar(x, multiple=TRUE)) That does it. Thanks. -Matt Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] print.raw - but convert ASCII?
Does anyone know if there is a simple way to print raw vectors, such that ASCII characters are printed for bytes in the ASCII range, and their hex representation otherwise? rawToChar doesn't work when we have something like c(0x00, 0x00, 0x44, 0x00). -Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integer and floating-point storage
Hi Mike, There are some facilities for storing and manipulating small (2 bit) integers. See here: http://cran.r-project.org/web/packages/ff/index.html -Matt On 04/14/2011 01:20 PM, Mike Miller wrote: I note that current implementations of R use 32-bit integers for integer vectors, but I am working with large arrays that contain integers from 0 to 3, so they could be stored as unsigned 8-bit integers. Can R do this? (FYI -- This is for storing minor-allele counts for genetic studies. There are 0, 1 or 2 minor alleles and 3 would represent missing.) It is theoretically possible to store such data with four integers per byte. This is what PLINK (GPL license) does in its binary (.bed) pedigree format: http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#ped That might be too much to hope for. ;-) I think that the R system uses double-precision floating point numbers by default. When I impute minor-allele counts, I get posterior expected values ranging from 0 to 2 (called dosages). The imputation isn't very precise, so it would be fine to store such data using one or two bytes. (The values are used as regressors and small changes would have minimal impact on results.) I could use unsigned 8-bit integers (0 to 255), probably using only 0 to 254 so that 1 and 2 could be represented with perfect precision as 127/127 and 254/127 (but I would do regression on the integer values). Or I could use 16 bits, doubling memory load and improving precision. It would be convenient if R could work with half-precision floating-point numbers (binary16): http://en.wikipedia.org/wiki/Half_precision_floating-point_format Can R do that? If not, is anyone interested in working on developing some of these features in R? We have GPL code from PLINK and Octave that might help a lot. http://www.gnu.org/software/octave/doc/interpreter/Integer-Data-Types.html Best, Mike -- Michael B. Miller, Ph.D. Bioinformatics Specialist Minnesota Center for Twin and Family Research Department of Psychology University of Minnesota __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] understanding dump.frames; typo;
When a function I have stop()s, I'd like it to return its evaluation frame, but not halt execution of the script. In experimenting with this, I became confused with dump.frames. From ?dump.frames: If ‘dump.frames’ is installed as the error handler, execution will continue even in non-interactive sessions. See the examples for how to dump and then quit. Suppose I save the following script to dump-test.R: options(error=dump.frames) cat(interactive:, interactive(), \n) f - function() { stop(dump-test-error) cat(execution continues within f\n) } f() cat(execution continues outside of f\n) if(exists(last.dump)) cat(last.dump is available\n) From an interactive R prompt, execution is halted at 'stop': R source('dump-test.R') interactive: TRUE Error in f() : dump-test-error Using Rscript, execution continues depending on whether you source() the file with the -e flag, or pass the file as an argument. matt@pal ~$ Rscript dump-test.R interactive: FALSE Error in f() : dump-test-error execution continues outside of f last.dump is available matt@pal ~$ Rscript -e source('dump-test.R') interactive: FALSE Error in f() : dump-test-error Calls: source - eval.with.vis - eval.with.vis - f It seems that interactiveness (as tested by interactive()) doesn't come into play, yet execution does *not* always continue. What am I missing? Alternative solutions are also welcome. -Matt P.S. There is a typo in the help file: The dumped object contain the call stack... should read The dumped object contains the call stack sessionInfo() R version 2.13.0 alpha (2011-03-18 r54865) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.13.0 -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Examples of web-based Sweave use?
That's an interesting idea. I had written a long email describing a proof-of-concept, but decided to post is to the website below instead. http://biostatmatt.com/archives/1184 Matt On 04/04/2011 07:31 AM, carslaw wrote: I appreciate that this is OT, but I'd be grateful for pointers to examples of where Sweave has been used for web-based applications. In particular, examples of where reports/analyses are produced automatically through submission of data to a web-sever. I am mostly interested in situations where pdf reports have been produced rather than, say, a plot/table etc shown on a web page. I've had limited success finding examples on this. Many thanks. David Carslaw Environmental Research Group MRC-HPA Centre for Environment and Health King's College London Franklin Wilkins Building Stamford Street London SE1 9NH david.cars...@kcl.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Examples-of-web-based-Sweave-use-tp3425324p3425324.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] library(foreign) read.spss warning
There is some information about this subtype in the PSPP source code, and for other subtypes not yet implemented by read.spss. The PSPP source code indicates that this subtype consists of Value labels for long strings, which isn't very illuminating to me (probably because I don't use PSPP, or SPSS, though I increasingly have need to import SPSS data files). Copied below are the relevant bits. -Matt From (the PSPP source file) src/data/sys-file-reader.c: enum { /* subtypes 0-2 unknown */ EXT_INTEGER = 3, /* Machine integer info. */ EXT_FLOAT = 4, /* Machine floating-point info. */ EXT_VAR_SETS = 5, /* Variable sets. */ EXT_DATE = 6, /* DATE. */ EXT_MRSETS= 7, /* Multiple response sets. */ EXT_DATA_ENTRY= 8, /* SPSS Data Entry. */ /* subtypes 9-10 unknown */ EXT_DISPLAY = 11, /* Variable display parameters. */ /* subtype 12 unknown */ EXT_LONG_NAMES= 13, /* Long variable names. */ EXT_LONG_STRINGS = 14, /* Long strings. */ /* subtype 15 unknown */ EXT_NCASES= 16, /* Extended number of cases. */ EXT_FILE_ATTRS= 17, /* Data file attributes. */ EXT_VAR_ATTRS = 18, /* Variable attributes. */ EXT_MRSETS2 = 19, /* Multiple response sets (extended). */ EXT_ENCODING = 20, /* Character encoding. */ EXT_LONG_LABELS = 21 /* Value labels for long strings. */ }; and static const struct extension_record_type types[] = { /* Implemented record types. */ { EXT_INTEGER, 4, 8 }, { EXT_FLOAT,8, 3 }, { EXT_MRSETS, 1, 0 }, { EXT_DISPLAY, 4, 0 }, { EXT_LONG_NAMES, 1, 0 }, { EXT_LONG_STRINGS, 1, 0 }, { EXT_NCASES, 8, 2 }, { EXT_FILE_ATTRS, 1, 0 }, { EXT_VAR_ATTRS,1, 0 }, { EXT_MRSETS2, 1, 0 }, { EXT_ENCODING, 1, 0 }, { EXT_LONG_LABELS, 1, 0 }, /* Ignored record types. */ { EXT_VAR_SETS, 0, 0 }, { EXT_DATE, 0, 0 }, { EXT_DATA_ENTRY, 0, 0 }, }; On Fri, 2011-03-25 at 18:39 -0500, Robert Baer wrote: I got the following: library(foreign) swal = read.spss(swallowing.sav, to.data.frame =TRUE) Warning message: In read.spss(swallowing.sav, to.data.frame = TRUE) : swallowing.sav: Unrecognized record type 7, subtype 21 encountered in system file The bulk of the data seems to read in a usable form, but I'm curious about what might be getting lost because I don't know how to translate type 7, subtype 21. I did not generate the SPSS data so I'm not certain of the version, but I'm assuming version 18 or 19. I did a quick Find on the PSPP manual for Type 7 and subtype 21 and came up dry. Any insights or clues how I might learn more? Thanks, Rob R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 12.2 $year [1] 2011 $month [1] 02 $day [1] 25 $`svn rev` [1] 54585 $language [1] R $version.string [1] R version 2.12.2 (2011-02-25) -- Robert W. Baer, Ph.D. Professor of Physiology Kirksville College of Osteopathic Medicine A. T. Still University of Health Sciences Kirksville, MO 63501 660-626-232 FAX 660-626-2965 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Venn Diagram corresponding to size in R
Try here: https://stat.ethz.ch/pipermail/r-help/2003-February/029393.html On Tue, 2011-03-08 at 20:25 -0500, Shira Rockowitz wrote: I was wondering if anyone could help me figure out how to make a Venn diagram in R where the circles are scaled to the size of each dataset. I have looked at the information for venn (in gplots) and vennDiagram (in limma) and I cannot seem to figure out what parameter to change. I have looked this up online and do not seem to be seeing anyone else who has posted this question or the answer to it before. I see graphs though that are purported to be made in R that are scaled like this, so I think it must be possible, although I do not know if they were made with a custom function. If I have just not been searching for this question correctly, and it has already been asked, please direct me to the earlier question. I would like to thank you all in advance for you help! ~Shira [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] assignment by value or reference
On 03/08/2011 07:20 AM, Xiaobo Gu wrote: On Wed, Sep 15, 2010 at 5:05 PM, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: See the R Language Definition manual. Since R knows about lazy evaluation, it is sometimes neither by reference nor by value. If you want to think binary, then by value fits better than by reference. Hi, Can we think it's eventually by value? Not always (see in-line below). For simple functions such as: is(df[[1]], logical) used to test wheather the first column of data frame df is of type logical, will a new vector be created and used inside the is function? No, df[[1]] isn't copied in this case. However, if you subset an atomic vector (subset+assignment is different!), there is copying. For example: df - data.frame(x=c(FALSE,TRUE)) tracemem(df[[1]]) [1] 0x217afa8 is(df[[1]],logical) [1] TRUE is(df[[1]][], logical) tracemem[0x217afa8 - 0xf9d198]: ...cut... [1] TRUE is(df[[1]][1], logical) [1] TRUE Note that tracemem doesn't catch the copying that occurs during evaluation of the last expression. As a strategy, R avoids copying when it's clearly not necessary from the perspective of the R interpreter. There are some notable cases where copying is obviously not necessary from the user perspective (e.g. contiguous subsetting), but avoiding a copy in these cases might be difficult to implement in R's parser/evaluator framework. Here's another simple exception: x - 1 tracemem(x) [1] 0x18984b8 x - x + 1 tracemem[0x18984b8 - 0x207e568]: ...cut... Another example, dbWriteTable(con, tablename, df) will write the content of data frame df into a database table, will a new data frame object created and used inside the dbWriteTable function? No, but if dbWriteTable modifies its local variable that was assigned df, then df may be copied. Thanks. Uwe Ligges On 05.09.2010 17:19, Xiaobo Gu wrote: Hi Team, Can you please tell me the rules of assignment in R, by value or by reference. From my about 3 months of experience of part time job of R, it seems most times it is by value, especially in function parameter and return values assignment; and it is by reference when referencing container sub-objects of This is a function call convention (i.e. passing by value), as distinguished from an assignment convention (I'm not certain they're equivalent in R). In general R functions pass by value. There are exceptions here also, notably R environments. For example: f - function(e) assign(a, 1, e) e - new.env() f(e) objects(e) [1] a Under strict pass-by-value convention, e would remain unchanged. In general, assignments are by value. However, R environments are an exception; assignment is by reference: r - e objects(r) [1] a assign(b, 2, r) objects(r) [1] a b objects(e) [1] a b In this sense, the calling/assignment convention is a property of the objects being passed/assigned. I think that is consistent with Uwe's comment above. Best, Matt R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) container objects, such as elements of List objects and row/column objects of DataFrame objectes; but it is by value when referencing the smallest unit of element of a container object, such as cell of data frame objects. Xiaobo.Gu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rapache ( was Developing a web crawler )
On Sun, 2011-03-06 at 08:06 -0500, Mike Marchywka wrote: Date: Thu, 3 Mar 2011 13:04:11 -0600 From: matt.shotw...@vanderbilt.edu To: r-help@r-project.org Subject: Re: [R] Developing a web crawler / R webkit or something similar? [off topic] On 03/03/2011 08:07 AM, Mike Marchywka wrote: Date: Thu, 3 Mar 2011 01:22:44 -0800 From: antuj...@gmail.com To: r-help@r-project.org Subject: [R] Developing a web crawler Hi, I wish to develop a web crawler in R. I have been using the functionalities available under the RCurl package. I am able to extract the html content of the site but i don't know how to go In general this can be a big effort but there may be things in text processing packages you could adapt to execute html and javascript. However, I guess what I'd be looking for is something like a webkit package or other open source browser with or without an R interface. This actually may be an ideal solution for a lot of things as you get all the content handlers of at least some browser. Now that you mention it, I wonder if there are browser plugins to handle R content ( I'd have to give this some thought, put a script up as a web page with mime type test/R and have it execute it in R. ) There are server-side solutions for this sort of thing. See http://rapache.net/ . Also, there was a string of messages on R-devel some years ago addressing the mime type issue; beginning here: http://tolstoy.newcastle.edu.au/R/devel/05/11/3054.html . Though I don't know whether there was a resolution. Some suggestions were text/x-R, text/x-Rd, application/x-RData. The rapache demo looks like something I could use right away but I haven't looked into the handlers yet. I have installed rapache now on my debian system ( still have config issues but I did get apach2 to restart LOL) Before I plow into this too far, how would this compare/compete with something like a PHP library for Rserve? That is the approach I had been pursuing. Thanks. Hi Mike, If you've built and configured RApache, then the difficult plowing is over :). RApache operates at the top (HTTP) layer of the OSI stack, whereas Rserve works at the lower transport/network layer. Hence, the scope of Rserve applications is far more general. Extending Rserve to operate at the HTTP layer (via PHP) will mean more work. RApache offers high level functionality, for example, to replace PHP with R in web pages. No interface code is necessary. Here's a simple What's The Time? webpage using RApache and yarr [1] to handle the code: setContentType(text/html\n\n) html headtitleWhat's The Time?/title/head bodypre/= cat(format(Sys.time(), usetz=TRUE)) /pre/body /html Here's a live version: [2]. Interfacing PHP with Rserve in this context would be useful if installation of R and/or RApache on the web host were prohibited. A PHP/Rserve framework might also be useful in other contexts, for example, to extend PHP applications (e.g. WordPress, MediaWiki). Best, Matt [1] http://biostatmatt.com/archives/1000 [2] http://biostatmatt.com/yarr/time.yarr -Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Developing a web crawler / R webkit or something similar? [off topic]
On 03/03/2011 08:07 AM, Mike Marchywka wrote: Date: Thu, 3 Mar 2011 01:22:44 -0800 From: antuj...@gmail.com To: r-help@r-project.org Subject: [R] Developing a web crawler Hi, I wish to develop a web crawler in R. I have been using the functionalities available under the RCurl package. I am able to extract the html content of the site but i don't know how to go In general this can be a big effort but there may be things in text processing packages you could adapt to execute html and javascript. However, I guess what I'd be looking for is something like a webkit package or other open source browser with or without an R interface. This actually may be an ideal solution for a lot of things as you get all the content handlers of at least some browser. Now that you mention it, I wonder if there are browser plugins to handle R content ( I'd have to give this some thought, put a script up as a web page with mime type test/R and have it execute it in R. ) There are server-side solutions for this sort of thing. See http://rapache.net/ . Also, there was a string of messages on R-devel some years ago addressing the mime type issue; beginning here: http://tolstoy.newcastle.edu.au/R/devel/05/11/3054.html . Though I don't know whether there was a resolution. Some suggestions were text/x-R, text/x-Rd, application/x-RData. -Matt about analyzing the html formatted document. I wish to know the frequency of a word in the document. I am only acquainted with analyzing data sets. So how should i go about analyzing data that is not available in table format. Few chunks of code that i wrote: w- getURL(http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes;) write.table(w,test.txt) t- readLines(w) readLines also didnt prove out to be of any help. Any help would be highly appreciated. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust variance estimation with rq (failure of the bootstrap?)
Jim, Thanks for pointing me to this article. The authors argue that the bootstrap intervals for a robust estimator may not be as robust as the estimator. In this context, robustness is measured by the breakdown point, which is supposed to measure robustness to outliers. Even so, the authors found that the upper bound of a quantile bootstrap interval for the sample median was nearly as robust as the sample median. That brings some comfort in using quantile bootstrap intervals in quantile regression. Does the sandwich estimator assume that errors are independent? And a related question: Does the rq function allow the user to specify clusters/grouping among the observations? Best, Matt On Tue, 2011-03-01 at 05:35 -0600, James Shaw wrote: Matt: Thanks for your prompt reply. The disparity between the bootstrap and sandwich variance estimates derived when modeling the highly skewed outcome suggest that either (A) the empirical robust variance estimator is underestimating the variance or (B) the bootstrap is breaking down. The bootstrap variance estimate of a robust location estimate is not necessarily robust, see Statistics Probability Letters 50 (2000) 49-53. Since submitting my earlier post, I have noticed that the the robust kernel variance estimate is similar to the bootstrap estimate. Under what conditions would one expect Koenker and Machado's sandwich variance estimator, which uses a local estimate of the sparsity, to fail? -- Jim On Mon, Feb 28, 2011 at 8:59 PM, Matt Shotwell m...@biostatmatt.com wrote: Jim, If repeated measurements on patients are correlated, then resampling all measurements independently induces an incorrect sampling distribution (= incorrect variance) on a statistic of these data. One solution, as you mention, is the block or cluster bootstrap, which preserves the correlation among repeated observations in resamples. I don't immediately see why the cluster bootstrap is unsuitable. Beyond this, I would be concerned about *any* variance estimates that are blind to correlated observations. The bootstrap variance estimate may be larger than the asymptotic variance estimate, but that alone isn't evidence to favor one over the other. Also, I can't justify (to myself) why skew would hamper the quality of bootstrap variance estimates. I wonder how it affects the sandwich variance estimate... Best, Matt On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote: I am fitting quantile regression models using data collected from a sample of 124 patients. When modeling cross-sectional associations, I have noticed that nonparametric bootstrap estimates of the variances of parameter estimates are much greater in magnitude than the empirical Huber estimates derived using summary.rq's nid option. The outcome variable is severely skewed, and I am afraid that this may be affecting the consistency of the bootstrap variance estimates. I have read that the m out of n bootstrap can be used to overcome this problem. However, this procedure requires both the original sample (n) and the subsample (m) sizes to be large. The version implemented in rq.boot does not appear to provide any improvement over the naive bootstrap. Ultimately, I am interested in using median regression to model changes in the outcome variable over time. Summary.rq's robust variance estimator is not applicable to repeated-measures data. I question whether the block (cluster) bootstrap variance estimator, which can accommodate intraclass correlation, would perform well. Can anyone suggest alternatives for variance estimation in this situation? Regards, Jim James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 266 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 266 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo
Re: [R] Robust variance estimation with rq (failure of the bootstrap?)
Jim, If repeated measurements on patients are correlated, then resampling all measurements independently induces an incorrect sampling distribution (= incorrect variance) on a statistic of these data. One solution, as you mention, is the block or cluster bootstrap, which preserves the correlation among repeated observations in resamples. I don't immediately see why the cluster bootstrap is unsuitable. Beyond this, I would be concerned about *any* variance estimates that are blind to correlated observations. The bootstrap variance estimate may be larger than the asymptotic variance estimate, but that alone isn't evidence to favor one over the other. Also, I can't justify (to myself) why skew would hamper the quality of bootstrap variance estimates. I wonder how it affects the sandwich variance estimate... Best, Matt On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote: I am fitting quantile regression models using data collected from a sample of 124 patients. When modeling cross-sectional associations, I have noticed that nonparametric bootstrap estimates of the variances of parameter estimates are much greater in magnitude than the empirical Huber estimates derived using summary.rq's nid option. The outcome variable is severely skewed, and I am afraid that this may be affecting the consistency of the bootstrap variance estimates. I have read that the m out of n bootstrap can be used to overcome this problem. However, this procedure requires both the original sample (n) and the subsample (m) sizes to be large. The version implemented in rq.boot does not appear to provide any improvement over the naive bootstrap. Ultimately, I am interested in using median regression to model changes in the outcome variable over time. Summary.rq's robust variance estimator is not applicable to repeated-measures data. I question whether the block (cluster) bootstrap variance estimator, which can accommodate intraclass correlation, would perform well. Can anyone suggest alternatives for variance estimation in this situation? Regards, Jim James W. Shaw, Ph.D., Pharm.D., M.P.H. Assistant Professor Department of Pharmacy Administration College of Pharmacy University of Illinois at Chicago 833 South Wood Street, M/C 871, Room 266 Chicago, IL 60612 Tel.: 312-355-5666 Fax: 312-996-0868 Mobile Tel.: 215-852-3045 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Visualizing Points on a Sphere
That's interesting. You might also like: http://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution I'm not sure how to plot the wireframe sphere, but you can visualize the points by transforming to Cartesian coordinates like so: u - runif(1000,0,1) v - runif(1000,0,1) theta - 2 * pi * u phi - acos(2 * v - 1) x - sin(theta) * cos(phi) y - sin(theta) * sin(phi) z - cos(theta) library(lattice) cloud(z ~ x + y) -Matt On Fri, 2011-02-25 at 14:21 +0100, Lorenzo Isella wrote: Dear All, I need to plot some points on the surface of a sphere, but I am not sure about how to proceed to achieve this in R (or if it is suitable for this at all). In any case, I am not looking for really fancy visualizations; for instance you can consider the images between formulae 5 and 6 at http://bit.ly/hOgK9h Any suggestion is appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with writing a file in UTF-8
Thomas, I wasn't able to reproduce your finding. The last two characters in my 'out.txt' file were just as expected. But, I'm in an UTF-8 locale. Your locale affects the encoding of characters on your platform. If you're not in a UTF-8 locale, then characters are converted from your native encoding to UTF-8 (when you specify encoding=UTF-8). In the process of conversion, it's possible to lose information. You can test whether there is a loss (or a change rather) when R writes these characters like so: # what does űŁ look like in binary (hex)? raw_before - charToRaw(űŁ) # write 'out.txt' as before out - file(description=out.txt, open=w, encoding=UTF-8) write(x=űŁ, file=out) close(con=out) # read in the two characters out - file(description=out.txt, open=r, encoding=UTF-8) raw_after - charToRaw(readChar(con=out, nchars=2)) close(con=out) # compare the raw representations identical(raw_before, raw_after) This test passes on my machine. But, there's also the question of whether these characters made it onto R-help list unaltered. Also, please include the result of sessionInfo() in you subsequent messages. Best, Matt sessionInfo() R version 2.11.1 (2010-05-31) i686-pc-linux-gnu locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base On Thu, 2011-02-17 at 13:54 -0800, tpklein wrote: Hello, I am working with a data frame containg character strings with many special symbols from various European languages. When writing such character strings to a file using the UTF-8 encoding, some of them are converted in a strange way. See the following example, run in R 2.12.1 on Windows 7: out - file( description=out.txt, open=w, encoding=UTF-8) write( x=äöüßæűŁ, file=out ) close( con=out ) The last two symbols in the character string are converted to uL while all other characters are not changed (which is what I want). How to explain this? Does it have something to do with my locale? And is there a way to work around this problem? -- Any help would be greatly appreciated. Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] non-ascii characters in R output
All, I'd like to automatically output text from R to HTML. In doing this I've run into trouble with non-ascii characters, as my browser (and presumably others) does not render such characters correctly. For example, the 'fancy' single quotes associated with summary.lm are multi-byte characters on my platform. This particular problem is solved by options(useFancyQuotes=FALSE). But now I'm concerned about other non-ascii characters. As an overkill maybe, my current solution involves capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other sources of non-ascii character? Is there a better or general solution? Best, Matt sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.1 -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-ascii characters in R output
OK, looks like my web browser does render non-ascii characters output by R when it's given the encoding explicitly. This works for me: meta http-equiv=Content-Type content=text/html; charset=UTF-8/. So that's another solution, but not a general one. -Matt On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote: All, I'd like to automatically output text from R to HTML. In doing this I've run into trouble with non-ascii characters, as my browser (and presumably others) does not render such characters correctly. For example, the 'fancy' single quotes associated with summary.lm are multi-byte characters on my platform. This particular problem is solved by options(useFancyQuotes=FALSE). But now I'm concerned about other non-ascii characters. As an overkill maybe, my current solution involves capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other sources of non-ascii character? Is there a better or general solution? Best, Matt sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-ascii characters in R output
On Fri, 2011-02-18 at 19:50 -0500, Duncan Murdoch wrote: On 18/02/2011 5:58 PM, Matt Shotwell wrote: OK, looks like my web browser does render non-ascii characters output by R when it's given the encoding explicitly. This works for me:meta http-equiv=Content-Type content=text/html; charset=UTF-8/. So that's another solution, but not a general one. I don't understand your final comment. What is not general about declaring how the file is encoded? I meant that declaring UTF-8 is not generally applicable, because R doesn't always output UTF-8 (right?). For example, locales that use exotic encodings might output characters that are not interpretable where UTF-8 is assumed. The general solution, I suppose, is to automatically generate the meta / line with the encoding used by R. Matt Duncan Murdoch -Matt On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote: All, I'd like to automatically output text from R to HTML. In doing this I've run into trouble with non-ascii characters, as my browser (and presumably others) does not render such characters correctly. For example, the 'fancy' single quotes associated with summary.lm are multi-byte characters on my platform. This particular problem is solved by options(useFancyQuotes=FALSE). But now I'm concerned about other non-ascii characters. As an overkill maybe, my current solution involves capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other sources of non-ascii character? Is there a better or general solution? Best, Matt sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Revolution Analytics reading SAS datasets
On Thu, 2011-02-10 at 10:44 -0800, David Smith wrote: The SAS import/export feature of Revolution R Enterprise 4.2 isn't open-source, so we can't release it in open-source Revolution R Community, or to CRAN as we do with the ParallelR packages (foreach, doMC, etc.). Judging by the language of Dr. Nie's comments on the page linked below, it seems unlikely this feature is the result of a licensing agreement with SAS. Is that correct? Matt It is, though, available for download free of charge to members of the academic community (as is all of Revolution Analytics' software) from http://www.revolutionanalytics.com/downloads/ # David Smith On Wed, Feb 9, 2011 at 5:46 PM, Daniel Nordlund djnordl...@frontier.com wrote: Has anyone heard whether Revolution Analytics is going to release this capability to the R community? http://www.businesswire.com/news/home/20110201005852/en/Revolution-Analytics-Unlocks-SAS-Data Dan Daniel Nordlund Bothell, WA USA -- David M Smith da...@revolutionanalytics.com VP of Marketing, Revolution Analytics http://blog.revolutionanalytics.com Tel: +1 (650) 646-9523 (Palo Alto, CA, USA) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] clustering with finite mixture model
There are quite a few packages that work with finite mixtures, as evidenced by the descriptions here: http://cran.r-project.org/web/packages/index.html These might be useful: http://cran.r-project.org/web/packages/flexmix/index.html http://cran.r-project.org/web/packages/mclust/index.html -Matt On 02/02/2011 04:28 AM, karuna m wrote: Dear R-help, I am doing clustering via finite mixture model. Please suggest some packages in R to find clusters via finite mixture model with continuous variables. And also I wish to verify the distributional properties of the mixture distributions by fitting the model with lognormal, gamma, exponentials etc,. Thanks in advance, warm regards,Ms.Karunambigai M PhD Scholar Dept. of Biostatistics NIMHANS Bangalore India [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User input in R program
Martyn Plummer's 'coda' package has some nice interactive menus. The package appears to be written entirely in R. You could start with the codamenu() function in the package source: http://cran.r-project.org/web/packages/coda/index.html -Matt On Fri, 2011-01-21 at 14:26 +0200, christiaan pauw wrote: HI Everybody Does anyone know of documentation about different ways of obtaining user input in R. I have used readline() but I wondered is there are sophisticated packages that does things like validate answers or generate selection lists. bets regards Christaan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Encoding problem - I fails to read Hebrew text from online
Tal, It looks like the data you received has HTML special hex characters. That is, '#x5E9;' is just an ASCII HTML representation of a hex character. It's not encoded in a special manner. The trick is to substitute the HTML encoded hex character for its binary representation, or decode the character. I don't know of any R function that does this, but there are web services, for example: http://www.hashemian.com/tools/html-url-encode-decode.php I decoded your file using this service and posted it on my website. You can see the difference by running: readLines(http://biostatmatt.com/temp/Hebrew-original;, warn=FALSE) readLines(http://biostatmatt.com/temp/Hebrew-decoded;, warn=FALSE) The second should display the Hebrew characters correctly (it does in my terminal). The next thing to think about is how to automate this in R without using the web service... We may need to write an HTMLDecode function if there isn't one already. By the way, what's the Hebrew text in English? Best, Matt On Thu, 2010-12-09 at 12:21 -0500, Tal Galili wrote: I am bumping this question in the hopes that someone might be able to advise. This Hebrew and R business is not as smooth as I had hoped... Thanks, Tal Older massage: On Tue, Dec 7, 2010 at 2:30 PM, Tal Galili tal.gal...@gmail.com wrote: Hello all, # I am trying to read the text in this URL: u - http://google.com/complete/search?output=toolbarq=%d7%a9%d7%9c%d7%95%d7%9d # By using this command: readLines(u) And no matter what variation I tried, I keep getting this output: [1] ?xml version=\1.0\?toplevelCompleteSuggestionsuggestion data=\#x5E9;#x5DC;#x5D5;#x5DD;\/ (etc...) Instead of this output: ?xml version=1.0?toplevelCompleteSuggestionsuggestion data=שלום /num_queries int=1680//CompleteSuggestionCompleteSuggestionsuggestion data=שלום חנוך/num_queries int=232000//CompleteSuggestion CompleteSuggestionsuggestion data=שלום עליכם/ (etc) I tried: readLines(u, encoding= latin1) readLines(u, encoding= UTF-8) And also changing Sys.setlocale: Sys.setlocale(LC_ALL, Hebrew) # must be done for Hebrew to work. Sys.setlocale(LC_ALL, English) # must be done for Hebrew to work. Are there any more options I could try to get this text properly encoded? Thanks! Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Encoding problem - I fails to read Hebrew text from online
Tal, OK, let me clarify my understanding. The original and decoded file are text, encoded by UTF-8. In the original file, there are HTML `entities' that represent UTF-8 Hebrew characters. In the decoded file, the entities are converted to UTF-8 characters. The question is how to convert these entities within R. It's not the same as converting between character encodings, otherwise iconv() might offer a solution. I'll have a look around to find a solution, and I hope others will too. My first idea is to check RCurl, XML, and the related utils::URLdecode. If there really is no existing solution, I think it might be worthwhile to look at how PHP and Python do it (and maybe borrow some code :) ). -Matt On Thu, 2010-12-09 at 14:27 -0500, Tal Galili wrote: Hi Matt, Thanks for having a look at this. I just spent some time looking around and couldn't find any R function to decode decimal HTML code. Do you (or someone else on the list) knows how to program this sort of thing? (is there a formula for the translation? p.s: For it to work on my end I added the encoding parameter: readLines(http://biostatmatt.com/temp/Hebrew-decoded;, warn=FALSE, encoding= UTF-8) p.p.s: The Hebrew word I used means peace Cheers, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Dec 9, 2010 at 8:38 PM, Matt Shotwell shotw...@musc.edu wrote: Tal, It looks like the data you received has HTML special hex characters. That is, '#x5E9;' is just an ASCII HTML representation of a hex character. It's not encoded in a special manner. The trick is to substitute the HTML encoded hex character for its binary representation, or decode the character. I don't know of any R function that does this, but there are web services, for example: http://www.hashemian.com/tools/html-url-encode-decode.php I decoded your file using this service and posted it on my website. You can see the difference by running: readLines(http://biostatmatt.com/temp/Hebrew-original;, warn=FALSE) readLines(http://biostatmatt.com/temp/Hebrew-decoded;, warn=FALSE) The second should display the Hebrew characters correctly (it does in my terminal). The next thing to think about is how to automate this in R without using the web service... We may need to write an HTMLDecode function if there isn't one already. By the way, what's the Hebrew text in English? Best, Matt On Thu, 2010-12-09 at 12:21 -0500, Tal Galili wrote: I am bumping this question in the hopes that someone might be able to advise. This Hebrew and R business is not as smooth as I had hoped... Thanks, Tal Older massage: On Tue, Dec 7, 2010 at 2:30 PM, Tal Galili tal.gal...@gmail.com wrote: Hello all, # I am trying to read the text in this URL: u - http://google.com/complete/search?output=toolbarq=%d7%a9% d7%9c%d7%95%d7%9d # By using this command: readLines(u) And no matter what variation I tried, I keep getting this output: [1] ?xml version=\1.0 \?toplevelCompleteSuggestionsuggestion data=\#x5E9;#x5DC;#x5D5;#x5DD;\/ (etc...) Instead of this output: ?xml version=1.0?toplevelCompleteSuggestionsuggestion data=שלום /num_queries int=1680//CompleteSuggestionCompleteSuggestionsuggestion data=שלום חנוך/num_queries int=232000//CompleteSuggestion CompleteSuggestionsuggestion data=שלום עליכם/ (etc) I tried: readLines(u, encoding= latin1) readLines(u, encoding= UTF-8) And also changing Sys.setlocale: Sys.setlocale(LC_ALL, Hebrew) # must be done for Hebrew to work. Sys.setlocale(LC_ALL, English) # must be done for Hebrew to work. Are there any more options I could try to get this text properly encoded? Thanks! Tal Contact
Re: [R] statistical test for comparison of two classifications (nominal)
Martin, Pardon the delayed reply. Bootstrap methods have been around for some time (late seventies?), but their popularity seems to have exploded in correspondence with computing technology. You should be able to find more information in most modern books on statistical inference, but here is a brief: The bootstrap is a method often used to establish an empirical null distribution for a test statistic when traditional (analytical) methods fail. The bootstrap works by imposing a null hypothesis on the observed data, followed by re-sampling with replacement. The test statistic is computed at each re-sample and used to build up an empirical null distribution. The idea is to impose the null hypothesis while preserving variability in the observed data, and thus the test statistic. For example, suppose we observe some continuous scalar data and hypothesize that the sample was observed from a population with mean zero. We can impose this hypothesis by subtracting the sample mean from each observation. Re-samples from these transformed data are treated as having been observed under the null hypothesis. In the case of classification and partitioning, the difficulty is formulating a meaningful null hypothesis about the collection of classifications, and imposing the null hypothesis in a bootstrap sampling scheme. -Matt On Wed, 2010-11-17 at 10:01 -0500, Martin Tomko wrote: Thanks Mat, I have in the meantime identified the Rand index, but not the others. I will also have a look at profdpm, that did not pop-up in my searches. Indeed, the interpretation is going to be critical... Could you please elaborate on what you mean by the bootstrap process? Thanks a lot for your helps, Martin On 11/17/2010 3:50 PM, Matt Shotwell wrote: There are several statistics used to compare nominal classifications, or _partitions_ of a data set. A partition isn't quite the same in this context because partitioned data are not restricted to a fixed number of classes. However, the statistics used to compare partitions should also work for these 'restricted' partitions. See the Rand index, Fowlkes and Mallows index, Wallace indices, and the Jaccard index. The profdpm package implements a function (?profdpm::pci) that computes these indices for two factors representing partitions of the same data. The difficult part is drawing statistical inference about these indices. It's difficult to formulate a null hypothesis, and even more difficult to determine a null distribution for a partition comparison index. A bootstrap test might work, but you will probably have to implement this yourself. -Matt On Wed, 2010-11-17 at 08:33 -0500, Martin Tomko wrote: Dear all, I am having a hard time to figure out a suitable test for the match between two nominal classifications of the same set of data. I have used hierarchical clustering with multiple methods (ward, k-means,...) to classify my dat into a set number of classesa, and I would like to compare the resulting automated classification with the actual - objective benchmark one. So in principle I have a data frame with n columns of nominal classifications, and I want to do a mutual comparison and test for significance in difference in classification between pairs of columns. I just need to identify a suitable test, but I fail. I am currently exploring the possibility of using Cohen's Kappa, but I am open to other suggestions. Especially the fact that kappa seems to be moslty used on failible, human annotators seems to bring in limitations taht do not apply to my automatic classification. Any help will be appreciated, especially if also followed by a pointer to an R package that implements it. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fatal Error R
Please see below. On Wed, 2010-11-17 at 04:41 -0500, Ted Harding wrote: On 17-Nov-10 00:02:39, José Fernando Zea Castro wrote: Hello. First, I'm thankful about your wonderful project. However, I have serious worries about the reliability of R. I found the next bug which I consider important because in my job everytime We work with datanames like next. Please see below: b=data.frame(matrix(1:9,ncol=3)) names(b)=c(q99,r88,s77) b q99 r88 s77 1 1 4 7 2 2 5 8 3 3 6 9 b$q9 [1] 1 2 3 Please note that the variable q9 does not exist in the dataframe, . but you can see that R show q9 (as q99). Thank in advanced Cordially José Fernando Zea Castro Statistician Universidad Nacional Colombiana What you see here is a case of partial matching: You ask for 'b$q9', and R sees that 'q9' matches the beginning of 'q99' and nothing else. Therefore it responds with the value of 'b$q99', since there is no ambiguity. You would have got the same result if you had asked for b$q since there is no component name in b which matches 'q' except 'q99'. If there had been two components which matched 'q9', say both b$q99 and b$q98, then you would have got a NULL result, since there is not a unique match. However, if you also have b$q9 and b$q99 in b, then R would find that b$q9 was an *exact* (not partial) match, and would return that one. Normally, this should not cause problems. However, if you have written code which must take special action if a name is not present in a list, then there could be problems. For example, if b might (depending on what has happened) contain b$q9 only, or b$q99 only, or *both* b$q9 and b$q99, and you want to execute special actions if a name is not present in b, then in the case where b contained only b$q99 and you asked for b$q9, you would get the wrong result because of partial matching. This is one of those cases, in my opinion, where R's documentation drops you into a flat landscape, in the middle of nowhere, in a thick mist. This does happen sometimes, but partial matching in indexing operations is documented in the R Language Definition manual section 3.4.1, and well documented in the help page (?Extract or ?`$` or ?`[`). What is needed is to be able to set an option such that R will *only* respond with exact matches, e.g. something like options(partial.match=FALSE). I have spent about 20 minutes trying to locate the possible existence of such an option, or a similar way of suppressing partial matching. No success! Indexing a list using [[ and a string enforce exact matching (by default). Continuing with the example above: b[[q99]] [1] 1 2 3 b[[q]] NULL The closest I could get was the set of options, settable using options(... = ...): 'warnPartialMatchArgs': logical. If true, warns if partial matching is used in argument matching. 'warnPartialMatchAttr': logical. If true, warns if partial matching is used in extracting attributes via 'attr'. 'warnPartialMatchDollar': logical. If true, warns if partial matching is used for extraction by '$'. which concerns only the issue of warnings in such cases, and has nothing to do with suppressing partial matching. Maybe others know better! Best wishes, Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 17-Nov-10 Time: 09:41:03 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] statistical test for comparison of two classifications (nominal)
There are several statistics used to compare nominal classifications, or _partitions_ of a data set. A partition isn't quite the same in this context because partitioned data are not restricted to a fixed number of classes. However, the statistics used to compare partitions should also work for these 'restricted' partitions. See the Rand index, Fowlkes and Mallows index, Wallace indices, and the Jaccard index. The profdpm package implements a function (?profdpm::pci) that computes these indices for two factors representing partitions of the same data. The difficult part is drawing statistical inference about these indices. It's difficult to formulate a null hypothesis, and even more difficult to determine a null distribution for a partition comparison index. A bootstrap test might work, but you will probably have to implement this yourself. -Matt On Wed, 2010-11-17 at 08:33 -0500, Martin Tomko wrote: Dear all, I am having a hard time to figure out a suitable test for the match between two nominal classifications of the same set of data. I have used hierarchical clustering with multiple methods (ward, k-means,...) to classify my dat into a set number of classesa, and I would like to compare the resulting automated classification with the actual - objective benchmark one. So in principle I have a data frame with n columns of nominal classifications, and I want to do a mutual comparison and test for significance in difference in classification between pairs of columns. I just need to identify a suitable test, but I fail. I am currently exploring the possibility of using Cohen's Kappa, but I am open to other suggestions. Especially the fact that kappa seems to be moslty used on failible, human annotators seems to bring in limitations taht do not apply to my automatic classification. Any help will be appreciated, especially if also followed by a pointer to an R package that implements it. Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] log-transformed linear regression
Servet, These data do look linear in log space. Fortunately, the model log(y) = a + b * log(x) does have intercept zero in linear space. To see this, consider log(y) = a + b * log(x) y = 10^(a + b * log(x)) y = 10^a * 10^(b * log(x)) y = 10^a * 10^(log(x^b)) y = 10^a * x^b Hence, y = 0 when x = 0. The code below estimates a and b. Of course, y = 10^a * x^b is not a line, so we can't directly compare slopes. However, in the region of your data, the estimated mean is _nearly_ linear. In fact, you could consider looking at a linear approximation, say at the median of your x values. The median of your x values is 0.958. For simplicity, let's just say it's 1.0. The linear approximation (first order Taylor expansion) of y = 10^a * x^b at x = 1 is y = 10^a + 10^a * b * (x - 1) y = 10^a * (1 - b) + 10^a * b * x So, the slope of the linear approximation is 10^a * b, and the intercept is 10^a * (1 - b). Taking a and b from the analysis below, the approximate intercept is -0.00442, and slope 0.22650. You could argue that these values are consistent with the literature, but that the log linear model is more appropriate for these data. You could even construct a bootstrap confidence interval for the approximate slope. -Matt On Wed, 2010-11-10 at 19:27 -0500, servet cizmeli wrote: Dear List, I would like to take another chance and see if there if someone has anything to say to my last post... bump servet On 11/10/2010 01:11 PM, servet cizmeli wrote: Hello, I have a basic question. Sorry if it is so evident I have the following data file : http://ekumen.homelinux.net/mydata.txt I need to model Y~X-1 (simple linear regression through the origin) with these data : load(file=mydata.txt) X=k[,1] Y=k[,2] aa=lm(Y~X-1) dev.new() plot(X,Y,log=xy) abline(aa,untf=T) abline(b=0.0235, a=0,col=red,untf=T) abline(b=0.031, a=0,col=green,untf=T) Other people did the same kind of analysis with their data and found the regression coefficients of 0.0235 (red line) and 0.031 (green line). Regression with my own data, though, yields a slope of 0.0458 (black line) which is too high. Clearly my regression is too much influenced by the single point with high values (X100). I would not like to discard this point, though, because I know that the measurement is correct. I just would like to give it less weight... When I log-transform X and Y data, I obtain : dev.new() plot(log10(X),log10(Y)) abline(v=0,h=0,col=cyan) bb=lm(log10(Y)~log10(X)) abline(bb,col=blue) bb I am happy with this regression. Now the slope is at the log-log domain. I have to convert it back so that I can obtain a number comparable with the literature (0.0235 and 0.031). How to do it? I can't force the second regression through the origin as the log-transformed data does not go through the origin anymore. at first it seemed like an easy problem but I am at loss :o(( thanks a lot for your kindly help servet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data acquisition with R?
R implements (almost) all IO through its 'connections'. Unfortunately, there is no API (public or private) for adding connections, and therefore no packages that implement connections. You will find more discussion of connections and hardware (serial, USB) interface in the R-devel list archives. There are two source code patches that implement two types of connections that work on POSIX compliant OSs, including GNU Linux, BSD, and Mac OS X. The first is a 'serial' connection, a high level connection to a serial port http://biostatmatt.com/archives/112. The second is a 'tty' connection, a more low level connection to the POSIX termios interface http://biostatmatt.com/archives/564. Both of these solutions require that you apply the patch and recompile R. I can help with this, if you like. AFAIK, these are the only attempts at interfacing R with POSIX TTYs directly. -Matt On Fri, 2010-11-05 at 09:48 -0400, B.-MarkusS wrote: Hello, I spent quite some time now searching for any hint that R can also be used to address the interfaces of a computer (i.e. RS232 or USB) to acquire data from measurement devices (like with the - I think it is the - devices or serial toolbox of Matlab). Is there any package available or a project going on that you know of? I would so much like to have never to work with Matlab again. The only thing I am really missing in R so far is the possibility to connect to my measurement devices (for instance a precision balance) and record data directly with R. Please let me know whether I am just missing something or if you have some information about something like that. Thank you very much! Mango -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice plots for images
Have you tried using the 'mai' argument to par()? Something like: par(mfrow=c(3,3), mai=c(0,0,0,0)) I've used this in conjunction with image() to plot raster data in a tight grid. http://biostatmatt.com/archives/727 -Matt On Wed, 2010-11-03 at 11:13 -0400, Neba Funwi-Gabga wrote: Hello UseRs, I need help on how to plot several raster images (such as those obtained from a kernel-smoothed intensity function) in a layout such as that obtained from the lattice package. I would like to obtain something such as obtained from using the levelplot or xyplot in lattice. I currently use: par(mfrow=c(3,3) to set the workspace, but the resulting plots leave a lot of blank space between individual plots. If I can get it to the lattice format, I think it will save me some white space. Any help is greatly appreciated. Neba. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ForestPlot or similar
Here is a small function for forest plots in R, with an example: http://biostatmatt.com/wiki/r-credplot -Matt On Sat, 2010-10-30 at 11:40 -0400, Mestat wrote: Here is one example: I have three vectors (mean,lower interval, upper interval) mean-c(2,4,6,8) l-c(1,2,3,4) u-c(4,8,12,16) How would I plot that if I want to use the FORESTPLOT function. I dont need to use the TABLETEXT option. I am working in something like this: tabletext-c(NA,NA,NA,NA,NA) mean-c(NA,2,4,6,8) l-c(NA,1,2,3,4) u-c(NA,4,8,12,16) forestplot(tabletext,mean,l,u,zero=0) But I am having a problem with the length of the dimension... Thanks in advance, Marcio -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regular expression to find value between brackets
Here's a shorter (but more cryptic) one: gsub(^([^\\(]+)(\\((.+)\\))?, \\2, tests) [1] (%) (%) (mg/ml) gsub(^([^\\(]+)(\\((.+)\\))?, \\3, tests) [1] % % mg/ml -Matt On Wed, 2010-10-13 at 14:34 -0400, Henrique Dallazuanna wrote: Try this: replace(gsub(.*\\((.*)\\)$, \\1, tests), !grepl(\\(.*\\), tests), ) On Wed, Oct 13, 2010 at 3:16 PM, Bart Joosen bartjoo...@hotmail.com wrote: Hi, this should be an easy one, but I can't figure it out. I have a vector of tests, with their units between brackets (if they have units). eg tests - c(pH, Assay (%), Impurity A(%), content (mg/ml)) Now I would like to hava a function where I use a test as input, and which returns the units like: f - function (x) sub(\\), , sub(\\(, ,sub([[:alnum:]]+,,x))) this should give , %, %, mg/ml, but it doesn't do the job quit well. After searching in the manual, and on the help lists, I cant find the answer. anyone? Bart -- View this message in context: http://r.789695.n4.nabble.com/Regular-expression-to-find-value-between-brackets-tp2994166p2994166.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzle with integrate over infinite range
You could try pnorm also: shiftedGaussR - function(x0 = 500) { sd - 100/sqrt(2) int - pnorm(0, x0, sd, lower.tail=FALSE, log.p=TRUE) exp(int + log(sd) + 0.5 * log(2*pi)) } shiftedGaussR(500) [1] 177.2454 shiftedGauss(500) [1] 177.2454 -Matt On Tue, 2010-09-21 at 09:38 -0400, Ravi Varadhan wrote: There is nothing mysterious. You need to increase the accuracy of quadrature by decreasing the error tolerance: # I scaled your function to a proper Gaussian density shiftedGauss - function(x0=500){ integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0, Inf, rel.tol=1.e-07)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Hope this helps, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of baptiste auguie Sent: Tuesday, September 21, 2010 8:38 AM To: r-help Subject: [R] puzzle with integrate over infinite range Dear list, I'm calculating the integral of a Gaussian function from 0 to infinity. I understand from ?integrate that it's usually better to specify Inf explicitly as a limit rather than an arbitrary large number, as in this case integrate() performs a trick to do the integration better. However, I do not understand the following, if I shift the Gauss function by some amount the integral should not be affected, shiftedGauss - function(x0=500){ integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value } shift - seq(500, 800, by=10) plot(shift, sapply(shift, shiftedGauss)) Suddenly, just after 700, the value of the integral drops to nearly 0 when it should be constant all the way. Any clue as to what's going on here? I guess it's suddenly missing the important part of the range where the integrand is non-zero, but how could this be overcome? Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6 statmod_1.4.6 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a bisection method in R?
I was just reading about the merge sort algorithm last night (BTW, here is a fun link http://www.youtube.com/watch?v=t8g-iYGHpEA). There are some interesting similarities in this context. Here's a recursive method for bisection: bisectMatt - function(fn, lo, hi, tol = 1e-7, ...) { flo - fn(lo, ...) fhi - fn(hi, ...) if(flo * fhi 0) stop(root is not bracketed by lo and hi) mid - (lo + hi) / 2 fmid - fn(mid, ...) if(abs(fmid) = tol || abs(hi-lo) = tol) return(mid) if(fmid * fhi 0) return(bisectMatt(fn, lo, mid, tol, ...)) return(bisectMatt(fn, mid, hi, tol, ...)) } # Adapted from Ravi's original bisectRavi - function(fn, lo, hi, tol = 1e-7, ...) { flo - fn(lo, ...) fhi - fn(hi, ...) if (flo * fhi 0) stop(root is not bracketed by lo and hi) chg - hi - lo while (abs(chg) tol) { mid - (lo + hi) / 2 fmid - fn(mid, ...) if (abs(fmid) = tol) break if (flo * fmid 0) hi - mid if (fhi * fmid 0) lo - mid chg - hi - lo } return(mid) } testFn - function(x, a) exp(-x) - a*x system.time(bM - bisectMatt(testFn, 0, 2, a=1)) user system elapsed 0.000 0.000 0.001 system.time(bR - bisectRavi(testFn, 0, 2, a=1)) user system elapsed 0.000 0.000 0.001 bM [1] 0.5671433 bR [1] 0.5671433 Of course, Ravi's version is better for production (and most likely faster, though not significantly so in this example) because recursion is more expensive than looping. -Matt On Fri, 2010-09-17 at 17:44 -0400, Ravi Varadhan wrote: Here is something simple (does not have any checks for bad input), yet should be adequate: bisect - function(fn, lower, upper, tol=1.e-07, ...) { f.lo - fn(lower, ...) f.hi - fn(upper, ...) feval - 2 if (f.lo * f.hi 0) stop(Root is not bracketed in the specified interval \n) chg - upper - lower while (abs(chg) tol) { x.new - (lower + upper) / 2 f.new - fn(x.new, ...) if (abs(f.new) = tol) break if (f.lo * f.new 0) upper - x.new if (f.hi * f.new 0) lower - x.new chg - upper - lower feval - feval + 1 } list(x = x.new, value = f.new, fevals=feval) } # An example fn1 - function(x, a) { exp(-x) - a*x } bisect(fn1, 0, 2, a=1) bisect(fn1, 0, 2, a=2) Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peter Dalgaard Sent: Friday, September 17, 2010 4:16 PM To: Gregory Gentlemen Cc: r-help@r-project.org Subject: Re: [R] Is there a bisection method in R? On 09/17/2010 09:28 PM, Gregory Gentlemen wrote: If uniroot is not a bisection method, then what function in R does use bisection? Why do you assume that there is one? uniroot contains a better algorithm for finding bracketed roots. It shouldn't be too hard to roll your own if you need one for pedagogical purposes. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Founding
On Thu, 2010-09-16 at 17:30 -0400, Tal Galili wrote: Hello dear Jaroslaw, I strongly agree with you that the R foundation should have an easier method of enabling people to give donations. At the same time, I feel there is a (friendly) disagreement between us on how such money should be used. Your massage has inspired me to write a post on the topic, titles: Was this --^ a Freudian slip? In any case, it seems consistent with your notion of compensation for open-source developers. :) Interesting post Tal. -Matt Open source and money – why R developers shouldn’t be paidhttp://www.r-statistics.com/2010/09/open-source-and-money-why-r-developers-shouldnt-be-paid/ I hope you, and other community members, would find interest in it. Best, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Sep 16, 2010 at 12:49 PM, jaropis jaro...@zg.home.pl wrote: A few days ago Tal Galili posted a message about some controversies concerning the future of R. Having read the discussions, especially those following Ross Ihaka's post, I have come to the conclusion, that, as usual, the problem is money. I doubt there would be discussions about dropping R in its present form if the R-Foundation were properly funded and could hire computer scientists, programmers and statisticians. If a commercial company is able to provide big-database and multicore solutions, then so would a properly founded R-Foundation. In my opinion the main reason for the lack of funding is that the Foundation does not want to accept it from users and waits for the likes of Google to bring them a sack of money. I have already posted about this, but this seems to be the time and place to repeat it: it is very difficult to donate anything to the R-Foundation. First you have to find the appropriate link at the r-project page, then you have to fill out a form and send or fax it to the Foundation. I am not comfortable sending my details over snail-mail or fax. I would GLADLY donate 30-50$ each year just to see R develop, but there needs to be a way for me to do it in a civilized manner. If the userbase of R is over 2 million there will surely be 100,000 users who, like myself, will happily fork out 40$ a year - would that help? you can do the calculation yourselves. Set up a donation page in which I will be able to pay by credit card or PayPal and you will start getting donations from individual users. Advertise this at the startup message of the program: say something like support us at www.suppoRtR.com and the money will start coming. I am sure there would be enough to employ some foundation members full-time, pay external CSs and even protect the system in court from those who make money off of somebody else's work and do not give back to the community (you know who I am talking about). R and the Foundation have helped a lot of us to do our research and make real money. Now give us a chance to help you! Regards Jaroslaw Piskorski __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which language is faster for numerical computation?
For the compiled languages, it depends heavily on the compiler. This sort of comparison is rendered moot by the huge variety of compiler and hardware specific optimizations. My suggestion is to use C, or possibly C++ in conjunction with Rcpp, as these are most compatible with R. Also, C and C++ are consistently rated highly (often in the top 3) in popularity and use. Fortran is not. This would make a difference if you want to collaborate or ask for help. -Matt On Thu, 2010-09-09 at 06:26 -0400, Christofer Bogaso wrote: Dear all, R offers integration mechanism with different programming languages like C, C++, Fortran, .NET etc. Therefore I am curious on, for heavy numerical computation which language is the fastest? Is there any study? I specially want to know because, if there is some study saying that C is the fastest language for numerical computation then I would change some of my R code into C. Thanks for your time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
I have a little package I've been using to write template blog posts (in HTML) with embedded R code. It's quite small but very flexible and extensible, and aims to do something similar to Sweave and brew. In fact, the package is heavily influenced by the brew package, though implemented quite differently. It depends on the evaluate package, available in the CRAN. The tentatively titled 'markup' package is attached. After it's installed, see ?markup and the few examples in the inst/ directory, or just example(markup). -Matt On Thu, 2010-09-09 at 01:47 -0400, David Scott wrote: I am investigating some approaches to reproducible research. I need in the end to produce .html or .doc or .docx. I have used hwriter in the past but have had some problems with verbatim output from R. Tables are also not particularly convenient. I am interested in R2HTML and R2wd in particular, and possibly odfWeave. Does anyone have sample documents using any of these approaches which they could let me have? David Scott _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email:d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
Well, the attachment was a dud. Try this: http://biostatmatt.com/R/markup_0.0.tar.gz -Matt On Thu, 2010-09-09 at 10:54 -0400, Matt Shotwell wrote: I have a little package I've been using to write template blog posts (in HTML) with embedded R code. It's quite small but very flexible and extensible, and aims to do something similar to Sweave and brew. In fact, the package is heavily influenced by the brew package, though implemented quite differently. It depends on the evaluate package, available in the CRAN. The tentatively titled 'markup' package is attached. After it's installed, see ?markup and the few examples in the inst/ directory, or just example(markup). -Matt On Thu, 2010-09-09 at 01:47 -0400, David Scott wrote: I am investigating some approaches to reproducible research. I need in the end to produce .html or .doc or .docx. I have used hwriter in the past but have had some problems with verbatim output from R. Tables are also not particularly convenient. I am interested in R2HTML and R2wd in particular, and possibly odfWeave. Does anyone have sample documents using any of these approaches which they could let me have? David Scott _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Uncompressing data from read.socket
Have a look at gzcon, for decompressing data as they arrive. From the help file: ‘gzcon’ provides a modified connection that wraps an existing connection, and decompresses reads or compresses writes through that connection. Standard ‘gzip’ headers are assumed. There is no indication in the gzcon help file that explicitly prohibits socketConnections. Also, see memDecompress for in-memory decompression of the entire object. -Matt On Wed, 2010-09-08 at 00:50 -0400, raje...@cse.iitm.ac.in wrote: Hi, Is it possible to uncompress gzipped data coming over a socket? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove accents in strings
If you know the encoding of the string, or if its encoding is the current locale encoding, then you can use the iconv function to convert the string to ASCII. Something like: iconv(accented.string, to=ASCII//TRANSLIT) While 7-bit ASCII does not permit accented characters, extended (8-bit) ASCII does. Hence, I'm not sure this will work. But it's worth a try. -Matt On Tue, 2010-09-07 at 13:04 -0400, lamack lamack wrote: Dear all, there is a R function to remove all accents in strings? best regards. JL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove accents in strings
Weird, my (Ubuntu, s don't tell Dirk) iconv doesn't add the backticks or single quotes. tst - c(à, è, ì, ò, ù , À, È, Ì, Ò, Ù, á, + é, í, ó, ú, ý , Á, É, Í, Ó, Ú, Ý) iconv(tst, to=ASCII//TRANSLIT) [1] a e i o u A E I O U a e i o u y A E I [20] O U Y By the way, I'll take this moment to remind anyone interested that R still has trouble with embedded zeros in character strings. I may be abusing terminology, but I think that makes R 8-bit dirty. -Matt On Tue, 2010-09-07 at 14:01 -0400, David Winsemius wrote: On Sep 7, 2010, at 1:35 PM, Matt Shotwell wrote: If you know the encoding of the string, or if its encoding is the current locale encoding, then you can use the iconv function to convert the string to ASCII. Something like: iconv(accented.string, to=ASCII//TRANSLIT) While 7-bit ASCII does not permit accented characters, extended (8- bit) ASCII does. Hence, I'm not sure this will work. But it's worth a try. tst - c(à, è, ì, ò, ù , À, È, Ì, Ò, Ù, á, é, í, ó, ú, ý , Á, É, Í, Ó, Ú, Ý) iconv(tst, to=ASCII//TRANSLIT) [1] `a `e `i `o `u `A `E `I `O `U 'a 'e 'i 'o 'u 'y [17] 'A 'E 'I 'O 'U 'Y gsub(`|\\', , iconv(tst, to=ASCII//TRANSLIT)) [1] a e i o u A E I O U a e i o u y A E I O [21] U Y Notice that the accent acute gets converted to a single quote and therefore needs to be dbl-\-ed to get recognized in an R regex pattern. On a Mac with: locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why is vector assignment in R recreates the entire vector ?
Tal, For your first example, x is not duplicated in memory. If you compile R with --enable-memory-profiling, you have access to the tracemem() function, which will report whether x is duplicate()d: x - rep(1,100) tracemem(x) [1] 0x8f71c38 x[10] - NA This does not result in duplication of x, nor does assignment of x to y: y - x At this point, y internally references x. It's not until we modify y, that x is duplicated, and y gets its own copy of the data: y[10] - NA tracemem[0x8f71c38 - 0x91fff70]: Likewise, no duplication occurs using `[-`: x - rep(1,100) tracemem(x) [1] 0x8e44900 x - `[-`(x, list=10, values=NA) But, R is not yet smart enough to avoid a duplication here: x - rep(1,100) tracemem(x) [1] 0x915d580 x - replace(x, list=10, values=NA) tracemem[0x915d580 - 0x915e090]: replace Beyond these simple tests, it's difficult to know when R copies memory. I mentioned in another post recently that subsetting a vector will copy memory, but this is not reported by tracemem(). For example: tracemem(x) [1] 0x915ed50 y - x[1:100] tracemem(y) [1] 0x915f3f0 identical(x,y) [1] TRUE Fortunately, memory is fairly cheap, and memory operations are pretty fast in modern operating systems, like GNU Linux. I mostly find that the rate limiting steps in my code are computational routines, like exp(). -Matt On Wed, 2010-09-01 at 11:09 -0400, Tal Galili wrote: Hello all, A friend recently brought to my attention that vector assignment actually recreates the entire vector on which the assignment is performed. So for example, the code: x[10]- NA # The original call (short version) Is really doing this: x- replace(x, list=10, values=NA) # The original call (long version) # assigning a whole new vector to x Which is actually doing this: x- `[-`(x, list=10, values=NA) # The actual call Assuming this can be explained reasonably to the lay man, my question is, why is it done this way ? Why won't it just change the relevant pointer in memory? On small vectors it makes no difference. But on big vectors this might be (so I suspect) costly (in terms of time). I'm curious for your responses on the subject. Best, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] documentation to upgrade R-package from 32 to 64bit
Try this: http://www.stats.ox.ac.uk/~ripley/Win64/W64porting.html -Matt On Wed, 2010-09-01 at 07:40 -0400, Hayes, Daniel wrote: Dear all, I am working with the an R-package named GAMLSS (www.gamlss.comhttp://www.gamlss.com) it is currently only functional under the 32-bit version of R (for windows) The author of the package has agreed to help me create 64-bit compatible version. I've been looking through the available R-documentation but cannot find any relevant information on the process. Any help finding such documentation or any information on what the general changes are that need to be implemented for a 32bit add-on package to work with a 64bit version of R would be much appreciated. Thanks you in advance for you help, Daniel Hayes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [semi-OT] Using fortune() in an email signature
Or using R GNU tools: m...@max:~$ R -e fortunes::fortune() | gawk '/^[^]/ {print}' It's not a question of trying variations, rather of following instructions. -- Brian D. Ripley (about using 'Writing R Extensions') R-help (January 2006) -Matt On Wed, 2010-09-01 at 16:49 -0400, Stuart Luppescu wrote: Hello, As you can see from my signature in this message, I use the R fortune function to generate a fortune, which is then fed to the signature program, which constructs a named pipe containing the fortune-bearing sig, which is then included in mail messages. The problem is that it's got extraneous junk in it and I can't figure out how to get rid of it. This is the command that generates the fortune: /usr/bin/R --no-save --no-restore -q /home/sl70/print-fortune.R (where print-fortune.R is just library(fortunes) fortune() ) This produces this: library(fortunes) fortune() Michael Watson: Hopefully this one isn't in the manual or I am about to get shot :-S Peter Dalgaard: *Kapow*... -- Michael Watson and Peter Dalgaard (question on axis()) R-help (February 2006) I would like to remove the first two lines and the last line, so I changed the command to this: /usr/bin/R --no-save --no-restore /home/sl70/print-fortune.R |tail \ -n +23 | head -n -2 2 /dev/null That give the desired result when I run it at the command line, but when I feed it to the signature program, I get this message: Program /usr/local/bin/r-fortune doesn't seem to exist This is the signature program code that produces this error: /* check for existence of program by forking and then trying to exec() it in the child */ pid = fork(); switch (pid) { case -1:/* oh well */ perror(Couldn't fork() a child process); exit(EXIT_FAILURE); case 0: /* in child */ /* close stdout */ close(1); execlp(producer, producer, (char *) 0); exit(EXIT_FAILURE); default: waitpid(pid, exit_status, 0); if (exit_status != EXIT_SUCCESS) { fprintf(stderr, Program %s doesn't seem to exist \n, producer); exit(EXIT_FAILURE); } Unfortunately, I don't understand this at all. Can anyone give me a clue as to what's happening? Thanks. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave.sty
Here is one: http://svn.r-project.org/R/trunk/share/texmf/tex/latex/Sweave.sty -Matt On Tue, 2010-08-24 at 15:40 -0400, r.ookie wrote: Does anyone know where I can download the latest version of Sweave.sty? I have looked all over the site http://www.stat.umn.edu/~charlie/Sweave/ with no luck. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] on abort error, always show call stack?
On Sun, 2010-08-22 at 11:41 -0400, ivo welch wrote: Dear R Wizards---is it possible to get R to show its current call stack (sys.calls()) upon an error abort? I don't use ESS for execution, and it is often not obvious how to locate how I triggered an error in an R internal function. Seeing the call stack would make this easier. (right now, I sprinkle cat statements everywhere, just to locate the line where the error appears.) Of course, I would really love to see the line in my program that triggered this, but I have asked this before, and I understand this is too difficult to get into the R language. The traceback() function will print out the call stack after an error. However, you may find the debug() family of functions more useful for debugging. Also see the browser() function. -Matt regards, /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] on abort error, always show call stack?
How about this: test - function(x) log(x) tryCatch({ #Code that will error test(a) }, finally = { sink(stderr()) traceback() sink() }) If you are running non-interactively, invoke R with the --interactive flag to force it. Saving the code above to test.R, you can see the effect with $ R --interactive test.R 1 test.out 2 test.err This seems reasonable, but maybe others will say if I'm missing something more automagic. -Matt On Sun, 2010-08-22 at 11:58 -0400, ivo welch wrote: yes, thank you. is it possible to have it invoked to STDERR automatically on a program abort? /iaw On Sun, Aug 22, 2010 at 11:50 AM, Matt Shotwell shotw...@musc.edu wrote: On Sun, 2010-08-22 at 11:41 -0400, ivo welch wrote: Dear R Wizards---is it possible to get R to show its current call stack (sys.calls()) upon an error abort? I don't use ESS for execution, and it is often not obvious how to locate how I triggered an error in an R internal function. Seeing the call stack would make this easier. (right now, I sprinkle cat statements everywhere, just to locate the line where the error appears.) Of course, I would really love to see the line in my program that triggered this, but I have asked this before, and I understand this is too difficult to get into the R language. The traceback() function will print out the call stack after an error. However, you may find the debug() family of functions more useful for debugging. Also see the browser() function. -Matt regards, /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does R always insist on sending plot output to a file?
Donald, I was able to 'trick' R into writing plot data to a GNU Linux fifo. I had forgotten that the fifo will block until there is a process at either end (a writer and a reader): At one terminal, create a fifo and set a program to catch output $ mkfifo Rfifo $ cat Rfifo At a second terminal $ R postscript(file=Rfifo) plot(0) dev.off() -Matt On Wed, 2010-08-18 at 23:21 -0400, Matt Shotwell wrote: Donald, At least for the PDF device (I know you asked about png, but I believe they are similar), the answer no. Ultimately, this device calls the standard C function fopen, and writes its data to the resulting file stream. If you're using GNU Linux, you might trick R into writing to a fifo (a named pipe, see 'man fifo'), or some other in-memory device, and read from it with another program. My initial experiments with this, however, were not successful. A better solution here, would be to have the various graphics devices write to an R connection, as do most other R functions that input and output data. In this way, we could write graphics data to a RAW connection (rawConnection()), which is essentially a memory buffer. There are two obvious barriers to this: 1. C level I/O routines (e.g. fprintf) are heavily integrated into the graphics device code. Hence, accommodating R connections would require significant changes. 2. The graphics devices are mostly implemented in C, and there is (at present) no interface to R connections at the C level. -Matt On Wed, 2010-08-18 at 21:49 -0400, Donald Paul Winston wrote: I need to write the output of a R plot to a Java OutputStream. It looks like R insists on sending it's output to a file. Is there anyway to get bytes directly from the output of a plot so I can write it with Java? Writing it to a file is too slow. Is there a parameter in the graphics device function png(..) that directs output to a variable in memory? x - plot(.) would make sense. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pass By Value Questions
On Thu, 2010-08-19 at 14:27 -0400, Duncan Murdoch wrote: On 19/08/2010 12:57 PM, li...@jdadesign.net wrote: I understand R is a Pass-By-Value language. I have a few practical questions, however. I'm dealing with a large dataset (~1GB) and so my understanding of the nuances of memory usage in R is becoming important. In an example such as: d - read.csv(file.csv); n - apply(d, 1, sum); must d be copied to another location in memory in order to be used by apply? In general, is copying only done when a variable is updated within a function? Generally R only copies when the variable is modified, but its rules for detecting this are sometimes overly conservative, so you may get some unnecessary copying. For example, d[1,1] - 3 will probably not make a full copy of d when the internal version of [- is used, but if you have an R-level version, it probably will. I forget whether the dataframe method is internal or R level. In the apply(d, 1, sum) example, it would probably make a copy of each row to pass to sum, but never a copy of the whole dataframe/array. Would the following example be any different in terms of memory usage? d - read.csv(file.csv); n - apply(d[,2:10], 1, sum); or can R reference the original d object since no changes to the object are being made? This would make a new object containing d[,2:10], and would pass that to apply. Since d is a data.frame, subsetting the columns would create a new data.frame, as Duncan says. However, the columns of the new data.frame would internally _reference_ the appropriate columns of d, until either were modified. This does not apply to row subsetting. That is, d[2:10,] would create a new data.frame and copy the relevant data. Nor does it apply to _any_ subsetting of matrices. I'm familiar with FF and BigMemory, but are there any packages/tricks which allow for passing such objects by reference without having to code in C? It's difficult to determine exactly when data is copied internally by R. The tracemem function may be used to track when entire objects are duplicated. However, tracemem would not detect the duplication that occurs, for example, when subsetting the rows of d. Otherwise, we can monitor memory usage with gc(), and experiment with code on a trial and error basis. I have had limited success in avoiding duplication by utilizing R environments. See for example http://biostatmatt.com/archives/663 . However, this may be more trouble that it's worth. -Matt Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Delete rpart/mvpart cross-validation output
Or, if using GNU Linux or other UNIX-like system: sink(/dev/null) # Issue commands sink() -Matt On Wed, 2010-08-18 at 09:14 -0400, Gabor Grothendieck wrote: On Fri, Aug 13, 2010 at 1:52 PM, Marie-Hélène Ouellette mariehele...@gmail.com wrote: Dear all, I was wondering if there is a simple way to avoid printing the multiple cross-validation automatic output to the console of recursive partitionning functions like rpart or mvpart. For example... data(spider) mvpart(data.matrix(spider[,1:12])~herbs+reft+moss+sand+twigs+water,spider,xv=1se,xvmult=100) *X-Val rep : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 Minimum tree sizes tabmins 4 6 7 8 2 18 78 2 * ... loosing what's in bold ? Try this hack: cat - function(...) if (..1 !=..1 != X-Val rep : 1) base::cat(...) environment(mvpart) - .GlobalEnv mvpart(data.matrix(spider[,1:12])~herbs+reft+moss+sand+twigs+water,spider,xv=1se,xvmult=100) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does R always insist on sending plot output to a file?
Donald, At least for the PDF device (I know you asked about png, but I believe they are similar), the answer no. Ultimately, this device calls the standard C function fopen, and writes its data to the resulting file stream. If you're using GNU Linux, you might trick R into writing to a fifo (a named pipe, see 'man fifo'), or some other in-memory device, and read from it with another program. My initial experiments with this, however, were not successful. A better solution here, would be to have the various graphics devices write to an R connection, as do most other R functions that input and output data. In this way, we could write graphics data to a RAW connection (rawConnection()), which is essentially a memory buffer. There are two obvious barriers to this: 1. C level I/O routines (e.g. fprintf) are heavily integrated into the graphics device code. Hence, accommodating R connections would require significant changes. 2. The graphics devices are mostly implemented in C, and there is (at present) no interface to R connections at the C level. -Matt On Wed, 2010-08-18 at 21:49 -0400, Donald Paul Winston wrote: I need to write the output of a R plot to a Java OutputStream. It looks like R insists on sending it's output to a file. Is there anyway to get bytes directly from the output of a plot so I can write it with Java? Writing it to a file is too slow. Is there a parameter in the graphics device function png(..) that directs output to a variable in memory? x - plot(.) would make sense. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] when to use textConnection ??
Also, many R functions are designed to operate on R connections, to input and output text. Alternatively, we may wish to provide the input text as an R character vector, or output text to a character vector. The textConnection makes a character vector look like a connection, so R routines that operate on connections may also operate on character vectors. The textConnection also provides a mechanism for re-encoding text data, although this may be more directly accomplished via the iconv function. However, both methods are currently limited to encodings that do not allow embedded null characters. -Matt On Mon, 2010-08-16 at 13:06 -0400, Joshua Wiley wrote: Hi, One useful case is when data is sent in an email. For instance: T1 T2 T3 -0.24 -0.26 -0.67 -1.58 0.04 0.14 -1.21 1.55 -0.45 0.31 0.48 -1.39 One could read it in via con - textConnection( T1 T2 T3 -0.24 -0.26 -0.67 -1.58 0.04 0.14 -1.21 1.55 -0.45 0.31 0.48 -1.39) read.table(con, header = TRUE) Often a text file can be read in directly with read.table() and the appropriate delimiter (e.g., sep = \t for tab, , for comma, etc.). Do you have a particular problem you are trying to solve or an application of textConnection() you are interested in? Cheers, Josh On Mon, Aug 16, 2010 at 9:37 AM, skan juanp...@gmail.com wrote: Hello. I don't uderstant when to use textConnection and when not. Some examples do it, some not. I've even seen something like con - textConnection(rev(rev(ReadLines('data.txt'))[-(1:2])) data - read.table(con) close(con) -- View this message in context: http://r.789695.n4.nabble.com/when-to-use-textConnection-tp2327132p2327132.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] band pass filter
nuncio, If you already have a filter kernel, you can use the filter function. Of course, convolution filters can be applied directly using the discrete Fourier transform via the fft function. For an example of filtering (lowpass) with R, see http://biostatmatt.com/archives/78 , and the associated R script linked there. Also see the link to a free downloadable book by Steven Smith, which discusses the DFT and building filter kernels. -Matt On Sat, 2010-08-14 at 23:52 -0400, nuncio m wrote: Hello list, Is there any way to bandpass filter in R thanks nuncio -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading a text file, one line at a time
Walt, Something like: con - file(your-large-file.txt, rt) readLines(con, 1) # Read one line -Matt On Sun, 2010-08-15 at 10:58 -0400, Data Analytics Corp. wrote: Hi, I have an upcoming project that will involve a large text file. I want to 1. read the file into R one line at a time 2. do some string manipulations on the line 3. write the line to another text file. I can handle the last two parts. Scan and read.table seem to read the whole file in at once. Since this is a very large file (several hundred thousand lines), this is not practical. Hence the idea of reading one line at at time. The question is, can R read one line at a time? If so, how? Any suggestions are appreciated. Thanks, Walt Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 w...@dataanalyticscorp.com www.dataanalyticscorp.com _ -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ASCI characters
How about: rawToChar(as.raw(82)) [1] R -Matt On Sun, 2010-08-15 at 19:50 -0400, Orvalho Augusto wrote: Hello guys! Is there any function that permits me to get an ASCI character from its code? Eg. ascifunction(34) would give me ' or ascifunction(92) gives \ Thanks Caveman [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numerical Methods Course
TGS, Given that you have to pay an outrageous $155.86 for that book, it seems reasonable to look for a free environment for numerical computing (like R!). If your instructor says that such a variety of programming languages would work, you could probably make a good argument to use R. But why not just ask your instructor? If your instructor insists on MATLAB, you could also consider using GNU Octave, a free MATLAB clone. -Matt On Tue, 2010-08-10 at 10:55 -0400, TGS wrote: I want to take this numerical methods course where the text is http://www.amazon.com/Numerical-Methods-J-Douglas-Faires/dp/0534407617 . The instructor recommends MATLAB, but states Fortran, C, Mathematica, or Maple will also do the job. Will R do the job as well? If not, where do you think it will be lacking in the context of this book/course. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Good Book To Work Through This Summer
There are some book-length documents (downloadable for free) at the contributed documentation section of the R project website here: http://cran.r-project.org/other-docs.html In particular, the book “Practical Regression and Anova using R” by Julian Faraway looks to have the content you want, though I haven't read it myself. There are other high quality authors in the list also. -Matt On Mon, 2010-08-09 at 03:20 -0400, Ondrej Vozar wrote: Hello, I think that good introduction for application oriented people is book of Peter Dalgaard, Introductory Statistics with R http://www.springer.com/statistics/computanional+statistics/book/978-0-387-79053-4 This book is good for mastering basics of R. Book I like the one of John Fox, An R and S-PLUS Companion to Applied Regression http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/index.html But there are dozens of books on this topic. Best regards, Ondrej Vozar. On 9 August 2010 06:38, TGS cran.questi...@gmail.com wrote: Dear R users, I'm hoping to get a few suggestions about which books are good to follow along and learn R. I'm hoping to spend the summer going through a good R book as it is applied in linear regression. Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing downloaded source code into a package
See comments below. On Mon, 2010-08-09 at 10:22 -0400, JH wrote: I am wanting to change some lines of code in the R package named nlme http://cran.r-project.org/web/packages/nlme/index.html To do this I have downloaded the Package source named nlme_3.1-96.tar.gz, opened up the file and changed the text documents within the folder named R, specifically the cor.Struct.txt file. I couldn't find this file. Do you mean corStruct.R, or maybe corStruct.c? I now want to know how can I use this modified nlme_3.1-96.tar.gz file in R 2.10. How do I convert this source code into a package? The source code, along with the documentation, data files, etc. _is_ the package. When the package contains source code from a compiled language (C or Fortran), as nlme does, this code must be compiled for your platform before the package is installed. The CRAN maintainers kindly pre-compile this code for Windows and Mac OS X users. If you make modifications to C or Fortran code in a package, you must re-compile the code yourself, or use a service such as R-Forge. The R manual `Writing R Extensions` is the standard reference for packages. See also the `R Administration and Installation' manual. See the information here http://www.murdoch-sutherland.com/Rtools/ for compiling package code in Windows. Lastly, before you follow the instructions at the URL above, I urge you to consider GNU Linux as a platform for programming. I've found the tools available in standard GNU Linux distributions (such as that available at http://www.debian.org) much simpler to install and work with. I have looked on the internet and tried using cmd.exe then the code Rmcd.exe INSTALL -1 ~/nlme_3.1-96.tar.gz I end up getting the message The system can't find the specified path, when I have the file in the directory that Rmcd.exe is in. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep with search terms defined by a variable
Daniel, If you want to search for each term at the beginning of a sting, using the regular expression construct '^', you might use the following search.terms - c(Emil, Meryl) names - c(Emil Jannings, +Charles Chaplin, +Katherine Hepburn, +Meryl Streep) for(term in search.terms) { + print(grep(paste(^,term,sep=),names)) + } [1] 1 [1] 4 -Matt On Tue, 2010-08-03 at 00:05 -0400, Daniel Malter wrote: Hi, I have a good grasp of grep() and gsub() for finding and extracting character strings. However, I cannot figure out how to use a search term that is stored in a variable when the search string is more complex. #Say I have a string, and want to know whether the last name Jannings is in the string. This is done by names=c(Emil Jannings) grep(Emil,names) #Yet, I need to store the search terms in a variable, which works for the very simple example search.term=Emil grep(search.term,names) #but I cannot get it to work for the more difficult example in which I want to do something like grep(^search.term,names) grep(^search.term,names) grep(^search.term,names) #Implying that the search term must be the first part of the string that is being searched #Ultimately, I need to to loop over several strings stored in search.term, for example, names=c(Emil Jannings,Charles Chaplin,Katherine Hepburn,Meryl Streep) search.term=c(Emil,Meryl) for(i in 1:length(names)){ print(grep(^search.term[i],names)) } So the questions I have are two. 1. How do I concatenate terms that I would normally quote (like ^) with variables that contain search terms and that normally would not be quoted? 2. How do I run this over indices of the variable that contains the search terms? I greatly appreciate any help, Daniel -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Meaning of following function
Ron, In arithmetic, '-' and '+' are binary _and_ unary operators. That is, both -1 and 1-1 are valid arithmetic expressions, the former negates its argument, and the latter subtracts the second from the first. Since much of R is designed do arithmetic, R honors the unary _and_ binary versions of '-' and '+'. The implementation of `-`() performs negation when the second argument is missing, and subtraction when both arguments are present. AFAIR, the only other unary (but never binary) operator in R is '!', or the 'NOT' operator (maybe also the one-sided formula operator '~'). In contrast, the 'times' or 'multiply' operator '*' is generally a binary operator in arithmetic. Hence, the function `*`() requires two arguments. -Matt On Sun, 2010-08-01 at 10:56 -0400, Ron Michael wrote: Hi friends, I am aware of the function -() which acts as minus in ordinary computations. For example: -(3, 1) [1] 2 However what is the meaning of -(3) [1] -3 I was expecting R to generate some error as it does for *(3). What is the logic for that calculation? Thanks, [[alternative HTML version deleted]] -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to code it??
If I take your meaning correctly, you want something like this. x - c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, + 1) easy - function(x) { + state - 0 + for (i in 1:length(x)) { + if (x[i] == 0) + x[i] - state + state - 0 + if (x[i] == 1) + state - -1 + } + x + } easy(x) [1] 0 0 0 0 0 0 0 1 -1 0 1 1 -1 0 1 -1 0 0 1 -Matt On Wed, 2010-07-28 at 14:10 -0400, Raghu wrote: Hi I have say a large vector of 3500 digits. Initially the digits are 0s and 1s. I need to check for a rule to change some of the 0s to -1s in this vector. But once I change a 0 to -1 then I need to start applying the rule to change the next 0 only after I see the next 1 in the vector. Say for example x = (0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,1) I need to traverse from the 9th element to the last ( because the first occurrence of 1 is at 8) . Let us assume that according to our rule we change the 13th element (only 0s can be changed) to -1. Now we need to go to the next occurrence of 1 (which is 15) and begin the rule application from the 16th till the end of the vector and once replaced a 0 to a -1 then start again from the next 1. How do we code this? I 'feel' recursion is the best possible solution but I am not a programmer and will await experts' views. If this is not a typical R-forum question then my advance apologies. Many thx -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what is a vignette?
Alex, Vignettes are optional supplemental documentation. That is, they are in addition to the required boilerplate documentation for R functions and datasets. Vignettes are written in the spirit of sharing knowledge, and assisting new users in learning the purpose and use of a package. Maybe the best place to start is simply to read one, or a few. The `zoo` package has a few, for example here: http://cran.r-project.org/web/packages/zoo/index.html The technical details of vignettes, and how to write one are contained in the `Writing R Extensions` manual: http://cran.r-project.org/manuals.html -Matt On Mon, 2010-07-26 at 07:55 -0400, Alaios wrote: I am trying to find a simple R guide that explain what a vignette is but so far I didnt make any progress. I tried to search inside R's built in help.start() but it only returns results how to see vignettes. So could you please tell me what a vignette is and if you can also could you give some simple guide that I can always use to read about these things? Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sink function
I had addressed a problem similar to this only a few days ago. Please see the following URL: http://tolstoy.newcastle.edu.au/R/e11/help/10/07/1677.html On Fri, 2010-07-23 at 08:45 -0400, nuncio m wrote: I have the following code to write the output from auto.arima function. The issue is not in finding the model but to divert its out put fit to a file order_fit.txt. code runs but nothing is written to order_fit.txt where am I going wrong library(forecast) for (i in 1:2) { filen = paste(file,i,.txt,sep=) data - read.table(filen) dat1 - data[,1] xt - ts(dat1,start=c(1978,11),end=c(2006,12),frequency=12) #dat1[dat1 == -99.989998] - NA if (min(dat1) != max(dat1)){ fit - auto.arima(xt,D=1) *sink(file=order_fit.txt) fit sink()* residfit - residuals(fit) filenou1 = paste(fileree,i,_out,.txt,sep=) residfit write.table(residfit,filenou1,sep=\t,col.names=FALSE,row.names=FALSE,quote=FALSE) }else{ *fiit - ARIMA(-6,-6,-6)(-6,-6,-6)[12] sink(file=order_fit.txt) fiit sink()* filenou1 = paste(fileree,i,_out,.txt,sep=) residfit=rep(-99.99,338) residfit write.table(residfit,filenou1,sep=\t,col.names=FALSE,row.names=FALSE,quote=FALSE) rm(data,dat1,residfit,xt) } } -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Sink Function
Your code between calls to sink() does not generate any output. Hence, nothing will be diverted to the file. To illustrate this point, consider for(i in 1:10) i This produces no output. However, for(i in 1:10) print(i) produces output as expected. -Matt On Fri, 2010-07-16 at 13:34 -0400, Addi Wei wrote: Sorry about that. Still new to this... The code below should be reproducible.All R2 should just be 1, and I should write 1 to R2outputKKNN.txt 10 timesnothing is happening. Appreciate the efforts to help! for (i in 1:10) { adata = 1:5 bdata = 6:10 lm - lm(adata~bdata) slm - summary(lm) str(slm) if (i == 1) { previousR2 -slm$r.squared sink(file=R2outputKKNN.txt, append=TRUE) previousR2 sink() } else if(i!=1) { currentR2 - slm$r.squared if (previousR2 currentR2) { currentR2 - previousR2 } if (previousR2 currentR2) { sink(file=R2outputKKNN.txt, append=TRUE) currentR2 sink() } } } -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calling a c function from R
Fahim, Please see the Writing R Extensions manual http://cran.r-project.org/doc/manuals/R-exts.pdf There are simple instructions in this document under the heading System and foreign language interfaces. -Matt On Wed, 2010-07-14 at 01:21 -0400, Fahim Md wrote: Hi, I am trying to call a C function, that I wrote to parse a flat file, into R. The argument that will go into this function is an input file that I need to parse and write the desired output in an output file. I used some hit and trial approach but i keep on getting the file not found or segmentation fault error. I know that the error is in passing the argument but I could not solve it. After reading some of the tutorials, I understood how to do this if the arguments are integers or floats. I am stuck when i am trying to send the files. I am attaching stub of each file. Help appreciated. Thanks --- My function call would be: source(parse.R) parseGBest('./gbest/inFile.seq', './gbest/outFile.out'); --- I wrote a wrapper function (parse.R) as follows: dyn.load(parse.so); parseGBest = function(inFile, outFile) { .C( parse , inFile , outFile); } How to write receive the filenames in function( , ) above. and how to call .C parse.c file is as below: How to receive the argument in funcion and how to make it compatible with my argv[ ]. void parse( int argc, char *argv[] ) //This is working as standalone C program. How to receive // the above files so that it become compatible with my argv[ ] { FILE *fr, *of; char line[81]; if ( *argc == 3 )*/ { if ( ( fr = fopen( argv[0], r )) == NULL ) { puts( Can't open input file.\n ); exit( 0 ); } if ( ( of = fopen( argv[1], w )) == NULL ) { puts( Output file not given.\n ); } } else {printf(wrong usage: Try Agay!!! correct usage is:= functionName inputfileToParse outFileToWriteInto\n); } while(fgets(line, 81, fr) != NULL) -- --- -- } Thanks again Fahim -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast string comparison
On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote: strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) system.time(strings[-1] == strings[-1e5]) # user system elapsed # 0.016 0.000 0.017 So it takes ~1/100 of a second to do ~100,000 string comparisons. You need to provide a reproducible example that illustrates why you think string comparisons are slow. Here's a vectorized alternative to '==' for strings, with minimal argument checking or result conversion. I haven't looked at the corresponding R source code, it may be similar: library(inline) code - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character) strcmp - cfunction(sig, code) system.time(strings[-1] == strings[-1e5]) user system elapsed 0.036 0.000 0.035 system.time(strcmp(strings[-1], strings[-1e5])) user system elapsed 0.032 0.000 0.034 That's pretty fast, though I seem to be working with a slower system than Hadley. It's hard to see how this could be improved, except maybe by caching results of string comparisons. -Matt Hadley On Tue, Jul 13, 2010 at 6:52 AM, Ralf B ralf.bie...@gmail.com wrote: I am asking this question because String comparison in R seems to be awfully slow (based on profiling results) and I wonder if perhaps '==' alone is not the best one can do. I did not ask for anything particular and I don't think I need to provide a self-contained source example for the question. So, to re-phrase my question, are there more (runtime) effective ways to find out if two strings (about 100-150 characters long) are equal? Ralf On Sun, Jul 11, 2010 at 2:37 PM, Sharpie ch...@sharpsteen.net wrote: Ralf B wrote: What is the fastest way to compare two strings in R? Ralf Which way is not fast enough? In other words, are you asking this question because profiling showed one of R's string comparison operations is causing a massive bottleneck in your code? If so, which one and how are you using it? -Charlie - Charlie Sharpsteen Undergraduate-- Environmental Resources Engineering Humboldt State University -- View this message in context: http://r.789695.n4.nabble.com/Fast-string-comparison-tp2285156p2285409.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast string comparison
Good idea Romain, there is quite a bit of type testing in the function versions of STRING_ELT and CHAR, not to mention the function call overhead. Since the types are checked explicitly, I believe this function is safe. All together now... system.time(strings[-1] == strings[-1e5]) user system elapsed 0.032 0.000 0.035 system.time(strcmp(strings[-1], strings[-1e5])) user system elapsed 0.032 0.000 0.034 system.time(strcmp2(strings[-1], strings[-1e5])) user system elapsed 0.024 0.000 0.026 system.time(lhs==rhs) user system elapsed 0.012 0.000 0.013 system.time(strcmp(lhs, rhs)) user system elapsed 0.012 0.000 0.011 system.time(strcmp2(lhs, rhs)) user system elapsed 0.004 0.000 0.004 I looks like you can squeeze out more speed using the macro versions of STRING_ELT and CHAR. On Tue, 2010-07-13 at 09:48 -0400, Romain Francois wrote: Hi Matt, I think there are some confusing factors in your results. system.time(strcmp(strings[-1], strings[-1e5])) would also include the time required to perform both subscripting (strings[-1] and strings[-1e5] ) which actually takes some time. Also, you do have a bit of overhead due to the use of STRING_ELT and the write barrier. I've include below a version that uses R internals so that you get the fast (but you have to understand the risks, etc ...) version of STRING_ELT using the plugin system of inline. library(inline) code - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character) strcmp - cfunction(sig, code) strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) lhs - strings[-1] rhs - strings[-1e5] system.time( lhs == rhs ) system.time(strcmp( lhs, rhs) ) library(inline) settings - getPlugin( default ) settings$includes - paste( #define USE_RINTERNALS, settings$includes, collapse = \n ) code2 - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character ) strcmp2 - cxxfunction(sig, code2, settings = settings) system.time(strcmp2( lhs, rhs) ) I get: $ Rscript strings.R Le chargement a nécessité le package : methods utilisateur système écoulé 0.002 0.000 0.002 utilisateur système écoulé 0.004 0.000 0.005 utilisateur système écoulé 0.003 0.000 0.003 Romain Le 13/07/10 15:24, Matt Shotwell a écrit : On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote: strings- replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) system.time(strings[-1] == strings[-1e5]) # user system elapsed # 0.016 0.000 0.017 So it takes ~1/100 of a second to do ~100,000 string comparisons. You need to provide a reproducible example that illustrates why you think string comparisons are slow. Here's a vectorized alternative to '==' for strings, with minimal argument checking or result conversion. I haven't looked at the corresponding R source code, it may be similar: library(inline) code- SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig- signature(s1=character, s2=character) strcmp- cfunction(sig, code) system.time(strings[-1] == strings[-1e5]) user system elapsed 0.036 0.000 0.035 system.time(strcmp(strings[-1], strings[-1e5])) user system elapsed 0.032 0.000 0.034 That's pretty fast, though I seem to be working with a slower system than Hadley. It's hard to see how this could be improved, except maybe by caching results of string comparisons. -Matt Hadley On Tue, Jul 13, 2010 at 6:52 AM, Ralf Bralf.bie...@gmail.com wrote: I am asking this question because String comparison in R seems to be awfully slow (based
Re: [R] Compress string memCompress/Decompress
On Fri, 2010-07-09 at 20:02 -0400, Erik Wright wrote: Hi Matt, This works great, thanks! At first I got an error message saying BLOB is not implemented in RSQLite. When I updated to the latest version it worked. SQLite began to support BLOBs from version 3.0. Is there any reason the string needs to be stored as type BLOB? It seems to work the same when I swap BLOB with TEXT in the CREATE TABLE command. SQLite has a dynamic-type system. That is, data types are associated with values rather than with their container (column). This means that most columns in a table can store more than just the type (or 'affinity') it is declared with. I think that's what happens when you use TEXT rather than BLOB. If you use something like x'A9' to insert data into a column with TEXT affinity, I believe it is stored as a BLOB regardless. -Matt Thanks again!, Erik On Jul 9, 2010, at 3:21 PM, Matt Shotwell wrote: Erik, Can you store the data as a blob? For example: #create string, compress with gzip, convert to SQLite blob string string - gzip this string, store as blob in SQLite database string.gz - memCompress(string, type=gzip) string.sqlite - paste(x',paste(string.gz,collapse=),',sep=) #create database and table with a BLOB column library(RSQLite) Loading required package: DBI con - dbConnect(dbDriver(SQLite), compress.sqlite) dbGetQuery(con, CREATE TABLE Compress (id INTEGER, data BLOB);) NULL #insert the string as a blob query - paste(INSERT INTO Compress (id, data) VALUES (1, , + string.sqlite, );, sep=) dbGetQuery(con, query) NULL #recover the blob, decompress, and convert back to a string result - dbGetQuery(con, SELECT data FROM Compress;) string.gz - result[[1]][[1]] string - memDecompress(string.gz, type=gzip) rawToChar(string) [1] gzip this string, store as blob in SQLite database -Matt On Fri, 2010-07-09 at 12:51 -0400, Erik Wright wrote: Hello, I would like to compress a long string (character vector), store the compressed string in the text field of a SQLite database (using RSQLite), and then load the text back into memory and decompress it back into the the original string. My character vector can be compressed considerably using standard gzip/bzip2 compression. In theory it should be much faster for me to compress/decompress a long string than to write the whole string to the hard drive and then read it back (not to mention the saved hard drive space). I have tried accomplishing this task using memCompress() and memDecompress() without success. It seems memCompress can only convert a character vector to raw type which cannot be treated as a string. Does anyone have ideas on how I can go about doing this, especially using the standard base packages? Thanks!, Erik sessionInfo() R version 2.11.0 (2010-04-22) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.11.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compress string memCompress/Decompress
Erik, Can you store the data as a blob? For example: #create string, compress with gzip, convert to SQLite blob string string - gzip this string, store as blob in SQLite database string.gz - memCompress(string, type=gzip) string.sqlite - paste(x',paste(string.gz,collapse=),',sep=) #create database and table with a BLOB column library(RSQLite) Loading required package: DBI con - dbConnect(dbDriver(SQLite), compress.sqlite) dbGetQuery(con, CREATE TABLE Compress (id INTEGER, data BLOB);) NULL #insert the string as a blob query - paste(INSERT INTO Compress (id, data) VALUES (1, , + string.sqlite, );, sep=) dbGetQuery(con, query) NULL #recover the blob, decompress, and convert back to a string result - dbGetQuery(con, SELECT data FROM Compress;) string.gz - result[[1]][[1]] string - memDecompress(string.gz, type=gzip) rawToChar(string) [1] gzip this string, store as blob in SQLite database -Matt On Fri, 2010-07-09 at 12:51 -0400, Erik Wright wrote: Hello, I would like to compress a long string (character vector), store the compressed string in the text field of a SQLite database (using RSQLite), and then load the text back into memory and decompress it back into the the original string. My character vector can be compressed considerably using standard gzip/bzip2 compression. In theory it should be much faster for me to compress/decompress a long string than to write the whole string to the hard drive and then read it back (not to mention the saved hard drive space). I have tried accomplishing this task using memCompress() and memDecompress() without success. It seems memCompress can only convert a character vector to raw type which cannot be treated as a string. Does anyone have ideas on how I can go about doing this, especially using the standard base packages? Thanks!, Erik sessionInfo() R version 2.11.0 (2010-04-22) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.11.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calling Gnuplot from R
I recently wrote a small R function to draw simple ASCII scatterplots. http://biostatmatt.com/archives/491 Bill Harris commented that the plots reminded him of the dumb terminal of Gnuplot. I think it would be really neat to have an R graphics driver to Gnuplot in order to generate more complete ASCII graphics in R. Maybe there are other good reasons also? I believe octave makes good use of Gnuplot... -Matt On Thu, 2010-07-08 at 11:28 -0400, Erik Iverson wrote: If you use Emacs, you can use org-mode with org-babel to facilitate this... I'll refrain from asking why :). See: http://orgmode.org/worg/org-contrib/babel/index.php Christopher Desjardins wrote: Hi, I am wondering if there is a way to call Gnuplot from R and/or if anyone can recommend a package on CRAN capable of doing this? Thanks, Chris PS - Please cc me on the response. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot with whispers
It looks like read.table is reading the first line as a data value, which is the default for read.table. Try using read.table with the argument header=TRUE. Also, consider using a box and whiskers plot for these data (?boxplot, ?lattice::bwplot). -Matt On Mon, 2010-07-05 at 12:08 -0400, Ian Bentley wrote: Hello! I need to make a plot with whispers that does the following. Reads in 50 files, each file containing 200 data points. A file looks like this: base100.log Send Receive 10.5 100.3 15.0 102.4 ... There are 100 lines, each with two data points. I need to read in the 50 files, and plot three lines The first line is the mean of the send column with whiskers indicating standard deviation (Each file represents one data point) The second line is the mean of the receive column, as above. the final plot is the mean of the two summed, with whiskers as above. There will be 50 data points on the final graph, one for each file. I've done this sort of a thing before, but I really can't figure out how to handle the different Columns. If I use read.table: x1 - read.table(updateToSink1010.log) then x1 becomes a matrix, with two columns and 101 rows. -- including Send, Receive. Anyways, I'd appreciate a push in some direction - hopefully the right one :). -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to compute the multinomial beta function?
How about this? mbeta - function(...) { exp(sum(lgamma(c(...)))-lgamma(sum(c(... } gamma(5)*gamma(6)*gamma(7)/gamma(18) [1] 5.829838e-09 mbeta(5,6,7) [1] 5.829838e-09 On Mon, 2010-07-05 at 17:10 -0400, Gregory Gentlemen wrote: Dear R-users, Is there an R function to compute the multinomial beta function? That is, the normalizing constant that arises in a Dirichlet distribution. For example, with three parameters the beta function is Beta(n1,n2,n2) = Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3) Thanks in advance for any assisstance. Regards, Greg [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] left end or right end
Suku, It looks like you might want to consult with a [bio]statistician, but I'm interested in what these distances represent. Can you give some additional context for your problem? How were these distances collected? Is it a collection of pairs of intervals, like this: P Q 1) (1.5, 1.8) (1.2, 2.0) 2) (1.4, 1.9) (1.4, 2.3) ... 1) (start1, end1) (start2, end2) ? If so, is there a more specific test you're interested in? For instance, whether the interval P overlaps with the start/stop position of interval Q, or whether start1 == start2, or end1 == end2, or both? I can think of a bootstrap test for hypotheses like this, and this is relatively easy in R. -Matt On Thu, 2010-07-01 at 07:53 -0400, ravikumar sukumar wrote: Dear all, I am a biologist. I have two sets of distance P(start1, end1) and Q(start2, end2). The distance will be like this. P Q I want to know whether P falls closely to the right end or left end of Q. P and Q are of different lengths for each data point. There are more than 1 pairs of P and Q. Is there any test or function in R to bring a statistically significant conclusion. Thanks for all, Suku [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] left end or right end
Suku, Just to clarify, in your table and each of your images, it appears that the start position of P (start1) is _after_ or at the start position of Q (start2), and the end position of P (end1) is _before_ or at the end position of Q (end2). If these positions represent increasing integers, then start1 = start2 and end1 = end2. I will assume this for the discussion below. You mentioned wanting to know whether the midpoint of P tended to be greater or lesser than the midpoint of Q. That seems like a good idea, since the midpoints _must_ be similar when the lengths of P and Q are similar. Hence, if P and Q are samples from a population, then you may be interested in the population mean difference in midpoints. We can denote this mean M: M = E(mid(P) - mid(Q)) In order to do a classical statistical test, we _need_ a hypothesis about M, and a rule for rejecting the hypothesis. That's why we use the term 'hypothesis'. An appropriate hypothesis here might be: H0: M = 0 or, in words, the mean difference in the P and Q midpoints is zero. A simple rejection rule for this hypothesis is: reject H0 when the observed mean difference in P and Q midpoints is greater than some quantity C, or less than -C. The trick then is to find C that satisfies some type 1 error probability, usually 0.05. It's here that I might recommend a bootstrap procedure. If, in the end, you reject the hypothesis H0, you can use the sign of the estimated mean difference in your biological inferences. ...And I'm still interested to hear what those are. :-) Of course, these are just my ideas, you really ought to visit a biostatistician for professional advice. -Matt On Thu, 2010-07-01 at 10:24 -0400, ravikumar sukumar wrote: There are three possibilities: Case1: Left end P-- Q-- Case2: Right end P-- Q-- Case3: At mid position P- A-- My question is how far my data falls on the all the three cases. Is it biased towards case1 or case2 or case3. I have to consider the length of Q in the data. Example: start2-start1 =2 and end2-end1 = 3 does not make much difference if length of Q is 15. I do not hypothesize, i want to know how my data goes on. Thanks and regards On Thu, Jul 1, 2010 at 4:05 PM, Jonathan Christensen dzhona...@gmail.comwrote: Hi, You need to define what you want more exactly--what are the possible conclusions (hypotheses) you want to reach? Based on what you've said, I can think of several different approaches you might want, but I'm not sure which one of them you're actually after. For example: Hypothesis A: The distance between the left endpoints of P and Q is less than (or equal to) the distance between the right endpoints. Hypothesis B: The distance between the right endpoints is smaller. This is a simple binomial test, as David Winsemius suggested. In your most recent email, though, it sounds like you want to take into account how much smaller one distance is than the other. This is more complicated. Another option occurred to me: maybe you don't care which end P is close to, you just want to know whether it's close to one of the ends, or somewhere in the middle. Without knowing what exactly you are trying to test, it's very hard for us to help you. Jonathan On Thu, Jul 1, 2010 at 7:45 AM, ravikumar sukumar ravikumarsuku...@gmail.com wrote: Sorry for posting to the R list. P Q 12, 28 10, 42 2, 5 1, 55 32, 50 22, 63 . there are 1 points of P and Q. The number of points of P and Q are equal (i,e 1). The interval P always overlaps with Q. i,e start1start2 and end1end2. mere calculating whether points have this condition will not be significant start1start2 and end1end2 and the length of P that is length(end1-start1) and Q ie length(end2-start1) differs. Example Case A: Case B: start2 - start1 =100 end2-end1 = 2 In the above two cases, P is falling on the right end of Q in case B. But it depends on the length(end2-start2). If the length(end2-start2) =15000 in case of B, then it is almost on the middle point. Is there any test or function in R to bring a statistically significant conclusion that midpoint of P or P itself is falling on the left end or right end of Q. sorry once again for posting in this list. Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] how to display the clock time in the loop
Try to flush output after printing: cat(paste(Sys.time()),\n); flush(stdout()) On Thu, 2010-07-01 at 16:17 -0400, Jack Luo wrote: Hi, I am doing some computation which is pretty time consuming, I want R to display CPU time after each iteration using the command Sys.time(). However, I found that the code only began to display the CPU time after quite a while and several iterations have finished. Is there a way to ask R to display time right after each iteration is finished? Thanks, -Jun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] is there a way to do dense rank in R
x - c(5,7,7,9) rank(unique(x))[match(x, unique(x))] [1] 1 2 2 3 On Thu, 2010-07-01 at 21:30 -0400, Suresh Singh wrote: I have not been able to find a way to do dense rank in R Here is an example of what I need rank() gives the following 5 rank 1 7 rank 2 7 rank 2 9 *rank 4* but I want 5 rank 1 7 rank 2 7 rank 2 9 *rank 3* * * thanks SS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integration of two normal density
Isn't it equally trivial to demonstrate that the product of two pdfs _may_ be a normalized pdf? For example, the uniform (0,1) pdf: f(x) = 1 for x in (0, 1), and 0 otherwise Hence, g(x) = f(x)*f(x) = 1 for x in (0, 1), and 0 otherwise _is_ a normalized pdf. But this is a little silly. Rather than memorize answers to questions like is the product of pdfs also a pdf?, we ought to be confident in the properties of pdfs (i.e. not the answers, but the means to arrive at answers). On Mon, 2010-06-28 at 11:42 -0400, Bert Gunter wrote: Inline Below Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of bill.venab...@csiro.au Sent: Friday, June 25, 2010 10:53 PM To: carrieands...@gmail.com; R-help@r-project.org Subject: Re: [R] integration of two normal density Your intuition is wrong and R is right. Why should the product of two probability density functions be a normalized pdf also? -- as is trivially seen with two uniforms on [0,2], with pdf= 1/2, product = 1/4 on [0,2] . -- Bert -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Carrie Li Sent: Saturday, 26 June 2010 1:28 PM To: r-help Subject: [R] integration of two normal density Hello everyone, I have a question about integration of two density function Intuitively, I think the value after integration should be 1, but they are not. Am I missing something here ? t - function(y){dnorm(y, mean=3)*dnorm(y/2, mean=1.5)} integrate(t, -Inf, Inf) 0.3568248 with absolute error 4.9e-06 Also, is there any R function or package could do multivariate integration ? Thanks for any suggestions! Carrie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] advice on package devel with external libs
Some ideas, 1. Wrap the library as an R package, as you said, and check for the library at configure time (i.e. with autoconf or custom script). But if you do, it would be great to provide an R-level API so that we can all use it. This is the strategy of the 'cairo', 'RGtk', 'rgl', and 'gsl' packages. Also, maybe try and collaborate with the developers of the 'rjson' package to improve it. 2. If the library is appropriately licensed, and truly 'lightweight', simply add its sources to your package. R core does this with zlib and a few other libraries. However, this puts the burden on you to maintain code written by others. There are several JSON parsers with very liberal licenses (www.json.org), and some are tiny. -Matt On Mon, 2010-06-28 at 16:10 -0400, Murat Tasan wrote: hi all - i'm working on an R package that makes use of my own shared library written in C. but i also am making use of another C-written library. (my package is for facilitating biological namespace translations via online (i.e. up-to-date) biological databases.) problem is, the library i'm using is not a standard library (i.e. i doubt it will be installed on most users' machines). i also don't think too many users will be particularly adept in installing a shared library. for users with a sysadmin, it can be done easily enough, but on local installations i fear most will be incapable of properly installing/ locating the library so my code can link to it during compile time. (in case anyone was wondering, the library in question is a lightweight JSON parser... yes i know there are existing R packages for this, but they are *very* slow for large JSON object coding/ encoding.) how have folks dealt with this in the past with R packages? i've thought about wrapping the other library itself as a separate R package which basically does nothing on installation other than compile and put the libraries a predictable location... but this seems rather silly (and may violate the JSON parser package's license). thanks for any input on this, -murat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.