Re: [R] Opening SAS file using read.sas7bdat() function in sas7bdat library.

2012-10-30 Thread Matt Shotwell
Thanks for the helpful comments from others.

The KNOWNHOST variable lists the types of file that are known to work
with the read.sas7bdat function. It's likely that most files written on
Windows platforms will work, even if not listed in KNOWNHOST. If you're
feeling experimental, you might just comment the lines that test against
the KNOWNHOST list.

Unfortunately, it appears that the file formatting depends on the system
where is was originally written. The hypothesis is that sas7bdat files
were originally no more than a memory dump of a C structure, or similar.
Because C structures may be laid out differently by different compilers
(i.e., on different platforms), this may have led to the difficulty
apparent here.

Regards,
Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any good R server-with connection examples

2012-10-23 Thread Matt Shotwell
 I want to connect R with HTML/PHP pages to take input from user,do
 some
 statistical processing on it   show results to HTML page again.
 I search on net,i got Rserve package,but examples  are mainly for java
 langaure  not for PHP
 i am wondering how to connect it to PHP-Apache-MySQL
 Is there any good tutorial/video which will tell me how to do that ?
 At least tell me logical way how to use it ?

Check out http://rapache.net/

rApache connects R and the Apache 2 web server, such that R can act as a
server-side scripting language, like PHP. This may be the easiest way,
using R, to take user input from the web browser.

The site has some decent documentation and links to examples.

--Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nested brew call yields Error in .brew.cat(26, 28) : unused argument(s) (26, 28)

2012-03-29 Thread Matt Shotwell
On Wed, 2012-03-28 at 11:40 +0100, Chris Beeley wrote:
 I am writing several webpages using the brew package and R2HTML. I would 
 like to work off one script so I am using nested brew calls. The 
 documentation for brew states that:
 
 NOTE: brew calls can be nested and rely on placing a function named 
 ’.brew.cat’ in the environment in which it is passed. Each time brew is 
 called, a check for the existence of this function is made. If it 
 exists, then it is replaced with a new copy that is lexically scoped to 
 the current brew frame. Once the brew call is done, the function is 
 replaced with the previous function. The function is finally removed from 
 the environment once all brew calls return.
 
 I'm afraid I can't quite figure out what it is I'm supposed to do here. 
 I've tried loading the brew library within the script which I pass to 
 brew, and I've tried defining brew cat like this:

The paragraph above describes what brew is doing behind the scenes. It's
not necessary to modify or set the .brew.cat function.

A nested (or recursive) brew call occurs when brew() is called from a
document currently being processed by brew().

To illustrate further, suppose there are two brew documents,
example-1.brew and example-2.brew, where example-1.brew contains the
following text (delimited by '''):

'''
This text is in example-1.brew.
%= brew::brew(example-2.brew) %
'''

and the example-2.brew contains

'''
This text is in example-2.brew.
%= date() -%
'''

Then from the R prompt we have:

Rbrew::brew(example-1.brew)
This text is in example-1.brew.
This text is in example-2.brew.
Thu Mar 29 20:24:52 2012

 .brew.cat=function(){}
 
 This generates the following error message:
 
 Error in .brew.cat(26, 28) : unused argument(s) (26, 28)
 
 I think perhaps it is more likely that I need to insert into the script 
 the actual content of .brew.cat, but I can't seem to get R to tell me 
 what it is and Googling throws up a lot of stuff about beer and not much 
 else (drew a blank also from RSiteSearch(Nested brew))
 
 Any help gratefully received.
 
 Chris Beeley
 Institute of Mental Health, UK
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Assistant Professor, Department of Biostatistics
School of Medicine, Vanderbilt University
1161 21st Ave. S2323 MCN Office CC2102L
Nashville, TN 37232-2158

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with Matrix code optimization

2012-02-23 Thread Matt Shotwell
The chol and solve methods for dpoMatrix (Matrix package) are much
faster than the default methods. But, the time required to coerce a
regular matrix to dpoMatrix swamps the advantage.

Hence, I have the following problem, where use of dpoMatrix is worse
than a regular matrix.

library(Matrix)

x - diag(10)

system.time(
  for(r in seq(0.1, 0.9, length.out=1000)) {
m - r^abs(row(x)-col(x));
chol(m); solve(m);
  })

system.time(
  for(r in seq(0.1, 0.9, length.out=1000)) {
M - as(r^abs(row(x)-col(x)), 'dpoMatrix')
chol(M); solve(M);
  })

Any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function restrictedparts

2012-01-25 Thread Matt Shotwell
That's because the number of partitions of 281 items of order 10 is
quite large:

R library('partitions')
R R(10,281)
[1] 1218681472

Without thinking about this too hard, the result of
restrictedparts(281,10) should require around

R 1218681472 * 10 * 4 / 10^9
[1] 48.74726

gigabytes of storage space (because the result is a 1218681472 x 10
array of 4 byte integers).

Because the number of partitions grows 'explosively' with the number of
items, this is a serious obstacle for statistical partitioning and
clustering methods. For more discouragement, see the 'Bell number'.

You can enumerate these restricted partitions one by one; see

R ?partitions::nextpart

Matt

On Wed, 2012-01-25 at 15:11 +, yan jiao wrote:
 I am using function restrictedparts, but got error:
 
 
 restrictedparts(281,10)
 Error in integer(len) : vector size specified is too large
 Calls: restrictedparts - integer
 In addition: Warning message:
 In restrictedparts(281, 10) : NAs introduced by coercion
 Error in integer(len) : vector size specified is too large
 Calls: restrictedparts - integer
 
 
 is there a similar function can deal with long vector?
 
 I'm using R version 2.14.1 (2011-12-22),x86_64, linux-gnu
 
 many thanks
 
 yan
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bayesian data analysis recommendations

2012-01-20 Thread Matt Shotwell
On Thu, 2012-01-19 at 19:23 -0500, C W wrote:
 Thanks, Rich, I will look at the book.
 
 I agree, there are many nice packages, but what if the package changes in a
 few years?  I would have no idea what is going on!  I've heard
 from predecessor in the industry who emphasize the learning, not just plug
 and chug.
 
 I really want to learn the material and understand it, above all, it is
 interesting.
 
 I am looking more towards Bayesian statistics or Bayesian inference.  I am
 in statistics graduate school, though not my field, the biology application
 could help in the understand I suppose?

This list (r-help) may not be the best place to look for advice on this.
But here is some anyway :)

For a well-rounded introduction, I recommend Robert's 'The Bayesian
Choice'. This is a great foundation for Bayesians who intend to defend
their positions on statistical inference. For a more practical approach,
Gelman, Carlin, Stern, and Rubin's book 'Bayesian Data Analysis' has
been very popular (THE most popular, according to some). Regarding the
software tools for Bayesian data analysis, the most mature _and_ active
_and_ best integrated with the R project is Martyn Plummer's JAGS (See
also the R package rjags, by the same author). Another tool that I'm
planning to check out is PyMC: http://code.google.com/p/pymc/

Best,
Matt

 On Thu, Jan 19, 2012 at 7:07 PM, Rich Shepard rshep...@appl-ecosys.com
 wrote:
  On Thu, 19 Jan 2012, C W wrote:
 
  I am trying to learn Bayesian inference and Bayesian data analysis, I am
  new in the field.  Would any experts on the list recommend any good sites
  or materials for beginners?
 
  My approach is to learn and understand the theory first, then program
  on my own using R, though I see there are already packages.
 
 
   I'm far from an expert, but why not avoid re-inventing the wheel while
 you
  learn? Buy and read Jim Albert's Bayesian Computation with R.
 
   If you're a population ecologist (or willing to extend pesented examples
  and ideas to communities and ecosystems), Ben Bolker's Ecological Models
  and Data in R explains when Bayesian and frequentist approaches each have
  advantages over the other.
 
  Rich
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R logo in eps formt

2011-12-01 Thread Matt Shotwell
See this earlier post for SVG logos:

http://tolstoy.newcastle.edu.au/R/e12/devel/10/10/0112.html

Using Image Magick, do something like 

convert logo.svg logo.eps


On Thu, 2011-12-01 at 10:56 +0700, Ben Madin wrote:
 G'day all,
 
 Sorry if this message has been posted before, but searching for R is always 
 difficult...
 
 I was hoping for a copy of the logo in eps format? Can I do this from R, or 
 is one available for download?
 
 cheers
 
 Ben
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] contact person for UseR 2012, please?

2011-10-18 Thread Matt Shotwell
The contact person is:

Stephania McNeal-Goddard
email: stephania.mcneal-godd...@vanderbilt.edu
phone: (615)322-2768

Vanderbilt University School of Medicine
Department of Biostatistics
S-2323 Medical Center North
Nashville, TN 37232-2158

On Tue, 2011-10-18 at 12:41 -0400, David Winsemius wrote:
 On Oct 18, 2011, at 12:25 PM, Erin Hodgess wrote:
 
  Dear R People:
 
  Do you know who the contact person is for UseR 2012, please?
 
  I'm trying to get together some numbers for funding (sorry for the
 
 Funny, it was the first hit on a Google search with term useR2012
 
 http://biostat.mc.vanderbilt.edu/wiki/Main/UseR-2012
 
 
 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Assistant Professor, Department of Biostatistics
School of Medicine, Vanderbilt University
1161 21st Ave. S2323 MCN Office CC2102L
Nashville, TN 37232-2158

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Related Topic] need help on read.spss

2011-10-13 Thread Matt Shotwell
Would it be worthwhile to update the read.spss implementation using the
more recent discoveries from the PSPP group? I don't mean to copy their
code; but to use the ideas in their code. Is anyone working on this? I
wouldn't want the effort to be duplicated.

On Thu, 2011-10-13 at 16:22 +0200, Uwe Ligges wrote:
 
 On 11.10.2011 12:07, Smart Guy wrote:
  Hi,
 I have one doubt about one of the parameter of 'read.spss()' from
  'foreign' package.
  Here is the syntax :-
 
  read.spss ( file,
   use.value.labels = TRUE,
   to.data.frame = FALSE,
   max.value.labels = Inf,
   trim.factor.names = FALSE,
   trim_values = TRUE,
   reencode = NA,
   use.missings = to.data.frame )
 
 
  In above syntax when I pass *'to.data.frame= FALSE*' it gives me missing
  values from SPSS file (that I try to read using read.spss() ). But when I
  pass '*to.data.frame = TRUE*' then its not giving me missing values. And
  need to get missing values.
 
  According to read.spss() documentation
 
  *to.data.frame :  return a data frame?*
 
  I am curious to know, if we pass *'to.data.frame = TRUE*' , is it going to
  cause some issue or effect something? I didn't understand the read.spss()
  documentation correctly.
  Please explain.
 
  Thanks in Advance
 
 
 An R data.frame cannot represent different kinds of missing values, 
 since R just has NA. Therefore, there are two way to import data:
 
 to.data.frame=FALSE  will read all the information, but into a format 
 you will likely have to postprocess to make it conveniently usable.
 
 to.data.frame=TRUE   will import into a data.frame, but that cannot 
 represent all the nuances known from the SPSS representation.
 
 Uwe Ligges
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rweb and setting up R on a server

2011-09-08 Thread Matt Shotwell
Erin, 

I haven't used Rweb recently. The URL is
http://www.math.montana.edu/Rweb/ . If you have a server, you could set
up the server version of RStudio: http://rstudio.org/download/server .
It worked well when I tried it. 

Best,
Matt

On Tue, 2011-09-06 at 17:07 -0500, Erin Hodgess wrote: 
 Dear R People:
 
 At one time, Rweb existed, which had R on a server.
 
 I looked for it, but can't find it.
 
 Has anyone used that recently, or is there a new equivalent, please?
 
 Thanks,
 Erin
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readBin fails to read large files

2011-09-01 Thread Matt Shotwell
On Thu, 2011-09-01 at 17:36 +0100, Prof Brian Ripley wrote:
 readBin is intended to read a few items at a time, not 10^9.  You are 
 probably getting 32-bit integer overflow inside your OS, since the 
 number of bytes you are trying to read in one go exceeds 2GB.
 
 Don't do that: read say a million at time.
 
 And BTW, if these really are unsigned ints you will get wraparound.

To elaborate, ?readBin reads that the 'signed' argument is only used for
integers of size 1 and 2 bytes. These are ultimately converted to signed
4 byte integers, because that's how R stores integers. To be exact, if
your file contains integers larger than 2^31-1 = 2147483647, would
occur. In actuality, R returns NA for those values.

I'm bringing this up because R normally issues a warning:

R 2147483647L + 1L
[1] NA
Warning message:
In 2147483647L + 1L : NAs produced by integer overflow

But, a similar warning isn't issued by readBin when NA results from
signed integer overflow:

#The raw vector below represents 2147483647L and 2147483647L + 1L
#in little endian, unsigned, 4 byte integers 
R dat - as.raw(c(0xff,0xff,0xff,0x7f,0x00,0x00,0x00,0x80))
R writeBin(dat, 'test.bin')
R readBin('test.bin', n=2, integer(), signed=FALSE)
[1] 2147483647 NA

 On Thu, 1 Sep 2011, Benton, Paul wrote:
 
  Posting for a friend
 
  Begin forwarded message:
 
  From: Geier, Florian 
  florian.geie...@imperial.ac.ukmailto:florian.geie...@imperial.ac.uk
  Subject: Fwd: readBin fails to read large files
  Date: September 1, 2011 4:10:53 PM GMT+01:00
  To:
 
 
 
  Begin forwarded message:
 
  Date: 1 September 2011 16:01:45 GMT+01:00
  Subject: readBin fails to read large files
 
  Dear all,
 
  I am trying to read a large file (~2GB) of unsigned ints into R. Using the 
  command:
 
  raw-readBin(file,n=10^8, integer(),endian=little,signed=FALSE)
 
  It works fine for n=10^8, but fails for n=10^9 (or even at n=6*10^8). My 
  machine$sizeof.long is 8 bit.
  I am running R 2.13.1 on a x86_64-apple-darwin9.8.0/x86_64 (64-bit) 
  architecture.
 
  Thanks for your help
 
  Florian
 
  --
  AXA doctoral fellow
  Bundy lab - Biomolecular Medicine
  Imperial College London
 
 
 
 
 
  --
  AXA doctoral fellow
  Bundy lab - Biomolecular Medicine
  Imperial College London
 
 
 
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrap

2011-07-21 Thread Matt Shotwell
In order to apply the bootstrap, you must resample, uniformly at random
from the independent units of measurement in your data. Assuming that
these represent the rows of 'data', consider the following:

est - function(y, x, obeta = c(1,1), verbose=FALSE) {
n - length(x)
X - cbind(rep(1, n), x)
nbeta - c(0,0)
iter - 0
while(crossprod(obeta-nbeta)10^(-12)) {
nbeta - obeta
eta   - X%*%nbeta
mu- eta
mu1   - 1/eta
W - diag(as.vector(mu1))
Z - X%*%nbeta+(y-mu)
XWX   - t(X)%*%W%*%X
XWZ   - t(X)%*%W%*%Z
Cov   - solve(XWX)
obeta - Cov%*%XWZ
iter  - iter+1
if(verbose)
cat(Iteration #  and beta1= ,iter, nbeta, \n)
}
return(nbeta[1,1])
}

boot - function(data, reps) {
n - nrow(data)
Nt - vector('numeric', length=reps)
for(Ncount in 1:reps) {
#resample the rows of data
bdata - data[sample(1:n,n,replace=TRUE),]
#recompute and store estimate
Nt[Ncount] - est(bdata[,1], bdata[,2])
}
return(Nt) 
}

stem(boot(data,1000),width=60)

  The decimal point is at the |

  -3 | 4
  -2 | 
  -1 | 2
  -0 | 88866555444333222111
   0 | 0022+400
   1 | 0001+203
   2 | 2224+23
   3 | 112223344455
   4 | 113344555789
   5 | 02334446677899
   6 | 1112334455778
   7 | 11235568
   8 | 001799
   9 | 0259
  10 | 1446
  11 | 19
  12 | 48
  13 | 8
  14 | 024
  15 | 
  16 | 
  17 | 0788
  18 | 
  19 | 1

On Wed, 2011-07-20 at 18:09 -0400, Val wrote:
 Hi all,
 
 I am facing difficulty on  how to use bootstrap sampling and
 below is my example of function.
 
 Read a data , use some functions and  use iteration to find the solution(
 ie, convergence is reached).  I want to use bootstrap approach to do it
 several times (200 or 300 times) this whole process  and see the
 distribution of parameter of interest.
 
 Below is a small example that resembles my problem. However,  I  found out
 all samples are the same. So I would appreciate your help on this case.
 
 #**
 rm(list=ls())
  xx - read.table(textConnection( y x
 11 5.16
 11 4.04
 14 3.85
 19 5.68
 4 1.26
 23  7.89
 15 4.25
 17 3.94
 7 2.35
 17 4.74
 14 5.49
 11 4.12
 17 5.92), header=TRUE)
 data - as.matrix(xx)
 closeAllconnections()
 
 Nt - NULL
 for (Ncount in 1:100)
  {
 y - data[,1]
 x - data[,2]
 n - length(x)
 
 X - cbind(rep(1,n),x) #covariate/design matrix
 obeta- c(1,1) #previous/starting values of beta
 
 nbeta - c(0,0)#new beta
 iter=0
 
   while(crossprod(obeta-nbeta)10^(-12))
{
 nbeta - obeta
 eta   - X%*%nbeta
 mu- eta
 mu1   - 1/eta
 W - diag(as.vector(mu1))
 Z - X%*%nbeta+(y-mu)
 XWX   - t(X)%*%W%*%X
 XWZ   - t(X)%*%W%*%Z
 Cov   - solve(XWX)
 obeta - Cov%*%XWZ
 iter  - iter+1
 
 cat(Iteration #  and beta1= ,iter, nbeta, \n)
 }
 
   Nt[Ncount] - nbeta[1,1]
 }
 Nt
 summary(Nt)
 #**e*
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to capture console output in a numeric format

2011-06-24 Thread Matt Shotwell
Ravi, 

Consider using an environment (i.e. a 'reference' object) to store the
results, avoiding string manipulation, and the potential for loss of
precision:

fr - function(x, env) {   ## Rosenbrock Banana function
x1 - x[1]
x2 - x[2]
f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
if(exists('fout', env))
fout - rbind(get('fout', env), c(x1, x2, f))
else
fout - c(x1=x1, x2=x2, f=f)
assign('fout', fout, env)
f   
}

out - new.env()
ans - optim(c(-1.2, 1), fr, env=out)
out$fout

Best,
Matt

 
On Fri, 2011-06-24 at 15:10 +, Ravi Varadhan wrote:
 Thank you very much, Jim.  That works!  
 
 I did know that I could process the character strings using regex, but was 
 also wondering if there was a direct way to get this.  
 
 Suppose, in the current example I would like to obtain a 3-column matrix that 
 contains the parameters and the function value:
 
 fr - function(x) {   ## Rosenbrock Banana function
 on.exit(print(cbind(x1, x2, f)))
 x1 - x[1]
 x2 - x[2]
 f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 f 
 }
 
 fvals - capture.output(ans - optim(c(-1.2,1), fr))
 
 Now, I need to tweak your solution to get the 3-column matrix.  It would be 
 nice, if there was a more direct way to get the numerical output, perhaps a 
 numeric option in capture.output().
 
 Best,
 Ravi.
 
 ---
 Ravi Varadhan, Ph.D.
 Assistant Professor,
 Division of Geriatric Medicine and Gerontology School of Medicine Johns 
 Hopkins University
 
 Ph. (410) 502-2619
 email: rvarad...@jhmi.edu
 
 -Original Message-
 From: jim holtman [mailto:jholt...@gmail.com] 
 Sent: Friday, June 24, 2011 10:48 AM
 To: Ravi Varadhan
 Cc: r-help@r-project.org
 Subject: Re: [R] How to capture console output in a numeric format
 
 try this:
 
  fr - function(x) {   ## Rosenbrock Banana function
 +on.exit(print(f))
 +x1 - x[1]
 +x2 - x[2]
 +f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 +f
 + }
 
  fvals - capture.output(ans - optim(c(-1.2,1), fr))
  # convert to numeric
  fvals - as.numeric(sub(^.* , , fvals))
 
  fvals
   [1] 24.20  7.095296 15.08  4.541696
   [5]  6.029216  4.456256  8.879936  7.777856
   [9]  4.728125  5.167901  4.21  4.437670
  [13]  4.178989  4.326023  4.070813  4.221489
  [17]  4.039810  4.896359  4.009379  4.077130
  [21]  4.020798  3.993600  4.024586  4.117625
  [25]  3.993115  3.976081  3.971089  4.023905
  [29]  3.980807  3.952577  3.932179  3.935345
 
 
 On Fri, Jun 24, 2011 at 10:39 AM, Ravi Varadhan rvarad...@jhmi.edu wrote:
  Hi,
 
  I would like to know how to capture the console output from running an 
  algorithm for further analysis.  I can capture this using capture.output() 
  but that yields a character vector.  I would like to extract the actual 
  numeric values.  Here is an example of what I am trying to do.
 
  fr - function(x) {   ## Rosenbrock Banana function
 on.exit(print(f))
 x1 - x[1]
 x2 - x[2]
 f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 f
  }
 
  fvals - capture.output(ans - optim(c(-1.2,1), fr))
 
  Now, `fvals' contains character elements, but I would like to obtain the 
  actual numerical values.  How can I do this?
 
  Thanks very much for any suggestions.
 
  Best,
  Ravi.
 
  ---
  Ravi Varadhan, Ph.D.
  Assistant Professor,
  Division of Geriatric Medicine and Gerontology School of Medicine Johns 
  Hopkins University
 
  Ph. (410) 502-2619
  email: rvarad...@jhmi.edumailto:rvarad...@jhmi.edu
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 

-- 
Matthew S. Shotwell
Assistant Professor, Department of Biostatistics
School of Medicine, Vanderbilt University
1161 21st Ave. S2323 MCN Office CC2102L
Nashville, TN 37232-2158

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to capture console output in a numeric format

2011-06-24 Thread Matt Shotwell
On Fri, 2011-06-24 at 12:09 -0400, David Winsemius wrote:
 On Jun 24, 2011, at 11:27 AM, Matt Shotwell wrote:
 
  Ravi,
 
  Consider using an environment (i.e. a 'reference' object) to store the
  results, avoiding string manipulation, and the potential for loss of
  precision:
 
  fr - function(x, env) {   ## Rosenbrock Banana function
 x1 - x[1]
 x2 - x[2]
 f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 if(exists('fout', env))
 fout - rbind(get('fout', env), c(x1, x2, f))
 
 So _that's_ what a reference object is?

Well, environments have 'pass-by-reference' behavior. That is, when they
are passed to a function, modifications to the environment persist
outside the function call.

This is distinct from the Reference class (?methods::ReferenceClass).
But there are similar concepts. The methods of a reference class can
modify the class fields in a 'by-reference' fashion. However, the fields
need not be passed to a method.

 This seems to give the same results in this example. Am I committing  
 any sins by sneaking around the get()?
 
  if(exists('fout', env))
 fout - rbind(env[['fout']], c(x1, x2, f))  # seems more direct
 

'env$fout' works here too.

 Thinking I also might be able to avoid the later assign(), I tried  
 these without success.
 
 fr - function(x, env) {   ## Rosenbrock Banana function
 x1 - x[1]
 x2 - x[2]
 f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 if(exists('fout', env))
 env[['fout']] - rbind(env[['fout']], c(x1, x2, f))
 else
 fout - c(x1=x1, x2=x2, f=f)
 
 f
 }

this would work with 'env$fout - c(x1=x1, x2=x2, f=f)' following the
'else'. Hence, David's version might look like this:

fr - function(x, env) {   ## Rosenbrock Banana function 
x1 - x[1]
x2 - x[2]
f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
if(exists('fout', env))
env$fout - rbind(env$fout, c(x1, x2, f))
else
env$fout - c(x1=x1, x2=x2, f=f)
f
}

out - new.env()
ans - optim(c(-1.2, 1), fr, env=out)
out$fout

-Matt

 out - new.env()
 ans - optim(c(-1.2, 1), fr, env=out)
 out$fout
 # NULL
 
   Is there no '[[-' for environments? (Also tried '-' but I know  
 that is sinful/ )
 
 -- 
 David.
 else
 fout - c(x1=x1, x2=x2, f=f)
 assign('fout', fout, env)
 f
  }
 
  out - new.env()
  ans - optim(c(-1.2, 1), fr, env=out)
  out$fout
 
  Best,
  Matt
 
 
  On Fri, 2011-06-24 at 15:10 +, Ravi Varadhan wrote:
  Thank you very much, Jim.  That works!
 
  I did know that I could process the character strings using regex,  
  but was also wondering if there was a direct way to get this.
 
  Suppose, in the current example I would like to obtain a 3-column  
  matrix that contains the parameters and the function value:
 
  fr - function(x) {   ## Rosenbrock Banana function
 on.exit(print(cbind(x1, x2, f)))
 x1 - x[1]
 x2 - x[2]
 f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
 f   
  }
 
  fvals - capture.output(ans - optim(c(-1.2,1), fr))
 
  Now, I need to tweak your solution to get the 3-column matrix.  It  
  would be nice, if there was a more direct way to get the numerical  
  output, perhaps a numeric option in capture.output().
 
  Best,
  Ravi.
 
  ---
  Ravi Varadhan, Ph.D.
  Assistant Professor,
  Division of Geriatric Medicine and Gerontology School of Medicine  
  Johns Hopkins University
 
  Ph. (410) 502-2619
  email: rvarad...@jhmi.edu
 
  -Original Message-
  From: jim holtman [mailto:jholt...@gmail.com]
  Sent: Friday, June 24, 2011 10:48 AM
  To: Ravi Varadhan
  Cc: r-help@r-project.org
  Subject: Re: [R] How to capture console output in a numeric format
 
  try this:
 
  fr - function(x) {   ## Rosenbrock Banana function
  +on.exit(print(f))
  +x1 - x[1]
  +x2 - x[2]
  +f - 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
  +f
  + }
 
  fvals - capture.output(ans - optim(c(-1.2,1), fr))
  # convert to numeric
  fvals - as.numeric(sub(^.* , , fvals))
 
  fvals
   [1] 24.20  7.095296 15.08   
  4.541696
   [5]  6.029216  4.456256  8.879936   
  7.777856
   [9]  4.728125  5.167901  4.21   
  4.437670
  [13]  4.178989  4.326023  4.070813   
  4.221489
  [17]  4.039810  4.896359  4.009379   
  4.077130
  [21]  4.020798  3.993600  4.024586   
  4.117625
  [25]  3.993115  3.976081  3.971089   
  4.023905
  [29]  3.980807  3.952577  3.932179   
  3.935345
 
 
  On Fri, Jun 24, 2011 at 10:39 AM, Ravi Varadhan  
  rvarad...@jhmi.edu wrote:
  Hi,
 
  I would like to know how to capture the console output from  
  running an algorithm for further analysis.  I can capture this  
  using capture.output() but that yields a character vector.  I  
  would like

Re: [R] Elbow criterion

2011-06-20 Thread Matt Shotwell
On Mon, 2011-06-20 at 13:38 +0200, Dominik P.H. Kalisch wrote:
 Hi,
 
 I would like to cluster a dataset with the ward algorithm.

I'm assuming that this refers to the agglomerative partitioning method
[1]. That is, the number of clusters is selected according to the data
partition that is sequentially optimal with respect to an `objective
function'. In order to apply the elbow criterion, it should be possible
to optimize over subsets of all possible data partitions where the
number of clusters is fixed.

Although the Ward method yields a sequence of data partitions with
decreasing cluster sizes, there is no guarantee that _any_ of these
partitions are optimal (except sequentially, of course). To apply the
elbow method post hoc seems dubious, but maybe no more so than the Ward
method itself.

There are clustering methods that optimize the data partition (w.r.t a
likelihood/posterior) with a fixed number of clusters, for instance,
those based on finite mixture models. The elbow principle and method
seem more valid in this context. See the R package 'mclust', and the
CRAN task view for cluster analysis:

http://cran.r-project.org/web/views/Cluster.html

 That works fine. But I can't find a method to plot the structure chart 
 to estimate the elbow crterion for the number of clusters.
 Can someone tell me how I can do it?
 
 Thanks for your help.
 Dominik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[1] Ward, J. H. (1963), “Hierarchical Grouping to Optimize an Objective
Function,” Journal of the American Statistical Association, 58, 236–244.

-- 
Matthew S. Shotwell
Assistant Professor, Department of Biostatistics
School of Medicine, Vanderbilt University
1161 21st Ave. S2323 MCN Office CC2102L
Nashville, TN 37232-2158

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can we prepare a questionaire in R

2011-06-08 Thread Matt Shotwell
As Mike had written, there are frameworks for web-development with R.
RApache http://www.rapache.net is one. Also, see the R package Rook:
http://cran.r-project.org/web/packages/Rook/index.html .

On Wed, 2011-06-08 at 17:26 +0530, amrita gs wrote:
 How can we create HTML forms in R

Wouldn't you rather create HTML forms in HTML? See the links above to
use R for server-side scripting, for example, to receive form data from
a web browser.

 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about curve function

2011-06-07 Thread Matt Shotwell
On Tue, 2011-06-07 at 16:17 +0200, Uwe Ligges wrote:
 
 On 07.06.2011 11:57, peter dalgaard wrote:
 
  On Jun 6, 2011, at 11:22 , Prof Brian Ripley wrote:
 
  As a further example of the trickiness, the function method of plot() 
  relies on curve(x, ...) being a request to plot the function x(x) against 
  x.  I've added a comment to that effect to the help page.
 
  Ouch. This springs to mind:
 
  fortune(106)
 
  If the answer is parse() you should usually rethink the question.
  -- Thomas Lumley
 R-help (February 2005)
 
 
  but curve() predates that insight by half a decade or more. It could 
  probably do with a redesign, if anyone is up to it.
 
  By the way, it really does work if the 2nd arg is an expression object (as 
  opposed to an expression evaluating to an expression object):
 
  do.call(curve,list(expression(x)))
 
  or
 
  cl- quote(curve(x))
  cl[[2]]- expression(x)
  eval(cl)
 
  (The trouble with nonstandard evaluation is that it doesn't follow standard 
  evaluation rules...)
 
 If this is not already a fortune, I will add it.

And one more for Uwe's principle: when discontent, circumvent!  :)

 Which is why I useually circvumvent curve(). It is typically faster to 
 just evaluate a function at positions x and plot it rather than thinking 
 minutes about how curve() expects its arguments.
 
 Uwe
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about curve function

2011-06-05 Thread Matt Shotwell
I think there is trouble because expr in curve(expr) may be the name of
a function, and it's ambiguous whether 'x' should be interpreted as a
mathematical expression involving x, or the name of a function. Here are
some examples that work:

curve(I(x))
curve(1*x)

On Sun, 2011-06-05 at 12:07 -0500, Abhilash Balakrishnan wrote:
 Dear Sirs,
 
 I am a new user of the R package.  When I try to use the curve function it
 confuses me.
 
  curve(x^2)
 Works fine.
 
  curve(x)
 Makes a complaint I don't understand.  Why is x^2 valid and x is not?
 
 I check the documentation of curve, and it says the first argument must be
 an expression containing x.
 
  expression(x)
 Is an expression containing x.
 
  curve(expression(x))
 Makes a different complaint and mentions different lengths of x and y (but I
 use no y here).
 
 I understand that plotting the function y(x) = x is rather silly, but I want
 to know what I am doing wrong, for the sake of my understanding of how R
 works.
 
 Thank you for support.
 Abhilash B.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a vector from a file

2011-05-31 Thread Matt Shotwell
On Tue, 2011-05-31 at 15:36 +0200, heimat los wrote:
 Hello all,
 I am new to R and my question should be trivial. I need to create a word
 cloud from a txt file containing the words and their occurrence number. For
 that purposes I am using the snippets package [1].
 As it can be seen at the bottom of the link, first I have to create a vector
 (is that right that words is a vector?) like bellow.
 
  words - c(apple=10, pie=14, orange=5, fruit=4)
 
 My problem is to do the same thing but create the vector from a file which
 would contain words and their occurence number. I would be very happy if you
 could give me some hints.

How is the file formatted? Can you provide a small example?

 Moreover, to understand the format of the file to be inserted I write the
 vector words to a file.
 
  write(words, file=words.txt)
 
 However, the file words.txt contains only the values but not the
 names(apple, pie etc.).
 
 $ cat words.txt
 10 14 5 4
 
 It seems that I have to understand more about the data types in R.
 
 Thanks.
 PH
 
 http://www.rforge.net/doc/packages/snippets/cloud.html
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a vector from a file

2011-05-31 Thread Matt Shotwell
On Tue, 2011-05-31 at 16:19 +0200, heimat los wrote:
 On Tue, May 31, 2011 at 4:12 PM, Matt Shotwell m...@biostatmatt.com
 wrote:
 On Tue, 2011-05-31 at 15:36 +0200, heimat los wrote:
  Hello all,
  I am new to R and my question should be trivial. I need to
 create a word
  cloud from a txt file containing the words and their
 occurrence number. For
  that purposes I am using the snippets package [1].
  As it can be seen at the bottom of the link, first I have to
 create a vector
  (is that right that words is a vector?) like bellow.
 
   words - c(apple=10, pie=14, orange=5, fruit=4)
 
  My problem is to do the same thing but create the vector
 from a file which
  would contain words and their occurence number. I would be
 very happy if you
  could give me some hints.
 
 
 How is the file formatted? Can you provide a small example?
 
 
 
 The file format is
 
 video tape=8
 object recognition=45
 object detection=23
 vhs tape=2
 
 But I can change it if needed with bash scripting.

A CSV might be more universal, but this will do.

 Regards
 

OK. Save the above as 'words.txt', then from the R prompt:

words.df - read.table(words.txt, sep==)
words.vec - words.df$V2
names(words.vec) - words.df$V1

Then use words.vec with the snippets::cloud function. I wasn't able to
install the snippets package and test the cloud function, because I am
still using R 2.13.0-alpha.

read.table returns what R calls a 'data frame'; basically a collection
of records over some number of fields. It's like a matrix but different,
since fields may take values of different types. In the example above,
the data frame returned by read.table has two fields named 'V1' and
'V2', respectively. The R expression 'words.df$V2' references the 'V2'
field of words.df, which is a vector. The last expression sets names for
words.vec, by referencing the 'V1' field of words.df. 

  
  Moreover, to understand the format of the file to be
 inserted I write the
  vector words to a file.
 
   write(words, file=words.txt)
 
  However, the file words.txt contains only the values but not
 the
  names(apple, pie etc.).
 
  $ cat words.txt
  10 14 5 4
 
  It seems that I have to understand more about the data types
 in R.
 
  Thanks.
  PH
 
  http://www.rforge.net/doc/packages/snippets/cloud.html
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible
 code.
 
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] blank space escape sequence in R?

2011-04-25 Thread Matt Shotwell
You can embed hex escapes in strings (except \x00). The value(s) that
you embed will depend on the character encoding used on you platform. If
this is UTF-8, or some other ASCII compatible encoding, \x20 will work:

 foo\x20bar
[1] foo bar


For other locales, you might try charToRaw( ) to see the binary (hex)
representation for the space character on your platform, and substitute
this sequence instead.

On Mon, 2011-04-25 at 15:01 +0200, Mark Heckmann wrote:
 Is there a blank space escape sequence in R, i.e. something like \sp etc. to 
 produce a blank space?
 
 TIA
 Mark
 –––
 Mark Heckmann
 Blog: www.markheckmann.de
 R-Blog: http://ryouready.wordpress.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] blank space escape sequence in R?

2011-04-25 Thread Matt Shotwell
I may have misread your original email. Whether you use a hex escape or
a space character, the resulting string in memory is identical:

 identical(a\x20b, a b)
[1] TRUE

But, if you were to read a file containing the six characters a
\x20b (say with readLines), then the six characters would be read into
memory, and printed like this:

a\\x20b

That is, not with a space character substituted for \x20. So, now I'm
not sure this is a solution.

On Mon, 2011-04-25 at 12:24 -0500, Matt Shotwell wrote:
 You can embed hex escapes in strings (except \x00). The value(s) that
 you embed will depend on the character encoding used on you platform. If
 this is UTF-8, or some other ASCII compatible encoding, \x20 will work:
 
  foo\x20bar
 [1] foo bar
 
 
 For other locales, you might try charToRaw( ) to see the binary (hex)
 representation for the space character on your platform, and substitute
 this sequence instead.
 
 On Mon, 2011-04-25 at 15:01 +0200, Mark Heckmann wrote:
  Is there a blank space escape sequence in R, i.e. something like \sp etc. 
  to produce a blank space?
  
  TIA
  Mark
  –––
  Mark Heckmann
  Blog: www.markheckmann.de
  R-Blog: http://ryouready.wordpress.com
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting 16-bit to 8-bit encoding?

2011-04-21 Thread Matt Shotwell

On 04/21/2011 10:36 AM, Brian Buma wrote:

Hello all-

I have a question related to encoding.  I'm using a seperate program which
takes either 16 bit or 8 bit (flat binary files) as inputs (they are raster
satellite imagery and the associated quality files), but can't handle both
at the same time.  Problem is the quality and the image come in different
formats (quality- 8bit, image- 16bit).  I need to switch the encoding on the


I think some more detail about these files is necessary. What do these 
16/8 bit quantities represent? Are these files just a sequence of such 
quantities, or is there meta information (i.e. image dimension)?



quality files to 16 bit, without altering anything else (they are img files
right now).  I imagine this is a fairly simply process, but I haven't been


Does 'img files' indicate that these files are formatted according to a 
standard?. Finally, are you using some R code to manipulate these files? 
Have an example, including data?



able to find a package or anything which can tell me how to do it- perhaps
I'm searching the wrong terms, but I did look.  Is there any methods to do
this quickly?  Ideally, the solution would involve reading in a list of
files and replacing the original with the new, 16 bit version, as I have
over 300 files to convert.  I hope that's clear.  Thanks in advance!




--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting 16-bit to 8-bit encoding?

2011-04-21 Thread Matt Shotwell

OK. I'm going to copy this back to R-help too.

With R, we can convert a file of 8-bit integers to 16-bit integers like so:

# Create a test file of 8-bit integers:
con - file(test.8, wb)
writeBin(sample(-1L:4L, 1024, TRUE), con, size=1)
close(con)

# Convert test.8 to test.16
icon - file(test.8, rb)
ocon - file(test.16, wb)
while(length(dat - readBin(icon, integer, 1024, size=1))  0)
writeBin(dat, ocon, size=2)
close(icon)
close(ocon)

This assumes (without considering a more formal description of the 
format) that the file and your computing platform agree on how 
multi-byte signed integers are represented.


Hope that will get you going.

On 04/21/2011 11:02 AM, Brian Buma wrote:

Apologies.  The 8-bit file (the one that needs to be converted) is just
a series of integers, -1 to 4, which is no doubt why they are encoded in
8 bit.  They don't need to be changed numerically, just put in a 16-bit
encoding.  No meta info, headerless.  All the data is MODIS satellite
imagery.

I have been using the raster program to visualize things, and
processing (when I get that far) will be done in that program mainly.
I've used that program on a different project, and it seemed to work
well.  The actual program that can't handle two different inputs is
Timesat, a phenology-program (not R).  I was thinking that R could
probably do this conversion quick and easy (fairly), but haven't figured
out how to yet.

As an example, I have an NDVI file (flat binary, 16bit encoding)- so a
string of numbers, 4450, 4650, etc...  The associated quality file is
another string, 1,1,2,1,0, etc.  It's encoded as an 8bit file.
Conceptually, all it needs (I think) is to be read in and resaved in the
less memory-efficient 16-bit format.

Thanks!  Sorry if the explanation isn't clear.



On Thu, Apr 21, 2011 at 9:50 AM, Matt Shotwell
matt.shotw...@vanderbilt.edu mailto:matt.shotw...@vanderbilt.edu wrote:

On 04/21/2011 10:36 AM, Brian Buma wrote:

Hello all-

I have a question related to encoding.  I'm using a seperate
program which
takes either 16 bit or 8 bit (flat binary files) as inputs (they
are raster
satellite imagery and the associated quality files), but can't
handle both
at the same time.  Problem is the quality and the image come in
different
formats (quality- 8bit, image- 16bit).  I need to switch the
encoding on the


I think some more detail about these files is necessary. What do
these 16/8 bit quantities represent? Are these files just a sequence
of such quantities, or is there meta information (i.e. image dimension)?


quality files to 16 bit, without altering anything else (they
are img files
right now).  I imagine this is a fairly simply process, but I
haven't been


Does 'img files' indicate that these files are formatted according
to a standard?. Finally, are you using some R code to manipulate
these files? Have an example, including data?


able to find a package or anything which can tell me how to do
it- perhaps
I'm searching the wrong terms, but I did look.  Is there any
methods to do
this quickly?  Ideally, the solution would involve reading in a
list of
files and replacing the original with the new, 16 bit version,
as I have
over 300 files to convert.  I hope that's clear.  Thanks in advance!



--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University




--


Brian Buma
PhD Candidate
Ecology and Evolutionary Biology / CIRES
University of Colorado, Boulder

brian.b...@colorado.edu mailto:brian.b...@colorado.edu




--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] print.raw - but convert ASCII?

2011-04-19 Thread Matt Shotwell
On Tue, 2011-04-19 at 03:14 -0400, Duncan Murdoch wrote:
 On 11-04-18 9:51 PM, Matt Shotwell wrote:
  Does anyone know if there is a simple way to print raw vectors, such
  that ASCII characters are printed for bytes in the ASCII range, and
  their hex representation otherwise? rawToChar doesn't work when we have
  something like c(0x00, 0x00, 0x44, 0x00).
 
 Do you really need hex?  rawToChar(x, multiple=TRUE) comes close, but 
 displays using octal or symbolic escapes, e.g.

No, but I've almost learned to count efficiently in hex. :)

[1]  \001 \002 \003 \004 \005 \006 \a   \b 
 \t   \n
   [12] \v   \f   \r   \016 \017 \020 \021 \022 \023 
 \024 \025
   [23] \026 \027 \030 \031 \032 \033 \034 \035 \036 
 \037  
   [34] !\   #$%'() 
 *+
 
 If you really do want hex, then you'll need something like
 
 ifelse( x  32 | x = 127, as.character(x), rawToChar(x, multiple=TRUE))

That does it. Thanks. -Matt

 Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] print.raw - but convert ASCII?

2011-04-18 Thread Matt Shotwell
Does anyone know if there is a simple way to print raw vectors, such
that ASCII characters are printed for bytes in the ASCII range, and
their hex representation otherwise? rawToChar doesn't work when we have
something like c(0x00, 0x00, 0x44, 0x00).

-Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integer and floating-point storage

2011-04-14 Thread Matt Shotwell

Hi Mike,

There are some facilities for storing and manipulating small (2 bit) 
integers. See here:


http://cran.r-project.org/web/packages/ff/index.html

-Matt

On 04/14/2011 01:20 PM, Mike Miller wrote:

I note that current implementations of R use 32-bit integers for
integer vectors, but I am working with large arrays that contain
integers from 0 to 3, so they could be stored as unsigned 8-bit
integers. Can R do this? (FYI -- This is for storing minor-allele counts
for genetic studies. There are 0, 1 or 2 minor alleles and 3 would
represent missing.)

It is theoretically possible to store such data with four integers per
byte. This is what PLINK (GPL license) does in its binary (.bed)
pedigree format:

http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#ped

That might be too much to hope for. ;-)

I think that the R system uses double-precision floating point numbers
by default. When I impute minor-allele counts, I get posterior expected
values ranging from 0 to 2 (called dosages). The imputation isn't very
precise, so it would be fine to store such data using one or two bytes.
(The values are used as regressors and small changes would have minimal
impact on results.) I could use unsigned 8-bit integers (0 to 255),
probably using only 0 to 254 so that 1 and 2 could be represented with
perfect precision as 127/127 and 254/127 (but I would do regression on
the integer values). Or I could use 16 bits, doubling memory load and
improving precision. It would be convenient if R could work with
half-precision floating-point numbers (binary16):

http://en.wikipedia.org/wiki/Half_precision_floating-point_format

Can R do that?

If not, is anyone interested in working on developing some of these
features in R? We have GPL code from PLINK and Octave that might help a
lot.

http://www.gnu.org/software/octave/doc/interpreter/Integer-Data-Types.html

Best,

Mike

--
Michael B. Miller, Ph.D.
Bioinformatics Specialist
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] understanding dump.frames; typo;

2011-04-12 Thread Matt Shotwell
When a function I have stop()s, I'd like it to return its evaluation 
frame, but not halt execution of the script. In experimenting with this, 
I became confused with dump.frames. From ?dump.frames:


 If ‘dump.frames’ is installed as the error handler, execution will
 continue even in non-interactive sessions.  See the examples for
 how to dump and then quit.

Suppose I save the following script to dump-test.R:

options(error=dump.frames)
cat(interactive:, interactive(), \n)
f - function() {
stop(dump-test-error)
cat(execution continues within f\n)
}
f()
cat(execution continues outside of f\n)
if(exists(last.dump))
cat(last.dump is available\n)

From an interactive R prompt, execution is halted at 'stop':

R source('dump-test.R')
interactive: TRUE
Error in f() : dump-test-error

Using Rscript, execution continues depending on whether you source() the 
file with the -e flag, or pass the file as an argument.


matt@pal ~$ Rscript dump-test.R
interactive: FALSE
Error in f() : dump-test-error
execution continues outside of f
last.dump is available

matt@pal ~$ Rscript -e source('dump-test.R')
interactive: FALSE
Error in f() : dump-test-error
Calls: source - eval.with.vis - eval.with.vis - f

It seems that interactiveness (as tested by interactive()) doesn't come 
into play, yet execution does *not* always continue. What am I missing? 
Alternative solutions are also welcome.


-Matt

P.S. There is a typo in the help file: The dumped object contain the 
call stack... should read The dumped object contains the call stack


 sessionInfo()
R version 2.13.0 alpha (2011-03-18 r54865)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.13.0

--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Examples of web-based Sweave use?

2011-04-06 Thread Matt Shotwell
That's an interesting idea. I had written a long email describing a 
proof-of-concept, but decided to post is to the website below instead.


http://biostatmatt.com/archives/1184

Matt

On 04/04/2011 07:31 AM, carslaw wrote:

I appreciate that this is OT, but I'd be grateful for pointers to examples of
where
Sweave has been used for web-based applications.  In particular, examples of
where reports/analyses are produced automatically through submission of data
to a web-sever.  I am mostly interested in situations where pdf reports have
been produced rather than, say, a plot/table etc shown on a web page.

I've had limited success finding examples on this.

Many thanks.

David Carslaw


Environmental Research Group
MRC-HPA Centre for Environment and Health
King's College London
Franklin Wilkins Building
Stamford Street
London SE1 9NH

david.cars...@kcl.ac.uk


--
View this message in context: 
http://r.789695.n4.nabble.com/Examples-of-web-based-Sweave-use-tp3425324p3425324.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] library(foreign) read.spss warning

2011-03-26 Thread Matt Shotwell
There is some information about this subtype in the PSPP source code,
and for other subtypes not yet implemented by read.spss. The PSPP source
code indicates that this subtype consists of Value labels for long
strings, which isn't very illuminating to me (probably because I don't
use PSPP, or SPSS, though I increasingly have need to import SPSS data
files). Copied below are the relevant bits.

-Matt

From (the PSPP source file) src/data/sys-file-reader.c:

enum
  {
/* subtypes 0-2 unknown */
EXT_INTEGER   = 3,  /* Machine integer info. */
EXT_FLOAT = 4,  /* Machine floating-point info. */
EXT_VAR_SETS  = 5,  /* Variable sets. */
EXT_DATE  = 6,  /* DATE. */
EXT_MRSETS= 7,  /* Multiple response sets. */
EXT_DATA_ENTRY= 8,  /* SPSS Data Entry. */
/* subtypes 9-10 unknown */
EXT_DISPLAY   = 11, /* Variable display parameters. */
/* subtype 12 unknown */
EXT_LONG_NAMES= 13, /* Long variable names. */
EXT_LONG_STRINGS  = 14, /* Long strings. */
/* subtype 15 unknown */
EXT_NCASES= 16, /* Extended number of cases. */
EXT_FILE_ATTRS= 17, /* Data file attributes. */
EXT_VAR_ATTRS = 18, /* Variable attributes. */
EXT_MRSETS2   = 19, /* Multiple response sets (extended). */
EXT_ENCODING  = 20, /* Character encoding. */
EXT_LONG_LABELS   = 21  /* Value labels for long strings. */
  };

and

  static const struct extension_record_type types[] =
{   
  /* Implemented record types. */
  { EXT_INTEGER,  4, 8 },
  { EXT_FLOAT,8, 3 },
  { EXT_MRSETS,   1, 0 }, 
  { EXT_DISPLAY,  4, 0 },
  { EXT_LONG_NAMES,   1, 0 },
  { EXT_LONG_STRINGS, 1, 0 },
  { EXT_NCASES,   8, 2 },
  { EXT_FILE_ATTRS,   1, 0 },
  { EXT_VAR_ATTRS,1, 0 },
  { EXT_MRSETS2,  1, 0 },
  { EXT_ENCODING, 1, 0 },
  { EXT_LONG_LABELS,  1, 0 },

  /* Ignored record types. */
  { EXT_VAR_SETS, 0, 0 },
  { EXT_DATE, 0, 0 },
  { EXT_DATA_ENTRY,   0, 0 },
};


On Fri, 2011-03-25 at 18:39 -0500, Robert Baer wrote:
 I got the following:
  library(foreign)
  swal = read.spss(swallowing.sav, to.data.frame =TRUE)
 Warning message:
 In read.spss(swallowing.sav, to.data.frame = TRUE) :
   swallowing.sav: Unrecognized record type 7, subtype 21 encountered in 
 system file
  
 
 The bulk of the data seems to read in  a usable form, but I'm curious about 
 what might be getting lost because I don't know how to translate type 7, 
 subtype 21.  I did not generate the SPSS data so I'm not certain of the 
 version, but I'm assuming version 18 or 19.  I did a quick Find on the PSPP 
 manual for Type 7 and subtype 21 and came up dry.
 
 Any insights or clues how I might learn more?  
 
 Thanks,
 Rob
 
 
  R.Version()
 $platform
 [1] i386-pc-mingw32
 
 $arch
 [1] i386
 
 $os
 [1] mingw32
 
 $system
 [1] i386, mingw32
 
 $status
 [1] 
 
 $major
 [1] 2
 
 $minor
 [1] 12.2
 
 $year
 [1] 2011
 
 $month
 [1] 02
 
 $day
 [1] 25
 
 $`svn rev`
 [1] 54585
 
 $language
 [1] R
 
 $version.string
 [1] R version 2.12.2 (2011-02-25)
 
 
 
 --
 Robert W. Baer, Ph.D.
 Professor of Physiology
 Kirksville College of Osteopathic Medicine
 A. T. Still University of Health Sciences
 Kirksville, MO 63501
 660-626-232
 FAX 660-626-2965
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Venn Diagram corresponding to size in R

2011-03-09 Thread Matt Shotwell
Try here:
https://stat.ethz.ch/pipermail/r-help/2003-February/029393.html


On Tue, 2011-03-08 at 20:25 -0500, Shira Rockowitz wrote:
 I was wondering if anyone could help me figure out how to make a Venn
 diagram in R where the circles are scaled to the size of each dataset.  I
 have looked at the information for venn (in gplots) and vennDiagram (in
 limma) and I cannot seem to figure out what parameter to change.  I have
 looked this up online and do not seem to be seeing anyone else who has
 posted this question or the answer to it before.  I see graphs though that
 are purported to be made in R that are scaled like this, so I think it must
 be possible, although I do not know if they were made with a custom
 function.  If I have just not been searching for this question correctly,
 and it has already been asked, please direct me to the earlier question.  I
 would like to thank you all in advance for you help!
 ~Shira
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assignment by value or reference

2011-03-08 Thread Matt Shotwell
On 03/08/2011 07:20 AM, Xiaobo Gu wrote:
 On Wed, Sep 15, 2010 at 5:05 PM, Uwe Ligges
 lig...@statistik.tu-dortmund.de  wrote:
 See the R Language Definition manual. Since R knows about lazy
evaluation,
 it is sometimes neither by reference nor by value.
 If you want to think binary, then by value fits better than by
 reference.
 Hi,
 Can we think it's eventually by value?

Not always (see in-line below).


 For simple functions such as:
 is(df[[1]], logical)
 used to test wheather the first column of data frame df is of type
 logical, will a new vector be created and used inside the is function?

No, df[[1]] isn't copied in this case. However, if you subset an atomic
vector (subset+assignment is different!), there is copying. For example:

  df - data.frame(x=c(FALSE,TRUE))
  tracemem(df[[1]])
[1] 0x217afa8
  is(df[[1]],logical)
[1] TRUE
  is(df[[1]][],  logical)
tracemem[0x217afa8 - 0xf9d198]: ...cut...
[1] TRUE
  is(df[[1]][1], logical)
[1] TRUE

Note that tracemem doesn't catch the copying that occurs during 
evaluation of the last expression. As a strategy, R avoids copying when 
it's clearly not necessary from the perspective of the R interpreter. 
There are some notable cases where copying is obviously not necessary 
from the user perspective (e.g. contiguous subsetting), but avoiding a 
copy in these cases might be difficult to implement in R's 
parser/evaluator framework. Here's another simple exception:

  x - 1
  tracemem(x)
[1] 0x18984b8
  x - x + 1
tracemem[0x18984b8 - 0x207e568]: ...cut...


 Another example,

 dbWriteTable(con, tablename, df) will write the content of data
 frame df into a database table, will a new data frame object created
 and used inside the dbWriteTable function?

No, but if dbWriteTable modifies its local variable that was assigned
df, then df may be copied.


 Thanks.



 Uwe Ligges



 On 05.09.2010 17:19, Xiaobo Gu wrote:

 Hi Team,

   Can you please tell me the rules of assignment in R, by
value or
 by reference.

  From my about 3 months of experience of part time job of R, it
seems most
 times it is by value, especially in function parameter and return
values
 assignment; and it is by reference when referencing container
sub-objects of

This is a function call convention (i.e. passing by value), as 
distinguished from an assignment convention (I'm not certain they're
equivalent in R). In general R functions pass by value. There are
exceptions here also, notably R environments. For 
example:

  f - function(e) assign(a, 1, e)
  e - new.env()
  f(e)
  objects(e)
[1] a

Under strict pass-by-value convention, e would remain unchanged. In 
general, assignments are by value. However, R environments are an 
exception; assignment is by reference:

  r - e
  objects(r)
[1] a
  assign(b, 2, r)
  objects(r)
[1] a b
  objects(e)
[1] a b

In this sense, the calling/assignment convention is a property of the 
objects being passed/assigned. I think that is consistent with Uwe's 
comment above.

Best,
Matt
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

 container objects, such as elements of List objects and row/column
objects
 of DataFrame objectes; but it is by value when referencing the
smallest unit
 of element of a container object, such as cell of data frame
objects.





 Xiaobo.Gu




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rapache ( was Developing a web crawler )

2011-03-06 Thread Matt Shotwell
On Sun, 2011-03-06 at 08:06 -0500, Mike Marchywka wrote: 
 
 
 
 
 
 
  Date: Thu, 3 Mar 2011 13:04:11 -0600
  From: matt.shotw...@vanderbilt.edu
  To: r-help@r-project.org
  Subject: Re: [R] Developing a web crawler / R webkit or something 
  similar? [off topic]
 
  On 03/03/2011 08:07 AM, Mike Marchywka wrote:
  
  
  
  
  
  
  
   Date: Thu, 3 Mar 2011 01:22:44 -0800
   From: antuj...@gmail.com
   To: r-help@r-project.org
   Subject: [R] Developing a web crawler
  
   Hi,
  
   I wish to develop a web crawler in R. I have been using the 
   functionalities
   available under the RCurl package.
   I am able to extract the html content of the site but i don't know how 
   to go
  
   In general this can be a big effort but there may be things in
   text processing packages you could adapt to execute html and javascript.
   However, I guess what I'd be looking for is something like a webkit
   package or other open source browser with or without an R interface.
   This actually may be an ideal solution for a lot of things as you get
   all the content handlers of at least some browser.
  
  
   Now that you mention it, I wonder if there are browser plugins to handle
   R content ( I'd have to give this some thought, put a script up as
   a web page with mime type test/R and have it execute it in R. )
 
  There are server-side solutions for this sort of thing. See
  http://rapache.net/ . Also, there was a string of messages on R-devel
  some years ago addressing the mime type issue; beginning here:
  http://tolstoy.newcastle.edu.au/R/devel/05/11/3054.html . Though I don't
  know whether there was a resolution. Some suggestions were text/x-R,
  text/x-Rd, application/x-RData.
 
 The rapache demo looks like something I could use right away
 but I haven't looked into the handlers yet. I have installed rapache now
 on my debian system ( still have config issues but I did get apach2 to 
 restart LOL)
 Before I plow into this too far, how would this compare/compete with something
 like a PHP library for Rserve? That is the approach I had been pursuing.
 
 Thanks. 

Hi Mike, 

If you've built and configured RApache, then the difficult plowing is
over :). RApache operates at the top (HTTP) layer of the OSI stack,
whereas Rserve works at the lower transport/network layer. Hence, the
scope of Rserve applications is far more general. Extending Rserve to
operate at the HTTP layer (via PHP) will mean more work.

RApache offers high level functionality, for example, to replace PHP
with R in web pages. No interface code is necessary. Here's a simple
What's The Time? webpage using RApache and yarr [1] to handle the
code:

 setContentType(text/html\n\n) 
html
headtitleWhat's The Time?/title/head
bodypre/= cat(format(Sys.time(), usetz=TRUE)) /pre/body
/html

Here's a live version: [2]. Interfacing PHP with Rserve in this context
would be useful if installation of R and/or RApache on the web host were
prohibited. A PHP/Rserve framework might also be useful in other
contexts, for example, to extend PHP applications (e.g. WordPress,
MediaWiki).

Best,
Matt

[1] http://biostatmatt.com/archives/1000
[2] http://biostatmatt.com/yarr/time.yarr

 
  -Matt
 
  
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Developing a web crawler / R webkit or something similar? [off topic]

2011-03-03 Thread Matt Shotwell

On 03/03/2011 08:07 AM, Mike Marchywka wrote:









Date: Thu, 3 Mar 2011 01:22:44 -0800
From: antuj...@gmail.com
To: r-help@r-project.org
Subject: [R] Developing a web crawler

Hi,

I wish to develop a web crawler in R. I have been using the functionalities
available under the RCurl package.
I am able to extract the html content of the site but i don't know how to go


In general this can be a big effort but there may be things in
text processing packages you could adapt to execute html and javascript.
However, I guess what I'd be looking for is something like a webkit
package or other open source browser with or without an R interface.
This actually may be an ideal solution for a lot of things as you get
all the content handlers of at least some browser.


Now that you mention it, I wonder if there are browser plugins to handle
R content ( I'd have to give this some thought, put a script up as
a web page with mime type test/R and have it execute it in R. )


There are server-side solutions for this sort of thing. See 
http://rapache.net/ . Also, there was a string of messages on R-devel 
some years ago addressing the mime type issue; beginning here: 
http://tolstoy.newcastle.edu.au/R/devel/05/11/3054.html . Though I don't 
know whether there was a resolution. Some suggestions were text/x-R, 
text/x-Rd, application/x-RData.


-Matt






about analyzing the html formatted document.
I wish to know the frequency of a word in the document. I am only acquainted
with analyzing data sets.
So how should i go about analyzing data that is not available in table
format.

Few chunks of code that i wrote:
w-
getURL(http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes;)
write.table(w,test.txt)
t- readLines(w)

readLines also didnt prove out to be of any help.

Any help would be highly appreciated. Thanks in advance.


--
View this message in context: 
http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Robust variance estimation with rq (failure of the bootstrap?)

2011-03-01 Thread Matt Shotwell
Jim,

Thanks for pointing me to this article. The authors argue that the 
bootstrap intervals for a robust estimator may not be as robust as the 
estimator. In this context, robustness is measured by the breakdown 
point, which is supposed to measure robustness to outliers. Even so, the
authors found that the upper bound of a quantile bootstrap interval for 
the sample median was nearly as robust as the sample median. That brings
some comfort in using quantile bootstrap intervals in quantile
regression.

Does the sandwich estimator assume that errors are independent? And a 
related question: Does the rq function allow the user to specify 
clusters/grouping among the observations?

Best,
Matt

On Tue, 2011-03-01 at 05:35 -0600, James Shaw wrote:
 Matt:
 
 Thanks for your prompt reply.
 
 The disparity between the bootstrap and sandwich variance estimates
 derived when modeling the highly skewed outcome suggest that either
 (A) the empirical robust variance estimator is underestimating the
 variance or (B) the bootstrap is breaking down.  The bootstrap
 variance estimate of a robust location estimate is not necessarily
 robust, see Statistics  Probability Letters 50 (2000) 49-53.  Since
 submitting my earlier post, I have noticed that the the robust kernel
 variance estimate is similar to the bootstrap estimate.  Under what
 conditions would one expect Koenker and Machado's sandwich variance
 estimator, which uses a local estimate of the sparsity, to fail?
 
 --
 Jim
 
 
 
 On Mon, Feb 28, 2011 at 8:59 PM, Matt Shotwell m...@biostatmatt.com wrote:
  Jim,
 
  If repeated measurements on patients are correlated, then resampling all
  measurements independently induces an incorrect sampling distribution
  (= incorrect variance) on a statistic of these data. One solution, as
  you mention, is the block or cluster bootstrap, which preserves the
  correlation among repeated observations in resamples. I don't
  immediately see why the cluster bootstrap is unsuitable.
 
  Beyond this, I would be concerned about *any* variance estimates that
  are blind to correlated observations.
 
  The bootstrap variance estimate may be larger than the asymptotic
  variance estimate, but that alone isn't evidence to favor one over the
  other.
 
  Also, I can't justify (to myself) why skew would hamper the quality of
  bootstrap variance estimates. I wonder how it affects the sandwich
  variance estimate...
 
  Best,
  Matt
 
  On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote:
  I am fitting quantile regression models using data collected from a
  sample of 124 patients.  When modeling cross-sectional associations, I
  have noticed that nonparametric bootstrap estimates of the variances
  of parameter estimates are much greater in magnitude than the
  empirical Huber estimates derived using summary.rq's nid option.
  The outcome variable is severely skewed, and I am afraid that this may
  be affecting the consistency of the bootstrap variance estimates.  I
  have read that the m out of n bootstrap can be used to overcome this
  problem.  However, this procedure requires both the original sample
  (n) and the subsample (m) sizes to be large.  The version implemented
  in rq.boot does not appear to provide any improvement over the naive
  bootstrap.  Ultimately, I am interested in using median regression to
  model changes in the outcome variable over time.  Summary.rq's robust
  variance estimator is not applicable to repeated-measures data.  I
  question whether the block (cluster) bootstrap variance estimator,
  which can accommodate intraclass correlation, would perform well.  Can
  anyone suggest alternatives for variance estimation in this situation?
  Regards,
 
  Jim
 
 
  James W. Shaw, Ph.D., Pharm.D., M.P.H.
  Assistant Professor
  Department of Pharmacy Administration
  College of Pharmacy
  University of Illinois at Chicago
  833 South Wood Street, M/C 871, Room 266
  Chicago, IL 60612
  Tel.: 312-355-5666
  Fax: 312-996-0868
  Mobile Tel.: 215-852-3045
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
 -- 
 James W. Shaw, Ph.D., Pharm.D., M.P.H.
 Assistant Professor
 Department of Pharmacy Administration
 College of Pharmacy
 University of Illinois at Chicago
 833 South Wood Street, M/C 871, Room 266
 Chicago, IL 60612
 Tel.: 312-355-5666
 Fax: 312-996-0868
 Mobile Tel.: 215-852-3045
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo

Re: [R] Robust variance estimation with rq (failure of the bootstrap?)

2011-02-28 Thread Matt Shotwell
Jim, 

If repeated measurements on patients are correlated, then resampling all
measurements independently induces an incorrect sampling distribution
(= incorrect variance) on a statistic of these data. One solution, as
you mention, is the block or cluster bootstrap, which preserves the
correlation among repeated observations in resamples. I don't
immediately see why the cluster bootstrap is unsuitable.

Beyond this, I would be concerned about *any* variance estimates that
are blind to correlated observations.

The bootstrap variance estimate may be larger than the asymptotic
variance estimate, but that alone isn't evidence to favor one over the
other.

Also, I can't justify (to myself) why skew would hamper the quality of
bootstrap variance estimates. I wonder how it affects the sandwich
variance estimate...

Best,
Matt

On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote:
 I am fitting quantile regression models using data collected from a
 sample of 124 patients.  When modeling cross-sectional associations, I
 have noticed that nonparametric bootstrap estimates of the variances
 of parameter estimates are much greater in magnitude than the
 empirical Huber estimates derived using summary.rq's nid option.
 The outcome variable is severely skewed, and I am afraid that this may
 be affecting the consistency of the bootstrap variance estimates.  I
 have read that the m out of n bootstrap can be used to overcome this
 problem.  However, this procedure requires both the original sample
 (n) and the subsample (m) sizes to be large.  The version implemented
 in rq.boot does not appear to provide any improvement over the naive
 bootstrap.  Ultimately, I am interested in using median regression to
 model changes in the outcome variable over time.  Summary.rq's robust
 variance estimator is not applicable to repeated-measures data.  I
 question whether the block (cluster) bootstrap variance estimator,
 which can accommodate intraclass correlation, would perform well.  Can
 anyone suggest alternatives for variance estimation in this situation?
 Regards,
 
 Jim
 
 
 James W. Shaw, Ph.D., Pharm.D., M.P.H.
 Assistant Professor
 Department of Pharmacy Administration
 College of Pharmacy
 University of Illinois at Chicago
 833 South Wood Street, M/C 871, Room 266
 Chicago, IL 60612
 Tel.: 312-355-5666
 Fax: 312-996-0868
 Mobile Tel.: 215-852-3045
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualizing Points on a Sphere

2011-02-25 Thread Matt Shotwell
That's interesting. You might also like:
http://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution

I'm not sure how to plot the wireframe sphere, but you can visualize the
points by transforming to Cartesian coordinates like so:

u - runif(1000,0,1)
v - runif(1000,0,1)
theta - 2 * pi * u
phi   - acos(2 * v - 1)
x - sin(theta) * cos(phi)
y - sin(theta) * sin(phi)
z - cos(theta)
library(lattice)
cloud(z ~ x + y)

-Matt

On Fri, 2011-02-25 at 14:21 +0100, Lorenzo Isella wrote:
 Dear All,
 I need to plot some points on the surface of a sphere, but I am not sure 
 about how to proceed to achieve this in R (or if it is suitable for this 
 at all).
 In any case, I am not looking for really fancy visualizations; for 
 instance you can consider the images between formulae 5 and 6 at
 
 http://bit.ly/hOgK9h
 
 Any suggestion is appreciated.
 Cheers
 
 Lorenzo
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with writing a file in UTF-8

2011-02-21 Thread Matt Shotwell
Thomas, 

I wasn't able to reproduce your finding. The last two characters in my
'out.txt' file were just as expected. But, I'm in an UTF-8 locale. Your
locale affects the encoding of characters on your platform. If you're
not in a UTF-8 locale, then characters are converted from your native
encoding to UTF-8 (when you specify encoding=UTF-8). In the process of
conversion, it's possible to lose information. You can test whether
there is a loss (or a change rather) when R writes these characters like
so:

# what does űŁ look like in binary (hex)?
raw_before - charToRaw(űŁ)

# write 'out.txt' as before
out - file(description=out.txt, open=w, encoding=UTF-8)
write(x=űŁ, file=out)
close(con=out)

# read in the two characters
out - file(description=out.txt, open=r, encoding=UTF-8)
raw_after - charToRaw(readChar(con=out, nchars=2))
close(con=out)

# compare the raw representations
identical(raw_before, raw_after)

This test passes on my machine. But, there's also the question of
whether these characters made it onto R-help list unaltered. Also,
please include the result of sessionInfo() in you subsequent messages.

Best,
Matt

 sessionInfo()
R version 2.11.1 (2010-05-31) 
i686-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_US.utf8   LC_NUMERIC=C 
 [3] LC_TIME=en_US.utf8LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8   
 [7] LC_PAPER=en_US.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C  

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base  

On Thu, 2011-02-17 at 13:54 -0800, tpklein wrote:

 Hello,
 
 I am working with a data frame containg character strings with many special
 symbols from various European languages.  When writing such character
 strings to a file using the UTF-8 encoding, some of them are converted in a
 strange way.  See the following example, run in R 2.12.1 on Windows 7:
 
 out - file( description=out.txt, open=w, encoding=UTF-8)
 write( x=äöüßæűŁ, file=out )
 close( con=out )
 
 The last two symbols in the character string are converted to uL while all
 other characters are not changed (which is what I want).  How to explain
 this?  Does it have something to do with my locale?  And is there a way to
 work around this problem? -- Any help would be greatly appreciated.
 
 Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] non-ascii characters in R output

2011-02-18 Thread Matt Shotwell

All,

I'd like to automatically output text from R to HTML. In doing this I've 
run into trouble with non-ascii characters, as my browser (and 
presumably others) does not render such characters correctly. For 
example, the 'fancy' single quotes associated with summary.lm are 
multi-byte characters on my platform. This particular problem is solved 
by options(useFancyQuotes=FALSE). But now I'm concerned about other 
non-ascii characters. As an overkill maybe, my current solution involves 
capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other 
sources of non-ascii character? Is there a better or general solution?


Best,
Matt

 sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.12.1

--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-ascii characters in R output

2011-02-18 Thread Matt Shotwell
OK, looks like my web browser does render non-ascii characters output by
R when it's given the encoding explicitly. This works for me: meta
http-equiv=Content-Type content=text/html; charset=UTF-8/. So
that's another solution, but not a general one.

-Matt

On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote:
 All,
 
 I'd like to automatically output text from R to HTML. In doing this I've 
 run into trouble with non-ascii characters, as my browser (and 
 presumably others) does not render such characters correctly. For 
 example, the 'fancy' single quotes associated with summary.lm are 
 multi-byte characters on my platform. This particular problem is solved 
 by options(useFancyQuotes=FALSE). But now I'm concerned about other 
 non-ascii characters. As an overkill maybe, my current solution involves 
 capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other 
 sources of non-ascii character? Is there a better or general solution?
 
 Best,
 Matt
 
   sessionInfo()
 R version 2.12.1 (2010-12-16)
 Platform: x86_64-pc-linux-gnu (64-bit)
 
 locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
   [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 loaded via a namespace (and not attached):
 [1] tools_2.12.1


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-ascii characters in R output

2011-02-18 Thread Matt Shotwell


On Fri, 2011-02-18 at 19:50 -0500, Duncan Murdoch wrote:
 On 18/02/2011 5:58 PM, Matt Shotwell wrote:
  OK, looks like my web browser does render non-ascii characters output by
  R when it's given the encoding explicitly. This works for me:meta
  http-equiv=Content-Type content=text/html; charset=UTF-8/. So
  that's another solution, but not a general one.
 
 I don't understand your final comment.  What is not general about 
 declaring how the file is encoded?

I meant that declaring UTF-8 is not generally applicable, because R
doesn't always output UTF-8 (right?). For example, locales that use
exotic encodings might output characters that are not interpretable
where UTF-8 is assumed.

The general solution, I suppose, is to automatically generate the
meta / line with the encoding used by R.

Matt

 
 Duncan Murdoch
 
 
  -Matt
 
  On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote:
  All,
 
  I'd like to automatically output text from R to HTML. In doing this I've
  run into trouble with non-ascii characters, as my browser (and
  presumably others) does not render such characters correctly. For
  example, the 'fancy' single quotes associated with summary.lm are
  multi-byte characters on my platform. This particular problem is solved
  by options(useFancyQuotes=FALSE). But now I'm concerned about other
  non-ascii characters. As an overkill maybe, my current solution involves
  capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other
  sources of non-ascii character? Is there a better or general solution?
 
  Best,
  Matt
 
  sessionInfo()
  R version 2.12.1 (2010-12-16)
  Platform: x86_64-pc-linux-gnu (64-bit)
 
  locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
  [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
 
  loaded via a namespace (and not attached):
  [1] tools_2.12.1
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Revolution Analytics reading SAS datasets

2011-02-10 Thread Matt Shotwell
On Thu, 2011-02-10 at 10:44 -0800, David Smith wrote:
 The SAS import/export feature of Revolution R Enterprise 4.2 isn't
 open-source, so we can't release it in open-source Revolution R
 Community, or to CRAN as we do with the ParallelR packages (foreach,
 doMC, etc.).

Judging by the language of Dr. Nie's comments on the page linked below,
it seems unlikely this feature is the result of a licensing agreement
with SAS. Is that correct?

Matt

 It is, though, available for download free of charge to members of the
 academic community (as is all of Revolution Analytics' software) from
 http://www.revolutionanalytics.com/downloads/
 
 # David Smith
 
 On Wed, Feb 9, 2011 at 5:46 PM, Daniel Nordlund djnordl...@frontier.com 
 wrote:
  Has anyone heard whether Revolution Analytics is going to release this 
  capability to the R community?
 
  http://www.businesswire.com/news/home/20110201005852/en/Revolution-Analytics-Unlocks-SAS-Data
 
  Dan
 
  Daniel Nordlund
  Bothell, WA USA
 
 --
 David M Smith da...@revolutionanalytics.com
 VP of Marketing, Revolution Analytics  http://blog.revolutionanalytics.com
 Tel: +1 (650) 646-9523 (Palo Alto, CA, USA)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] clustering with finite mixture model

2011-02-02 Thread Matt Shotwell
There are quite a few packages that work with finite mixtures, as 
evidenced by the descriptions here:


http://cran.r-project.org/web/packages/index.html

These might be useful:

http://cran.r-project.org/web/packages/flexmix/index.html
http://cran.r-project.org/web/packages/mclust/index.html

-Matt

On 02/02/2011 04:28 AM, karuna m wrote:

Dear R-help,
I am doing clustering via finite mixture model. Please suggest some packages in
R to find clusters via finite mixture model with continuous variables. And
also I wish to verify the distributional properties of the mixture distributions
by fitting the model with lognormal, gamma, exponentials etc,.
Thanks in advance,
  warm regards,Ms.Karunambigai M
PhD Scholar
Dept. of Biostatistics
NIMHANS
Bangalore
India


[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] User input in R program

2011-01-21 Thread Matt Shotwell
Martyn Plummer's 'coda' package has some nice interactive menus. The
package appears to be written entirely in R. You could start with the
codamenu() function in the package source:

http://cran.r-project.org/web/packages/coda/index.html

-Matt

On Fri, 2011-01-21 at 14:26 +0200, christiaan pauw wrote:
 HI Everybody
 
 Does anyone know of documentation about different ways of obtaining user
 input in R. I have used readline() but I wondered is there are sophisticated
 packages that does things like validate answers or generate selection
 lists.
 
 bets regards
 Christaan
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Encoding problem - I fails to read Hebrew text from online

2010-12-09 Thread Matt Shotwell
Tal, 

It looks like the data you received has HTML special hex characters.
That is, '#x5E9;' is just an ASCII HTML representation of a hex
character. It's not encoded in a special manner.

The trick is to substitute the HTML encoded hex character for its binary
representation, or decode the character. I don't know of any R
function that does this, but there are web services, for example:
http://www.hashemian.com/tools/html-url-encode-decode.php

I decoded your file using this service and posted it on my website. You
can see the difference by running:

readLines(http://biostatmatt.com/temp/Hebrew-original;, warn=FALSE)

readLines(http://biostatmatt.com/temp/Hebrew-decoded;, warn=FALSE)

The second should display the Hebrew characters correctly (it does in my
terminal). The next thing to think about is how to automate this in R
without using the web service... We may need to write an HTMLDecode
function if there isn't one already.

By the way, what's the Hebrew text in English?

Best,
Matt


On Thu, 2010-12-09 at 12:21 -0500, Tal Galili wrote:
 I am bumping this question in the hopes that someone might be able to
 advise.
 This Hebrew and R business is not as smooth as I had hoped...
 
 Thanks,
 Tal
 
 Older massage:
 
 On Tue, Dec 7, 2010 at 2:30 PM, Tal Galili tal.gal...@gmail.com wrote:
 
  Hello all,
 
  # I am trying to read the text in this URL:
  u -
  http://google.com/complete/search?output=toolbarq=%d7%a9%d7%9c%d7%95%d7%9d
  # By using this command:
  readLines(u)
 
  And no matter what variation I tried, I keep getting this output:
  [1] ?xml version=\1.0\?toplevelCompleteSuggestionsuggestion
  data=\#x5E9;#x5DC;#x5D5;#x5DD;\/   (etc...)
 
 
 
  Instead of this output:
  ?xml version=1.0?toplevelCompleteSuggestionsuggestion data=שלום
  /num_queries 
  int=1680//CompleteSuggestionCompleteSuggestionsuggestion
  data=שלום חנוך/num_queries int=232000//CompleteSuggestion
  CompleteSuggestionsuggestion data=שלום עליכם/
  (etc)
 
 
 
  I tried:
readLines(u, encoding= latin1)
readLines(u, encoding= UTF-8)
  And also changing Sys.setlocale:
Sys.setlocale(LC_ALL, Hebrew) # must be done for Hebrew to work.
Sys.setlocale(LC_ALL, English) # must be done for Hebrew to work.
 
  Are there any more options I could try to get this text properly encoded?
 
  Thanks!
  Tal
 
 
 
  Contact
  Details:---
  Contact me: tal.gal...@gmail.com |  972-52-7275845
  Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
  www.r-statistics.com (English)
 
  --
 
 
 
 
   [[alternative HTML version deleted]]
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Encoding problem - I fails to read Hebrew text from online

2010-12-09 Thread Matt Shotwell
Tal, 

OK, let me clarify my understanding. The original and decoded file are
text, encoded by UTF-8. In the original file, there are HTML `entities'
that represent UTF-8 Hebrew characters. In the decoded file, the
entities are converted to UTF-8 characters. The question is how to
convert these entities within R. It's not the same as converting between
character encodings, otherwise iconv() might offer a solution.

I'll have a look around to find a solution, and I hope others will too.
My first idea is to check RCurl, XML, and the related utils::URLdecode.
If there really is no existing solution, I think it might be worthwhile
to look at how PHP and Python do it (and maybe borrow some code :) ).

-Matt


On Thu, 2010-12-09 at 14:27 -0500, Tal Galili wrote:
 Hi Matt,
 Thanks for having a look at this.
 I just spent some time looking around and couldn't find any R function
 to decode  decimal HTML code.
 
 
 Do you (or someone else on the list) knows how to program this sort of
 thing? (is there a formula for the translation?
 
 
 
 
 p.s:
 For it to work on my end I added the encoding parameter:
 readLines(http://biostatmatt.com/temp/Hebrew-decoded;, warn=FALSE,
 encoding= UTF-8)
 
 
 p.p.s: The Hebrew word I used means peace 
 
 
 Cheers,
 Tal
 
 
 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew)
 | www.r-statistics.com (English)
 --
 
 
 
 
 On Thu, Dec 9, 2010 at 8:38 PM, Matt Shotwell shotw...@musc.edu
 wrote:
 Tal,
 
 It looks like the data you received has HTML special hex
 characters.
 That is, '#x5E9;' is just an ASCII HTML representation of a
 hex
 character. It's not encoded in a special manner.
 
 The trick is to substitute the HTML encoded hex character for
 its binary
 representation, or decode the character. I don't know of any
 R
 function that does this, but there are web services, for
 example:
 http://www.hashemian.com/tools/html-url-encode-decode.php
 
 I decoded your file using this service and posted it on my
 website. You
 can see the difference by running:
 
 readLines(http://biostatmatt.com/temp/Hebrew-original;,
 warn=FALSE)
 
 readLines(http://biostatmatt.com/temp/Hebrew-decoded;,
 warn=FALSE)
 
 The second should display the Hebrew characters correctly (it
 does in my
 terminal). The next thing to think about is how to automate
 this in R
 without using the web service... We may need to write an
 HTMLDecode
 function if there isn't one already.
 
 By the way, what's the Hebrew text in English?
 
 Best,
 Matt
 
 
 
 On Thu, 2010-12-09 at 12:21 -0500, Tal Galili wrote:
  I am bumping this question in the hopes that someone might
 be able to
  advise.
  This Hebrew and R business is not as smooth as I had
 hoped...
 
  Thanks,
  Tal
 
  Older massage:
 
  On Tue, Dec 7, 2010 at 2:30 PM, Tal Galili
 tal.gal...@gmail.com wrote:
 
   Hello all,
  
   # I am trying to read the text in this URL:
   u -
   http://google.com/complete/search?output=toolbarq=%d7%a9%
 d7%9c%d7%95%d7%9d
   # By using this command:
   readLines(u)
  
   And no matter what variation I tried, I keep getting this
 output:
   [1] ?xml version=\1.0
 \?toplevelCompleteSuggestionsuggestion
   data=\#x5E9;#x5DC;#x5D5;#x5DD;\/   (etc...)
  
 
 
   Instead of this output:
   ?xml
 version=1.0?toplevelCompleteSuggestionsuggestion
 data=שלום
   /num_queries
 int=1680//CompleteSuggestionCompleteSuggestionsuggestion
   data=שלום חנוך/num_queries
 int=232000//CompleteSuggestion
   CompleteSuggestionsuggestion data=שלום עליכם/
   (etc)
  
  
 
   I tried:
 readLines(u, encoding= latin1)
 readLines(u, encoding= UTF-8)
   And also changing Sys.setlocale:
 Sys.setlocale(LC_ALL, Hebrew) # must be done for
 Hebrew to work.
 Sys.setlocale(LC_ALL, English) # must be done for
 Hebrew to work.
  
   Are there any more options I could try to get this text
 properly encoded?
  
   Thanks!
   Tal
  
  
  
   Contact

Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-22 Thread Matt Shotwell
Martin,

Pardon the delayed reply.

Bootstrap methods have been around for some time (late seventies?), but
their popularity seems to have exploded in correspondence with computing
technology. You should be able to find more information in most modern
books on statistical inference, but here is a brief:

The bootstrap is a method often used to establish an empirical null
distribution for a test statistic when traditional (analytical) methods
fail. The bootstrap works by imposing a null hypothesis on the observed
data, followed by re-sampling with replacement. The test statistic is
computed at each re-sample and used to build up an empirical null
distribution. The idea is to impose the null hypothesis while preserving
variability in the observed data, and thus the test statistic.

For example, suppose we observe some continuous scalar data and
hypothesize that the sample was observed from a population with mean
zero. We can impose this hypothesis by subtracting the sample mean from
each observation. Re-samples from these transformed data are treated as
having been observed under the null hypothesis.

In the case of classification and partitioning, the difficulty is
formulating a meaningful null hypothesis about the collection of
classifications, and imposing the null hypothesis in a bootstrap
sampling scheme.

-Matt

On Wed, 2010-11-17 at 10:01 -0500, Martin Tomko wrote: 
 Thanks Mat,
 I have in the meantime identified the Rand index, but not the others. I 
 will also have a look at profdpm, that did not pop-up in my searches.
 Indeed, the interpretation is going to be critical... Could you please 
 elaborate on what you mean by the bootstrap process?
 
 Thanks a lot for your helps,
 Martin
 
 On 11/17/2010 3:50 PM, Matt Shotwell wrote:
  There are several statistics used to compare nominal classifications, or
  _partitions_ of a data set. A partition isn't quite the same in this
  context because partitioned data are not restricted to a fixed number of
  classes. However, the statistics used to compare partitions should also
  work for these 'restricted' partitions. See the Rand index, Fowlkes and
  Mallows index, Wallace indices, and the Jaccard index. The profdpm
  package implements a function (?profdpm::pci) that computes these
  indices for two factors representing partitions of the same data.
 
  The difficult part is drawing statistical inference about these indices.
  It's difficult to formulate a null hypothesis, and even more difficult
  to determine a null distribution for a partition comparison index. A
  bootstrap test might work, but you will probably have to implement this
  yourself.
 
  -Matt
 
  On Wed, 2010-11-17 at 08:33 -0500, Martin Tomko wrote:
 
  Dear all,
  I am having a hard time to figure out a suitable test for the match
  between two nominal classifications of the same set of data.
  I have used hierarchical clustering with multiple methods (ward,
  k-means,...) to classify my dat into a set number of classesa, and I
  would like to compare the resulting automated classification with the
  actual - objective benchmark one.
  So in principle I have a data frame with n columns of nominal
  classifications, and I want to do a mutual comparison and test for
  significance in difference in classification between pairs of columns.
 
  I just need to identify a suitable test, but I fail. I am currently
  exploring the possibility of using Cohen's Kappa, but I am open to other
  suggestions. Especially the fact that kappa seems to be moslty used on
  failible, human annotators seems to bring in limitations taht do not
  apply to my automatic classification.
  Any help will be appreciated, especially if also followed by a pointer
  to an R package that implements it.
 
  Thanks
  Martin
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
   
 
 
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fatal Error R

2010-11-17 Thread Matt Shotwell
Please see below.

On Wed, 2010-11-17 at 04:41 -0500, Ted Harding wrote:
 On 17-Nov-10 00:02:39, José Fernando Zea Castro wrote:
  Hello.
  First, I'm thankful about your wonderful project.
  
  However, I have serious worries about the reliability of R.
  I found the next bug which I consider important because in
  my job everytime We work with datanames like next. Please
  see below:
  
   b=data.frame(matrix(1:9,ncol=3))
   names(b)=c(q99,r88,s77)
  
   b
q99 r88 s77
  1   1   4   7
  2   2   5   8
  3   3   6   9
  b$q9
  [1] 1 2 3
  
  
  Please note that the variable q9 does not exist in the dataframe,
 . but you can see  that R show q9 (as q99).
  
  Thank in advanced
  
  Cordially
  José Fernando Zea Castro
  Statistician Universidad Nacional Colombiana
 
 What you see here is a case of partial matching: You ask for
 'b$q9', and R sees that 'q9' matches the beginning of 'q99'
 and nothing else. Therefore it responds with the value of 'b$q99',
 since there is no ambiguity.
 
 You would have got the same result if you had asked for
 
   b$q
 
 since there is no component name in b which matches 'q' except 'q99'.
 
 If there had been two components which matched 'q9', say both
 b$q99 and b$q98, then you would have got a NULL result, since
 there is not a unique match.
 
 However, if you also have b$q9 and b$q99 in b, then R would find that
 b$q9 was an *exact* (not partial) match, and would return that one.
 
 Normally, this should not cause problems. However, if you have
 written code which must take special action if a name is not
 present in a list, then there could be problems.
 
 For example, if b might (depending on what has happened) contain
 b$q9 only, or b$q99 only, or *both* b$q9 and b$q99, and you want
 to execute special actions if a name is not present in b, then
 in the case where b contained only b$q99 and you asked for b$q9,
 you would get the wrong result because of partial matching.
 
 This is one of those cases, in my opinion, where R's documentation
 drops you into a flat landscape, in the middle of nowhere, in a
 thick mist.

This does happen sometimes, but partial matching in indexing operations
is documented in the R Language Definition manual section 3.4.1, and
well documented in the help page (?Extract or ?`$` or ?`[`).

  What is needed is to be able to set an option such
 that R will *only* respond with exact matches, e.g. something
 like options(partial.match=FALSE). I have spent about 20 minutes
 trying to locate the possible existence of such an option, or a
 similar way of suppressing partial matching. No success!

Indexing a list using [[ and a string enforce exact matching (by
default). Continuing with the example above:

 b[[q99]]
[1] 1 2 3

 b[[q]]
NULL

 The closest I could get was the set of options, settable using
 options(... = ...):
 
  'warnPartialMatchArgs': logical.  If true, warns if partial
   matching is used in argument matching.
 
  'warnPartialMatchAttr': logical.  If true, warns if partial
   matching is used in extracting attributes via 'attr'.
 
  'warnPartialMatchDollar': logical.  If true, warns if partial
   matching is used for extraction by '$'.
 
 which concerns only the issue of warnings in such cases, and has
 nothing to do with suppressing partial matching.
 
 Maybe others know better!
 
 Best wishes,
 Ted.
 
 
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Fax-to-email: +44 (0)870 094 0861
 Date: 17-Nov-10   Time: 09:41:03
 -- XFMail --
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-17 Thread Matt Shotwell
There are several statistics used to compare nominal classifications, or
_partitions_ of a data set. A partition isn't quite the same in this
context because partitioned data are not restricted to a fixed number of
classes. However, the statistics used to compare partitions should also
work for these 'restricted' partitions. See the Rand index, Fowlkes and
Mallows index, Wallace indices, and the Jaccard index. The profdpm
package implements a function (?profdpm::pci) that computes these
indices for two factors representing partitions of the same data.

The difficult part is drawing statistical inference about these indices.
It's difficult to formulate a null hypothesis, and even more difficult
to determine a null distribution for a partition comparison index. A
bootstrap test might work, but you will probably have to implement this
yourself.

-Matt

On Wed, 2010-11-17 at 08:33 -0500, Martin Tomko wrote:
 Dear all,
 I am having a hard time to figure out a suitable test for the match 
 between two nominal classifications of the same set of data.
 I have used hierarchical clustering with multiple methods (ward, 
 k-means,...) to classify my dat into a set number of classesa, and I 
 would like to compare the resulting automated classification with the 
 actual - objective benchmark one.
 So in principle I have a data frame with n columns of nominal 
 classifications, and I want to do a mutual comparison and test for 
 significance in difference in classification between pairs of columns.
 
 I just need to identify a suitable test, but I fail. I am currently 
 exploring the possibility of using Cohen's Kappa, but I am open to other 
 suggestions. Especially the fact that kappa seems to be moslty used on 
 failible, human annotators seems to bring in limitations taht do not 
 apply to my automatic classification.
 Any help will be appreciated, especially if also followed by a pointer 
 to an R package that implements it.
 
 Thanks
 Martin
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] log-transformed linear regression

2010-11-11 Thread Matt Shotwell
Servet,

These data do look linear in log space. Fortunately, the model

log(y) = a + b * log(x)

does have intercept zero in linear space. To see this, consider

log(y) = a + b * log(x)
 y = 10^(a + b * log(x))
 y = 10^a * 10^(b * log(x))
 y = 10^a * 10^(log(x^b))
 y = 10^a * x^b

Hence, y = 0 when x = 0. The code below estimates a and b. 

Of course,

y = 10^a * x^b 

is not a line, so we can't directly compare slopes. However, in the
region of your data, the estimated mean is _nearly_ linear. In fact, you
could consider looking at a linear approximation, say at the median of
your x values. The median of your x values is 0.958. For simplicity,
let's just say it's 1.0. The linear approximation (first order Taylor
expansion) of 

y = 10^a * x^b

at x = 1 is

y = 10^a + 10^a * b * (x - 1)
y = 10^a * (1 - b) + 10^a * b * x

So, the slope of the linear approximation is 10^a * b, and the intercept
is 10^a * (1 - b). Taking a and b from the analysis below, the
approximate intercept is -0.00442, and slope 0.22650. You could argue
that these values are consistent with the literature, but that the log
linear model is more appropriate for these data. You could even
construct a bootstrap confidence interval for the approximate slope.

-Matt

On Wed, 2010-11-10 at 19:27 -0500, servet cizmeli wrote:
 Dear List,
 
 I would like to take another chance and see if there if someone has 
 anything to say to my last post...
 
 bump
 
 servet
 
 
 On 11/10/2010 01:11 PM, servet cizmeli wrote:
  Hello,
 
  I have a basic question. Sorry if it is so evident
 
  I have the following data file :
  http://ekumen.homelinux.net/mydata.txt
 
  I need to model Y~X-1 (simple linear regression through the origin) with
  these data :
 
  load(file=mydata.txt)
  X=k[,1]
  Y=k[,2]
 
  aa=lm(Y~X-1)
  dev.new()
  plot(X,Y,log=xy)
  abline(aa,untf=T)
  abline(b=0.0235, a=0,col=red,untf=T)
  abline(b=0.031, a=0,col=green,untf=T)
 
  Other people did the same kind of analysis with their data and found the
  regression coefficients of 0.0235 (red line) and 0.031 (green line).
 
  Regression with my own data, though, yields a slope of 0.0458 (black
  line) which is too high. Clearly my regression is too much influenced by
  the single point with high values (X100). I would not like to discard
  this point, though, because I know that the measurement is correct. I
  just would like to give it less weight...
 
  When I log-transform X and Y data, I obtain :
 
  dev.new()
  plot(log10(X),log10(Y))
  abline(v=0,h=0,col=cyan)
  bb=lm(log10(Y)~log10(X))
  abline(bb,col=blue)
  bb
 
  I am happy with this regression. Now the slope is at the log-log domain.
  I have to convert it back so that I can obtain a number comparable with
  the literature (0.0235 and 0.031).  How to do it? I can't force the
  second regression through the origin as the log-transformed data does
  not go through the origin anymore.
 
  at first it seemed like an easy problem but I am at loss :o((
  thanks a lot for your kindly help
  servet
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data acquisition with R?

2010-11-05 Thread Matt Shotwell
R implements (almost) all IO through its 'connections'. Unfortunately,
there is no API (public or private) for adding connections, and
therefore no packages that implement connections. You will find more
discussion of connections and hardware (serial, USB) interface in the
R-devel list archives.

There are two source code patches that implement two types of
connections that work on POSIX compliant OSs, including GNU Linux, BSD,
and Mac OS X. The first is a 'serial' connection, a high level
connection to a serial port http://biostatmatt.com/archives/112. The
second is a 'tty' connection, a more low level connection to the POSIX
termios interface http://biostatmatt.com/archives/564. Both of these
solutions require that you apply the patch and recompile R. I can help
with this, if you like.

AFAIK, these are the only attempts at interfacing R with POSIX TTYs
directly.

-Matt



On Fri, 2010-11-05 at 09:48 -0400, B.-MarkusS wrote:
 Hello,
 
 I spent quite some time now searching for any hint that R can also be 
 used to address the interfaces of a computer (i.e. RS232 or USB) to 
 acquire data from measurement devices (like with the - I think it is the 
 - devices or serial toolbox of Matlab).
 
 Is there any package available or a project going on that you know of? I 
 would so much like to have never to work with Matlab again. The only 
 thing I am really missing in R so far is the possibility to connect to 
 my measurement devices (for instance a precision balance) and record 
 data directly with R.
 
 Please let me know whether I am just missing something or if you have 
 some information about something like that.
 
 Thank you very much!
 Mango

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice plots for images

2010-11-03 Thread Matt Shotwell
Have you tried using the 'mai' argument to par()? Something like:
par(mfrow=c(3,3), mai=c(0,0,0,0))

I've used this in conjunction with image() to plot raster data in a
tight grid. http://biostatmatt.com/archives/727

-Matt

On Wed, 2010-11-03 at 11:13 -0400, Neba Funwi-Gabga wrote:
 Hello UseRs,
 I need help on how to plot several raster images (such as those obtained
 from a kernel-smoothed intensity function) in a layout
 such as that obtained from the lattice package. I would like to obtain
 something such as obtained from using the levelplot or xyplot
 in lattice. I currently use:
 
 par(mfrow=c(3,3)
 
 to set the workspace, but the resulting plots leave a lot of blank space
 between individual plots. If I can get it to the lattice format,
 I think it will save me some white space.
 
 Any help is greatly appreciated.
 
 Neba.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ForestPlot or similar

2010-10-30 Thread Matt Shotwell
Here is a small function for forest plots in R, with an example:

http://biostatmatt.com/wiki/r-credplot

-Matt

On Sat, 2010-10-30 at 11:40 -0400, Mestat wrote:
 Here is one example:
 I have three vectors (mean,lower interval, upper interval)
 mean-c(2,4,6,8)
 l-c(1,2,3,4)
 u-c(4,8,12,16)
 How would I plot that if I want to use the FORESTPLOT function. I dont need
 to use the TABLETEXT option.
 I am working in something like this:
 tabletext-c(NA,NA,NA,NA,NA)
 mean-c(NA,2,4,6,8)
 l-c(NA,1,2,3,4)
 u-c(NA,4,8,12,16)
 forestplot(tabletext,mean,l,u,zero=0)
 But I am having a problem with the length of the dimension...
 Thanks in advance,
 Marcio
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regular expression to find value between brackets

2010-10-13 Thread Matt Shotwell
Here's a shorter (but more cryptic) one:

 gsub(^([^\\(]+)(\\((.+)\\))?, \\2, tests)
[1] (%) (%) (mg/ml)

 gsub(^([^\\(]+)(\\((.+)\\))?, \\3, tests)
[1]   % % mg/ml

-Matt


On Wed, 2010-10-13 at 14:34 -0400, Henrique Dallazuanna wrote:
 Try this:
 
 replace(gsub(.*\\((.*)\\)$, \\1, tests), !grepl(\\(.*\\), tests), )
 
 
 On Wed, Oct 13, 2010 at 3:16 PM, Bart Joosen bartjoo...@hotmail.com wrote:
 
 
  Hi,
 
  this should be an easy one, but I can't figure it out.
  I have a vector of tests, with their units between brackets (if they have
  units).
  eg tests - c(pH, Assay (%), Impurity A(%), content (mg/ml))
 
  Now I would like to hava a function where I use a test as input, and which
  returns the units
  like:
  f - function (x) sub(\\), , sub(\\(, ,sub([[:alnum:]]+,,x)))
  this should give , %, %, mg/ml, but it doesn't do the job quit
  well.
 
  After searching in the manual, and on the help lists, I cant find the
  answer.
 
  anyone?
 
  Bart
  --
  View this message in context:
  http://r.789695.n4.nabble.com/Regular-expression-to-find-value-between-brackets-tp2994166p2994166.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] puzzle with integrate over infinite range

2010-09-21 Thread Matt Shotwell
You could try pnorm also:

shiftedGaussR - function(x0 = 500) {
sd  - 100/sqrt(2)
int - pnorm(0, x0, sd, lower.tail=FALSE, log.p=TRUE)
exp(int + log(sd) + 0.5 * log(2*pi))
}

 shiftedGaussR(500)
[1] 177.2454
 shiftedGauss(500)
[1] 177.2454

-Matt


On Tue, 2010-09-21 at 09:38 -0400, Ravi Varadhan wrote:
 There is nothing mysterious.  You need to increase the accuracy of
 quadrature by decreasing the error tolerance:
 
 # I scaled your function to a proper Gaussian density
 shiftedGauss - function(x0=500){
  integrate(function(x) 1/sqrt(2*pi * 100^2) * exp(-(x-x0)^2/(2*100^2)), 0,
 Inf, rel.tol=1.e-07)$value }
 
 shift - seq(500, 800, by=10)
 plot(shift, sapply(shift, shiftedGauss))
 
 
 Hope this helps,
 Ravi.
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of baptiste auguie
 Sent: Tuesday, September 21, 2010 8:38 AM
 To: r-help
 Subject: [R] puzzle with integrate over infinite range
 
 Dear list,
 
 I'm calculating the integral of a Gaussian function from 0 to
 infinity. I understand from ?integrate that it's usually better to
 specify Inf explicitly as a limit rather than an arbitrary large
 number, as in this case integrate() performs a trick to do the
 integration better.
 
 However, I do not understand the following, if I shift the Gauss
 function by some amount the integral should not be affected,
 
 shiftedGauss - function(x0=500){
  integrate(function(x) exp(-(x-x0)^2/100^2), 0, Inf)$value
 }
 
 shift - seq(500, 800, by=10)
 plot(shift, sapply(shift, shiftedGauss))
 
 Suddenly, just after 700, the value of the integral drops to nearly 0
 when it should be constant all the way. Any clue as to what's going on
 here? I guess it's suddenly missing the important part of the range
 where the integrand is non-zero, but how could this be overcome?
 
 Regards,
 
 baptiste
 
 
 sessionInfo()
 R version 2.11.1 (2010-05-31)
 x86_64-apple-darwin9.8.0
 
 locale:
 [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 other attached packages:
 [1] inline_0.3.5RcppArmadillo_0.2.6 Rcpp_0.8.6
 statmod_1.4.6
 
 loaded via a namespace (and not attached):
 [1] tools_2.11.1
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a bisection method in R?

2010-09-20 Thread Matt Shotwell
I was just reading about the merge sort algorithm last night (BTW, here
is a fun link http://www.youtube.com/watch?v=t8g-iYGHpEA). There are
some interesting similarities in this context. Here's a recursive method
for bisection:

bisectMatt - function(fn, lo, hi, tol = 1e-7, ...) {

flo - fn(lo, ...)
fhi - fn(hi, ...)

if(flo * fhi  0)
stop(root is not bracketed by lo and hi)

mid - (lo + hi) / 2
fmid - fn(mid, ...)
if(abs(fmid) = tol || abs(hi-lo) = tol)
return(mid)


if(fmid * fhi  0)
return(bisectMatt(fn, lo, mid, tol, ...))

return(bisectMatt(fn, mid, hi, tol, ...))
} 

# Adapted from Ravi's original
bisectRavi - function(fn, lo, hi, tol = 1e-7, ...) {

flo - fn(lo, ...) 
fhi - fn(hi, ...) 

if (flo * fhi  0) 
stop(root is not bracketed by lo and hi)

chg - hi - lo
while (abs(chg)  tol) {
mid - (lo + hi) / 2
fmid - fn(mid, ...)
if (abs(fmid) = tol) break
if (flo * fmid  0) hi - mid 
if (fhi * fmid  0) lo - mid
chg - hi - lo
}

return(mid)
}

testFn - function(x, a) exp(-x) - a*x 

 system.time(bM - bisectMatt(testFn, 0, 2, a=1))
   user  system elapsed 
  0.000   0.000   0.001 

 system.time(bR - bisectRavi(testFn, 0, 2, a=1))
   user  system elapsed 
  0.000   0.000   0.001 

 bM
[1] 0.5671433

 bR
[1] 0.5671433

Of course, Ravi's version is better for production (and most likely
faster, though not significantly so in this example) because recursion
is more expensive than looping.

-Matt

On Fri, 2010-09-17 at 17:44 -0400, Ravi Varadhan wrote:
 Here is something simple (does not have any checks for bad input), yet
 should be adequate:
 
 bisect - function(fn, lower, upper, tol=1.e-07, ...) {
 f.lo - fn(lower, ...) 
 f.hi - fn(upper, ...) 
 feval - 2
 
 if (f.lo * f.hi  0) stop(Root is not bracketed in the specified interval
 \n)
 chg - upper - lower
 
 while (abs(chg)  tol) {
   x.new - (lower + upper) / 2
   f.new - fn(x.new, ...)
   if (abs(f.new) = tol) break
   if (f.lo * f.new  0) upper - x.new 
   if (f.hi * f.new  0) lower - x.new 
   chg - upper - lower
   feval - feval + 1
 }
 list(x = x.new, value = f.new, fevals=feval)
 }
 
 # An example
 fn1 - function(x, a) {
 exp(-x) - a*x 
 }
 
 bisect(fn1, 0, 2, a=1)
  
 bisect(fn1, 0, 2, a=2)
 
 
 Ravi.
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Peter Dalgaard
 Sent: Friday, September 17, 2010 4:16 PM
 To: Gregory Gentlemen
 Cc: r-help@r-project.org
 Subject: Re: [R] Is there a bisection method in R?
 
 On 09/17/2010 09:28 PM, Gregory Gentlemen wrote:
  If uniroot is not a bisection method, then what function in R does use
 bisection?
  
 
 Why do you assume that there is one? uniroot contains a better algorithm
 for finding bracketed roots.
 
 It shouldn't be too hard to roll your own if you need one for
 pedagogical purposes.
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Founding

2010-09-16 Thread Matt Shotwell
On Thu, 2010-09-16 at 17:30 -0400, Tal Galili wrote:
 Hello dear Jaroslaw,
 I strongly agree with you that the R foundation should have an easier method
 of enabling people to give donations.
 At the same time, I feel there is a (friendly) disagreement between us on
 how such money should be used.
 
 Your massage has inspired me to write a post on the topic, titles:
   
Was this --^ a Freudian slip? In any case, it seems consistent with your
notion of compensation for open-source developers. :) Interesting post
Tal.

-Matt

 Open source and money – why R developers shouldn’t be
 paidhttp://www.r-statistics.com/2010/09/open-source-and-money-why-r-developers-shouldnt-be-paid/
 
 
 I hope you, and other community members, would find interest in it.
 
 Best,
 Tal
 
 
 
 
 
 
 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)
 --
 
 
 
 
 On Thu, Sep 16, 2010 at 12:49 PM, jaropis jaro...@zg.home.pl wrote:
 
  A few days ago Tal Galili posted a message about some controversies
  concerning the future of R. Having read the discussions, especially those
  following Ross Ihaka's post, I have come to the conclusion, that, as usual,
  the problem is money. I doubt there would be discussions about dropping R
  in
  its present form if the R-Foundation were properly funded and could hire
  computer scientists, programmers and statisticians. If a commercial company
  is able to provide big-database and multicore solutions, then so would a
  properly founded R-Foundation.
 
  In my opinion the main reason for the lack of funding is that the
  Foundation
  does not want to accept it from users and waits for the likes of Google to
  bring them a sack of money. I have already posted about this, but this
  seems
  to be the time and place to repeat it: it is very difficult to donate
  anything to the R-Foundation. First you have to find the appropriate link
  at
  the r-project page, then you have to fill out a form and send or fax it to
  the Foundation. I am not comfortable sending my details over snail-mail or
  fax.
 
  I would GLADLY donate 30-50$ each year just to see R develop, but there
  needs to be a way for me to do it in a civilized manner. If the userbase of
  R is over 2 million there will surely be 100,000 users who, like myself,
  will happily fork out 40$ a year - would that help? you can do the
  calculation yourselves. Set up a donation page in which I will be able to
  pay by credit card or PayPal and you will start getting donations from
  individual users. Advertise this at the startup message of the program: say
  something like support us at www.suppoRtR.com and the money will start
  coming. I am sure there would be enough to employ some foundation members
  full-time, pay external CSs and even protect the system in court from those
  who make money off of somebody else's work and do not give back to the
  community (you know who I am talking about).
 
  R and the Foundation have helped a lot of us to do our research and make
  real money. Now give us a chance to help you!
 
 
  Regards
  Jaroslaw Piskorski
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which language is faster for numerical computation?

2010-09-09 Thread Matt Shotwell
For the compiled languages, it depends heavily on the compiler. This
sort of comparison is rendered moot by the huge variety of compiler and
hardware specific optimizations. My suggestion is to use C, or possibly
C++ in conjunction with Rcpp, as these are most compatible with R. Also,
C and C++ are consistently rated highly (often in the top 3) in
popularity and use. Fortran is not. This would make a difference if you
want to collaborate or ask for help.

-Matt

On Thu, 2010-09-09 at 06:26 -0400, Christofer Bogaso wrote:
 Dear all, R offers integration mechanism with different programming
 languages like C, C++, Fortran, .NET etc. Therefore I am curious on,
 for heavy numerical computation which language is the fastest? Is
 there any study? I specially want to know because, if there is some
 study saying that C is the fastest language for numerical computation
 then I would change some of my R code into C.
 
 Thanks for your time.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducible research

2010-09-09 Thread Matt Shotwell
I have a little package I've been using to write template blog posts (in
HTML) with embedded R code. It's quite small but very flexible and
extensible, and aims to do something similar to Sweave and brew. In
fact, the package is heavily influenced by the brew package, though
implemented quite differently. It depends on the evaluate package,
available in the CRAN. The tentatively titled 'markup' package is
attached. After it's installed, see ?markup and the few examples in the
inst/ directory, or just example(markup).

-Matt

On Thu, 2010-09-09 at 01:47 -0400, David Scott wrote:
 I am investigating some approaches to reproducible research. I need in 
 the end to produce .html or .doc or .docx. I have used hwriter in the 
 past but have had some problems with verbatim output from  R. Tables are 
 also not particularly convenient.
 
 I am interested in R2HTML and R2wd in particular, and possibly odfWeave.
 
 Does anyone have sample documents using any of these approaches which 
 they could let me have?
 
 David Scott
 
 _
 
 David Scott   Department of Statistics
   The University of Auckland, PB 92019
   Auckland 1142,NEW ZEALAND
 Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
 Email:d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018
 
 Director of Consulting, Department of Statistics
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducible research

2010-09-09 Thread Matt Shotwell
Well, the attachment was a dud. Try this:

http://biostatmatt.com/R/markup_0.0.tar.gz

-Matt

On Thu, 2010-09-09 at 10:54 -0400, Matt Shotwell wrote:
 I have a little package I've been using to write template blog posts (in
 HTML) with embedded R code. It's quite small but very flexible and
 extensible, and aims to do something similar to Sweave and brew. In
 fact, the package is heavily influenced by the brew package, though
 implemented quite differently. It depends on the evaluate package,
 available in the CRAN. The tentatively titled 'markup' package is
 attached. After it's installed, see ?markup and the few examples in the
 inst/ directory, or just example(markup).
 
 -Matt
 
 On Thu, 2010-09-09 at 01:47 -0400, David Scott wrote:
  I am investigating some approaches to reproducible research. I need in 
  the end to produce .html or .doc or .docx. I have used hwriter in the 
  past but have had some problems with verbatim output from  R. Tables are 
  also not particularly convenient.
  
  I am interested in R2HTML and R2wd in particular, and possibly odfWeave.
  
  Does anyone have sample documents using any of these approaches which 
  they could let me have?
  
  David Scott
  
  _
  
  David Scott Department of Statistics
  The University of Auckland, PB 92019
  Auckland 1142,NEW ZEALAND
  Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
  Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018
  
  Director of Consulting, Department of Statistics
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Uncompressing data from read.socket

2010-09-08 Thread Matt Shotwell
Have a look at gzcon, for decompressing data as they arrive. From the
help file:

 ‘gzcon’ provides a modified connection that wraps an existing
 connection, and decompresses reads or compresses writes through
 that connection.  Standard ‘gzip’ headers are assumed.

There is no indication in the gzcon help file that explicitly prohibits
socketConnections. Also, see memDecompress for in-memory decompression
of the entire object. 

-Matt

On Wed, 2010-09-08 at 00:50 -0400, raje...@cse.iitm.ac.in wrote:
 Hi,
 
 Is it possible to uncompress gzipped data coming over a socket? 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove accents in strings

2010-09-07 Thread Matt Shotwell
If you know the encoding of the string, or if its encoding is the
current locale encoding, then you can use the iconv function to convert
the string to ASCII. Something like:

iconv(accented.string, to=ASCII//TRANSLIT)

While 7-bit ASCII does not permit accented characters, extended (8-bit)
ASCII does. Hence, I'm not sure this will work. But it's worth a try.

-Matt

On Tue, 2010-09-07 at 13:04 -0400, lamack lamack wrote:
 Dear all, there is a R function to remove all accents in strings?
 
 best regards.
 
 JL 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove accents in strings

2010-09-07 Thread Matt Shotwell
Weird, my (Ubuntu, s don't tell Dirk) iconv doesn't add the
backticks or single quotes.

 tst - c(à, è, ì, ò, ù , À, È, Ì, Ò, Ù, á,  
+ é, í, ó, ú, ý , Á, É, Í, Ó, Ú, Ý)
 iconv(tst, to=ASCII//TRANSLIT)
 [1] a e i o u A E I O U a e i o u y A
E I
[20] O U Y

By the way, I'll take this moment to remind anyone interested that R
still has trouble with embedded zeros in character strings. I may be
abusing terminology, but I think that makes R 8-bit dirty.

-Matt

On Tue, 2010-09-07 at 14:01 -0400, David Winsemius wrote:
 On Sep 7, 2010, at 1:35 PM, Matt Shotwell wrote:
 
  If you know the encoding of the string, or if its encoding is the
  current locale encoding, then you can use the iconv function to  
  convert
  the string to ASCII. Something like:
 
  iconv(accented.string, to=ASCII//TRANSLIT)
 
  While 7-bit ASCII does not permit accented characters, extended (8- 
  bit)
  ASCII does. Hence, I'm not sure this will work. But it's worth a try.
 
   tst - c(à, è, ì, ò, ù , À, È, Ì, Ò, Ù, á,  
 é, í, ó, ú, ý , Á, É, Í, Ó, Ú, Ý)
   iconv(tst, to=ASCII//TRANSLIT)
   [1] `a `e `i `o `u `A `E `I `O `U 'a 'e 'i  
 'o 'u 'y
 [17] 'A 'E 'I 'O 'U 'Y
   gsub(`|\\', , iconv(tst, to=ASCII//TRANSLIT))
   [1] a e i o u A E I O U a e i o u y  
 A E I O
 [21] U Y
 
 Notice that the accent acute gets converted to a single quote and  
 therefore needs to be dbl-\-ed to get recognized in an R regex pattern.
 
 On a Mac with: locale:
 [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why is vector assignment in R recreates the entire vector ?

2010-09-01 Thread Matt Shotwell
Tal, 

For your first example, x is not duplicated in memory. If you compile R
with --enable-memory-profiling, you have access to the tracemem()
function, which will report whether x is duplicate()d:

 x - rep(1,100)
 tracemem(x)
[1] 0x8f71c38
 x[10] - NA

This does not result in duplication of x, nor does assignment of x to y:

 y - x

At this point, y internally references x. It's not until we modify y,
that x is duplicated, and y gets its own copy of the data:

 y[10] - NA
tracemem[0x8f71c38 - 0x91fff70]:

Likewise, no duplication occurs using `[-`:

 x - rep(1,100)
 tracemem(x)
[1] 0x8e44900
 x - `[-`(x, list=10, values=NA)

But, R is not yet smart enough to avoid a duplication here:

 x - rep(1,100)
 tracemem(x)
[1] 0x915d580
 x - replace(x, list=10, values=NA)
tracemem[0x915d580 - 0x915e090]: replace 

Beyond these simple tests, it's difficult to know when R copies memory.
I mentioned in another post recently that subsetting a vector will copy
memory, but this is not reported by tracemem(). For example:

 tracemem(x)
[1] 0x915ed50
 y - x[1:100]
 tracemem(y)
[1] 0x915f3f0
 identical(x,y)
[1] TRUE

Fortunately, memory is fairly cheap, and memory operations are pretty
fast in modern operating systems, like GNU Linux. I mostly find that the
rate limiting steps in my code are computational routines, like exp().

-Matt


On Wed, 2010-09-01 at 11:09 -0400, Tal Galili wrote:
 Hello all,
 
 A friend recently brought to my attention that vector assignment actually
 recreates the entire vector on which the assignment is performed.
 
 So for example, the code:
 x[10]- NA # The original call (short version)
 
 Is really doing this:
 x- replace(x, list=10, values=NA) # The original call (long version)
 # assigning a whole new vector to x
 
 Which is actually doing this:
 x- `[-`(x, list=10, values=NA) # The actual call
 
 
 Assuming this can be explained reasonably to the lay man, my question is,
 why is it done this way ?
 Why won't it just change the relevant pointer in memory?
 
 On small vectors it makes no difference.
 But on big vectors this might be (so I suspect) costly (in terms of time).
 
 
 I'm curious for your responses on the subject.
 
 Best,
 Tal
 
 
 
 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)
 --
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] documentation to upgrade R-package from 32 to 64bit

2010-09-01 Thread Matt Shotwell
Try this:
http://www.stats.ox.ac.uk/~ripley/Win64/W64porting.html

-Matt

On Wed, 2010-09-01 at 07:40 -0400, Hayes, Daniel wrote:
 Dear all,
 
 I am working with the an R-package named GAMLSS 
 (www.gamlss.comhttp://www.gamlss.com) it is currently only functional under 
 the 32-bit version of R (for windows)
 The author of the package has agreed to help me create 64-bit compatible 
 version.
 I've been looking through the available R-documentation but cannot find any 
 relevant information on the process.
 Any help finding such documentation or any information on what the general 
 changes are that need to be implemented for a 32bit add-on package to work 
 with a 64bit version of R would be much appreciated.
 
 Thanks you in advance for you help,
 Daniel Hayes
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [semi-OT] Using fortune() in an email signature

2010-09-01 Thread Matt Shotwell
Or using R  GNU tools:

m...@max:~$ R -e fortunes::fortune() | gawk '/^[^]/ {print}'

It's not a question of trying variations, rather of following
instructions.
   -- Brian D. Ripley (about using 'Writing R Extensions')
  R-help (January 2006)

-Matt

On Wed, 2010-09-01 at 16:49 -0400, Stuart Luppescu wrote:
 Hello, As you can see from my signature in this message, I use the R
 fortune function to generate a fortune, which is then fed to the
 signature program, which constructs a named pipe containing the
 fortune-bearing sig, which is then included in mail messages. The
 problem is that it's got extraneous junk in it and I can't figure out
 how to get rid of it. This is the command that generates the fortune:
 
 /usr/bin/R --no-save --no-restore -q  /home/sl70/print-fortune.R
 (where print-fortune.R is just
 library(fortunes)
 fortune()
 )
 
 This produces this:
  library(fortunes)   

  fortune()
 
 Michael Watson: Hopefully this one isn't in the manual or I am about to get 
 shot :-S
 Peter Dalgaard: *Kapow*...
-- Michael Watson and Peter Dalgaard (question on axis())
   R-help (February 2006)
 
  
 
 I would like to remove the first two lines and the last line, so I
 changed the command to this:
 /usr/bin/R --no-save --no-restore  /home/sl70/print-fortune.R  |tail \
 -n  +23  | head -n -2 2 /dev/null
 
 That give the desired result when I run it at the command line, but when
 I feed it to the signature program, I get this message:
 
 Program /usr/local/bin/r-fortune doesn't seem to exist
 
 This is the signature program code that produces this error:
 
  /* check for existence of program by forking and then trying to
exec() it in the child */
 pid = fork();
 switch (pid) {
 case -1:/* oh well */
 perror(Couldn't fork() a child process);
 exit(EXIT_FAILURE);
 case 0: /* in child */
 /* close stdout */
 close(1);
 execlp(producer, producer, (char *) 0);
 exit(EXIT_FAILURE);
 default:
 waitpid(pid, exit_status, 0);
 if (exit_status != EXIT_SUCCESS) {
 fprintf(stderr, Program %s doesn't seem to exist
 \n,
 producer);
 exit(EXIT_FAILURE);
 }
 
 Unfortunately, I don't understand this at all. Can anyone give me a clue
 as to what's happening?
 
 Thanks.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave.sty

2010-08-24 Thread Matt Shotwell
Here is one:

http://svn.r-project.org/R/trunk/share/texmf/tex/latex/Sweave.sty

-Matt

On Tue, 2010-08-24 at 15:40 -0400, r.ookie wrote:
 Does anyone know where I can download the latest version of Sweave.sty? I 
 have looked all over the site http://www.stat.umn.edu/~charlie/Sweave/ with 
 no luck.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] on abort error, always show call stack?

2010-08-22 Thread Matt Shotwell
On Sun, 2010-08-22 at 11:41 -0400, ivo welch wrote:
 Dear R Wizards---is it possible to get R to show its current call
 stack (sys.calls()) upon an error abort?  I don't use ESS for
 execution, and it is often not obvious how to locate how I triggered
 an error in an R internal function.  Seeing the call stack would make
 this easier.  (right now, I sprinkle cat statements everywhere, just
 to locate the line where the error appears.)  Of course, I would
 really love to see the line in my program that triggered this, but I
 have asked this before, and I understand this is too difficult to get
 into the R language.

The traceback() function will print out the call stack after an error.
However, you may find the debug() family of functions more useful for
debugging. Also see the browser() function.

-Matt

 
 regards,
 
 /iaw
 
 
 Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] on abort error, always show call stack?

2010-08-22 Thread Matt Shotwell
How about this:

test - function(x) log(x)
tryCatch({
#Code that will error
test(a)
}, finally = {
sink(stderr())
traceback()
sink()
})


If you are running non-interactively, invoke R with the --interactive
flag to force it. Saving the code above to test.R, you can see the
effect with

$ R --interactive  test.R 1 test.out 2 test.err

This seems reasonable, but maybe others will say if I'm missing
something more automagic.

-Matt


On Sun, 2010-08-22 at 11:58 -0400, ivo welch wrote:
 yes, thank you.  is it possible to have it invoked to STDERR
 automatically on a program abort?
 
 /iaw
 
 On Sun, Aug 22, 2010 at 11:50 AM, Matt Shotwell shotw...@musc.edu wrote:
  On Sun, 2010-08-22 at 11:41 -0400, ivo welch wrote:
  Dear R Wizards---is it possible to get R to show its current call
  stack (sys.calls()) upon an error abort?  I don't use ESS for
  execution, and it is often not obvious how to locate how I triggered
  an error in an R internal function.  Seeing the call stack would make
  this easier.  (right now, I sprinkle cat statements everywhere, just
  to locate the line where the error appears.)  Of course, I would
  really love to see the line in my program that triggered this, but I
  have asked this before, and I understand this is too difficult to get
  into the R language.
 
  The traceback() function will print out the call stack after an error.
  However, you may find the debug() family of functions more useful for
  debugging. Also see the browser() function.
 
  -Matt
 
 
  regards,
 
  /iaw
 
  
  Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com)
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  --
  Matthew S. Shotwell
  Graduate Student
  Division of Biostatistics and Epidemiology
  Medical University of South Carolina
 
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does R always insist on sending plot output to a file?

2010-08-19 Thread Matt Shotwell
Donald, 

I was able to 'trick' R into writing plot data to a GNU Linux fifo. I
had forgotten that the fifo will block until there is a process at
either end (a writer and a reader):

At one terminal, create a fifo and set a program to catch output

$ mkfifo Rfifo
$ cat Rfifo

At a second terminal

$ R
 postscript(file=Rfifo)
 plot(0)
 dev.off()

-Matt


On Wed, 2010-08-18 at 23:21 -0400, Matt Shotwell wrote:
 Donald,
 
 At least for the PDF device (I know you asked about png, but I believe
 they are similar), the answer no. Ultimately, this device calls the
 standard C function fopen, and writes its data to the resulting file
 stream.
 
 If you're using GNU Linux, you might trick R into writing to a fifo (a
 named pipe, see 'man fifo'), or some other in-memory device, and read
 from it with another program. My initial experiments with this, however,
 were not successful.
 
 A better solution here, would be to have the various graphics devices
 write to an R connection, as do most other R functions that input and
 output data. In this way, we could write graphics data to a RAW
 connection (rawConnection()), which is essentially a memory buffer. 
 
 There are two obvious barriers to this:
 1. C level I/O routines (e.g. fprintf) are heavily integrated into the
 graphics device code. Hence, accommodating R connections would require
 significant changes.
 2. The graphics devices are mostly implemented in C, and there is (at
 present) no interface to R connections at the C level.
 
 -Matt
 
 On Wed, 2010-08-18 at 21:49 -0400, Donald Paul Winston wrote:
  I need to write the output of a R plot to a Java OutputStream. It looks like
  R insists on sending it's output to a file. Is there anyway to get bytes
  directly from the output of a plot so I can write it with Java? Writing it
  to a file is too slow.
  
  Is there a parameter in the graphics device function png(..) that directs
  output to a variable in memory? 
  
  x - plot(.)  would make sense.
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pass By Value Questions

2010-08-19 Thread Matt Shotwell
On Thu, 2010-08-19 at 14:27 -0400, Duncan Murdoch wrote:
 On 19/08/2010 12:57 PM, li...@jdadesign.net wrote:
  I understand R is a Pass-By-Value language. I have a few practical
  questions, however.
 
  I'm dealing with a large dataset (~1GB) and so my understanding of the
  nuances of memory usage in R is becoming important.
 
  In an example such as:
   d - read.csv(file.csv);
   n - apply(d, 1, sum);
  must d be copied to another location in memory in order to be used by
  apply? In general, is copying only done when a variable is updated within
  a function?

 
 Generally R only copies when the variable is modified, but its rules for 
 detecting this are sometimes overly conservative, so you may get some 
 unnecessary copying.  For example,
 
 d[1,1] - 3
 
 will probably not make a full copy of d when the internal version of 
 [- is used, but if you have an R-level version, it probably will.  I 
 forget whether the dataframe method is internal or R level. 
 
 In the apply(d, 1, sum) example, it would probably make a copy of each 
 row to pass to sum, but never a copy of the whole dataframe/array.
  Would the following example be any different in terms of memory usage?
   d - read.csv(file.csv);
   n - apply(d[,2:10], 1, sum);
  or can R reference the original d object since no changes to the object
  are being made?

 
 This would make a new object containing d[,2:10], and would pass that to 
 apply.

Since d is a data.frame, subsetting the columns would create a new
data.frame, as Duncan says. However, the columns of the new data.frame
would internally _reference_ the appropriate columns of d, until either
were modified. This does not apply to row subsetting. That is, d[2:10,]
would create a new data.frame and copy the relevant data. Nor does it
apply to _any_ subsetting of matrices.

  I'm familiar with FF and BigMemory, but are there any packages/tricks
  which allow for passing such objects by reference without having to code
  in C?


It's difficult to determine exactly when data is copied internally by R.
The tracemem function may be used to track when entire objects are
duplicated. However, tracemem would not detect the duplication that
occurs, for example, when subsetting the rows of d. Otherwise, we can
monitor memory usage with gc(), and experiment with code on a trial and
error basis.

I have had limited success in avoiding duplication by utilizing R
environments. See for example http://biostatmatt.com/archives/663 .
However, this may be more trouble that it's worth.

-Matt

 Duncan Murdoch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Delete rpart/mvpart cross-validation output

2010-08-18 Thread Matt Shotwell
Or, if using GNU Linux or other UNIX-like system:

sink(/dev/null)
# Issue commands
sink()

-Matt

On Wed, 2010-08-18 at 09:14 -0400, Gabor Grothendieck wrote:
 On Fri, Aug 13, 2010 at 1:52 PM, Marie-Hélène Ouellette
 mariehele...@gmail.com wrote:
  Dear all,
 
  I was wondering if there is a simple way to avoid printing the multiple
  cross-validation automatic output to the console of recursive partitionning
  functions like rpart or mvpart. For example...
 
  data(spider)
 
  mvpart(data.matrix(spider[,1:12])~herbs+reft+moss+sand+twigs+water,spider,xv=1se,xvmult=100)
  *X-Val rep : 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18
  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37
  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56
  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75
  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94
  95  96  97  98  99  100
  Minimum tree sizes
  tabmins
   4  6  7  8
   2 18 78  2 *
 
  ... loosing what's in bold ?
 
 
 Try this hack:
 
 cat - function(...) if (..1 !=..1 != X-Val rep : 1) base::cat(...)
 environment(mvpart) - .GlobalEnv
 
 mvpart(data.matrix(spider[,1:12])~herbs+reft+moss+sand+twigs+water,spider,xv=1se,xvmult=100)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does R always insist on sending plot output to a file?

2010-08-18 Thread Matt Shotwell
Donald,

At least for the PDF device (I know you asked about png, but I believe
they are similar), the answer no. Ultimately, this device calls the
standard C function fopen, and writes its data to the resulting file
stream.

If you're using GNU Linux, you might trick R into writing to a fifo (a
named pipe, see 'man fifo'), or some other in-memory device, and read
from it with another program. My initial experiments with this, however,
were not successful.

A better solution here, would be to have the various graphics devices
write to an R connection, as do most other R functions that input and
output data. In this way, we could write graphics data to a RAW
connection (rawConnection()), which is essentially a memory buffer. 

There are two obvious barriers to this:
1. C level I/O routines (e.g. fprintf) are heavily integrated into the
graphics device code. Hence, accommodating R connections would require
significant changes.
2. The graphics devices are mostly implemented in C, and there is (at
present) no interface to R connections at the C level.

-Matt

On Wed, 2010-08-18 at 21:49 -0400, Donald Paul Winston wrote:
 I need to write the output of a R plot to a Java OutputStream. It looks like
 R insists on sending it's output to a file. Is there anyway to get bytes
 directly from the output of a plot so I can write it with Java? Writing it
 to a file is too slow.
 
 Is there a parameter in the graphics device function png(..) that directs
 output to a variable in memory? 
 
 x - plot(.)  would make sense.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] when to use textConnection ??

2010-08-16 Thread Matt Shotwell
Also, many R functions are designed to operate on R connections, to
input and output text. Alternatively, we may wish to provide the input
text as an R character vector, or output text to a character vector. The
textConnection makes a character vector look like a connection, so R
routines that operate on connections may also operate on character
vectors.

The textConnection also provides a mechanism for re-encoding text data,
although this may be more directly accomplished via the iconv function.
However, both methods are currently limited to encodings that do not
allow embedded null characters.

-Matt

On Mon, 2010-08-16 at 13:06 -0400, Joshua Wiley wrote:
 Hi,
 
 One useful case is when data is sent in an email.  For instance:
 
 T1   T2 T3
 -0.24 -0.26 -0.67
 -1.58  0.04  0.14
 -1.21  1.55 -0.45
 0.31  0.48 -1.39
 
 One could read it in via
 
 con - textConnection(
 T1   T2 T3
 -0.24 -0.26 -0.67
 -1.58  0.04  0.14
 -1.21  1.55 -0.45
 0.31  0.48 -1.39)
 
 read.table(con, header = TRUE)
 
 Often a text file can be read in directly with read.table() and the
 appropriate delimiter (e.g., sep = \t for tab, , for comma, etc.).
  Do you have a particular problem you are trying to solve or an
 application of textConnection() you are interested in?
 
 Cheers,
 
 Josh
 
 On Mon, Aug 16, 2010 at 9:37 AM, skan juanp...@gmail.com wrote:
 
  Hello.
 
  I don't uderstant when to use textConnection and when not.
  Some examples do it, some not.
  I've even seen something like
 
  con - textConnection(rev(rev(ReadLines('data.txt'))[-(1:2]))
  data - read.table(con)
  close(con)
 
  --
  View this message in context: 
  http://r.789695.n4.nabble.com/when-to-use-textConnection-tp2327132p2327132.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] band pass filter

2010-08-15 Thread Matt Shotwell
nuncio,

If you already have a filter kernel, you can use the filter function. Of
course, convolution filters can be applied directly using the discrete
Fourier transform via the fft function. For an example of filtering
(lowpass) with R, see http://biostatmatt.com/archives/78 , and the
associated R script linked there. Also see the link to a free
downloadable book by Steven Smith, which discusses the DFT and building
filter kernels.

-Matt

On Sat, 2010-08-14 at 23:52 -0400, nuncio m wrote:
 Hello list,
   Is there any way to bandpass filter in R
 thanks
 nuncio
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading a text file, one line at a time

2010-08-15 Thread Matt Shotwell
Walt,

Something like:

con - file(your-large-file.txt, rt)
readLines(con, 1) # Read one line

-Matt



On Sun, 2010-08-15 at 10:58 -0400, Data Analytics Corp. wrote:
 Hi,
 
 I have an upcoming project that will involve a large text file.  I want to
 
1. read the file into R one line at a time
2. do some string manipulations on the line
3. write the line to another text file.
 
 I can handle the last two parts.  Scan and read.table seem to read the 
 whole file in at once.  Since this is a very large file (several hundred 
 thousand lines), this is not practical.  Hence the idea of reading one 
 line at at time.  The question is, can R read one line at a time?  If 
 so, how?  Any suggestions are appreciated.
 
 Thanks,
 
 Walt
 
 
 
 Walter R. Paczkowski, Ph.D.
 Data Analytics Corp.
 44 Hamilton Lane
 Plainsboro, NJ 08536
 
 (V) 609-936-8999
 (F) 609-936-3733
 w...@dataanalyticscorp.com
 www.dataanalyticscorp.com
 
 _
 
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ASCI characters

2010-08-15 Thread Matt Shotwell
How about:

 rawToChar(as.raw(82))
[1] R

-Matt

On Sun, 2010-08-15 at 19:50 -0400, Orvalho Augusto wrote:
 Hello guys!
 
 Is there any function that permits me to get an ASCI character from its
 code? Eg. ascifunction(34) would give me '
 or ascifunction(92) gives \
 
 Thanks
 Caveman
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Numerical Methods Course

2010-08-10 Thread Matt Shotwell
TGS,

Given that you have to pay an outrageous $155.86 for that book, it seems
reasonable to look for a free environment for numerical computing (like
R!). If your instructor says that such a variety of programming
languages would work, you could probably make a good argument to use R.
But why not just ask your instructor?

If your instructor insists on MATLAB, you could also consider using GNU
Octave, a free MATLAB clone.

-Matt

On Tue, 2010-08-10 at 10:55 -0400, TGS wrote:
 I want to take this numerical methods course where the text is 
 http://www.amazon.com/Numerical-Methods-J-Douglas-Faires/dp/0534407617 . The 
 instructor recommends MATLAB, but states Fortran, C, Mathematica, or Maple 
 will also do the job.
 
 Will R do the job as well?
 
 If not, where do you think it will be lacking in the context of this 
 book/course.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good Book To Work Through This Summer

2010-08-09 Thread Matt Shotwell
There are some book-length documents (downloadable for free) at the
contributed documentation section of the R project website here:

http://cran.r-project.org/other-docs.html

In particular, the book “Practical Regression and Anova using R” by
Julian Faraway looks to have the content you want, though I haven't read
it myself. There are other high quality authors in the list also.

-Matt


On Mon, 2010-08-09 at 03:20 -0400, Ondrej Vozar wrote:
 Hello,
 I think that good introduction for application oriented people is book of
 Peter Dalgaard, Introductory Statistics with R
 http://www.springer.com/statistics/computanional+statistics/book/978-0-387-79053-4
 This book is good for mastering basics of R.
 
 Book I like the one of John Fox, An R and S-PLUS Companion to Applied
 Regression
 http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/index.html
 
 But there are dozens of books on this topic.
 Best regards,
 Ondrej Vozar.
 
 On 9 August 2010 06:38, TGS cran.questi...@gmail.com wrote:
 
  Dear R users,
 
  I'm hoping to get a few suggestions about which books are good to follow
  along and learn R.
 
  I'm hoping to spend the summer going through a good R book as it is applied
  in linear regression.
 
  Thanks!
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing downloaded source code into a package

2010-08-09 Thread Matt Shotwell
See comments below.

On Mon, 2010-08-09 at 10:22 -0400, JH wrote:
 I am wanting to change some lines of code in the R package named nlme
 http://cran.r-project.org/web/packages/nlme/index.html
 To do this I have downloaded the Package source named nlme_3.1-96.tar.gz,
 opened up the file and changed the text documents within the folder named R,
 specifically the cor.Struct.txt file.

I couldn't find this file. Do you mean corStruct.R, or maybe
corStruct.c?

 I now want to know how can I use this modified nlme_3.1-96.tar.gz file in R
 2.10. How do I convert this source code into a package?

The source code, along with the documentation, data files, etc. _is_ the
package. When the package contains source code from a compiled language
(C or Fortran), as nlme does, this code must be compiled for your
platform before the package is installed. The CRAN maintainers kindly
pre-compile this code for Windows and Mac OS X users. If you make
modifications to C or Fortran code in a package, you must re-compile the
code yourself, or use a service such as R-Forge.

The R manual `Writing R Extensions` is the standard reference for
packages. See also the `R Administration and Installation' manual. See
the information here

http://www.murdoch-sutherland.com/Rtools/

for compiling package code in Windows. Lastly, before you follow the
instructions at the URL above, I urge you to consider GNU Linux as a
platform for programming. I've found the tools available in standard GNU
Linux distributions (such as that available at http://www.debian.org)
much simpler to install and work with.

 
 
 I have looked on the internet and tried using cmd.exe  then the code 
 Rmcd.exe INSTALL -1 ~/nlme_3.1-96.tar.gz 
 I end up getting the message The system can't find the specified path,
 when I have the file in the directory that Rmcd.exe is in.
 
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep with search terms defined by a variable

2010-08-02 Thread Matt Shotwell
Daniel,

If you want to search for each term at the beginning of a sting, using
the regular expression construct '^', you might use the following

 search.terms - c(Emil, Meryl) 
 names - c(Emil Jannings, 
+Charles Chaplin, 
+Katherine Hepburn, 
+Meryl Streep) 
 for(term in search.terms) { 
+ print(grep(paste(^,term,sep=),names)) 
+ } 
[1] 1
[1] 4

-Matt


On Tue, 2010-08-03 at 00:05 -0400, Daniel Malter wrote:
 Hi, I have a good grasp of grep() and gsub() for finding and extracting
 character strings. However, I cannot figure out how to use a search term
 that is stored in a variable when the search string is more complex.
 
 #Say I have a string, and want to know whether the last name Jannings is
 in the string. This is done by
 
 names=c(Emil Jannings)
 grep(Emil,names)
 
 #Yet, I need to store the search terms in a variable, which works for the
 very simple example
 
 search.term=Emil
 grep(search.term,names)
 
 #but I cannot get it to work for the more difficult example in which I want
 to do something like
 
 grep(^search.term,names)
 grep(^search.term,names)
 grep(^search.term,names)
 
 #Implying that the search term must be the first part of the string that is
 being searched
 
 #Ultimately, I need to to loop over several strings stored in search.term,
 for example,
 
 names=c(Emil Jannings,Charles Chaplin,Katherine Hepburn,Meryl
 Streep)
 search.term=c(Emil,Meryl)
 
 for(i in 1:length(names)){
  print(grep(^search.term[i],names))
 }
 
 So the questions I have are two. 1. How do I concatenate terms that I would
 normally quote (like ^) with variables that contain search terms and that
 normally would not be quoted? 2. How do I run this over indices of the
 variable that contains the search terms?
 
 I greatly appreciate any help,
 Daniel
 
 
 
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Meaning of following function

2010-08-01 Thread Matt Shotwell
Ron,

In arithmetic, '-' and '+' are binary _and_ unary operators. That is,
both -1 and 1-1 are valid arithmetic expressions, the former negates its
argument, and the latter subtracts the second from the first. Since much
of R is designed do arithmetic, R honors the unary _and_ binary versions
of '-' and '+'. The implementation of `-`() performs negation when the
second argument is missing, and subtraction when both arguments are
present. AFAIR, the only other unary (but never binary) operator in R is
'!', or the 'NOT' operator (maybe also the one-sided formula operator
'~').

In contrast, the 'times' or 'multiply' operator '*' is generally a
binary operator in arithmetic. Hence, the function `*`() requires two
arguments.

-Matt



On Sun, 2010-08-01 at 10:56 -0400, Ron Michael wrote:
 Hi friends, I am aware of the function -() which acts as minus in ordinary 
 computations. For example:
  
  -(3, 1)
 [1] 2
 
 However what is the meaning of 
  -(3)
 [1] -3
 
 I was expecting R to generate some error as it does for *(3). What is the 
 logic for that calculation?
  
 Thanks,
 
 
   [[alternative HTML version deleted]]
 

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to code it??

2010-07-28 Thread Matt Shotwell
If I take your meaning correctly, you want something like this.
 x - c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 
+ 1)
 easy - function(x) {
+ state - 0
+ for (i in 1:length(x)) {
+ if (x[i] == 0) 
+ x[i] - state
+ state - 0
+ if (x[i] == 1) 
+ state - -1
+ }
+ x
+ }
 easy(x)
 [1]  0  0  0  0  0  0  0  1 -1  0  1  1 -1  0  1 -1  0  0  1 

-Matt


On Wed, 2010-07-28 at 14:10 -0400, Raghu wrote:
 Hi
 
 I have say a large vector of 3500 digits. Initially the digits are 0s and
 1s. I need to check for a rule to change some of the 0s to -1s in this
 vector. But once I change a 0 to -1 then I need to start applying the rule
 to change the next 0 only after I see the next 1 in the vector.
 
 Say for example x = (0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,1)
 I need to traverse from the 9th element to the last ( because the first
 occurrence of 1 is at 8) . Let us assume that according to our rule we
 change the 13th element (only 0s can be changed) to -1. Now we need to go to
 the next occurrence of 1 (which is 15) and begin the rule application from
 the 16th till the end of the vector and once replaced a 0 to a -1 then start
 again from the next 1. How do we code this? I 'feel' recursion is the best
 possible solution but I am not a programmer and will await experts' views.
 If this is not a typical R-forum question then my advance apologies.
 
 Many thx

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what is a vignette?

2010-07-26 Thread Matt Shotwell
Alex, 

Vignettes are optional supplemental documentation. That is, they are in
addition to the required boilerplate documentation for R functions and
datasets. Vignettes are written in the spirit of sharing knowledge, and
assisting new users in learning the purpose and use of a package.

Maybe the best place to start is simply to read one, or a few. The `zoo`
package has a few, for example here:

http://cran.r-project.org/web/packages/zoo/index.html

The technical details of vignettes, and how to write one are contained
in the `Writing R Extensions` manual:

http://cran.r-project.org/manuals.html

-Matt

On Mon, 2010-07-26 at 07:55 -0400, Alaios wrote:
 I am trying to find a simple R guide that explain what a vignette is but so 
 far 
 I didnt make any progress. I tried to search inside R's built in help.start() 
 but it only returns results how to see vignettes.
 
 So could you please tell me what a vignette is and if you can also could you 
 give some simple guide that I can always use to read about these things?
 
 Best Regards
 Alex
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Matthew S. Shotwell
Graduate Student 
Division of Biostatistics and Epidemiology
Medical University of South Carolina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink function

2010-07-23 Thread Matt Shotwell
I had addressed a problem similar to this only a few days ago. Please
see the following URL:

http://tolstoy.newcastle.edu.au/R/e11/help/10/07/1677.html

On Fri, 2010-07-23 at 08:45 -0400, nuncio m wrote:
 I have the following code to write the output from auto.arima function.  The
 issue is not in finding the model but to divert its out put
 fit to a file order_fit.txt. code runs but nothing is written to
 order_fit.txt
 where am I going wrong
 
 library(forecast)
 for (i in 1:2) {
 filen = paste(file,i,.txt,sep=)
 data - read.table(filen)
 dat1 - data[,1]
 xt - ts(dat1,start=c(1978,11),end=c(2006,12),frequency=12)
 #dat1[dat1 == -99.989998] - NA
 if (min(dat1) != max(dat1)){
 fit - auto.arima(xt,D=1)
 
 *sink(file=order_fit.txt)
 fit
 sink()*
 
 residfit - residuals(fit)
 filenou1 = paste(fileree,i,_out,.txt,sep=)
 residfit
 write.table(residfit,filenou1,sep=\t,col.names=FALSE,row.names=FALSE,quote=FALSE)
 
 }else{
 *fiit - ARIMA(-6,-6,-6)(-6,-6,-6)[12]
 sink(file=order_fit.txt)
 fiit
 sink()*
 filenou1 = paste(fileree,i,_out,.txt,sep=)
 residfit=rep(-99.99,338)
 residfit
 write.table(residfit,filenou1,sep=\t,col.names=FALSE,row.names=FALSE,quote=FALSE)
 rm(data,dat1,residfit,xt)
 }
 }
 

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with Sink Function

2010-07-16 Thread Matt Shotwell
Your code between calls to sink() does not generate any output. Hence,
nothing will be diverted to the file. To illustrate this point,
consider 

for(i in 1:10) i

This produces no output. However, 

for(i in 1:10) print(i)

produces output as expected.

-Matt

On Fri, 2010-07-16 at 13:34 -0400, Addi Wei wrote:
 Sorry about that. Still new to this...  The code below should be
 reproducible.All R2 should just be 1, and I should write 1 to
 R2outputKKNN.txt 10 timesnothing is happening.  Appreciate the efforts
 to help! 
 
 for (i in 1:10)
 {
   adata = 1:5
   bdata = 6:10
   lm - lm(adata~bdata)
   slm - summary(lm)
 str(slm)
 
if (i == 1) {   
   previousR2 -slm$r.squared
sink(file=R2outputKKNN.txt, append=TRUE)
previousR2  
sink()  }   else if(i!=1)
{
currentR2 - slm$r.squared
if (previousR2  currentR2)
{
currentR2 - previousR2
}  
if (previousR2  currentR2) {
sink(file=R2outputKKNN.txt, append=TRUE)
currentR2
sink()  
}
}
 }
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calling a c function from R

2010-07-14 Thread Matt Shotwell
Fahim, 

Please see the Writing R Extensions manual
http://cran.r-project.org/doc/manuals/R-exts.pdf

There are simple instructions in this document under the heading System
and foreign language interfaces.

-Matt


On Wed, 2010-07-14 at 01:21 -0400, Fahim Md wrote:
 Hi,
 I am trying to call a C function, that I wrote to parse a flat file,  into
 R. The argument that will go into this function is an input file that I need
 to parse and write the desired output in an output file.  I used some hit
 and trial approach but i keep on getting the file not found or
 segmentation fault error. I know that the error is in passing the argument
 but I could not solve it.
 
 After reading  some of the tutorials, I understood how to do this if the
 arguments are integers or floats. I am stuck when i am trying to send the
 files. I am attaching stub of each file.
 Help appreciated.
 Thanks
 
 ---
 My function call would be:
 source(parse.R)
 parseGBest('./gbest/inFile.seq',   './gbest/outFile.out');
 ---
 I wrote a wrapper function (parse.R) as follows:
 
 dyn.load(parse.so);
 parseGBest = function(inFile, outFile)
 {
 .C( parse , inFile , outFile);
 }
 
 How to write receive the filenames in function( , ) above. and how to call
 .C
 
 
 parse.c file is as below:  How to receive the argument in funcion and how to
 make it compatible with my argv[ ].
 
 
 void parse( int argc, char *argv[] )  //This is working as standalone C
 program. How to receive
   // the above files so that
 it become compatible with my argv[ ]
 {
 
 FILE *fr, *of;
 char line[81];
 
 
  if ( *argc == 3 )*/
 {
 if ( ( fr = fopen( argv[0], r )) == NULL )
 {
 puts( Can't open input file.\n );
 exit( 0 );
 }
 if ( ( of = fopen( argv[1], w )) == NULL )
 {
 puts( Output file not given.\n );
 }
   }
else
 {printf(wrong usage: Try Agay!!! correct usage is:=  functionName
 inputfileToParse outFileToWriteInto\n);
}
 while(fgets(line, 81, fr) != NULL)
 
 --
 ---
 --
 }
 
 
 
 Thanks again
 Fahim
 
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fast string comparison

2010-07-13 Thread Matt Shotwell
On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote:
 strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse =  
 ))
 system.time(strings[-1] == strings[-1e5])
 #   user  system elapsed
 #  0.016   0.000   0.017
 
 So it takes ~1/100 of a second to do ~100,000 string comparisons. You
 need to provide a reproducible example that illustrates why you think
 string comparisons are slow.

Here's a vectorized alternative to '==' for strings, with minimal
argument checking or result conversion. I haven't looked at the
corresponding R source code, it may be similar:

library(inline)
code - 
SEXP ans;
int i, len, *cans;
if(!isString(s1) || !isString(s2))
error(\invalid arguments\);
len = length(s1)length(s2)?length(s2):length(s1);
PROTECT(ans = allocVector(INTSXP, len));
cans = INTEGER(ans);
for(i = 0; i  len; i++)
cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
 CHAR(STRING_ELT(s2,i)));
UNPROTECT(1);
return ans;

sig - signature(s1=character, s2=character)
strcmp - cfunction(sig, code)


 system.time(strings[-1] == strings[-1e5])
   user  system elapsed 
  0.036   0.000   0.035 
 system.time(strcmp(strings[-1], strings[-1e5]))
   user  system elapsed 
  0.032   0.000   0.034 

That's pretty fast, though I seem to be working with a slower system
than Hadley. It's hard to see how this could be improved, except maybe
by caching results of string comparisons. 

-Matt

 
 Hadley
 
 
 On Tue, Jul 13, 2010 at 6:52 AM, Ralf B ralf.bie...@gmail.com wrote:
  I am asking this question because String comparison in R seems to be
  awfully slow (based on profiling results) and I wonder if perhaps '=='
  alone is not the best one can do. I did not ask for anything
  particular and I don't think I need to provide a self-contained source
  example for the question. So, to re-phrase my question, are there more
  (runtime) effective ways to find out if two strings (about 100-150
  characters long) are equal?
 
  Ralf
 
 
 
 
 
 
  On Sun, Jul 11, 2010 at 2:37 PM, Sharpie ch...@sharpsteen.net wrote:
 
 
  Ralf B wrote:
 
  What is the fastest way to compare two strings in R?
 
  Ralf
 
 
  Which way is not fast enough?
 
  In other words, are you asking this question because profiling showed one 
  of
  R's string comparison operations is causing a massive bottleneck in your
  code? If so, which one and how are you using it?
 
  -Charlie
 
  -
  Charlie Sharpsteen
  Undergraduate-- Environmental Resources Engineering
  Humboldt State University
  --
  View this message in context: 
  http://r.789695.n4.nabble.com/Fast-string-comparison-tp2285156p2285409.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fast string comparison

2010-07-13 Thread Matt Shotwell
Good idea Romain, there is quite a bit of type testing in the function
versions of STRING_ELT and CHAR, not to mention the function call
overhead. Since the types are checked explicitly, I believe this
function is safe. All together now...

 system.time(strings[-1] == strings[-1e5])
   user  system elapsed 
  0.032   0.000   0.035 
 system.time(strcmp(strings[-1], strings[-1e5]))
   user  system elapsed 
  0.032   0.000   0.034 
 system.time(strcmp2(strings[-1], strings[-1e5]))
   user  system elapsed 
  0.024   0.000   0.026 
 system.time(lhs==rhs)
   user  system elapsed 
  0.012   0.000   0.013 
 system.time(strcmp(lhs, rhs))
   user  system elapsed 
  0.012   0.000   0.011 
 system.time(strcmp2(lhs, rhs))
   user  system elapsed 
  0.004   0.000   0.004

I looks like you can squeeze out more speed using the macro versions of
STRING_ELT and CHAR.

On Tue, 2010-07-13 at 09:48 -0400, Romain Francois wrote:
 Hi Matt,
 
 I think there are some confusing factors in your results.
 
 
 system.time(strcmp(strings[-1], strings[-1e5]))
 
 would also include the time required to perform both subscripting 
 (strings[-1] and strings[-1e5] ) which actually takes some time.
 
 
 Also, you do have a bit of overhead due to the use of STRING_ELT and the 
 write barrier.
 
 
 I've include below a version that uses R internals so that you get the 
 fast (but you have to understand the risks, etc ...) version of 
 STRING_ELT using the plugin system of inline.
 
 library(inline)
 code - 
  SEXP ans;
  int i, len, *cans;
  if(!isString(s1) || !isString(s2))
  error(\invalid arguments\);
  len = length(s1)length(s2)?length(s2):length(s1);
  PROTECT(ans = allocVector(INTSXP, len));
  cans = INTEGER(ans);
  for(i = 0; i  len; i++)
  cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
   CHAR(STRING_ELT(s2,i)));
  UNPROTECT(1);
  return ans;
 
 sig - signature(s1=character, s2=character)
 strcmp - cfunction(sig, code)
 
 strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse 
 =  ))
 
 
 lhs - strings[-1]
 rhs - strings[-1e5]
 system.time( lhs == rhs )
 system.time(strcmp( lhs, rhs) )
 
 library(inline)
 settings - getPlugin( default )
 settings$includes - paste( #define USE_RINTERNALS, settings$includes, 
 collapse = \n )
 code2 - 
  SEXP ans;
  int i, len, *cans;
  if(!isString(s1) || !isString(s2))
  error(\invalid arguments\);
  len = length(s1)length(s2)?length(s2):length(s1);
  PROTECT(ans = allocVector(INTSXP, len));
  cans = INTEGER(ans);
  for(i = 0; i  len; i++)
  cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
   CHAR(STRING_ELT(s2,i)));
  UNPROTECT(1);
  return ans;
 
 sig - signature(s1=character, s2=character )
 strcmp2 - cxxfunction(sig, code2, settings = settings)
 system.time(strcmp2( lhs, rhs) )
 
 
 
 I get:
 
 $ Rscript strings.R
 Le chargement a nécessité le package : methods
 utilisateur système  écoulé
0.002   0.000   0.002
 utilisateur système  écoulé
0.004   0.000   0.005
 utilisateur système  écoulé
0.003   0.000   0.003
 
 Romain
 
 
 Le 13/07/10 15:24, Matt Shotwell a écrit :
 
  On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote:
  strings- replicate(1e5, paste(sample(letters, 100, rep = T), collapse =  
  ))
  system.time(strings[-1] == strings[-1e5])
  #   user  system elapsed
  #  0.016   0.000   0.017
 
  So it takes ~1/100 of a second to do ~100,000 string comparisons. You
  need to provide a reproducible example that illustrates why you think
  string comparisons are slow.
 
  Here's a vectorized alternative to '==' for strings, with minimal
  argument checking or result conversion. I haven't looked at the
  corresponding R source code, it may be similar:
 
  library(inline)
  code- 
   SEXP ans;
   int i, len, *cans;
   if(!isString(s1) || !isString(s2))
   error(\invalid arguments\);
   len = length(s1)length(s2)?length(s2):length(s1);
   PROTECT(ans = allocVector(INTSXP, len));
   cans = INTEGER(ans);
   for(i = 0; i  len; i++)
   cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
CHAR(STRING_ELT(s2,i)));
   UNPROTECT(1);
   return ans;
  
  sig- signature(s1=character, s2=character)
  strcmp- cfunction(sig, code)
 
 
  system.time(strings[-1] == strings[-1e5])
  user  system elapsed
 0.036   0.000   0.035
  system.time(strcmp(strings[-1], strings[-1e5]))
  user  system elapsed
 0.032   0.000   0.034
 
  That's pretty fast, though I seem to be working with a slower system
  than Hadley. It's hard to see how this could be improved, except maybe
  by caching results of string comparisons.
 
  -Matt
 
 
  Hadley
 
 
  On Tue, Jul 13, 2010 at 6:52 AM, Ralf Bralf.bie...@gmail.com  wrote:
  I am asking this question because String comparison in R seems to be
  awfully slow (based

Re: [R] Compress string memCompress/Decompress

2010-07-11 Thread Matt Shotwell
On Fri, 2010-07-09 at 20:02 -0400, Erik Wright wrote:
 Hi Matt,
 
 This works great, thanks!
 
 At first I got an error message saying BLOB is not implemented in RSQLite.  
 When I updated to the latest version it worked.

SQLite began to support BLOBs from version 3.0.

 
 Is there any reason the string needs to be stored as type BLOB?  It seems to 
 work the same when I swap BLOB with TEXT in the CREATE TABLE command.

SQLite has a dynamic-type system. That is, data types are associated
with values rather than with their container (column). This means that
most columns in a table can store more than just the type (or
'affinity') it is declared with. I think that's what happens when you
use TEXT rather than BLOB. If you use something like x'A9' to insert
data into a column with TEXT affinity, I believe it is stored as a BLOB
regardless.

-Matt

 Thanks again!,
 Erik
 
 
 
 On Jul 9, 2010, at 3:21 PM, Matt Shotwell wrote:
 
  Erik, 
  
  Can you store the data as a blob? For example:
  
  #create string, compress with gzip, convert to SQLite blob string
  string - gzip this string, store as blob in SQLite database
  string.gz - memCompress(string, type=gzip)
  string.sqlite - paste(x',paste(string.gz,collapse=),',sep=)
  
  #create database and table with a BLOB column
  library(RSQLite)
  Loading required package: DBI
  con - dbConnect(dbDriver(SQLite), compress.sqlite)
  dbGetQuery(con, CREATE TABLE Compress (id INTEGER, data BLOB);)
  NULL
  
  #insert the string as a blob
  query - paste(INSERT INTO Compress (id, data) VALUES (1, , 
  + string.sqlite, );, sep=)
  dbGetQuery(con, query)
  NULL
  
  #recover the blob, decompress, and convert back to a string
  result - dbGetQuery(con, SELECT data FROM Compress;)
  string.gz - result[[1]][[1]]
  string - memDecompress(string.gz, type=gzip)
  rawToChar(string)
  [1] gzip this string, store as blob in SQLite database
  
  
  -Matt
  
  
  
  On Fri, 2010-07-09 at 12:51 -0400, Erik Wright wrote:
  Hello,
  
  I would like to compress a long string (character vector), store the 
  compressed string in the text field of a SQLite database (using RSQLite), 
  and then load the text back into memory and decompress it back into the 
  the original string.  My character vector can be compressed considerably 
  using standard gzip/bzip2 compression.  In theory it should be much faster 
  for me to compress/decompress a long string than to write the whole string 
  to the hard drive and then read it back (not to mention the saved hard 
  drive space).
  
  I have tried accomplishing this task using memCompress() and 
  memDecompress() without success.  It seems memCompress can only convert a 
  character vector to raw type which cannot be treated as a string.  Does 
  anyone have ideas on how I can go about doing this, especially using the 
  standard base packages?
  
  Thanks!,
  Erik
  
  
  sessionInfo()
  R version 2.11.0 (2010-04-22) 
  x86_64-apple-darwin9.8.0 
  
  locale:
  [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
  
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base 
  
  loaded via a namespace (and not attached):
  [1] tools_2.11.0
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  -- 
  Matthew S. Shotwell
  Graduate Student
  Division of Biostatistics and Epidemiology
  Medical University of South Carolina
  http://biostatmatt.com
  
 
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compress string memCompress/Decompress

2010-07-09 Thread Matt Shotwell
Erik, 

Can you store the data as a blob? For example:

 #create string, compress with gzip, convert to SQLite blob string
 string - gzip this string, store as blob in SQLite database
 string.gz - memCompress(string, type=gzip)
 string.sqlite - paste(x',paste(string.gz,collapse=),',sep=)

 #create database and table with a BLOB column
 library(RSQLite)
Loading required package: DBI
 con - dbConnect(dbDriver(SQLite), compress.sqlite)
 dbGetQuery(con, CREATE TABLE Compress (id INTEGER, data BLOB);)
NULL

 #insert the string as a blob
 query - paste(INSERT INTO Compress (id, data) VALUES (1, , 
+ string.sqlite, );, sep=)
 dbGetQuery(con, query)
NULL

 #recover the blob, decompress, and convert back to a string
 result - dbGetQuery(con, SELECT data FROM Compress;)
 string.gz - result[[1]][[1]]
 string - memDecompress(string.gz, type=gzip)
 rawToChar(string)
[1] gzip this string, store as blob in SQLite database


-Matt



On Fri, 2010-07-09 at 12:51 -0400, Erik Wright wrote:
 Hello,
 
 I would like to compress a long string (character vector), store the 
 compressed string in the text field of a SQLite database (using RSQLite), and 
 then load the text back into memory and decompress it back into the the 
 original string.  My character vector can be compressed considerably using 
 standard gzip/bzip2 compression.  In theory it should be much faster for me 
 to compress/decompress a long string than to write the whole string to the 
 hard drive and then read it back (not to mention the saved hard drive space).
 
 I have tried accomplishing this task using memCompress() and memDecompress() 
 without success.  It seems memCompress can only convert a character vector to 
 raw type which cannot be treated as a string.  Does anyone have ideas on how 
 I can go about doing this, especially using the standard base packages?
 
 Thanks!,
 Erik
 
 
  sessionInfo()
 R version 2.11.0 (2010-04-22) 
 x86_64-apple-darwin9.8.0 
 
 locale:
 [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base 
 
 loaded via a namespace (and not attached):
 [1] tools_2.11.0
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calling Gnuplot from R

2010-07-08 Thread Matt Shotwell
I recently wrote a small R function to draw simple ASCII scatterplots.
http://biostatmatt.com/archives/491
Bill Harris commented that the plots reminded him of the dumb terminal
of Gnuplot. I think it would be really neat to have an R graphics driver
to Gnuplot in order to generate more complete ASCII graphics in R. Maybe
there are other good reasons also? I believe octave makes good use of
Gnuplot...

-Matt

On Thu, 2010-07-08 at 11:28 -0400, Erik Iverson wrote:
 If you use Emacs, you can use org-mode with org-babel to facilitate 
 this... I'll refrain from asking why :).
 
 See: http://orgmode.org/worg/org-contrib/babel/index.php
 
 Christopher Desjardins wrote:
  Hi,
  I am wondering if there is a way to call Gnuplot from R and/or if anyone can
  recommend a package on CRAN capable of doing this?
  Thanks,
  Chris
  PS - Please cc me on the response.
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot with whispers

2010-07-05 Thread Matt Shotwell
It looks like read.table is reading the first line as a data value,
which is the default for read.table. Try using read.table with the
argument header=TRUE. Also, consider using a box and whiskers plot for
these data (?boxplot, ?lattice::bwplot).

-Matt

On Mon, 2010-07-05 at 12:08 -0400, Ian Bentley wrote:
 Hello!
 
 I need to make a plot with whispers that does the following.
 
 Reads in 50 files, each file containing 200 data points.  A file looks like
 this:
 base100.log
 Send Receive
 10.5   100.3
 15.0   102.4
 ...
 
 There are 100 lines, each with two data points.  I need to read in the 50
 files, and plot three lines
 
 The first line is the mean of the send column with whiskers indicating
 standard deviation  (Each file represents one data point)
 
 The second line is the mean of the receive column, as above.
 
 the final plot is the mean of the two summed, with whiskers as above.
 
 There will be 50 data points on the final graph, one for each file.
 
 I've done this sort of a thing before, but I really can't figure out how to
 handle the different Columns.
 
 If I use read.table:
 
 x1 - read.table(updateToSink1010.log)
 
 then x1 becomes a matrix, with two columns and 101 rows.  -- including Send,
 Receive.
 
 Anyways, I'd appreciate a push in some direction - hopefully the right one
 :).
 
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function to compute the multinomial beta function?

2010-07-05 Thread Matt Shotwell
How about this?

mbeta - function(...) { 
exp(sum(lgamma(c(...)))-lgamma(sum(c(...
}
 
 gamma(5)*gamma(6)*gamma(7)/gamma(18)
[1] 5.829838e-09
 mbeta(5,6,7)
[1] 5.829838e-09



On Mon, 2010-07-05 at 17:10 -0400, Gregory Gentlemen wrote:
 Dear R-users,
 
 Is there an R function to compute the multinomial beta function? That is, the 
 normalizing constant that arises in a Dirichlet distribution. For example, 
 with three parameters the beta function is Beta(n1,n2,n2) = 
 Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3)
 
 Thanks in advance for any assisstance.
 
 Regards,
 Greg
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] left end or right end

2010-07-01 Thread Matt Shotwell
Suku, 

It looks like you might want to consult with a [bio]statistician, but
I'm interested in what these distances represent. Can you give some
additional context for your problem? How were these distances collected?
Is it a collection of pairs of intervals, like this:

   P   Q
1)  (1.5, 1.8)  (1.2, 2.0)
2)  (1.4, 1.9)  (1.4, 2.3)
...
1)  (start1, end1)  (start2, end2)

?

If so, is there a more specific test you're interested in? For instance,
whether the interval P overlaps with the start/stop position of interval
Q, or whether start1 == start2, or end1 == end2, or both? I can think of
a bootstrap test for hypotheses like this, and this is relatively easy
in R.

-Matt

On Thu, 2010-07-01 at 07:53 -0400, ravikumar sukumar wrote:
 Dear all,
 I am a biologist. I have two sets of distance P(start1, end1) and Q(start2,
 end2).
 The distance will be like this.
 P 
 Q  
 
 I want to know whether P falls closely to the right end or left  end of Q.
  P and Q are of different lengths for each data point. There are more than
 1 pairs of P and Q.
 Is there any test or function in R to bring a statistically significant
 conclusion.
 
 Thanks for all,
 Suku
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] left end or right end

2010-07-01 Thread Matt Shotwell
Suku, 

Just to clarify, in your table and each of your images, it appears that
the start position of P (start1) is _after_ or at the start position of
Q (start2), and the end position of P (end1) is _before_ or at the end
position of Q (end2). If these positions represent increasing integers,
then start1 = start2 and end1 = end2. I will assume this for the
discussion below.
  
You mentioned wanting to know whether the midpoint of P tended to be
greater or lesser than the midpoint of Q. That seems like a good idea,
since the midpoints _must_ be similar when the lengths of P and Q are
similar. Hence, if P and Q are samples from a population, then you may
be interested in the population mean difference in midpoints. We can
denote this mean M:

M = E(mid(P) - mid(Q))

In order to do a classical statistical test, we _need_ a hypothesis
about M, and a rule for rejecting the hypothesis. That's why we use the
term 'hypothesis'. An appropriate hypothesis here might be:

H0: M = 0

or, in words, the mean difference in the P and Q midpoints is zero. A
simple rejection rule for this hypothesis is:

reject H0 when the observed mean difference in P and Q midpoints is
greater than some quantity C, or less than -C.

The trick then is to find C that satisfies some type 1 error
probability, usually 0.05. It's here that I might recommend a bootstrap
procedure.

If, in the end, you reject the hypothesis H0, you can use the sign of
the estimated mean difference in your biological inferences. ...And I'm
still interested to hear what those are. :-) Of course, these are just
my ideas, you really ought to visit a biostatistician for professional
advice.

-Matt



On Thu, 2010-07-01 at 10:24 -0400, ravikumar sukumar wrote:
 There are three possibilities:
 
 Case1: Left end
 
 P--
 Q--
 
 Case2: Right end
 
 P--
 Q--
 
 
 Case3: At mid position
 
 P-
 A--
 
 
 My question is how far my data falls on the all the three cases. Is it
 biased towards case1 or case2 or case3. I have to consider the length of Q
 in the data. Example: start2-start1 =2  and end2-end1 = 3 does not make much
 difference if length of Q is 15.
 
 I do not hypothesize, i want to know how my data goes on.
 
 Thanks and regards
 
 
 
 
 
 
 
 On Thu, Jul 1, 2010 at 4:05 PM, Jonathan Christensen 
 dzhona...@gmail.comwrote:
 
  Hi,
 
  You need to define what you want more exactly--what are the possible
  conclusions (hypotheses) you want to reach? Based on what you've said, I can
  think of several different approaches you might want, but I'm not sure which
  one of them you're actually after. For example:
 
  Hypothesis A: The distance between the left endpoints of P and Q is less
  than (or equal to) the distance between the right endpoints.
  Hypothesis B: The distance between the right endpoints is smaller.
 
  This is a simple binomial test, as David Winsemius suggested. In your most
  recent email, though, it sounds like you want to take into account how much
  smaller one distance is than the other. This is more complicated.
 
  Another option occurred to me: maybe you don't care which end P is close
  to, you just want to know whether it's close to one of the ends, or
  somewhere in the middle.
 
  Without knowing what exactly you are trying to test, it's very hard for us
  to help you.
 
  Jonathan
 
 
  On Thu, Jul 1, 2010 at 7:45 AM, ravikumar sukumar 
  ravikumarsuku...@gmail.com wrote:
 
  Sorry for posting to the R list.
 
  P  Q
  12, 28   10, 42
  2, 5   1, 55
  32, 50   22, 63
  . there are 1 points of P and Q.
  The number of points of P and Q are equal (i,e 1).
 
  The interval P always overlaps with Q. i,e start1start2 and end1end2.
 
  mere calculating whether points have this condition will not be
  significant start1start2 and end1end2 and the length of P that is
  length(end1-start1) and Q ie length(end2-start1) differs.
 
  Example
  Case A:
 
 
  Case B:
  start2 - start1 =100
  end2-end1 = 2
 
  In the above two cases, P is falling on the right end of Q in case B. But
  it
  depends on the length(end2-start2). If the length(end2-start2) =15000 in
  case of B, then it is almost on the middle point.
 
  Is there any test or function in R to bring a statistically
  significant conclusion that midpoint of P or P itself is falling on the
  left
  end or right end of Q.
 
  sorry once again for posting in this list.
 
  Regards
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
   [[alternative HTML version deleted]]
 
 

Re: [R] how to display the clock time in the loop

2010-07-01 Thread Matt Shotwell
Try to flush output after printing:

cat(paste(Sys.time()),\n); flush(stdout())

On Thu, 2010-07-01 at 16:17 -0400, Jack Luo wrote:
 Hi,
 
 I am doing some computation which is pretty time consuming, I want R to
 display CPU time after each iteration using the command Sys.time(). However,
 I found that the code only began to display the CPU time after quite a while
 and several iterations have finished. Is there a way to ask R to display
 time right after each iteration is finished?
 
 Thanks,
 
 -Jun
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is there a way to do dense rank in R

2010-07-01 Thread Matt Shotwell
 x - c(5,7,7,9)
 rank(unique(x))[match(x, unique(x))]
[1] 1 2 2 3

On Thu, 2010-07-01 at 21:30 -0400, Suresh Singh wrote:
 I have not been able to find a way to do dense rank in R
 
 Here is an example of what I need
 
 rank() gives the following
 
 5 rank 1
 7 rank 2
 7 rank 2
 9 *rank 4*
 
 but I want
 
 5 rank 1
 7 rank 2
 7 rank 2
 9 *rank 3*
 *
 *
 thanks
 SS
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integration of two normal density

2010-06-28 Thread Matt Shotwell
Isn't it equally trivial to demonstrate that the product of two pdfs
_may_ be a normalized pdf? For example, the uniform (0,1) pdf:

f(x) = 1 for x in (0, 1), and 0 otherwise

Hence, g(x) = f(x)*f(x) = 1 for x in (0, 1), and 0 otherwise _is_ a
normalized pdf. 

But this is a little silly. Rather than memorize answers to questions
like is the product of pdfs also a pdf?, we ought to be confident in
the properties of pdfs (i.e. not the answers, but the means to arrive at
answers).


On Mon, 2010-06-28 at 11:42 -0400, Bert Gunter wrote:
 Inline Below
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
  
  -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of bill.venab...@csiro.au
 Sent: Friday, June 25, 2010 10:53 PM
 To: carrieands...@gmail.com; R-help@r-project.org
 Subject: Re: [R] integration of two normal density
 
 Your intuition is wrong and R is right.
 
 Why should the product of two probability density functions be a normalized
 pdf also? 
 
 -- as is trivially seen with two uniforms on [0,2], with pdf= 1/2, product =
 1/4 on [0,2] . 
 
 -- Bert
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Carrie Li
 Sent: Saturday, 26 June 2010 1:28 PM
 To: r-help
 Subject: [R] integration of two normal density
 
 Hello everyone,
 
 I have a question about integration of two density function
 Intuitively, I think the value after integration should be 1, but they are
 not. Am I missing something here ?
 
  t - function(y){dnorm(y, mean=3)*dnorm(y/2, mean=1.5)}
  integrate(t, -Inf, Inf)
 0.3568248 with absolute error  4.9e-06
 
 
 Also, is there any R function or package could do multivariate integration ?
 
 Thanks for any suggestions!
 
 Carrie
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] advice on package devel with external libs

2010-06-28 Thread Matt Shotwell
Some ideas,

1. Wrap the library as an R package, as you said, and check for the
library at configure time (i.e. with autoconf or custom script). But if
you do, it would be great to provide an R-level API so that we can all
use it. This is the strategy of the 'cairo', 'RGtk', 'rgl', and 'gsl'
packages. Also, maybe try and collaborate with the developers of the
'rjson' package to improve it.

2. If the library is appropriately licensed, and truly 'lightweight',
simply add its sources to your package. R core does this with zlib and a
few other libraries. However, this puts the burden on you to maintain
code written by others. There are several JSON parsers with very liberal
licenses (www.json.org), and some are tiny.

-Matt



On Mon, 2010-06-28 at 16:10 -0400, Murat Tasan wrote:
 hi all - i'm working on an R package that makes use of my own shared
 library written in C.
 but i also am making use of another C-written library.
 (my package is for facilitating biological namespace translations via
 online (i.e. up-to-date) biological databases.)
 
 problem is, the library i'm using is not a standard library (i.e. i
 doubt it will be installed on most users' machines).
 i also don't think too many users will be particularly adept in
 installing a shared library.
 for users with a sysadmin, it can be done easily enough, but on local
 installations i fear most will be incapable of properly installing/
 locating the library so my code can link to it during compile time.
 (in case anyone was wondering, the library in question is a
 lightweight JSON parser... yes i know there are existing R packages
 for this, but they are *very* slow for large JSON object coding/
 encoding.)
 
 how have folks dealt with this in the past with R packages?
 i've thought about wrapping the other library itself as a separate R
 package which basically does nothing on installation other than
 compile and put the libraries a predictable location... but this seems
 rather silly (and may violate the JSON parser package's license).
 
 thanks for any input on this,
 
 -murat
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >