[R] Controlling text and strip arrangement in xyplot

2007-06-19 Thread Juan Pablo Lewinger
I've searched the archives and read the xyplot help but can't figure 
out the 2 lattice questions below?

Consider:

library(lattice)
DF - data.frame(x=rnorm(20), y=rnorm(20), g1=rep(letters[1:2], 10),
  g2=rep(LETTERS[1:2], each=10), 
g3=rep(rep(letters[3:4],each=5),2))

xyplot(y ~ x | g1 + g2, groups=g3, data=DF)

1) Is there a way to get one strip per row and column of panels as 
below instead of the default?


_|__a__|__b__|
 |
   B
 |
--
 |
   A
 |

2) How do I control the text of the strips so that for instance 
instead of a and b it readsg1=alpha, g1=beta where alpha 
and beta stand for the corresponding greek symbols? (my difficulty 
here is not with the plotmath symbols but with controlling the text 
of the strips directly from the call to xyplot and not by renaming 
the levels of g1)

I'd appreciate any help!


Juan Pablo Lewinger
Department of Preventive Medicine
Keck School of Medicine
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Speeding up resampling of rows from a large matrix

2007-05-25 Thread Juan Pablo Lewinger
I'm trying to:

Resample with replacement pairs of distinct rows from a 120 x 65,000 
matrix H of 0's and 1's. For each resampled pair sum the resulting 2 
x 65,000 matrix by column:

 0 1 0 1 ...
+
 0 0 1 1 ...
___
=  0 1 1 2 ...

For each column accumulate the number of 0's, 1's and 2's over the 
resamples to obtain a 3 x 65,000 matrix G.

For those interested in the background, H is a matrix of haplotypes, 
each pair of haplotypes forms a genotype, and each column corresponds 
to a SNP. I'm using resampling to compute the null distribution of 
the maximum over correlated SNPs of a simple statistic.


The code:
#---
nSNPs - 1000
H - matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)
G - matrix(0, nrow=3, ncol=nSNPs)
# Keep in mind that the real H is 120 x 65000

nResamples - 3000
pair - replicate(nResamples, sample(1:120, 2))

gen - function(x){g - sum(x); c(g==0, g==1, g==2)}

for (i in 1:nResamples){
G - G + apply(H[pair[,i],], 2, gen)
}
#---
The problem is that the loop takes about 80 mins to complete and I 
need to repeat the whole thing 10,000 times, which would then take 
over a year and a half!

Is there a way to speed this up so that the full 10,000 iterations 
take a reasonable amount of time (say a week)?

My machine has an Intel Xeon 3.40GHz CPU with 1GB of RAM

  sessionInfo()
R version 2.5.0 (2007-04-23)
i386-pc-mingw32

I would greatly appreciate any help.

Juan Pablo Lewinger
Department of Preventive Medicine
Keck School of Medicine
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speeding up resampling of rows from a large matrix

2007-05-25 Thread Juan Pablo Lewinger
That's beautiful. For the full 120 x 65,000 matrix your approach took 
85 seconds. A truly remarkable improvement over my 80 minutes!

Thank you!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Efficiently reading random lines form a large file

2007-05-15 Thread Juan Pablo Lewinger
I need to read two different random lines at a time from a large 
ASCII file (120 x 296976) containing space delimited 0-1 entries.

The following code does the job and it's reasonable fast for my needs:

   lineNumber = sample(120, 2)
   line1 = scan(filename, what = integer, skip=lineNumber[1]-1, nlines=1)
   line2 = scan(filename, what = integer, skip=lineNumber[2]-1, nlines=1)

  system.time(for (i in 50){
+   lineNumber = sample(120, 2)
+   line1 = scan(filename, what = integer, skip=lineNumber[1]-1, nlines=1)
+   line2 = scan(filename, what = integer, skip=lineNumber[2]-1, nlines=1)
+ })

Read 296976 items
Read 296976 items
[1] 14.24  0.12 14.51NANA

However, I'm wondering if there's an even faster way to do this. Is there?

  sessionInfo()
R version 2.4.1 (2006-12-18)
i386-pc-mingw32

Juan Pablo Lewinger
Department of Preventive Medicine
Keck School of Medicine
University of Southern California
1540 Alcazar Street, CHP-220
Los Angeles, CA 90089-9011, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CDF of a Multivariate Normal

2007-05-10 Thread Juan Pablo Lewinger
 In my simulations, I have to use the values of the cumulative distribution 
function of a multivariate
 normal with known mean vector and dispersion matrix. Please, can you tell me 
if there is a package in R to do that?

There are two that I know of:

mvtnorm
mnormt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random sample from log-normal distribution

2006-11-18 Thread Juan Pablo Lewinger
 Dear all R users,
 
 Please forgive me if my question is too trivial.
 Suppose I have two variables, (x,y) which is
 log-normally distributed with expected value (mu1,
 mu2) and some variance-covariance matrix. Now I want
 to draw a random sample of size 1000 from this
 distribution. Is there any function available to do
 this?
 
 Thanks and regards,
 Megh


If what you really want is a bivariate lognormal, you can generate first a 
bivariate normal sample (X,Y) with the function rmvnorm in package mvtnorm.  
Then exp(X,Y) will be multivariate lognormal.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Retrieving value computed in inner function call

2006-09-13 Thread Pablo Lewinger
Though not obvious at first the posting you pointed me too is very helpful 
indeed. Thanks a lot Gabor.

Juan Pablo

At 08:48 PM 9/12/2006 -0400, Gabor Grothendieck wrote:
Check out:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/83547.html

On 9/12/06, Juan Pablo Lewinger [EMAIL PROTECTED] wrote:
Dear R users,

Consider the following example function:

f = function(a,b) {
g = function(x) a*x + b
h = function(x) g(x)^2 + x^2
opt = optimize(h,lower = -1, upper = 1)
x.min = opt$minimum
h.xmin = opt$objective
g.xmin = g(x.min)
return(c(x.min, h.xmin, g.xmin))
}

In my real problem the function that plays the role of g is costly
to compute. Now, to minimize h, optimize calls h with different
values of x. In particular, at the end of the optimization, h would
be called with argument x.min, the minimizer of h(x). Therefore,
buried somewhere, there has to be a call to g with argument x=x.min
which I would like to retrieve in order to avoid the extra call to
g in the line before the return. Can this be done without too much pain?

I'd very much appreciate any help.



Juan Pablo Lewinger
Department of Preventive Medicine
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Retrieving value computed in inner function call

2006-09-12 Thread Juan Pablo Lewinger
Dear R users,

Consider the following example function:

f = function(a,b) {
g = function(x) a*x + b
h = function(x) g(x)^2 + x^2
opt = optimize(h,lower = -1, upper = 1)
x.min = opt$minimum
h.xmin = opt$objective
g.xmin = g(x.min)
return(c(x.min, h.xmin, g.xmin))
}

In my real problem the function that plays the role of g is costly 
to compute. Now, to minimize h, optimize calls h with different 
values of x. In particular, at the end of the optimization, h would 
be called with argument x.min, the minimizer of h(x). Therefore, 
buried somewhere, there has to be a call to g with argument x=x.min 
which I would like to retrieve in order to avoid the extra call to 
g in the line before the return. Can this be done without too much pain?

I'd very much appreciate any help.



Juan Pablo Lewinger
Department of Preventive Medicine
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] numeric variables converted to character when recoding missingvalues

2006-06-24 Thread Juan Pablo Lewinger
Thanks Bert, that works of course and is much more straightforward than what
I was trying. However, I'm still puzzled as to why x[x==99]-NA works (i.e.
it replaces the 999s with NAs and keeps the numeric variables numeric) but
is.na(x[x==999])-TRUE doesn't (it replaces the 999s with NAs but changes
all variables where a replacement was made to character)

PS:  As far as I can tell section 2.5 of An Introduction to R -which I had
read- doesn't answer my original question.

Juan Pablo Lewinger
Department of Preventive Medicine 
Keck School of Medicine 
University of Southern California

-Original Message-
From: Berton Gunter [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 23, 2006 3:15 PM
To: 'Juan Pablo Lewinger'; r-help@stat.math.ethz.ch
Subject: RE: [R] numeric variables converted to character when recoding
missingvalues

Please read section 2.5 of An Introduction to R. Numerical missing values
are assigned as NA:

x[x==999]-NA

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Juan 
 Pablo Lewinger
 Sent: Friday, June 23, 2006 3:00 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] numeric variables converted to character when 
 recoding missingvalues
 
 Dear R helpers,
 
 I have a data frame where missing values for numeric 
 variables are coded as
 999. I want to recode those as NAs. The following only 
 partially succeeds
 because numeric variables are converted to character in the process:
 
 df - data.frame(a=c(999,1,999,2), b=LETTERS[1:4])
 is.na(df[2,1]) - TRUE
 df
 
 a b
 1 999 A
 2  NA B
 3 999 C
 4   2 D
 
 is.numeric(df$a)
 [1] TRUE
 
 
 is.na(df[!is.na(df)  df==999]) - TRUE
 df
  a b
 1 NA A
 21 B
 3 NA C
 42 D
 
 is.character(df$a)
 [1] TRUE
 
 My question is how to do the recoding while avoiding this 
 undesirable side
 effect. I'm using R 2.2.1 (yes, I know 2.3.1 is available but 
 don't want to
 switch mid project). I'd appreciate any help.
 
 Further details:
 
 platform i386-pc-mingw32
 arch i386   
 os   mingw32
 system   i386, mingw32  
 status  
 major2  
 minor2.1
 year 2005   
 month12 
 day  20 
 svn rev  36812  
 language R  
 
 
 
 Juan Pablo Lewinger
 Department of Preventive Medicine 
 Keck School of Medicine 
 University of Southern California
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] numeric variables converted to character when recoding missing values

2006-06-23 Thread Juan Pablo Lewinger
Dear R helpers,

I have a data frame where missing values for numeric variables are coded as
999. I want to recode those as NAs. The following only partially succeeds
because numeric variables are converted to character in the process:

df - data.frame(a=c(999,1,999,2), b=LETTERS[1:4])
is.na(df[2,1]) - TRUE
df

a b
1 999 A
2  NA B
3 999 C
4   2 D

is.numeric(df$a)
[1] TRUE


is.na(df[!is.na(df)  df==999]) - TRUE
df
 a b
1 NA A
21 B
3 NA C
42 D

is.character(df$a)
[1] TRUE

My question is how to do the recoding while avoiding this undesirable side
effect. I'm using R 2.2.1 (yes, I know 2.3.1 is available but don't want to
switch mid project). I'd appreciate any help.

Further details:

platform i386-pc-mingw32
arch i386   
os   mingw32
system   i386, mingw32  
status  
major2  
minor2.1
year 2005   
month12 
day  20 
svn rev  36812  
language R  



Juan Pablo Lewinger
Department of Preventive Medicine 
Keck School of Medicine 
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] cdf of multivariate normal

2006-02-06 Thread Juan Pablo Lewinger
I was wondering if anybody has written R code to compute the cdf of a
multivariate (or at least a bivariate) normal distribution with given
covariance structure.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html