date:20101004

Re: [R] Converting a dataframe column from string to datetime

2010-10-04 Thread raje...@cse.iitm.ac.in

ok..thanks

- Original Message -
From: Don MacQueen macque...@llnl.gov
To: raje...@cse.iitm.ac.in, r-help r-help@r-project.org
Sent: Fri, 01 Oct 2010 20:51:06 +0530 (IST)
Subject: Re: [R] Converting a dataframe column from string to datetime

Youâre working too hard. Use this:

   tms - as.POSIXct(strptime(v, %a %b %d %H:%M:%OS %Y))

Take note of the fact that there are two types of datetime objects:  POSIXct 
and POSIXlt.

Your unlist() gave what seemed a strange result because you used on an âltâ 
object. Had you given it a âctâ object it would have made sense. To see, try

   lapply(v,function(x){as.POSIXct(strptime(x, %a %b %d %H:%M:%OS %Y))})

But using lapply() was more complicated than necessary.

-Don

On 9/30/10 10:59 PM, raje...@cse.iitm.ac.in raje...@cse.iitm.ac.in wrote:

Hi,

I have a dataframe column of the form

v-c(Fri Feb 05 20:00:01.43000 2010,Fri Feb 05 20:00:02.274000 2010,Fri 
Feb 05 20:00:02.274000 2010,Fri Feb 05 20:00:06.34000 2010)

I need to convert this to datetime form. I did the following..

lapply(v,function(x){strptime(x, %a %b %d %H:%M:%OS %Y)})

This gives me a list that looks like this...

[[1]]

[1] 2010-02-05 20:00:01.43

[[2]]

[1] 2010-02-05 20:00:02.274

[[3]]

[1] 2010-02-05 20:00:02.274

[[4]]

[1] 2010-02-05 20:00:06.34

However, when I do an unlist...I gets converted to something like this...

sec  minhourmday  monyearwdayyday   isdst  sec  minhour 
   mday  monyearwdayyday   isdst  sec 

  1.430   0.000  20.000   5.000   1.000 110.000   5.000  35.000   0.000   2.274 
  0.000  20.000   5.000   1.000 110.000   5.000  35.000   0.000   2.274 

minhourmday  monyearwdayyday   isdst  sec  min
hourmday  monyearwdayyday   isdst  

  0.000  20.000   5.000   1.000 110.000   5.000  35.000   0.000   6.340   0.000 
 20.000   5.000   1.000 110.000   5.000  

I want it to become a dataframe column except for a change in the datatype to 
datetime...how can I achieve this? 

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://BLOCKEDstat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide 
http://BLOCKEDwww.BLOCKEDR-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

-- 

Don MacQueen

Environmental Protection Department

Lawrence Livermore National Laboratory

925 423-1062

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to iterate through different arguments?

2010-10-04 Thread lord12


any solutions?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-iterate-through-different-arguments-tp2953511p2953608.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating R objects in Java

2010-10-04 Thread lord12


It seems that you can call Java objects from R but that you cannot call R
objects from Java using the RJava interface. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Creating-R-objects-in-Java-tp2904497p2953611.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Syntax for Rmpi cf multicore

2010-10-04 Thread Patrick Connolly

I'm aiming to compare the workings of Rmpi and multicore on a duel
processor quad core machine with 64 bit R-2.11.1 Kubuntu 10.4.


It's impossible for me to get a small reproducable code segment to
show what I mean, but if I show what works for mclapply, I'd hope it's
possible to be shown what would be the equivalent way with mpi.apply.

The function lr.gbm has variables trees, folds and minob which I use
with mclapply like this:

 out - mclapply(subsets, lr.gbm, mc.cores = 6, trees=trees, folds=folds,
  minob=minob)

subsets is simply a vector that looks like this:

[1] COB_2 CNJ_2 COB_3 CNJ_3 COB_4 CNJ_4

and that works more or less fine to produce a list of length 6 which
I've named with the elements of the subsets vector.

  names(out) - subsets


Now, when I use mpi.apply, I make the trees, folds and minob available
to all slaves thus:

  mpi.spawn.Rslaves(nslaves=6)
  mpi.bcast.Robj2slave(trees)
  mpi.bcast.Robj2slave(folds)
  mpi.bcast.Robj2slave(minob)

but, I can't work out what sort of 'array' is required for the first
argument to mpi.apply.

  out.mpi - mpi.apply(subsets.l, lr.gbm)

What should subsets.l look like?  I tried a list of length 6 with
elements the same as those in 'subsets', but evidently, something more
array-like is required, but I'm out of ideas.

TIA


-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] i have aproblem --thank you

2010-10-04 Thread David Winsemius



On Oct 4, 2010, at 7:29 AM, 笑啸 wrote:


dear professor:


If this is directed to my posting, I am not a professor.

  thank you for your help,witn your help i develop the nomogram  
successfully.
after that i want to do the internal validation to the model.i ues  
the bootpred to do it,and then i encounter problem again,just like  
that.(´íÎóÓÚerror to :complete.cases(x, y, wt) :  
²»ÊÇËùÓÐµÄ²ÎÊý¶¼Ò»Ñù³¤(the length of the  
augment was different))
i hope you tell me where is the mistake,and maybe i have chosen the  
wrong function.


I think so, at least to the limited extent that I I am understanding  
your efforts. It appears you are trying to use boot to do simulation,  
but the goal is rather unclear. I think you would get further along  
the way if you imported your data into R your data and constructed a  
model within the rms system.


None of your functions in this posting are creating a new instance of  
dfr$y. They are all depending on the single constructed version of the  
data which may be peculiar or it may be exactly typical, but one  
cannot tell. Rather than boot() you might get further with  
replicate(), but I am not inclined to offer further code in support of  
what is an unclear and possibly futile direction.




thank you
  turly yours
..
load package 'rms'


ddist - datadist(dfr)
options(datadist='ddist')
n-100
set.seed(10)
T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M))
Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes))
dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) 
+0.92*as.numeric(Sex)-1.338)

dfr$y - with(dfr, ifelse(runif(n)  plogis(L), 1, 0) )
dfr - data.frame(T.Grade,Sex,Smoking, L, y)
ddist - datadist(dfr)
options(datadist='ddist')
f-lrm(y~T.Grade +Sex+Smoking, data=dfr)
nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(. 
1,.9,by=.2),.9,1),funlabel=Risk of Death)

plot(nom, xfrac=0.45)


load package bootstrap

.the problem


This is not a problem statement. It is merely evidence that you have  
some sort of unspecified problem.



theta.fit - function(dfr,y){lsfit(dfr,y)}
theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef}


(There is already a predict() function for regular regression problems  
and a Predict() function for use with rms fit objects, which if you  
explained clearly what you were attempting might be urseful.)



sq.err - function(y,yhat) { (y-yhat)^2}


It's probably possible to get the sum-squared error for a linear fit  
from function lm() by more direct means. I do not understand what the  
next line is supposed to be doing, so am unable to offer further  
suggestion. Refitting the same model 50 times to the same data seems  
an uninformative exercise.



results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err)
´íÎóÓÚerror to :complete.cases(x, y, wt) :  
²»ÊÇËùÓÐµÄ²ÎÊý¶¼Ò»Ñù³¤(the length of the  
augment was different)




È«¹ú×îµÍ¼Û£¬ÌìÌìÔÚ¼Ò³åÕÕÆ¬£¬24Ð¡Ê±· 
¢»õÉÏÃÅ£¡

[[alternative HTML version deleted]]


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot matrix - Pearson linear correlation and Density Ellipse

2010-10-04 Thread ashz


Thanks a lot for the answer
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Scatterplot-matrix-Pearson-linear-correlation-and-Density-Ellipse-tp2763552p2953909.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about splines in R

2010-10-04 Thread David Winsemius

Have you looked at lrm in rms? Harrell uses restricted cubic splines,  
but my understanding (subject to correction by my statistical betters)  
is that these are the same as natural cubic splines. The penalty  
matrix is returned as part of the fit object after being constructed  
and the model fit. You have the option of specifying knot locations or  
allowing these to be estimated.


require(rms)
?lrm
?pentrace

Option 2,,, try one of the many methods for searching?

RSiteSearch(penalized logistic)

... returned 95 hits

--
David Winsemius, MD

On Oct 4, 2010, at 1:44 AM, Yan Li wrote:


Dear all,

I wanted to fit a penalized logistic model to obtain knots-based basis
matrix as well as basis coefficients. I was wondering if there is a
package in R that can generate a natural cubic spline and construct
the corresponding penalty matrix?

Thank you!

Yan


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading Strings in R

2010-10-04 Thread Michael Bedward

Try thist...

your.strings - scan(yourfilename.dat, what=)

Michael

On 4 October 2010 22:04, Lorenzo Isella lorenzo.ise...@gmail.com wrote:
 Dear All,
 I know this must be a one-liner, but I am experiencing problems.
 I need to read lists of long integers stored on a text files, i.e. which
 reads like

 7359700484475972
 7359700484475972
 0
 7359700484475972
 0
 0
 0
 97189954101318722
 0
 0
 7811896690636354
 7359700490636354
 0
 0
 0
 7811896684475972
 7811896684475972
 7811896690636354
 8447597298304052
 7359700484475972
 0
 7359700498304052
 7359700498304052
 7359700498304052
 0
 7359700498304052
 7359700484475972
 7359700484475972

 Now, for me every integer is just a symbol, i.e. the ID of some object and I
 simply need to find out how many times and where it occurs in this file.
 Bottom line: I would like to read this file directly as a string to ensure I
 will not have any trouble of integer overflow or rounding error.
 I tried readLines, but I cannot get rid of \n.
 Any help is appreciated.
 Cheers

 Lorenzo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Thu, Sep 30, 2010 at 09:10:16AM -0400, Gabor Grothendieck wrote:
 A data frame is a list in which every component (i.e. every column)
 must have the same length (i.e. the same number of rows).
 data.frame() does preserve names:
 
  data.frame(b, my.c)
  b my.c
 1 22.48
 2 12.29
 3 10.9   15
 4  8.51
 5  9.2   14

Thanks for your suggestion. However, the reason I used list() was that
the different vectors to return usually have different lengths.

Admittedly, I should have used another example that explicated this.

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GnuPG key: 1024D/7050614E
Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E
Learn about secure email at http://www.gnupg.org


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Force evaluation of variable when calling partialPlot

2010-10-04 Thread Liaw, Andy

The plot titles aren't pretty, but the following works for me:

R library(randomForest)
randomForest 4.5-37
Type rfNews() to see new features/changes/bug fixes.
R set.seed(1004)
R iris.rf - randomForest(iris[-5], iris[[5]], ntree=1001)
R par(mfrow=c(2,2))
R for (i in 1:4) partialPlot(iris.rf, iris, names(iris)[i])

Andy

From: Ben Bond-Lamberty
 
 Dear R Users,
 
 I'm using the randomForest package and would like to generate partial
 dependence plots, one after another, for a variety of variables:
 
 m - randomForest( s, ... )
 varnames - c( var1, var2, var3, var4 )   # var1..4 are all in
 data frame s
 for( v in varnames ) {
partialPlot( x=m, pred.data=s, x.var=v )
 }
 
 ...but this doesn't work, with partialPlot complaining that it can't
 find the variable v. I think I need to force the evaluation of the
 loop variable v so that partialPlot sees the correct variable names,
 but am stumped after trying eval and similar functions. Any
 suggestions on how to do this? Googling has not turned up anything
 very useful.
 
 Thanks,
 Ben
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 2010年10月4日 19:14:45 自动保存草稿

2010-10-04 Thread 笑啸

dear professor:
   thank you for your help,witn your help i develop the nomogram successfully.
after that i want to do the internal validation to the model.i ues the bootpred 
to do it,and then i encounter problem again,just like that.(´íÎóÓÚerror to 
:complete.cases(x, y, wt) : ²»ÊÇËùÓÐµÄ²ÎÊý¶¼Ò»Ñù³¤(the length of the augment 
was different))
i hope you tell me where is the mistake,and maybe i have chosen the wrong 
function.
thank you
   turly yours 
 ..
load package 'rms'
 
 ddist - datadist(dfr)
 options(datadist='ddist')
 n-100
 set.seed(10)
  T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
 Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M))
 Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes))
 dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338)
 dfr$y - with(dfr, ifelse(runif(n)  plogis(L), 1, 0) )
  dfr - data.frame(T.Grade,Sex,Smoking, L, y)
 ddist - datadist(dfr)
  options(datadist='ddist')
 f-lrm(y~T.Grade +Sex+Smoking, data=dfr)
 nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(.1,.9,by=.2),.9,1),funlabel=Risk
  of Death)
 plot(nom, xfrac=0.45)
 
load package bootstrap
 
.the problem
 theta.fit - function(dfr,y){lsfit(dfr,y)}
 theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef}
 sq.err - function(y,yhat) { (y-yhat)^2}
 results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err)
´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓÐµÄ²ÎÊý¶¼Ò»Ñù³¤(the length of 
the augment was different)
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] i have aproblem --thank you

2010-10-04 Thread 笑啸

 dear professor:
   thank you for your help,witn your help i develop the nomogram successfully.
after that i want to do the internal validation to the model.i ues the bootpred 
to do it,and then i encounter problem again,just like that.(´íÎóÓÚerror to 
:complete.cases(x, y, wt) : ²»ÊÇËùÓÐµÄ²ÎÊý¶¼Ò»Ñù³¤(the length of the augment 
was different))
i hope you tell me where is the mistake,and maybe i have chosen the wrong 
function.
thank you
   turly yours 
 ..
load package 'rms'
 
 ddist - datadist(dfr)
 options(datadist='ddist')
 n-100
 set.seed(10)
  T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
 Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M))
 Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes))
 dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338)
 dfr$y - with(dfr, ifelse(runif(n)  plogis(L), 1, 0) )
  dfr - data.frame(T.Grade,Sex,Smoking, L, y)
 ddist - datadist(dfr)
  options(datadist='ddist')
 f-lrm(y~T.Grade +Sex+Smoking, data=dfr)
 nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(.1,.9,by=.2),.9,1),funlabel=Risk
  of Death)
 plot(nom, xfrac=0.45)
 
load package bootstrap
 
.the problem
 theta.fit - function(dfr,y){lsfit(dfr,y)}
 theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef}
 sq.err - function(y,yhat) { (y-yhat)^2}
 results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err)
´íÎóÓÚerror to :complete.cases(x, y, wt) : ²»ÊÇËùÓÐµÄ²ÎÊý¶¼Ò»Ñù³¤(the length of 
the augment was different)



È«¹ú×îµÍ¼Û£¬ÌìÌìÔÚ¼Ò³åÕÕÆ¬£¬24Ð¡Ê±·¢»õÉÏÃÅ£¡
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] have aproblem --thank you

2010-10-04 Thread 笑啸

 dear professor:
   thank you for your help,witn your help i develop the nomogram successfully.
after that i want to do the internal validation to the model.i ues the bootpred 
to do it,and then i encounter problem again,just like that.(错误于error to 
:complete.cases(x, y, wt) : 不是所有的参数都一样长(the length of the augment was 
different))
i hope you tell me where is the mistake,and maybe i have chosen the wrong 
function.
thank you
   turly yours 
 ..
load package 'rms'
 
 ddist - datadist(dfr)
 options(datadist='ddist')
 n-100
 set.seed(10)
  T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
 Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M))
 Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes))
 dfr$L-with(dfr,0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338)
 dfr$y - with(dfr, ifelse(runif(n)  plogis(L), 1, 0) )
  dfr - data.frame(T.Grade,Sex,Smoking, L, y)
 ddist - datadist(dfr)
  options(datadist='ddist')
 f-lrm(y~T.Grade +Sex+Smoking, data=dfr)
 nom-nomogram(f,fun=function(x)1/(1+exp(-x)),fun.at=c(.01,.05,seq(.1,.9,by=.2),.9,1),funlabel=Risk
  of Death)
 plot(nom, xfrac=0.45)
 
load package bootstrap
 
.the problem
 theta.fit - function(dfr,y){lsfit(dfr,y)}
 theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef}
 sq.err - function(y,yhat) { (y-yhat)^2}
 results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err)
错误于error to :complete.cases(x, y, wt) : 不是所有的参数都一样长(the length of the augment 
was different)



全国最低价，天天在家冲照片，24小时发货上门！


全国最低价，天天在家冲照片，24小时发货上门！__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Issue with match.call

2010-10-04 Thread raje...@cse.iitm.ac.in


Hi,

I have a function that I'm writing. The arguments in the function are as follows

RFF-function(qtype, qOpt,...){}
i.e., I have two args that are compulsary and the rest are optional. Now when 
my user passes the function call, I need to see what optional args are defined 
and process accordingly...what I have so far is..

RFF-function(qtype, qOpt,...){
mc - match.call(expand.dots=TRUE)
 }

I need to see what all args have been sent out of
vec-c(flag,sep,dec) and define if-else conditions based on whether they 
have been defined. How do I do this?  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] I have aproblem --thank you

2010-10-04 Thread 笑啸

dear teacher:
   thank you for your help,witn your help i develop the nomogram successfully.
after that i want to do the internal validation to the nomogram.i wish to use 
the bootstrap to achieve the prediction accuracy of the nomogram,and then i 
encounter problem again,just like that.(error to :complete.cases(x, y, wt) : 
(the length of the augment was different))
i hope you tell me where is the mistake,and maybe i have chosen the wrong 
function.
thank you
   turly yours 
 ..
load package 'rms'
 
 n-100
 set.seed(10)
 T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
 Sex-factor(0:1,labels=c(F,M))
 Smoking-factor(0:1,labels=c(No,yes))
 L-0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)
 +0.92*as.numeric(Sex)-1.338
 L
[1] -0.755 -0.172  0.363  0.946
 y - ifelse(runif(n)  plogis(L), 1, 0)
dfr - data.frame(T.Grade,Sex,Smoking, L, y)
ddist - datadist(dfr)  # wrap the vectors into a dataframe.
options(datadist='ddist')
f-lrm(y~T.Grade +Sex+Smoking, data=dfr) # skip the as.numeric()'s
### Gives an error message due to singular X matrix.
 f-lrm(y~T.Grade +Sex+Smoking, data=dfr)
singular information matrix in lrm.fit (rank= 5 ).  Offending 
variable(s):
Sex=M
Error in lrm(y ~ T.Grade + Sex + Smoking, data = dfr) :
   Unable to fit model using “lrm.fit”

#Try instead:

n-100
set.seed(10)
T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M))
Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes))

dfr$L - with(dfr, 0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)
+0.92*as.numeric(Sex)-1.338)
dfr$y - with(dfr, ifelse(runif(n)  plogis(L), 1, 0) )
dfr - data.frame(T.Grade,Sex,Smoking, L, y)

ddist - datadist(dfr)
options(datadist='ddist')
f-lrm(y~T.Grade +Sex+Smoking, data=dfr)
nom - nomogram(f, fun=function(x)1/(1+exp(-x)),  # or fun=plogis
 fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999),
 funlabel=Risk of Death)
plot(nom, xfrac=.45)

load package bootstrap(achieve the prediction accuracy of the nomogram)
 
.the problem
 theta.fit - function(dfr,y){lsfit(dfr,y)}
 theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef}
 sq.err - function(y,yhat) { (y-yhat)^2}
 results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err)
error to :complete.cases(x, y, wt) : (the length of the augment was different.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote:
 You should try:
 
 eapply(.GlobalEnv, I)[c('b', 'my.c')]

Great!

b - c(22.4, 12.2, 10.9, 8.5, 9.2)
my.c - sample.int(round(2*mean(b)), 4)

my.return - function (vector.of.variable.names) {
  eapply(.GlobalEnv, I)[vector.of.variable.names]
}

str(my.return(c(b,my.c)))
List of 2
 $ b   :Class 'AsIs'  num [1:5] 22.4 12.2 10.9 8.5 9.2
 $ my.c:Class 'AsIs'  int [1:4] 18 22 12 3

much nicer than list(b=b, my.c=my.c), especially in real cases with
longer variable names and a lot of variables to return.

Thanks Henrique!

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?

2010-10-04 Thread Czerminski, Ryszard

I would like to use greek tau as a symbol of variable to integrate
over in plotmath

expression(integral(f(tau)*dtau, 0,t))

but nothing seems to work. I tried d{\tau}, d\tau, etc.,
without any success

Is it possible? How can I accomplish this?

Best regards,
Ryszard


--
Confidentiality Notice: This message is private and may ...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inserting a plot into another

2010-10-04 Thread Filoche


Hi everyone.

First, thank you for your answers, it helped me alot.

I have 1 more question regarding subplot.  I'm using this function in a
graph that contains 6 plots (3 x 2). This is producing strange behavior.
http://img545.imageshack.us/img545/9233/fig6.png  See here for the resulting
figure. 

For instance here's my code :

pdf(file=C:/Users/Modelisation/Desktop/Fig6.pdf, width = 8.5, height = 11,
pointsize = 12);

par(mfcol = c(3,2), mai = c(0.7,0.8,0,0), omi = c(1, 0, 0.7, 0.1));

plot(FSL.data_Center$Distance, 4.6/FSL.data_Center$kd.BLUE., pch = 22, bg =
gray43, xlab = Distance (km), ylab = 1% penetration depth (m) , ylim =
c(0,20), xaxt = n, yaxt = n, xpd = NA);
axis(1, at = c(100,200,300,400,500),labels = c(100,200,300,400,500),
cex.axis=1, tck = 0.02);
axis(2, tck = 0.02);

...

#Subplots
subplot(plot(lowess(FSL.data_North$Distance, 4.6/FSL.data_North$kd.BLUE.),
type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim =
c(0,20)), 330,17, size = c(.96, .6));
subplot(plot(lowess(FSL.data_Center$Distance, 4.6/FSL.data_Center$kd.RED.),
type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim =
c(0,20)), 330,17, size = c(.96, .6));
subplot(plot(lowess(FSL.data_South$Distance, 4.6/FSL.data_South$kd.GREEN.),
type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450), ylim =
c(0,20)), 330,17, size = c(.96, .6));

PLOT CODE FOR THE 5 OTHER FIGURES HERE

dev.off();

The last 2 subplots are plotted in at the bottom of the page. I would like
to have the last 2 curves on the plot A.

With regards,
Phil

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Inserting-a-plot-into-another-tp2720936p2954362.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?

2010-10-04 Thread peter dalgaard


On Oct 4, 2010, at 15:25 , Czerminski, Ryszard wrote:

 I would like to use greek tau as a symbol of variable to integrate
 over in plotmath
 
 expression(integral(f(tau)*dtau, 0,t))
 
 but nothing seems to work. I tried d{\tau}, d\tau, etc.,
 without any success
 
 Is it possible? How can I accomplish this?

I'd try
...f(tau)~d*tau...



 
 Best regards,
 Ryszard
 
 
 --
 Confidentiality Notice: This message is private and may ...{{dropped:11}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Gabor Grothendieck

On Mon, Oct 4, 2010 at 7:51 AM, Berwin A Turlach
ber...@maths.uwa.edu.au wrote:
 On Mon, 4 Oct 2010 19:45:23 +0800
 Berwin A Turlach ber...@maths.uwa.edu.au wrote:

 Mmh,

 R my.return - function (vector.of.variable.names) {
   sapply(vector.of.variable.names, function(x) list(get(x)))
 }

 make that:

 R my.return - function (vector.of.variable.names) {
     sapply(vector.of.variable.names, function(x) get(x))
   }


Some small tweaks.   If you use simplify=FALSE then it will guarantee
that a list is returned:

   sapply(my.names, get, simplify = FALSE)

for example, compare the outputs of:

   sapply(c(letters, LETTERS), get)
   sapply(c(letters, LETTERS), get, simplify = FALSE)

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Christophe Pallier

See llist from Hmisc package:

library(Hmisc)
a=rnorm(10)
b=rnorm(5)
llist(a,b)



On Thu, Sep 30, 2010 at 1:49 PM, Hans Ekbrand
hans.ekbr...@sociology.gu.se wrote:
 If I combine elements into a list

 b - c(22.4, 12.2, 10.9, 8.5, 9.2)
 my.c - sample.int(round(2*mean(b)), 5)
 my.list - list(b, my.c)

 the names of the elements seems to get lost in the process:

 str(my.list)
 List of 2
  $ : num [1:5] 22.4 12.2 10.9 8.5 9.2
  $ : int [1:5] 11 8 6 9 20

 If I explicitly name the elements at list-creation, I get what I want:

 my.list - list(b=b, my.c=my.c)

 str(my.list)
 List of 2
  $ b   : num [1:5] 22.4 12.2 10.9 8.5 9.2
  $ my.c: int [1:5] 11 8 6 9 20


 Now, is there a way to get list() (or some other function) to
 automatically name the elements?

 I often use list() in return(), and I am getting tired of having to
 repeat myself.

 --
 Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
 A. Because it breaks the logical sequence of discussion
 Q. Why is top posting bad?

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.9 (GNU/Linux)

 iEYEARECAAYFAkykeUwACgkQfCyHKnBQYU4J+ACgrdjMSoyr/Uzt9fpTsietde3n
 d8UAnRskbOM7mDhDexiS7T9LkhRs287P
 =MsDG
 -END PGP SIGNATURE-

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Christophe Pallier  christo...@pallier.org
tel: +33 (0)1 69 08 79 34
Unité de Neuroimagerie Cognitive INSERM-CEA,
Neurospin center, F91191 Gif-sur-Yvette, France
web site: http://www.unicog.org
personal web site: www.pallier.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?

2010-10-04 Thread Thomas Stewart

Try

expression(paste(integral(),f(,tau,) d,tau,sep=)))

it works for me.

-tgs

On Mon, Oct 4, 2010 at 9:25 AM, Czerminski, Ryszard 
ryszard.czermin...@astrazeneca.com wrote:

 I would like to use greek tau as a symbol of variable to integrate
 over in plotmath

 expression(integral(f(tau)*dtau, 0,t))

 but nothing seems to work. I tried d{\tau}, d\tau, etc.,
 without any success

 Is it possible? How can I accomplish this?

 Best regards,
 Ryszard


 --
 Confidentiality Notice: This message is private and may ...{{dropped:11}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating R objects in Java

2010-10-04 Thread Michael Bedward

Have you looked at JRI ?

http://www.rforge.net/JRI/

Michael

On 4 October 2010 09:29, lord12 trexi...@yahoo.com wrote:

 It seems that you can call Java objects from R but that you cannot call R
 objects from Java using the RJava interface.
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Creating-R-objects-in-Java-tp2904497p2953611.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fisher exact test?

2010-10-04 Thread James Nead

Hi All,

My apologies if this is a totally newbie question. 

I want to calculate the probability that a particular set of genes is picked 
repeatedly for 3 samplings. For example, if from a total of 'N' genes, I pick 
'A' number of genes in the first pick, 'B' number of genes in the second pick, 
and 'C' number of genes in the third pick, I would want to calculate p(n) where 
n = (A intersection B intersection C).

Would fisher exact test be the correct method to evaluate the probability? And 
if so how can I frame the 3x3 (?) contingency table? I can find numerous 
examples of 2x2 tables, but none for 3x3. Any guidance would be appreciated.

many thanks.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help on reading multipe files in R

2010-10-04 Thread Rahul Menon

Hi,
How can I read multiple files(in a loop like manner) within a single code 
of R ? For example, I need to run the same code for different datasets (here 
list of companies) and since individual files are quite large, appending the 
files into one file is not a desirable option. Can this be done through a macro 
or sql kind of command?

Thanks and Regards,
Rahul S Menon
Research Associate
ISB, Hyderabad
Ph-040-2318 7288


DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help

2010-10-04 Thread John Haart

Dear List and Frank,

I have calculated the log-odds for my models but maybe i am not getting 
something but i am not understanding how for a categorical factor this helps? 
On all the examples i have see it relates to continuous factors where moving 
from one number to another shows either a increase or decrease, not as in my 
case a change of catagory.

Furthermore, this gives the values for each factor independent of each other, 
how do i get the log-odds for the entire model? I appreciate i maybe trying to 
put things in boxes again, i am not i am happy to report the log odds  of 
moving from one response level to the next but would like it for all the 
factors together not independently.

John

Low HighDiff.   Effect  S.E.Lower   Upper
WO  Woody:Non_woody 1   2   
NA  0.280.16-0.04   0.6
Odds Ratio  1   
2   NA  1.32NA  0.961.82
PD  Abiotic:Biotic  2   
1   NA  -1.21   0.13-1.47   -0.96
Odds Ratio  2   
1   NA  0.3 NA  0.230.38
ALT All:Low 3   
1   NA  0.470.190.110.84
Odds Ratio  3   
1   NA  1.6 NA  1.112.31
ALT High:Low3   
2   NA  -0.07   0.14-0.35   0.21
Odds Ratio  3   
2   NA  0.93NA  0.7 1.24
ALT Mid:Low 3   
4   NA  0.390.150.1 0.67
Odds Ratio  3   
4   NA  1.48NA  1.111.96
REG Two_plus:One1   2   
NA  -0.59   0.13-0.84   -0.34
Odds Ratio  1   
2   NA  0.55NA  0.430.72
BIO Arctic:Subtropical/Tropical 4   
1   NA  -1.02   0.81-2.61   0.58
Odds Ratio  4   
1   NA  0.36NA  0.071.78
BIO Boreal:Subtropical/Tropical 4   2   
NA  -1.21   0.81-2.79   0.37
Odds Ratio  4   
2   NA  0.3 NA  0.061.44
BIO Mediterranean:Subtropical/Tropical  4   3   
NA  -1.89   0.48-2.83   -0.95
Odds Ratio  4   
3   NA  0.15NA  0.060.39
BIO Temperate:Subtropical/Tropical  4   5   
NA  -0.09   0.16-0.41   0.23
Odds Ratio  4   
5   NA  0.91NA  0.661.26
On 3 Oct 2010, at 15:29, Frank Harrell wrote:


You still seem to be hung up on making arbitrary classifications.  Instead,
look at tendencies using odds ratios or rank correlation measures.  My book
Regression Modeling Strategies covers this.

Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2953220.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'all subsets' fitting algorithm for Bayesian approach

2010-10-04 Thread Michael Hopkins


Bert

Thanks for your reply but I already have the Box  Steinberg work in my 
extensive library of 'homework'  :o)

I have some problem specific priors that I need to use for calculating model 
probabilities so that I can produce predictive distributions using Bayesian 
model averaging - hence I need to be able to extract the key summary stats (as 
an analogue of the likelihood) for each model from an exhaustive selection of 
possible model terms.  

If this is available in R already then I would love to hear about it.  I am 
aware of leaps() and regsubsets() but I am not sure that provides exactly what 
I need here, though it may be possible to adapt it somehow.

Michael


On 1 Oct 2010, at 18:32, Bert Gunter wrote:

 u... You are reinventing the wheel. In fact, several wheels: the
 statistical literature already has several different approaches worked
 out for this. For example, George Box and David Steinberg did one
 about 20 years ago, and it has been incorporated as one of the options
 in the JMP DOE model choice procedure.
 
 So do your homework and save yourself some effort. Maybe even all your effort.
 
 -- Bert
 
 On Fri, Oct 1, 2010 at 7:02 AM, Michael Hopkins
 hopk...@upstreamsystems.com wrote:
 
 Hi R experts
 
 I am just wondering if something is already available (or easily adaptable) 
 to do the following.
 
 I am planning to build linear models for all possible combinations of terms, 
 so for example if the terms are sent into a function as this string
 
  X1 + X2 + X3 + X4 + X1:X2
 
 I would want to build models for all possible combinations of these 5 terms, 
 e.g.
 
m1 - lm( y ~ X1 + X3 )
 
 and capture at least the residual sum of squares and total number of model 
 parameters from each model produced.  This will become part of a Bayesian 
 approach to infer actual model probabilities when specialist prior knowledge 
 is also introduced into the problem.
 
 At a high level this particular problem requires something like:
 
 1) the term 'string' to be broken down into it's elements which are 
 separated by + and, I suppose, stored in a list for easier manipulation
 
 2) a matrix with 2^5 rows and 5 columns to be formed with a 0 present if the 
 term is not included and 1 if it is.  Then a model will be fitted to 
 represent every row of this matrix and the key statistics stored in vectors 
 of length 2^5
 
 For N terms of course the number of models will be 2^N.
 
 Is there anything available already?  This is a very similar problem to all 
 subsets regression.
 
 My skill at manipulating strings in R is very limited; can anyone recommend 
 some links or available functions which would make the separations and 
 constructions required easy to achieve?
 
 Thanks in advance to all
 
 
 Michael Hopkins
 Algorithm and Statistical Modelling Expert
 
 
 -- 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 467-7374
 http://devo.gene.com/groups/devo/depts/ncb/home.shtml




Michael Hopkins
Algorithm and Statistical Modelling Expert
 
Upstream
23 Old Bond Street
London
W1S 4PZ

Mob +44 0782 578 7220
DL   +44 0207 290 1326
Fax  +44 0207 290 1321

hopk...@upstreamsystems.com
www.upstreamsystems.com
 
IMPORTANT NOTICE
The information in this e-mail and any attached files is...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help

2010-10-04 Thread Frank Harrell


I may be missing a point, but the proportional odds model easily gives you
odds ratios for Y=j (independent of j by PO assumption).  Other options
include examining a rank correlation between the linear predictor and Y, or
(if Y is numeric and spacings between categories are meaningful) you can get
predicted mean Y (see the Mean.lrm in the R rms package, a replacement for
the Design package).

Frank 

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2954274.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on reading multipe files in R

2010-10-04 Thread Michael Bedward

Hello,

I think you'll have to provide more details on the file type / format
and your analytical objectives (as will Arnab Kumar Maity, Researcher,
Indian School of Business who just asked a remarkably similar question
:)

Michael


On 4 October 2010 16:43, Rahul Menon rahul_me...@isb.edu wrote:
 Hi,
    How can I read multiple files(in a loop like manner) within a single code 
 of R ? For example, I need to run the same code for different datasets (here 
 list of companies) and since individual files are quite large, appending the 
 files into one file is not a desirable option. Can this be done through a 
 macro or sql kind of command?

 Thanks and Regards,
 Rahul S Menon
 Research Associate
 ISB, Hyderabad
 Ph-040-2318 7288

 
 DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading Strings in R

2010-10-04 Thread Lorenzo Isella


Dear All,
I know this must be a one-liner, but I am experiencing problems.
I need to read lists of long integers stored on a text files, i.e. which 
reads like


7359700484475972
7359700484475972
0
7359700484475972
0
0
0
97189954101318722
0
0
7811896690636354
7359700490636354
0
0
0
7811896684475972
7811896684475972
7811896690636354
8447597298304052
7359700484475972
0
7359700498304052
7359700498304052
7359700498304052
0
7359700498304052
7359700484475972
7359700484475972

Now, for me every integer is just a symbol, i.e. the ID of some object 
and I simply need to find out how many times and where it occurs in this 
file.
Bottom line: I would like to read this file directly as a string to 
ensure I will not have any trouble of integer overflow or rounding error.

I tried readLines, but I cannot get rid of \n.
Any help is appreciated.
Cheers

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] print-show-display a matrix

2010-10-04 Thread jim holtman

?View

On Mon, Oct 4, 2010 at 7:48 AM, Alaios ala...@yahoo.com wrote:
 Hello.
 I want to print the value a matrix has. The matrix's size is not too big
 (100*100 cells). I tried print(matrixname) but as it does not fit very well on
 my screen R splits it into several small matrixes that do overflow my screen.
 IS it possible somehow to display this matrix as one (even if this needs to go
 fullscreen) . There might be some function for that like plot cells or display
 matrixes.

 I would like to thank you in advance for your help

 Best Regards
 Alex



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help on reading multiple files in R

2010-10-04 Thread Arnab Maity

Hi,
How can I read multiple files(in a loop like manner) within a single code 
of R ? For example, I need to run the same code for different datasets (here 
list of companies) and since individual files are quite large, appending the 
files into one file is not a desirable option. Can this be done through a macro 
or sql kind of command?

Thank you in advance.


Arnab Kumar Maity
Researcher
Indian School of Business,
Hyderabad - 500032
India
Ph-  +91 40 2318 7886



DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Mon, Oct 04, 2010 at 07:45:23PM +0800, Berwin A Turlach wrote:
 G'day Hans,
 
 On Mon, 4 Oct 2010 11:28:15 +0200
 Hans Ekbrand h...@sociologi.cjb.net wrote:
 
  On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote:
   You should try:
   
   eapply(.GlobalEnv, I)[c('b', 'my.c')]
  
  Great!
  
  b - c(22.4, 12.2, 10.9, 8.5, 9.2)
  my.c - sample.int(round(2*mean(b)), 4)
  
  my.return - function (vector.of.variable.names) {
eapply(.GlobalEnv, I)[vector.of.variable.names]
  }
 
 Well, if you are willing to create a vector with the variable names,
 then simpler solutions should be possible, i.e. solutions that only
 operate on the objects of interest and not on all objects in the global
 environment (which could be a lot depending on your style).  

Actually, what made me want this list-like function was when coding
the return() of the interesting results from a calculation function to
what I imagine is the global environment (I have only a vague
concept of that though). So, in the global environment there are very
few objects, while there are more objects in the function where this
list-like function will be used.

Your solution does look way cleaner the falling back to hidden stuff
as .GlobalEnv, so I will definately use it.

In addition, the returned list has is not of a strange class as in
Henriques example.

Thanks,

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plyr: a*ply with functions that return matrices-- possible bug in aaply?

2010-10-04 Thread hadley wickham

  That is, I want to define something like the
 following using an a*ply method, but aaply gives a result in which the
 applied .margin(s) do not appear last in the
 result, contrary to the documentation for ?aaply.  I think this is a bug,
 either in the function or the documentation,
 but perhaps there's something I misunderstand for this case.

Maybe the documentation isn't clear but I think this is behaving as expected:

 * the margin you split on comes first in the output
 * followed by the dimensions created by the applied function.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Berwin A Turlach

On Mon, 4 Oct 2010 19:45:23 +0800
Berwin A Turlach ber...@maths.uwa.edu.au wrote:

Mmh,

 R my.return - function (vector.of.variable.names) {
   sapply(vector.of.variable.names, function(x) list(get(x)))
 }

make that:

R my.return - function (vector.of.variable.names) {  
 sapply(vector.of.variable.names, function(x) get(x))
   }

Cheers,

Berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Berwin A Turlach

G'day Hans,

On Mon, 4 Oct 2010 11:28:15 +0200
Hans Ekbrand h...@sociologi.cjb.net wrote:

 On Thu, Sep 30, 2010 at 09:34:26AM -0300, Henrique Dallazuanna wrote:
  You should try:
  
  eapply(.GlobalEnv, I)[c('b', 'my.c')]
 
 Great!
 
 b - c(22.4, 12.2, 10.9, 8.5, 9.2)
 my.c - sample.int(round(2*mean(b)), 4)
 
 my.return - function (vector.of.variable.names) {
   eapply(.GlobalEnv, I)[vector.of.variable.names]
 }

Well, if you are willing to create a vector with the variable names,
then simpler solutions should be possible, i.e. solutions that only
operate on the objects of interest and not on all objects in the global
environment (which could be a lot depending on your style).  For
example:

R my.return - function (vector.of.variable.names) {
  sapply(vector.of.variable.names, function(x) list(get(x)))
}
R str(my.return(c(b,my.c)))
List of 2
 $ b   : num [1:5] 22.4 12.2 10.9 8.5 9.2
 $ my.c: int [1:4] 7 5 23 4

Cheers,

Berwin

== Full address 
Berwin A Turlach  Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019)+61 (8) 6488 3383 (self)
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway   
Crawley WA 6009e-mail: ber...@maths.uwa.edu.au
Australiahttp://www.maths.uwa.edu.au/~berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about splines in R

2010-10-04 Thread Yan Li

Dear all,

I wanted to fit a penalized logistic model to obtain knots-based basis
matrix as well as basis coefficients. I was wondering if there is a
package in R that can generate a natural cubic spline and construct
the corresponding penalty matrix?

Thank you!

Yan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multivariate longitudinal analysis

2010-10-04 Thread abderrahim youssef

Dear all,

I am looking for an R package that fits multivariate gaussian or
non-gaussian longitudinal outcomes. I am especially interested to
non-gaussian outcomes  since the outcomes I've got are discretes (some are
binomial and some are counts).

Many thanks in advance,

Abderrahim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Force evaluation of variable when calling partialPlot

2010-10-04 Thread Ben Bond-Lamberty

Dear R Users,

I'm using the randomForest package and would like to generate partial
dependence plots, one after another, for a variety of variables:

m - randomForest( s, ... )
varnames - c( var1, var2, var3, var4 )   # var1..4 are all in
data frame s
for( v in varnames ) {
   partialPlot( x=m, pred.data=s, x.var=v )
}

...but this doesn't work, with partialPlot complaining that it can't
find the variable v. I think I need to force the evaluation of the
loop variable v so that partialPlot sees the correct variable names,
but am stumped after trying eval and similar functions. Any
suggestions on how to do this? Googling has not turned up anything
very useful.

Thanks,
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] print-show-display a matrix

2010-10-04 Thread Alaios

Hello. 
I want to print the value a matrix has. The matrix's size is not too big 
(100*100 cells). I tried print(matrixname) but as it does not fit very well on 
my screen R splits it into several small matrixes that do overflow my screen.
IS it possible somehow to display this matrix as one (even if this needs to go 
fullscreen) . There might be some function for that like plot cells or display 
matrixes.

I would like to thank you in advance for your help

Best Regards
Alex


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on reading multiple files in R

2010-10-04 Thread jim holtman

allFilesList - lapply(list.file(), read.table, ..args..)

On Mon, Oct 4, 2010 at 2:10 AM, Arnab Maity arnab_ma...@isb.edu wrote:
 Hi,
    How can I read multiple files(in a loop like manner) within a single code 
 of R ? For example, I need to run the same code for different datasets (here 
 list of companies) and since individual files are quite large, appending the 
 files into one file is not a desirable option. Can this be done through a 
 macro or sql kind of command?

 Thank you in advance.


 Arnab Kumar Maity
 Researcher
 Indian School of Business,
 Hyderabad - 500032
 India
 Ph-  +91 40 2318 7886


 
 DISCLAIMER : This e-mail (including any attachments) is ...{{dropped:11}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to make list() return a list of named elements

2010-10-04 Thread Hans Ekbrand

On Mon, Oct 04, 2010 at 07:51:10PM +0800, Berwin A Turlach wrote:
 R my.return - function (vector.of.variable.names) {  
  sapply(vector.of.variable.names, function(x) get(x))
}

Even better :-)

-- 
Hans Ekbrand (http://sociologi.cjb.net) h...@sociologi.cjb.net
GPG Fingerprint: 1408 C8D5 1E7D 4C9C C27E 014F 7C2C 872A 7050 614E


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] spatial interaction (gravity) model as Poisson regression

2010-10-04 Thread Carson Farmer

Dear list,

I posted essentially this same question to the r-sig-geo mailing list
last week with no response :( Unfortunately I am no closer to reaching
a solution, so I now post it here (with some clarifications) in the
hope that someone following this list might have an answer for me:

Has anyone had much experience with spatial interaction (or gravity)
models, specifically in the form of Poisson regression? I'm a bit
unsure of how to operationalize this using glm(), and would appreciate
any pointers from those with more experience. I have searched the
archives extensively, and there are several mentions of this question,
but I have yet to find anything concrete that I can wrap my head
around... apologies if I have missed something.

Basically, the conventional origin constrained model would look
something like this:
T_{ij} = exp(\delta_{i} + \log{A_{j}} - \beta D_{ij}) ~ \varepsilon_{ij}
where \delta_{i} is a constant parameter speciﬁc to the ith zone,
A_{j} is the attractiveness of the jth location, and D_{ij} is the
distance between i and j. Note that \varepsilon_{ij} is just the
multiplicative error term of the ﬂow from i to j, and \beta is the
distance decay parameter.

Similarly, the doubly constrained model follows the form:
T_{ij} = exp(\delta_{i} + \gamma_{j} - \beta D_{ij}) ~ \varepsilon_{ij}
where everything is defined as above, except exp(\gamma_{j}) is an
estimate of the attractiveness of location A_{j}.

Hopefully the above description makes things a bit clearer,
essentially my question is this:
What factors or in what form do I have to have my data in order to be
able to run such a model following the glm syntax?
I know this should be relatively straight-forward, I just can't seem
to get my head wrapped around it at all?
If it helps, I can provide some sample data to those who request it.

TIA, Carson

-- 
Carson J. Q. Farmer
ISSP Doctoral Fellow
National Centre for Geocomputation
National University of Ireland, Maynooth,
http://www.carsonfarmer.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I have aproblem --thank you

2010-10-04 Thread David Winsemius



On Oct 4, 2010, at 9:57 AM, 笑啸 wrote:


dear teacher:
  thank you for your help,witn your help i develop the nomogram  
successfully.

after that i want to do the internal validation to the nomogram.


So you need some standard for accuracy, which would only come from the  
original data. Hence my suggestion that you bring the original data  
into R and rms. You cannot validate that model with the approaches you  
are attempting.


And it would not be correct to say that you are validating the  
nomogram. It is simply a graphical tool for implementing predictions  
from a statistical model. You should be thinking of this as exploring  
the



i wish to use the bootstrap to achieve the prediction accuracy of  
the nomogram,and then i encounter problem again,just like that. 
(error to :complete.cases(x, y, wt) : (the length of the augment was  
different))
i hope you tell me where is the mistake,and maybe i have chosen the  
wrong function.


More accurately, I think you have chosen the wrong strategy. I think  
you need to add some statistical consultation fees to your research  
budget. Or you may get away with buying Regression Modeling  
Strategies and using it to work through the many worked examples that  
accompany the rms package. The first option may be faster, but the  
second option is probably slower but should result in a higher level  
of statistical understanding for you. The proper choice for you  
depends on your deadline, available time and what monetary value you  
assign to that spare time.


--
David.


thank you
  turly yours
..
load package 'rms'


n-100
set.seed(10)
T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
Sex-factor(0:1,labels=c(F,M))
Smoking-factor(0:1,labels=c(No,yes))
L-0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)
+0.92*as.numeric(Sex)-1.338

L

[1] -0.755 -0.172  0.363  0.946

y - ifelse(runif(n)  plogis(L), 1, 0)
dfr - data.frame(T.Grade,Sex,Smoking, L, y)

ddist - datadist(dfr)  # wrap the vectors into a dataframe.
options(datadist='ddist')
f-lrm(y~T.Grade +Sex+Smoking, data=dfr) # skip the as.numeric()'s
### Gives an error message due to singular X matrix.

f-lrm(y~T.Grade +Sex+Smoking, data=dfr)

singular information matrix in lrm.fit (rank= 5 ).  Offending
variable(s):
Sex=M
Error in lrm(y ~ T.Grade + Sex + Smoking, data = dfr) :
  Unable to fit model using “lrm.fit”

#Try instead:

n-100
set.seed(10)
T.Grade-factor(0:3,labels=c(G0, G1, G2,G3))
Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M))
Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes))

dfr$L - with(dfr, 0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)
+0.92*as.numeric(Sex)-1.338)
dfr$y - with(dfr, ifelse(runif(n)  plogis(L), 1, 0) )
dfr - data.frame(T.Grade,Sex,Smoking, L, y)

ddist - datadist(dfr)
options(datadist='ddist')
f-lrm(y~T.Grade +Sex+Smoking, data=dfr)
nom - nomogram(f, fun=function(x)1/(1+exp(-x)),  # or fun=plogis
fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999),
funlabel=Risk of Death)
plot(nom, xfrac=.45)

load package bootstrap(achieve the prediction accuracy of the  
nomogram)


.the problem

theta.fit - function(dfr,y){lsfit(dfr,y)}
theta.predict - function(fit,dfr){cbind(1,dfr)%*%fit$coef}
sq.err - function(y,yhat) { (y-yhat)^2}
results - bootpred(x,y,50,theta.fit,theta.predict,err.meas=sq.err)
error to :complete.cases(x, y, wt) : (the length of the augment was  
different.__

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issue with match.call

2010-10-04 Thread Joshua Wiley

Hi,

Something along these lines should get you there:

RFF - function(qtype, qOpt, ...) {
  mc - match.call(expand.dots=TRUE)
  vec - c(flag,sep,dec)
  matchedargs - match(vec, names(mc), FALSE)
}

'matchedargs' will be a vector of the positions in mc where it matched
'vec' (or 0 if it did not).  If you just want to extract the actual
arguments passed afterward, use the [[ extraction operator.

Cheers,

Josh

On Mon, Oct 4, 2010 at 6:45 AM, raje...@cse.iitm.ac.in
raje...@cse.iitm.ac.in wrote:

 Hi,

 I have a function that I'm writing. The arguments in the function are as 
 follows

 RFF-function(qtype, qOpt,...){}
 i.e., I have two args that are compulsary and the rest are optional. Now when 
 my user passes the function call, I need to see what optional args are 
 defined and process accordingly...what I have so far is..

 RFF-function(qtype, qOpt,...){
        mc - match.call(expand.dots=TRUE)
  }

 I need to see what all args have been sent out of
 vec-c(flag,sep,dec) and define if-else conditions based on whether 
 they have been defined. How do I do this?
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fisher exact test?

2010-10-04 Thread Thomas Stewart

To find the probability directly, you'll need to clearly state how you are
sampling.  Are you sampling with replacement? (Draw then put back.)  Or are
you sampling without replacement? (Draw.  Draw from the remaining.  Draw
from the remaining.)

Also, of the N genes, do you know how many are of the particular set?
How many genes do you draw each try?

Does order matter to you?  In other words is (A=5, B=0, C=23) equivalent to
(A=0,B=5,C=23)?

-tgs

On Mon, Oct 4, 2010 at 10:13 AM, James Nead james_n...@yahoo.com wrote:

 Hi All,

 My apologies if this is a totally newbie question.

 I want to calculate the probability that a particular set of genes is
 picked
 repeatedly for 3 samplings. For example, if from a total of 'N' genes, I
 pick
 'A' number of genes in the first pick, 'B' number of genes in the second
 pick,
 and 'C' number of genes in the third pick, I would want to calculate p(n)
 where
 n = (A intersection B intersection C).

 Would fisher exact test be the correct method to evaluate the probability?
 And
 if so how can I frame the 3x3 (?) contingency table? I can find numerous
 examples of 2x2 tables, but none for 3x3. Any guidance would be
 appreciated.

 many thanks.



[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Splitting a DF into rows according to a column

2010-10-04 Thread Johannes Graumann

Hi,

I'm turning my wheels on this and keep coming around to the same wrong 
solution - please have a look and give a hand ...

The premise is: a DF like so

 loremIpsum - Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
Quisque leo ipsum, ultricies scelerisque volutpat non, volutpat et nulla. 
Curabitur consequat ullamcorper tellus id imperdiet. Duis semper malesuada 
nulla, blandit lobortis diam fringilla at. Vestibulum nec tellus orci, eu 
sollicitudin quam. Phasellus sit amet enim diam. Phasellus mattis hendrerit 
varius. Curabitur ut tristique enim. Lorem ipsum dolor sit amet, consectetur 
adipiscing elit. Sed convallis, tortor id vehicula facilisis, nunc justo 
facilisis tellus, sed eleifend nisi lacus id purus. Maecenas tempus 
sollicitudin libero, molestie laoreet metus dapibus eu. Mauris justo ante, 
mattis et pulvinar a, varius pretium eros. Curabitur fringilla dui ac dui 
rutrum pretium. Donec sed magna adipiscing nisi accumsan congue sed ac est. 
Vivamus lorem urna, tristique quis accumsan quis, ullamcorper aliquet 
velit.
 tmpDF - data.frame(Column1=rep(unlist(strsplit(loremIpsum, 
)),length.out=510),Column2=runif(510,min=0,max=1e8))

is to be split into DFs with 50 entries in an ordered manner according to 
column2 (first DF ist o contain the rows with the 50 largest numbers, ...).

Here is what I have been doing:

 binSize - 50
 splitMembership - 
pmin(ceiling(order(tmpDF[[Column2]],decreasing=TRUE)/binSize),floor(nrow(tmpDF)/binSize))
 splitList - split(tmpDF,splitMembership)

Distribution seems to work ...
 sapply(splitList,nrow)

But this is NOT what I wanted ...
 sapply(splitList,function(x){max(x[[Column2]])})
This was supposed to give me bins that are Column2-sorted and bin one should 
have a higher max than 2 than 3 ...

Can anyone point out where (my now 3 reimplementations) fail?

Thanks, Stupid Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] input matrix for leaps algorithm

2010-10-04 Thread CZ


Hello,

I was trying to use the leaps algorithm for my variable selection.  I tried
different matrices calculated from multiple linear regression, linear
discriminant analysis, and the basic correlation matrix as inputs to the
leaps algorithm.  But I always got the error message - the matrix is not
positive definite.  

I have more variables than the number of observations/samples.  I know this
may be a problem.  But the multiple linear regression in SAS seems to be
able to handle the case when there are more variables than the number of
observations.  I wonder if there is a way I can still do my variable
selection instead of reducing the number of variables.  Or there may be
something wrong with my function calls.  

Thank you. 



-- 
View this message in context: 
http://r.789695.n4.nabble.com/input-matrix-for-leaps-algorithm-tp2954453p2954453.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot matrix - Pearson linear correlation and Density Ellipse

2010-10-04 Thread ashz


Hi,

strangely, when I run this script:

panel.cor = function(x, y, digits=2, prefix=r=, cex.cor)
{
usr = par(usr); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r = abs(cor(x, y, use=pairwise.complete.obs, method = pearson))
txt = format(c(r, 0.123456789), digits=digits)[1]
txt = paste(prefix, txt, sep=)
if(missing(cex.cor)) cex.cor = 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor)
}

 pairs(cap[8:11], lower.panel=panel.lm, upper.panel=panel.cor, cex=2)


I get linear regression and density ellipse, why?

Is it possible to add also the data point?

Thanks a lot

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Scatterplot-matrix-Pearson-linear-correlation-and-Density-Ellipse-tp2763552p2954416.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot for Binomial GLM

2010-10-04 Thread klsk89


Hi i would like to use some graphs or tables to explore the data and make
some sensible guesses of what  to expect to see in a glm model to assess if
toxin concentration and sex have a relationship with the kill rate of rats.
But i cant seem to work it out as i have two predictor
variables~help?Thanks.:)

Here's my data.

 rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,)
 attach(rat.toxic)
 names(rat.toxic)
[1] Dose  Sex   Dead  Alive
 rat.toxic
   Dose Sex Dead Alive
110   F119
210   M020
320   F416
420   M416
530   F911
630   M812
740   F   13 7
840   M   13 7
950   F   18 2
10   50   M   17 3
11   60   F   20 0
12   60   M   16 4
13   10   F317
14   10   M119
15   20   F218
16   20   M218
17   30   F   1010
18   30   M812
19   40   F   14 6
20   40   M   12 8
21   50   F   16 4
22   50   M   13 7
23   60   F   18 2
24   60   M   16 4

glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic)
 anova(glm2,test=Chi)
Analysis of Deviance Table

Model: binomial, link: logit

Response: deadalive

Terms added sequentially (first to last)


 Df Deviance Resid. Df Resid. Dev P(|Chi|)
NULL23225.455  
Dose  1  202.36622 23.0902e-16 ***
Sex   14.32821 18.7620.0375 *  
Dose:Sex  11.14920 17.6130.2838
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 summary(glm2)

Call:
glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic)

Deviance Residuals: 
 Min1QMedian3Q   Max  
-1.82241  -0.85632   0.06675   0.61981   1.47874  

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -3.479390.46167  -7.537 4.83e-14 ***
Dose 0.105970.01286   8.243   2e-16 ***
SexM 0.155010.63974   0.2420.809
Dose:SexM   -0.018210.01707  -1.0670.286
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 225.455  on 23  degrees of freedom
Residual deviance:  17.613  on 20  degrees of freedom
AIC: 91.115

Number of Fisher Scoring iterations: 4






-- 
View this message in context: 
http://r.789695.n4.nabble.com/Plot-for-Binomial-GLM-tp2954406p2954406.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How many R packages are not free?

2010-10-04 Thread Paul Miller

Thanks to everyone who replied to my post. One person indicated that this 
question treads on dangerous terrain. I had that sense too but really wanted 
to know how many packages might not be free in the sense that they could be 
used at no cost by everyone. I was a little disappointed to learn that there 
are packages that are free for some people but not for others. But I understand 
that it's not me who has put time and effort into developing the base software 
or the various add-on packages. And so clearly it's up to others who have put 
in the time and effort to make those decisions.
 
Paul


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issue with match.call

2010-10-04 Thread Hadley Wickham

 RFF-function(qtype, qOpt,...){}
 i.e., I have two args that are compulsary and the rest are optional. Now when 
 my user passes the function call, I need to see what optional args are 
 defined and process accordingly...what I have so far is..

 RFF-function(qtype, qOpt,...){
        mc - match.call(expand.dots=TRUE)
  }

 I need to see what all args have been sent out of
 vec-c(flag,sep,dec) and define if-else conditions based on whether 
 they have been defined. How do I do this?

I think you'd be much better off defining those as arguments and using
missing(), rather than messing around with match.call (unless there is
a specific reason you need the unevaluated expressions).

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot for Binomial GLM

2010-10-04 Thread Joshua Wiley

On Mon, Oct 4, 2010 at 7:21 AM, klsk89 karenkls...@yahoo.com wrote:

 Hi i would like to use some graphs or tables to explore the data and make
 some sensible guesses of what  to expect to see in a glm model to assess if
 toxin concentration and sex have a relationship with the kill rate of rats.
 But i cant seem to work it out as i have two predictor
 variables~help?Thanks.:)

What about xtabs?   For instance:

xtabs(deadalive ~ Dose + Sex, data = rat.toxic)

Regarding graphs, take a look at faceting in ggplot2 (or lattice).
You can get something close to the 3 way table but in graphical form
that way.  I am not sure if this is completely up and running yet, but
I know there has been work linking ggobi with R.  I have seen a few
demonstrations that looked quite promising, and it may work well for
you to visualize three variables at once (and interactively).  Here is
the link:  http://www.ggobi.org/rggobi/


 Here's my data.

 rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,)
 attach(rat.toxic)
  ^  why attach it?
 names(rat.toxic)
 [1] Dose  Sex   Dead  Alive
 rat.toxic
   Dose Sex Dead Alive
 1    10   F    1    19
 2    10   M    0    20
 3    20   F    4    16
 4    20   M    4    16
 5    30   F    9    11
 6    30   M    8    12
 7    40   F   13     7
 8    40   M   13     7
 9    50   F   18     2
 10   50   M   17     3
 11   60   F   20     0
 12   60   M   16     4
 13   10   F    3    17
 14   10   M    1    19
 15   20   F    2    18
 16   20   M    2    18
 17   30   F   10    10
 18   30   M    8    12
 19   40   F   14     6
 20   40   M   12     8
 21   50   F   16     4
 22   50   M   13     7
 23   60   F   18     2
 24   60   M   16     4

Please tell me that after this, you converted the counts of dead and
alive into a single variable that had a 0 or 1 if dead and the
opposite as alive before you used it as the dependent variable in your
logistic regression.

 glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic)
 anova(glm2,test=Chi)
 Analysis of Deviance Table

 Model: binomial, link: logit

 Response: deadalive

 Terms added sequentially (first to last)


         Df Deviance Resid. Df Resid. Dev P(|Chi|)
 NULL                        23    225.455
 Dose      1  202.366        22     23.090    2e-16 ***
 Sex       1    4.328        21     18.762    0.0375 *
 Dose:Sex  1    1.149        20     17.613    0.2838
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 summary(glm2)

 Call:
 glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic)

 Deviance Residuals:
     Min        1Q    Median        3Q       Max
 -1.82241  -0.85632   0.06675   0.61981   1.47874

 Coefficients:
            Estimate Std. Error z value Pr(|z|)
 (Intercept) -3.47939    0.46167  -7.537 4.83e-14 ***
 Dose         0.10597    0.01286   8.243   2e-16 ***
 SexM         0.15501    0.63974   0.242    0.809
 Dose:SexM   -0.01821    0.01707  -1.067    0.286
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 (Dispersion parameter for binomial family taken to be 1)

    Null deviance: 225.455  on 23  degrees of freedom
 Residual deviance:  17.613  on 20  degrees of freedom
 AIC: 91.115

 Number of Fisher Scoring iterations: 4






 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Plot-for-Binomial-GLM-tp2954406p2954406.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting a DF into rows according to a column

2010-10-04 Thread peter dalgaard


On Oct 4, 2010, at 16:57 , Johannes Graumann wrote:

 Hi,
 
 I'm turning my wheels on this and keep coming around to the same wrong 
 solution - please have a look and give a hand ...
 
 The premise is: a DF like so
 
 loremIpsum - Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
 Quisque leo ipsum, ultricies scelerisque volutpat non, volutpat et nulla. 
 Curabitur consequat ullamcorper tellus id imperdiet. Duis semper malesuada 
 nulla, blandit lobortis diam fringilla at. Vestibulum nec tellus orci, eu 
 sollicitudin quam. Phasellus sit amet enim diam. Phasellus mattis hendrerit 
 varius. Curabitur ut tristique enim. Lorem ipsum dolor sit amet, consectetur 
 adipiscing elit. Sed convallis, tortor id vehicula facilisis, nunc justo 
 facilisis tellus, sed eleifend nisi lacus id purus. Maecenas tempus 
 sollicitudin libero, molestie laoreet metus dapibus eu. Mauris justo ante, 
 mattis et pulvinar a, varius pretium eros. Curabitur fringilla dui ac dui 
 rutrum pretium. Donec sed magna adipiscing nisi accumsan congue sed ac est. 
 Vivamus lorem urna, tristique quis accumsan quis, ullamcorper aliquet 
 velit.
 tmpDF - data.frame(Column1=rep(unlist(strsplit(loremIpsum, 
 )),length.out=510),Column2=runif(510,min=0,max=1e8))
 
 is to be split into DFs with 50 entries in an ordered manner according to 
 column2 (first DF ist o contain the rows with the 50 largest numbers, ...).
 
 Here is what I have been doing:
 
 binSize - 50
 splitMembership - 
 pmin(ceiling(order(tmpDF[[Column2]],decreasing=TRUE)/binSize),floor(nrow(tmpDF)/binSize))
 splitList - split(tmpDF,splitMembership)
 
 Distribution seems to work ...
 sapply(splitList,nrow)
 
 But this is NOT what I wanted ...
 sapply(splitList,function(x){max(x[[Column2]])})
 This was supposed to give me bins that are Column2-sorted and bin one should 
 have a higher max than 2 than 3 ...
 
 Can anyone point out where (my now 3 reimplementations) fail?
 
 Thanks, Stupid Joh

Dear Stupid Joh, 

Have you considered something along the lines of

o - order(-x$Column2)
xx - x[o,]
split(xx, (seq_len(NROW(x))-1) %/% 50)

The above is a bit hard to follow, but it seems to work better with rank() 
instead of order():

 splitMembership - 
+ pmin(ceiling(rank(-tmpDF[[Column2]])/binSize),floor(nrow(tmpDF)/binSize))
 splitList - split(tmpDF,splitMembership) sapply(splitList,nrow)
 1  2  3  4  5  6  7  8  9 10 
50 50 50 50 50 50 50 50 50 60 
 sapply(splitList,function(x){max(x[[Column2]])})
   123456 
99877498 90567877 81965382 69112280 59814266 52130373 
   789   10 
41557660 32630212 21226996 11880032 


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?

2010-10-04 Thread David Winsemius



On Oct 4, 2010, at 10:11 AM, Thomas Stewart wrote:


Try

expression(paste(integral(),f(,tau,) d,tau,sep=)))

it works for me.


It does work (albeit without properly including the requested limits  
of integrations), but if it is used as a teaching example, it will  
obscure the syntax of plotmath expressions. A simple version that gets  
the same result is:


 expression(integral()*f(tau)*d*tau))

Plotmath expressions are parsed from unquoted text with proper non- 
printing separators being ~ for space and * for juxtaposition. The  
plotmath paste() is not really the same as the base function paste(),  
despite identical names and its main value is as a method for allowing  
commas to also be used as separators. Frankly I'm a bit surprised you  
got away with adding the undocumented sep= argument. My reading of  
the help page for plotmath would have lead me to predict that the text  
'sep=' should appear in the output. Instead the sep= is ignored  
and the text string is appended to the end of the expression


Try for instance:

plot(1,1, main=expression(paste(a,b,c,d, sep=-) )  )

--
David.



-tgs

On Mon, Oct 4, 2010 at 9:25 AM, Czerminski, Ryszard 
ryszard.czermin...@astrazeneca.com wrote:


I would like to use greek tau as a symbol of variable to integrate
over in plotmath

expression(integral(f(tau)*dtau, 0,t))

but nothing seems to work. I tried d{\tau}, d\tau, etc.,
without any success

Is it possible? How can I accomplish this?

Best regards,
Ryszard


--
Confidentiality Notice: This message is private and may ... 
{{dropped:11}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Debye function

2010-10-04 Thread Christophe Dutang

Dear list,

Is there another package than gsl* where the Debye function (of first order)
is implemented? Google only finds the gsl package.

Thanks in advance

Christophe

* page 12 of reference manual of gsl.
http://cran.r-project.org/web/packages/gsl/gsl.pdf
-- 
Christophe DUTANG
Ph. D. student at ISFA, Lyon, France

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error during scp transfer

2010-10-04 Thread wesley mathew

Dear all

I am implement one grid computing system using GridR package. I am
working on windows platform. I did all the configuration according to
the GridR tutorial.
But it has some errors when I execute grid.apply function.
Here is the output from stderr:
R: not found
java.io.IOException: Error during SCP transfer.
at com.trilead.ssh2.SCPClient.get(SCPClient.java:703)
at com.trilead.ssh2.SCPClient.get(SCPClient.java:596)
at MySSh.download(MySSh.java:104)
at MySSh.main(MySSh.java:151)
Caused by: java.io.IOException: Remote SCP error: scp:
grid/grid-ORNITORRINCO-1708-2010-10-04-15-28-47-9-script.Rout: No such file
or directory
at com.trilead.ssh2.SCPClient.receiveFiles(SCPClient.java:325)
at com.trilead.ssh2.SCPClient.get(SCPClient.java:699)
... 3 more
Grid job had error:
SSH Connection Error:Error during SCP transfer.

Could you please give some suggestion to solve this problem ?
Thanks in advance.

Kind Regards
W. Mathew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Debye function

2010-10-04 Thread Spencer Graves



install.packages('sos') # if not already installed
library(sos)
Debye - ???Debye
#found 4 matches
Debye # Print method opens table in a web browser


  Conclusions:


1.  There are references to Debye in the CHNOSZ package, 
but it may not be what you want.



2.  If it's otherwise available in R, it's very well 
concealed.



  Spencer


On 10/4/2010 8:56 AM, Christophe Dutang wrote:

Dear list,

Is there another package than gsl* where the Debye function (of first order)
is implemented? Google only finds the gsl package.

Thanks in advance

Christophe

* page 12 of reference manual of gsl.
http://cran.r-project.org/web/packages/gsl/gsl.pdf



--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read columns of quoted numbers as factors

2010-10-04 Thread james hirschorn

Suppose I have a data file (possibly with a huge number of columns), where the 
columns with factors are coded as 1, 2, 3, etc ... The default behavior 
of 
read.table is to convert these columns to integer vectors. 

Is there a way to get read.table to recognize that columns of quoted numbers 
represent factors (while unquoted numbers are interpreted as integers), without 
explicitly setting them with colClasses ?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error message with scan() function

2010-10-04 Thread Tariq Perwez

Hi Everyone,

I am new R user and am trying to learn by reading the online manual An
Introduction to R from the R web site. I am trying to practice using the
scan() function as explained in the manual. For this I first created three
vectors (one a character vector and two numeric one) and saved file
input.dat in my working directory as:

 label - c(Bill, Tom, Pat, Will, Sue, Sam)
 x - 1:6
 y - 6:1
 save(label, x, y, file = input.dat)

Now, when I try the scan() function as explained in the manual as below, I
get the error message:

 inp - scan(input.dat, list(,0,0))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
  scan() expected 'a real', got 'X'

I have not been able to figure out what the problem is. I would appreciate
if someone could please guide me as to what I am doing wrong. Regards,

Tariq

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with apply

2010-10-04 Thread Doran, Harold

Suppose I have the following data:

tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = 
sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE))

I can run the following double loop and yield what I want in the end (rr1) as:

library(statmod)
Q - 2
b - runif(3)
qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
L - nrow(tmp)
for(j in 1:Q){
for(i in 1:L){
rr1[j,i] - 
exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * 
exp(exp(qq$nodes[j]-b)) *

((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * 
qq$weights[j]
}
}
rr1

But, I think this is so inefficient for large Q and nrow(tmp). The function I 
am looping over is:

fn - function(x, nodes, weights, s, b) {
exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * 
exp(exp(nodes-b)) *
((1/(s*sqrt(2*pi)))  * 
exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights
}

I've tried using apply in a few ways, but I can't replicate rr1 from the double 
loop. I can go through each value of Q one step at a time and get matching 
values in the rr1 table, but this would still require a loop and that I think 
can be avoided.

apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = b)

Does anyone see an efficient way to compute rr1 without a loop?

Harold

 sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252  
  LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C   LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] minqa_1.1.9 Rcpp_0.8.6  MiscPsycho_1.6  statmod_1.4.6   
lattice_0.17-26 gdata_2.8.0

loaded via a namespace (and not attached):
[1] grid_2.10.1  gtools_2.6.2 tools_2.10.1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Debye function

2010-10-04 Thread Bert Gunter

Well, not really ...

Google on R package Debye . The 5th hit lists the gsl package with
the debye() function.

Moral: DO make use of standard professional search tools, also.

-- Bert

On Mon, Oct 4, 2010 at 9:13 AM, Spencer Graves
spencer.gra...@structuremonitoring.com wrote:

 install.packages('sos') # if not already installed
 library(sos)
 Debye - ???Debye
 #found 4 matches
 Debye # Print method opens table in a web browser


      Conclusions:


            1.  There are references to Debye in the CHNOSZ package, but it
 may not be what you want.


            2.  If it's otherwise available in R, it's very well concealed.


      Spencer


 On 10/4/2010 8:56 AM, Christophe Dutang wrote:

 Dear list,

 Is there another package than gsl* where the Debye function (of first
 order)
 is implemented? Google only finds the gsl package.

 Thanks in advance

 Christophe

 * page 12 of reference manual of gsl.
 http://cran.r-project.org/web/packages/gsl/gsl.pdf


 --
 Spencer Graves, PE, PhD
 President and Chief Operating Officer
 Structure Inspection and Monitoring, Inc.
 751 Emerson Ct.
 San José, CA 95126
 ph:  408-655-4567

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Debye function

2010-10-04 Thread Bert Gunter

OOPs. I Only read Spencer's reply. Sorry for repeating the poster's
original comment.

-- Bert

On Mon, Oct 4, 2010 at 10:10 AM, Bert Gunter bgun...@gene.com wrote:
 Well, not really ...

 Google on R package Debye . The 5th hit lists the gsl package with
 the debye() function.

 Moral: DO make use of standard professional search tools, also.

 -- Bert

 On Mon, Oct 4, 2010 at 9:13 AM, Spencer Graves
 spencer.gra...@structuremonitoring.com wrote:

 install.packages('sos') # if not already installed
 library(sos)
 Debye - ???Debye
 #found 4 matches
 Debye # Print method opens table in a web browser


      Conclusions:


            1.  There are references to Debye in the CHNOSZ package, but it
 may not be what you want.


            2.  If it's otherwise available in R, it's very well concealed.


      Spencer


 On 10/4/2010 8:56 AM, Christophe Dutang wrote:

 Dear list,

 Is there another package than gsl* where the Debye function (of first
 order)
 is implemented? Google only finds the gsl package.

 Thanks in advance

 Christophe

 * page 12 of reference manual of gsl.
 http://cran.r-project.org/web/packages/gsl/gsl.pdf


 --
 Spencer Graves, PE, PhD
 President and Chief Operating Officer
 Structure Inspection and Monitoring, Inc.
 751 Emerson Ct.
 San José, CA 95126
 ph:  408-655-4567

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Bert Gunter
 Genentech Nonclinical Biostatistics




-- 
Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error message with scan() function

2010-10-04 Thread David Winsemius



On Oct 4, 2010, at 12:51 PM, Tariq Perwez wrote:


Hi Everyone,

I am new R user and am trying to learn by reading the online manual  
An
Introduction to R from the R web site. I am trying to practice  
using the
scan() function as explained in the manual. For this I first created  
three

vectors (one a character vector and two numeric one) and saved file
input.dat in my working directory as:


label - c(Bill, Tom, Pat, Will, Sue, Sam)
x - 1:6
y - 6:1
save(label, x, y, file = input.dat)


Now, when I try the scan() function as explained in the manual as  
below, I

get the error message:


Typically the output and input functions are paired and the natural  
pairing for the save function is with the load() function. The scan()  
function is for text files (and requires specification if the vector  
types if not all numeric, while save() stores R objects with a  
compressed format that preserves the object identities and attributes.  
Other pairings:


dput/dget
write/read

You probably ought to be also reading the R Import/Export Manual:

http://cran.r-project.org/doc/manuals/R-data.pdf





inp - scan(input.dat, list(,0,0))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,  
na.strings,

:
 scan() expected 'a real', got 'X'


?scan

scan() expects a real when not given an argument for what. Did you  
look at that file with a text editor?


--
David.


I have not been able to figure out what the problem is. I would  
appreciate

if someone could please guide me as to what I am doing wrong. Regards,

Tariq


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inserting a plot into another

2010-10-04 Thread Greg Snow

OK, I found the bug in subplot (it was never tested with multiple figures).  
The bug has been fixed, but probably won't make it to CRAN any time soon.  In 
the meantime there are a couple of work arounds.

1. create your own local copy of subplot and edit it so that the 1st 3 lines of 
the body are:

type - match.arg(type)
old.par - par( c(type, 'usr', names(pars) ) )
on.exit(par(old.par))

(this means moving the 'type -' line and modifying the 'old.par -' line ).

Then use this copy instead of the one in TeachingDemos.

2.  The problem comes because par('mfg') is getting updated by subplot when it 
shouldn't be, you can just set par(mfg=  ) yourself after each call to subplot.

3.  wait a couple of days for the R-forge version to be updated, then install 
the R-forge version of TeachingDemos 2.8.

Sorry for the problems and hope this helps,
 

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Filoche
 Sent: Monday, October 04, 2010 7:58 AM
 To: r-help@r-project.org
 Subject: Re: [R] Inserting a plot into another
 
 
 Hi everyone.
 
 First, thank you for your answers, it helped me alot.
 
 I have 1 more question regarding subplot.  I'm using this function in a
 graph that contains 6 plots (3 x 2). This is producing strange
 behavior.
 http://img545.imageshack.us/img545/9233/fig6.png  See here for the
 resulting
 figure.
 
 For instance here's my code :
 
 pdf(file=C:/Users/Modelisation/Desktop/Fig6.pdf, width = 8.5, height
 = 11,
 pointsize = 12);
 
 par(mfcol = c(3,2), mai = c(0.7,0.8,0,0), omi = c(1, 0, 0.7, 0.1));
 
 plot(FSL.data_Center$Distance, 4.6/FSL.data_Center$kd.BLUE., pch = 22,
 bg =
 gray43, xlab = Distance (km), ylab = 1% penetration depth (m) ,
 ylim =
 c(0,20), xaxt = n, yaxt = n, xpd = NA);
 axis(1, at = c(100,200,300,400,500),labels = c(100,200,300,400,500),
 cex.axis=1, tck = 0.02);
 axis(2, tck = 0.02);
 
 ...
 
 #Subplots
 subplot(plot(lowess(FSL.data_North$Distance,
 4.6/FSL.data_North$kd.BLUE.),
 type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450),
 ylim =
 c(0,20)), 330,17, size = c(.96, .6));
 subplot(plot(lowess(FSL.data_Center$Distance,
 4.6/FSL.data_Center$kd.RED.),
 type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450),
 ylim =
 c(0,20)), 330,17, size = c(.96, .6));
 subplot(plot(lowess(FSL.data_South$Distance,
 4.6/FSL.data_South$kd.GREEN.),
 type = 'l', xlab = , ylab = , cex.axis = 0.75, xlim = c(0,450),
 ylim =
 c(0,20)), 330,17, size = c(.96, .6));
 
 PLOT CODE FOR THE 5 OTHER FIGURES HERE
 
 dev.off();
 
 The last 2 subplots are plotted in at the bottom of the page. I would
 like
 to have the last 2 curves on the plot A.
 
 With regards,
 Phil
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Inserting-
 a-plot-into-another-tp2720936p2954362.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with apply

2010-10-04 Thread Gabriela Cendoya

You are missing s in your definitions so I can't reproduce your code.

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = 
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE))

 str(tmp)
'data.frame':   3 obs. of  3 variables:
 $ var1: int  9 3 9
 $ var2: int  4 6 2
 $ var3: int  2 9 3

 #I can run the following double loop and yield what I want in the end (rr1) 
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
+ for(i in 1:L){
+ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
(factorial(tmp[i,]) *
+ exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi)))  *
exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j]
+}
+}
Error: objeto 's' no encontrado
 rr1
 [,1] [,2] [,3]
[1,]000
[2,]000


Gabriela


2010/10/4, Doran, Harold hdo...@air.org:
 Suppose I have the following data:

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 =
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace =
 TRUE))

 I can run the following double loop and yield what I want in the end (rr1)
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
 for(i in 1:L){
 rr1[j,i] -
 exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) *
 exp(exp(qq$nodes[j]-b)) *

 ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j])
 * qq$weights[j]
 }
 }
 rr1

 But, I think this is so inefficient for large Q and nrow(tmp). The function
 I am looping over is:

 fn - function(x, nodes, weights, s, b) {
 exp(sum(log((exp(x*(nodes-b))) / (factorial(x) *
 exp(exp(nodes-b)) *
 ((1/(s*sqrt(2*pi)))  *
 exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights
 }

 I've tried using apply in a few ways, but I can't replicate rr1 from the
 double loop. I can go through each value of Q one step at a time and get
 matching values in the rr1 table, but this would still require a loop and
 that I think can be avoided.

 apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b =
 b)

 Does anyone see an efficient way to compute rr1 without a loop?

 Harold

 sessionInfo()
 R version 2.10.1 (2009-12-14)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C   LC_TIME=English_United
 States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] minqa_1.1.9 Rcpp_0.8.6  MiscPsycho_1.6  statmod_1.4.6
 lattice_0.17-26 gdata_2.8.0

 loaded via a namespace (and not attached):
 [1] grid_2.10.1  gtools_2.6.2 tools_2.10.1

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
_
Lic. María Gabriela Cendoya
Magíster en Biometría
Profesor Adjunto
Facultad de Ciencias Agrarias
UNMdP - Argentina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with apply

2010-10-04 Thread Doran, Harold

Sorry about that; s - 1

-Original Message-
From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com] 
Sent: Monday, October 04, 2010 1:44 PM
To: Doran, Harold
Cc: R-help
Subject: Re: [R] Help with apply

You are missing s in your definitions so I can't reproduce your code.

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = 
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE))

 str(tmp)
'data.frame':   3 obs. of  3 variables:
 $ var1: int  9 3 9
 $ var2: int  4 6 2
 $ var3: int  2 9 3

 #I can run the following double loop and yield what I want in the end (rr1) 
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
+ for(i in 1:L){
+ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
(factorial(tmp[i,]) *
+ exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi)))  *
exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j]
+}
+}
Error: objeto 's' no encontrado
 rr1
 [,1] [,2] [,3]
[1,]000
[2,]000


Gabriela


2010/10/4, Doran, Harold hdo...@air.org:
 Suppose I have the following data:

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 =
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace =
 TRUE))

 I can run the following double loop and yield what I want in the end (rr1)
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
 for(i in 1:L){
 rr1[j,i] -
 exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) *
 exp(exp(qq$nodes[j]-b)) *

 ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j])
 * qq$weights[j]
 }
 }
 rr1

 But, I think this is so inefficient for large Q and nrow(tmp). The function
 I am looping over is:

 fn - function(x, nodes, weights, s, b) {
 exp(sum(log((exp(x*(nodes-b))) / (factorial(x) *
 exp(exp(nodes-b)) *
 ((1/(s*sqrt(2*pi)))  *
 exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights
 }

 I've tried using apply in a few ways, but I can't replicate rr1 from the
 double loop. I can go through each value of Q one step at a time and get
 matching values in the rr1 table, but this would still require a loop and
 that I think can be avoided.

 apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b =
 b)

 Does anyone see an efficient way to compute rr1 without a loop?

 Harold

 sessionInfo()
 R version 2.10.1 (2009-12-14)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C   LC_TIME=English_United
 States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] minqa_1.1.9 Rcpp_0.8.6  MiscPsycho_1.6  statmod_1.4.6
 lattice_0.17-26 gdata_2.8.0

 loaded via a namespace (and not attached):
 [1] grid_2.10.1  gtools_2.6.2 tools_2.10.1

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
_
Lic. María Gabriela Cendoya
Magíster en Biometría
Profesor Adjunto
Facultad de Ciencias Agrarias
UNMdP - Argentina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with apply

2010-10-04 Thread Joshua Wiley

On Mon, Oct 4, 2010 at 10:44 AM, Gabriela Cendoya
gabrielacendoya.rl...@gmail.com wrote:
 You are missing s in your definitions so I can't reproduce your code.

When he uses apply(), he sets s = 1.


 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = 
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = 
 TRUE))

 str(tmp)
 'data.frame':   3 obs. of  3 variables:
  $ var1: int  9 3 9
  $ var2: int  4 6 2
  $ var3: int  2 9 3

 #I can run the following double loop and yield what I want in the end (rr1) 
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
 +     for(i in 1:L){
 +         rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
 (factorial(tmp[i,]) *
 +         exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi)))  *
 exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j]
 +                                                }
 +                                }
 Error: objeto 's' no encontrado
 rr1
     [,1] [,2] [,3]
 [1,]    0    0    0
 [2,]    0    0    0


 Gabriela


 2010/10/4, Doran, Harold hdo...@air.org:
 Suppose I have the following data:

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 =
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace =
 TRUE))

 I can run the following double loop and yield what I want in the end (rr1)
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
                 for(j in 1:Q){
                                                 for(i in 1:L){
                                                                 rr1[j,i] -
 exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) *
 exp(exp(qq$nodes[j]-b)) *

 ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j])
 * qq$weights[j]
                                                 }
                                 }
 rr1

 But, I think this is so inefficient for large Q and nrow(tmp). The function
 I am looping over is:

 fn - function(x, nodes, weights, s, b) {
                 exp(sum(log((exp(x*(nodes-b))) / (factorial(x) *
 exp(exp(nodes-b)) *
                 ((1/(s*sqrt(2*pi)))  *
 exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights
                 }

 I've tried using apply in a few ways, but I can't replicate rr1 from the
 double loop. I can go through each value of Q one step at a time and get
 matching values in the rr1 table, but this would still require a loop and
 that I think can be avoided.

 apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b =
 b)

 Does anyone see an efficient way to compute rr1 without a loop?

 Harold

 sessionInfo()
 R version 2.10.1 (2009-12-14)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252    LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C                           LC_TIME=English_United
 States.1252

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] minqa_1.1.9     Rcpp_0.8.6      MiscPsycho_1.6  statmod_1.4.6
 lattice_0.17-26 gdata_2.8.0

 loaded via a namespace (and not attached):
 [1] grid_2.10.1  gtools_2.6.2 tools_2.10.1

       [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 _
 Lic. María Gabriela Cendoya
 Magíster en Biometría
 Profesor Adjunto
 Facultad de Ciencias Agrarias
 UNMdP - Argentina

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] vectorizing problem

2010-10-04 Thread Dylan Miracle

Hello,

I have a two column dataframe that
has entries that look like this:

2315100   NR_024005,NR_024004,AK093685
2315106   DQ786314

and I want to change this to look like this:

2315100   NR_024005
2315100   NR_024004
2315100   AK093685
2315106   DQ786314

I can do this with the following for loop but the dataframe (GPL)
has ~140,000 rows and this takes about 15 minutes:


extGPL - matrix(nrow=0,ncol=2)
for (i in 1:length(GPL[,2])){
   aa - unlist(strsplit(as.character(GPL[i,2]),\\,))
   bb - rep(as.numeric(as.character(GPL[i,1])), length(aa))
   cc - matrix(c(bb,aa),ncol = 2)
   extGPL - rbind(extGPL,cc)
}

Is there a way to vectorize this?

Thanks,

Dylan Miracle
University of Minnesota
GCD Department

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vectorizing problem

2010-10-04 Thread Henrique Dallazuanna

Try this:

do.call(rbind.data.frame, mapply(cbind, DF$V1, strsplit(as.character(DF$V2),
,)))

On Mon, Oct 4, 2010 at 2:54 PM, Dylan Miracle dylan.mira...@gmail.comwrote:

 Hello,

 I have a two column dataframe that
 has entries that look like this:

 2315100   NR_024005,NR_024004,AK093685
 2315106   DQ786314

 and I want to change this to look like this:

 2315100   NR_024005
 2315100   NR_024004
 2315100   AK093685
 2315106   DQ786314

 I can do this with the following for loop but the dataframe (GPL)
 has ~140,000 rows and this takes about 15 minutes:


 extGPL - matrix(nrow=0,ncol=2)
 for (i in 1:length(GPL[,2])){
   aa - unlist(strsplit(as.character(GPL[i,2]),\\,))
   bb - rep(as.numeric(as.character(GPL[i,1])), length(aa))
   cc - matrix(c(bb,aa),ncol = 2)
   extGPL - rbind(extGPL,cc)
 }

 Is there a way to vectorize this?

 Thanks,

 Dylan Miracle
 University of Minnesota
 GCD Department

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vectorizing problem

2010-10-04 Thread Dimitris Rizopoulos


try this:

GPL - data.frame(x = c(2315100, 2315106),
y = c(NR_024005,NR_024004,AK093685, DQ786314))

sp - strsplit(as.character(GPL$y), ,)
ni - sapply(sp, length)
data.frame(x = rep(GPL$x, ni), y = unlist(sp))


I hope it helps.

Best,
Dimitris


On 10/4/2010 7:54 PM, Dylan Miracle wrote:

Hello,

I have a two column dataframe that
has entries that look like this:

2315100   NR_024005,NR_024004,AK093685
2315106   DQ786314

and I want to change this to look like this:

2315100   NR_024005
2315100   NR_024004
2315100   AK093685
2315106   DQ786314

I can do this with the following for loop but the dataframe (GPL)
has ~140,000 rows and this takes about 15 minutes:


extGPL- matrix(nrow=0,ncol=2)
for (i in 1:length(GPL[,2])){
aa- unlist(strsplit(as.character(GPL[i,2]),\\,))
bb- rep(as.numeric(as.character(GPL[i,1])), length(aa))
cc- matrix(c(bb,aa),ncol = 2)
extGPL- rbind(extGPL,cc)
}

Is there a way to vectorize this?

Thanks,

Dylan Miracle
University of Minnesota
GCD Department

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vectorizing problem

2010-10-04 Thread David Winsemius



On Oct 4, 2010, at 2:02 PM, Henrique Dallazuanna wrote:


Try this:

do.call(rbind.data.frame, mapply(cbind, DF$V1,  
strsplit(as.character(DF$V2),

,)))



Also try:

 txt
   V1   V2
1 2315100 NR_024005,NR_024004,AK093685
2 2315106 DQ786314
 str(txt)
'data.frame':   2 obs. of  2 variables:
 $ V1: int  2315100 2315106
 $ V2: chr  NR_024005,NR_024004,AK093685 DQ786314
# Note that my V2 was input as chr, Henrique's appears to have been a  
factor column


 data.frame(idxs = rep(txt$V1, length(strsplit(txt$V2,  
split=,) ) ) ,

 vals = unlist(strsplit(txt$V2, split=,) )
  )
 idxs  vals
1 2315100 NR_024005
2 2315106 NR_024004
3 2315100  AK093685
4 2315106  DQ786314

--
David.

On Mon, Oct 4, 2010 at 2:54 PM, Dylan Miracle  
dylan.mira...@gmail.comwrote:



Hello,

I have a two column dataframe that
has entries that look like this:

2315100   NR_024005,NR_024004,AK093685
2315106   DQ786314

and I want to change this to look like this:

2315100   NR_024005
2315100   NR_024004
2315100   AK093685
2315106   DQ786314

I can do this with the following for loop but the dataframe (GPL)
has ~140,000 rows and this takes about 15 minutes:


extGPL - matrix(nrow=0,ncol=2)
for (i in 1:length(GPL[,2])){
 aa - unlist(strsplit(as.character(GPL[i,2]),\\,))
 bb - rep(as.numeric(as.character(GPL[i,1])), length(aa))
 cc - matrix(c(bb,aa),ncol = 2)
 extGPL - rbind(extGPL,cc)
}

Is there a way to vectorize this?

Thanks,

Dylan Miracle
University of Minnesota
GCD Department

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glmer or not - glmer model specification

2010-10-04 Thread Atle Torvik Kristiansen

Hello,

I'm having some trouble figuring out the correct model specification for
my data. The system consists of multiple populations of an organism,
which have been genetically sampled for several years. The problem is
this: A minority of individuals are found in more than one sample,
either they have survived into the next sampling at the same location,
or have migrated to another another location and survived into the next
sampling there. This pseudoreplication constitutes about 12% of all
observations.

I have information on the number of heterozygous locus per individual,
its location, year, population size  and number of immigrants at that
location, and want to investigate the effect of population size and
migration on heterozygosity through time. Heterozygosity, the response
variable is a proportion, and I would like to account for the
pseudoreplication. Their heterozygosity score will be the same for each
sampling, and is constant. I have been trying with a mixed models
approach, more specifically glmer (lme4 package) with a binomial
distribution family and the response variable as an odds, eg.

glmer(heterozygosity~1+population size+number of immigrants+year+(1|
individual id)+(0+year|individual id)+(1+year|location),family=binomial)

To my very basic understanding, this poses the following question: Is
there a an effect of population size, number of immigrants and year,
while taking into account a non-correlated random effect of time and
individual, and a correlated random effect of year and location?
   
Does anyone have any advice as to wether a mixed model is the best
approach, and if so, what model specification to use?

-- 
-

Vennlig hilsen/Kind regards,

Atle Torvik Kristiansen

M.Sc. student

Institutt for biologi (DU2-100) 
Realfagbygget
NTNU
7491 Trondheim
Norway

Email: atletor...@gmail.com
   atlet...@stud.ntnu.no

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with apply

2010-10-04 Thread Phil Spector


Harold -
   The first way that comes to mind is

doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
   (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) *
   ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/
   dnorm(qq$nodes[j]) * qq$weights[j]
t(outer(1:3,1:2,Vectorize(doit)))

but you said you wanted to use apply.  That leads to

doit1 = function(tmpi,nod,wt)
  exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) *
  exp(exp(nod-b)) * ((1/(s*sqrt(2*pi)))  *
  exp(-((nod-0)^2/(2*s^2/dnorm(nod) * qq$weights[j]
apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights))

Both seem to agree with your rr1.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu

On Mon, 4 Oct 2010, Doran, Harold wrote:


Sorry about that; s - 1

-Original Message-
From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com]
Sent: Monday, October 04, 2010 1:44 PM
To: Doran, Harold
Cc: R-help
Subject: Re: [R] Help with apply

You are missing s in your definitions so I can't reproduce your code.


tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = 
sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE))

str(tmp)

'data.frame':   3 obs. of  3 variables:
$ var1: int  9 3 9
$ var2: int  4 6 2
$ var3: int  2 9 3


#I can run the following double loop and yield what I want in the end (rr1) as:

library(statmod)
Q - 2
b - runif(3)
qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
L - nrow(tmp)
for(j in 1:Q){

+ for(i in 1:L){
+ rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
(factorial(tmp[i,]) *
+ exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi)))  *
exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j]
+}
+}
Error: objeto 's' no encontrado

rr1

[,1] [,2] [,3]
[1,]000
[2,]000




Gabriela


2010/10/4, Doran, Harold hdo...@air.org:

Suppose I have the following data:

tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 =
sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace =
TRUE))

I can run the following double loop and yield what I want in the end (rr1)
as:

library(statmod)
Q - 2
b - runif(3)
qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
L - nrow(tmp)
for(j in 1:Q){
for(i in 1:L){
rr1[j,i] -
exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) *
exp(exp(qq$nodes[j]-b)) *

((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j])
* qq$weights[j]
}
}
rr1

But, I think this is so inefficient for large Q and nrow(tmp). The function
I am looping over is:

fn - function(x, nodes, weights, s, b) {
exp(sum(log((exp(x*(nodes-b))) / (factorial(x) *
exp(exp(nodes-b)) *
((1/(s*sqrt(2*pi)))  *
exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights
}

I've tried using apply in a few ways, but I can't replicate rr1 from the
double loop. I can go through each value of Q one step at a time and get
matching values in the rr1 table, but this would still require a loop and
that I think can be avoided.

apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b =
b)

Does anyone see an efficient way to compute rr1 without a loop?

Harold


sessionInfo()

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C   LC_TIME=English_United
States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] minqa_1.1.9 Rcpp_0.8.6  MiscPsycho_1.6  statmod_1.4.6
lattice_0.17-26 gdata_2.8.0

loaded via a namespace (and not attached):
[1] grid_2.10.1  gtools_2.6.2 tools_2.10.1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
_
Lic. María Gabriela Cendoya
Magíster en Biometría
Profesor Adjunto
Facultad de Ciencias Agrarias
UNMdP - Argentina

Re: [R] Help with apply

2010-10-04 Thread Doran, Harold

A, I see that you wrapped apply around mapply. I was toying with both; but 
didn't think to use mapply inside apply. As always, thank you, Phil

-Original Message-
From: Phil Spector [mailto:spec...@stat.berkeley.edu] 
Sent: Monday, October 04, 2010 2:20 PM
To: Doran, Harold
Cc: Gabriela Cendoya; R-help
Subject: Re: [R] Help with apply

Harold -
The first way that comes to mind is

doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
(factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) *
((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/
dnorm(qq$nodes[j]) * qq$weights[j]
t(outer(1:3,1:2,Vectorize(doit)))

but you said you wanted to use apply.  That leads to

doit1 = function(tmpi,nod,wt)
   exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) *
   exp(exp(nod-b)) * ((1/(s*sqrt(2*pi)))  *
   exp(-((nod-0)^2/(2*s^2/dnorm(nod) * qq$weights[j]
apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights))

Both seem to agree with your rr1.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu

On Mon, 4 Oct 2010, Doran, Harold wrote:

 Sorry about that; s - 1

 -Original Message-
 From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com]
 Sent: Monday, October 04, 2010 1:44 PM
 To: Doran, Harold
 Cc: R-help
 Subject: Re: [R] Help with apply

 You are missing s in your definitions so I can't reproduce your code.

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = 
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = 
 TRUE))

 str(tmp)
 'data.frame':   3 obs. of  3 variables:
 $ var1: int  9 3 9
 $ var2: int  4 6 2
 $ var3: int  2 9 3

 #I can run the following double loop and yield what I want in the end (rr1) 
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
 + for(i in 1:L){
 + rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
 (factorial(tmp[i,]) *
 + exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi)))  *
 exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j]
 +}
 +}
 Error: objeto 's' no encontrado
 rr1
 [,1] [,2] [,3]
 [1,]000
 [2,]000


 Gabriela


 2010/10/4, Doran, Harold hdo...@air.org:
 Suppose I have the following data:

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 =
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace =
 TRUE))

 I can run the following double loop and yield what I want in the end (rr1)
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
 for(i in 1:L){
 rr1[j,i] -
 exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) *
 exp(exp(qq$nodes[j]-b)) *

 ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j])
 * qq$weights[j]
 }
 }
 rr1

 But, I think this is so inefficient for large Q and nrow(tmp). The function
 I am looping over is:

 fn - function(x, nodes, weights, s, b) {
 exp(sum(log((exp(x*(nodes-b))) / (factorial(x) *
 exp(exp(nodes-b)) *
 ((1/(s*sqrt(2*pi)))  *
 exp(-((nodes-0)^2/(2*s^2/dnorm(nodes) * weights
 }

 I've tried using apply in a few ways, but I can't replicate rr1 from the
 double loop. I can go through each value of Q one step at a time and get
 matching values in the rr1 table, but this would still require a loop and
 that I think can be avoided.

 apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b =
 b)

 Does anyone see an efficient way to compute rr1 without a loop?

 Harold

 sessionInfo()
 R version 2.10.1 (2009-12-14)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C   LC_TIME=English_United
 States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] minqa_1.1.9 Rcpp_0.8.6  MiscPsycho_1.6  statmod_1.4.6
 lattice_0.17-26 gdata_2.8.0

 loaded via a namespace (and not attached):
 [1] grid_2.10.1  gtools_2.6.2 tools_2.10.1

  [[alternative HTML version deleted]]

[R] Null values from DBI connection

2010-10-04 Thread Albert Vernon Smith

I am connecting to an Oracle database with RJDBC, and creating a data frame 
with dbGetQuery.  When I get values return, values which are NULL in the 
database are being returned as 0 from my query.  How can I have them returned 
as NA?

I am query like the following.

--
require(RJDBC)

drv - 
JDBC(oracle.jdbc.driver.OracleDriver,/usr/local/ojdbc/ojdbc14.jar,')
conn - dbConnect(drv, dbc:oracle:thin:@ADDRESS,user,pass)
results - dbGetQuery(select rownum, col1 from table where col1 is null)
--

 results

   ROWNUM COL1
1   10
2   20
3   30
4   40
5   50

I'd rather see:

 results

   ROWNUM COL1
1   1NA
2   2NA
3   3NA
4   4NA
5   5NA

--

-albert
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Roxygen not truncating files

2010-10-04 Thread Ken Williams

I'm trying to use Roxygen for the first time, but I'm running into trouble.
I call it like so:

  roxygenize('~/p4/r-packages/IREval', '~/p4/r-packages/IREval',
 unlink.target=T)

But it seems that it's only appending to files in
~/p4/r-packages/IREval/man/, not overwriting them.  So for example, for a
function that I haven't documented yet I get a long file with just the
following 4 lines over and over, as many times as I've called roxygenize():

% head -n12 man/adddots.pr.Rd
\name{adddots.pr}
\alias{adddots.pr}
\title{adddots.pr}
\usage{adddots.pr(pr)}
\name{adddots.pr}
\alias{adddots.pr}
\title{adddots.pr}
\usage{adddots.pr(pr)}
\name{adddots.pr}
\alias{adddots.pr}
\title{adddots.pr}
\usage{adddots.pr(pr)}
...


I also get the same behavior for the DESCRIPTION file.

Is this a known gotcha that someone's found a workaround for, maybe?


-- 
Ken Williams
Sr. Research Scientist
Thomson Reuters
Phone: 651-848-7712
ken.willi...@thomsonreuters.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] input matrix for leaps algorithm

2010-10-04 Thread Greg Snow

The regsubsets function in the leaps package can work with fewer points than 
variables.  Though the meaning will be questionable.  The lasso (lasso2 or lars 
packages) may be more informative for your situation.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of CZ
 Sent: Monday, October 04, 2010 8:43 AM
 To: r-help@r-project.org
 Subject: [R] input matrix for leaps algorithm
 
 
 Hello,
 
 I was trying to use the leaps algorithm for my variable selection.  I
 tried
 different matrices calculated from multiple linear regression, linear
 discriminant analysis, and the basic correlation matrix as inputs to
 the
 leaps algorithm.  But I always got the error message - the matrix is
 not
 positive definite.
 
 I have more variables than the number of observations/samples.  I know
 this
 may be a problem.  But the multiple linear regression in SAS seems to
 be
 able to handle the case when there are more variables than the number
 of
 observations.  I wonder if there is a way I can still do my variable
 selection instead of reducing the number of variables.  Or there may be
 something wrong with my function calls.
 
 Thank you.
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/input-
 matrix-for-leaps-algorithm-tp2954453p2954453.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading Strings in R

2010-10-04 Thread Greg Snow

Use scan whith what='' or read.table with colClasses='character'.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Lorenzo Isella
 Sent: Monday, October 04, 2010 5:05 AM
 To: r-help
 Subject: [R] Reading Strings in R
 
 Dear All,
 I know this must be a one-liner, but I am experiencing problems.
 I need to read lists of long integers stored on a text files, i.e.
 which
 reads like
 
 7359700484475972
 7359700484475972
 0
 7359700484475972
 0
 0
 0
 97189954101318722
 0
 0
 7811896690636354
 7359700490636354
 0
 0
 0
 7811896684475972
 7811896684475972
 7811896690636354
 8447597298304052
 7359700484475972
 0
 7359700498304052
 7359700498304052
 7359700498304052
 0
 7359700498304052
 7359700484475972
 7359700484475972
 
 Now, for me every integer is just a symbol, i.e. the ID of some object
 and I simply need to find out how many times and where it occurs in
 this
 file.
 Bottom line: I would like to read this file directly as a string to
 ensure I will not have any trouble of integer overflow or rounding
 error.
 I tried readLines, but I cannot get rid of \n.
 Any help is appreciated.
 Cheers
 
 Lorenzo
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inserting a plot into another

2010-10-04 Thread Filoche


Hi Greg. 

Thank you for the time you took to look at my problem.

I'll try your solution until CRAN is updated.

With regards,
Phil
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Inserting-a-plot-into-another-tp2720936p2954769.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vectorizing problem

2010-10-04 Thread Gabor Grothendieck

On Mon, Oct 4, 2010 at 1:54 PM, Dylan Miracle dylan.mira...@gmail.com wrote:
 Hello,

 I have a two column dataframe that
 has entries that look like this:

 2315100       NR_024005,NR_024004,AK093685
 2315106       DQ786314

 and I want to change this to look like this:

 2315100       NR_024005
 2315100       NR_024004
 2315100       AK093685
 2315106       DQ786314

 I can do this with the following for loop but the dataframe (GPL)
 has ~140,000 rows and this takes about 15 minutes:

Try this assuming that the columns of GPL are character.  You may need
to use as.character first if they are factor:

library(reshape2)
V2 - strsplit(GPL$V2, ,)
names(V2) - GPL$V1
melt(V2)

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] title and axis labels placement with layout

2010-10-04 Thread Daryl Morris


 Hello,

I'm writing a set of functions which we will use with a bunch of 
different datasets.  The functions create a set of graphics pages, with 
each page split into an upper panel with plots and a lower panel with 
tables.  The plots part might have one plot, or might have multiple.  
I'm using layout() to split the region into a top part for plots and a 
bottom part for 2 tables side-by-side.  For the tables below, I'm using 
textplot() from the gplots package.  I'm trying to get it so that the 
layout of the full page looks exactly the same regardless of the plot.


My working test case creates a boxplot with a few groups on one page, 
and 6 boxplots each with 3 groups on another page.  R automatically 
scales the cex parameter such that, for example, the points are 
smaller on the more complicated plots.


I can live with different point sizes.  But, I want the tables to have 
the same size below.  After much trial and error, I determined that I 
had to set cex OUTSIDE of textplot in order to get the sizes of the 
tables to be the same across pages.


But, what I can't get to work are 2 things:
(1) the ylabel placement (horizontally) for the top plot depends on the 
graphics in a way that I can't figure out.  (I've tried setting 
cex,lheight and many combinations of padj, line using both the plot 
function itself and title, mtext functions after the plot call).
(2) I can't figure out how to put a title on the tables such that they 
end up in repeatable position.  Again, I've tried all the various 
options I tried in (1) here again.


One solution to (1) would be if I could place the label using a 
measurement rather than a line number.  (analogous to using par(mai) 
instead of par(mar))

One solution to (2) would be that textplot include a main option.


thanks,
Daryl Morris
SCHARP, FHCRC, UW Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generating weekdays only

2010-10-04 Thread Ron Michael

Dear all, can anyone please tell me how to generate a sequence of days 
continuously, however without considering weekends i.e. Saturday and Sunday? I 
am aware of following code:

 seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day)
 [1] 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05 
2010-01-06 2010-01-07 2010-01-08
 [9] 2010-01-09 2010-01-10 2010-01-11 2010-01-12 2010-01-13 
2010-01-14 2010-01-15 2010-01-16
[17] 2010-01-17 2010-01-18 2010-01-19 2010-01-20 2010-01-21 
2010-01-22 2010-01-23 2010-01-24
[25] 2010-01-25 2010-01-26 2010-01-27 2010-01-28 2010-01-29 
2010-01-30 2010-01-31 2010-02-01

However this generates all days.
Thanks for your time.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Include externally generated pdf in output (without Sweave)

2010-10-04 Thread Greg Snow

Importing as pixel image then replotting will probably reduce the quality of 
the graph.  Another approach is to use the pdftk program (I have seen versions 
for unix and windows) to add/insert the original pdf page into your document.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Dieter Menne
 Sent: Sunday, October 03, 2010 7:39 AM
 To: r-help@r-project.org
 Subject: Re: [R] Include externally generated pdf in output (without
 Sweave)
 
 
 
 Tal Galili wrote:
 
  You could potentially read an image file (like, for example, tiff)
 using
  something like
  read.picture {SoPhy}
 
 
 Thanks to you and Baptiste Auguie. Looks like the best way would be to
 import the picture as a pixel graphics with read.picture. It might be
 possible to read ps with grImport (Murells/suggested by Baptiste), but
 I
 always found the postscript-detour messy.
 
 Dieter
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Include-
 externally-generated-pdf-in-output-without-Sweave-
 tp2953057p2953182.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] title and axis labels placement with layout

2010-10-04 Thread Greg Snow

You might want to look at the addtable2plot function in the plotrix package as 
an alternative for your tables.

For placing titles/labels/etc. look at the grconvertX and grconvertY functions 
for ways to get coordinates based on the device converted to other coordinate 
systems.  You can use those with the text function (or others) to place a label 
at a fixed position on the device.

Another option is to include an empty plot area in the layout to place the 
titles/labels into.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Daryl Morris
 Sent: Monday, October 04, 2010 12:34 PM
 To: r-help@r-project.org
 Subject: [R] title and axis labels placement with layout
 
   Hello,
 
 I'm writing a set of functions which we will use with a bunch of
 different datasets.  The functions create a set of graphics pages, with
 each page split into an upper panel with plots and a lower panel with
 tables.  The plots part might have one plot, or might have multiple.
 I'm using layout() to split the region into a top part for plots and a
 bottom part for 2 tables side-by-side.  For the tables below, I'm using
 textplot() from the gplots package.  I'm trying to get it so that the
 layout of the full page looks exactly the same regardless of the plot.
 
 My working test case creates a boxplot with a few groups on one page,
 and 6 boxplots each with 3 groups on another page.  R automatically
 scales the cex parameter such that, for example, the points are
 smaller on the more complicated plots.
 
 I can live with different point sizes.  But, I want the tables to have
 the same size below.  After much trial and error, I determined that I
 had to set cex OUTSIDE of textplot in order to get the sizes of the
 tables to be the same across pages.
 
 But, what I can't get to work are 2 things:
 (1) the ylabel placement (horizontally) for the top plot depends on the
 graphics in a way that I can't figure out.  (I've tried setting
 cex,lheight and many combinations of padj, line using both the plot
 function itself and title, mtext functions after the plot call).
 (2) I can't figure out how to put a title on the tables such that they
 end up in repeatable position.  Again, I've tried all the various
 options I tried in (1) here again.
 
 One solution to (1) would be if I could place the label using a
 measurement rather than a line number.  (analogous to using par(mai)
 instead of par(mar))
 One solution to (2) would be that textplot include a main option.
 
 
 thanks,
 Daryl Morris
 SCHARP, FHCRC, UW Biostatistics
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating weekdays only

2010-10-04 Thread David Winsemius



On Oct 4, 2010, at 2:07 PM, Ron Michael wrote:

Dear all, can anyone please tell me how to generate a sequence of  
days continuously, however without considering weekends i.e.  
Saturday and Sunday? I am aware of following code:



seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day)
 [1] 2010-01-01 2010-01-02 2010-01-03 2010-01-04  
2010-01-05 2010-01-06 2010-01-07 2010-01-08
 [9] 2010-01-09 2010-01-10 2010-01-11 2010-01-12  
2010-01-13 2010-01-14 2010-01-15 2010-01-16
[17] 2010-01-17 2010-01-18 2010-01-19 2010-01-20  
2010-01-21 2010-01-22 2010-01-23 2010-01-24
[25] 2010-01-25 2010-01-26 2010-01-27 2010-01-28  
2010-01-29 2010-01-30 2010-01-31 2010-02-01




??weekdays
?weekdays  # generates a character vector that can be tested against  
non Sat/Sun values


(as.Date(2010-01-01)+0:31)[ !weekdays((as.Date(2010-01-01)+0:31))  
%in% c(Saturday , Sunday) ]



However this generates all days.
Thanks for your time.



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating weekdays only

2010-10-04 Thread Greg Snow

Here is one way:

 mydays - seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day)
 myweekdays - mydays[ ! weekdays(mydays) %in% c('Saturday','Sunday') ]
 myweekdays
 [1] 2010-01-01 2010-01-04 2010-01-05 2010-01-06 2010-01-07
 [6] 2010-01-08 2010-01-11 2010-01-12 2010-01-13 2010-01-14
[11] 2010-01-15 2010-01-18 2010-01-19 2010-01-20 2010-01-21
[16] 2010-01-22 2010-01-25 2010-01-26 2010-01-27 2010-01-28
[21] 2010-01-29 2010-02-01

This generates the sequence of days, then throws out the weekends (you could 
also list the weekdays instead of the weekend days and drop the !, but I am 
lazy).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Ron Michael
 Sent: Monday, October 04, 2010 12:08 PM
 To: r-help@r-project.org
 Subject: [R] Generating weekdays only
 
 Dear all, can anyone please tell me how to generate a sequence of days
 continuously, however without considering weekends i.e. Saturday and
 Sunday? I am aware of following code:
 
  seq(as.Date(2010-01-01), as.Date(2010-02-01), by=1 day)
  [1] 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05
 2010-01-06 2010-01-07 2010-01-08
  [9] 2010-01-09 2010-01-10 2010-01-11 2010-01-12 2010-01-13
 2010-01-14 2010-01-15 2010-01-16
 [17] 2010-01-17 2010-01-18 2010-01-19 2010-01-20 2010-01-21
 2010-01-22 2010-01-23 2010-01-24
 [25] 2010-01-25 2010-01-26 2010-01-27 2010-01-28 2010-01-29
 2010-01-30 2010-01-31 2010-02-01
 
 However this generates all days.
 Thanks for your time.
 
 
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with apply

2010-10-04 Thread Joshua Wiley

Hi,

It seemed to me like the calculations were implemented in a bit of a
redundant way, so I tried to simplify it algebraically.  doit1 is
Phil's version of your original calculations, and doit2 is my
simplified way.

doit1 - function(i, j) {
  exp(sum(log((exp(tmp[i,] * (qq$nodes[j]-b)))/
  (factorial(tmp[i,])*exp(exp(qq$nodes[j]-b))*
  ((1/(s*sqrt(2*pi)))*exp(-((qq$nodes[j]-0)^2/(2*s^2/
  dnorm(qq$nodes[j])*qq$weights[j]
}

doit2 - function(i, j) {
  (prod(exp((tmp[i,]*(qq$nodes[j]-b))-exp(qq$nodes[j]-b))/
factorial(tmp[i,]))*qq$weights[j])/
  (s*sqrt(2*pi)*dnorm(qq$nodes[j])*exp((qq$nodes[j]-0)^2/(2*s^2)))
}

system.time(t(outer(1:300,1:2,Vectorize(doit1
system.time(t(outer(1:300,1:2,Vectorize(doit2


New vs. Old
Characters (for the calculations):
154 vs. 177
Function calls for calculations (i.e., excluding subseting/dnorm, etc.):
22 vs. 27

Timing difference:
Not appreciable on my system, but I spent so long wading through
exactly what was happening, I figured why not share it.  Moving tmp up
to 300 rows, I got:

 system.time(t(outer(1:300,1:2,Vectorize(doit1
   user  system elapsed
  9.044   0.008  10.135
 system.time(t(outer(1:300,1:2,Vectorize(doit2
   user  system elapsed
  8.413   0.008   8.893

At this rate, the difference between a few million rows might save you
enough time you can stretch your hands once, and those 20 characters
should really free up space on your hard disksigh.

Josh


On Mon, Oct 4, 2010 at 11:22 AM, Doran, Harold hdo...@air.org wrote:
 A, I see that you wrapped apply around mapply. I was toying with both; 
 but didn't think to use mapply inside apply. As always, thank you, Phil

 -Original Message-
 From: Phil Spector [mailto:spec...@stat.berkeley.edu]
 Sent: Monday, October 04, 2010 2:20 PM
 To: Doran, Harold
 Cc: Gabriela Cendoya; R-help
 Subject: Re: [R] Help with apply

 Harold -
    The first way that comes to mind is

 doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
        (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)) *
        ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2/
        dnorm(qq$nodes[j]) * qq$weights[j]
 t(outer(1:3,1:2,Vectorize(doit)))

 but you said you wanted to use apply.  That leads to

 doit1 = function(tmpi,nod,wt)
       exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) *
       exp(exp(nod-b)) * ((1/(s*sqrt(2*pi)))  *
       exp(-((nod-0)^2/(2*s^2/dnorm(nod) * qq$weights[j]
 apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights))

 Both seem to agree with your rr1.

                                        - Phil Spector
                                         Statistical Computing Facility
                                         Department of Statistics
                                         UC Berkeley
                                         spec...@stat.berkeley.edu

 On Mon, 4 Oct 2010, Doran, Harold wrote:

 Sorry about that; s - 1

 -Original Message-
 From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com]
 Sent: Monday, October 04, 2010 1:44 PM
 To: Doran, Harold
 Cc: R-help
 Subject: Re: [R] Help with apply

 You are missing s in your definitions so I can't reproduce your code.

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = 
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = 
 TRUE))

 str(tmp)
 'data.frame':   3 obs. of  3 variables:
 $ var1: int  9 3 9
 $ var2: int  4 6 2
 $ var3: int  2 9 3

 #I can run the following double loop and yield what I want in the end (rr1) 
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
 for(j in 1:Q){
 +     for(i in 1:L){
 +         rr1[j,i] - exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
 (factorial(tmp[i,]) *
 +         exp(exp(qq$nodes[j]-b)) * ((1/(s*sqrt(2*pi)))  *
 exp(-((qq$nodes[j]-0)^2/(2*s^2/dnorm(qq$nodes[j]) * qq$weights[j]
 +                                                }
 +                                }
 Error: objeto 's' no encontrado
 rr1
     [,1] [,2] [,3]
 [1,]    0    0    0
 [2,]    0    0    0


 Gabriela


 2010/10/4, Doran, Harold hdo...@air.org:
 Suppose I have the following data:

 tmp - data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 =
 sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace =
 TRUE))

 I can run the following double loop and yield what I want in the end (rr1)
 as:

 library(statmod)
 Q - 2
 b - runif(3)
 qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 rr1 - matrix(0, nrow = Q, ncol = nrow(tmp))
 L - nrow(tmp)
                 for(j in 1:Q){
                                                 for(i in 1:L){
                                                                 rr1[j,i] -
 exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) *
 exp(exp(qq$nodes[j]-b)) *

 ((1/(s*sqrt(2*pi)))  *

Re: [R] Plot for Binomial GLM

2010-10-04 Thread Peter Ehlers


On 2010-10-04 8:21, klsk89 wrote:


Hi i would like to use some graphs or tables to explore the data and make
some sensible guesses of what  to expect to see in a glm model to assess if
toxin concentration and sex have a relationship with the kill rate of rats.
But i cant seem to work it out as i have two predictor
variables~help?Thanks.:)

Here's my data.


rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,)
attach(rat.toxic)
names(rat.toxic)

[1] Dose  Sex   Dead  Alive

rat.toxic

Dose Sex Dead Alive
110   F119
210   M020
320   F416
420   M416
530   F911
630   M812
740   F   13 7
840   M   13 7
950   F   18 2
10   50   M   17 3
11   60   F   20 0
12   60   M   16 4
13   10   F317
14   10   M119
15   20   F218
16   20   M218
17   30   F   1010
18   30   M812
19   40   F   14 6
20   40   M   12 8
21   50   F   16 4
22   50   M   13 7
23   60   F   18 2
24   60   M   16 4

glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic)

anova(glm2,test=Chi)

Analysis of Deviance Table

Model: binomial, link: logit

Response: deadalive

Terms added sequentially (first to last)


  Df Deviance Resid. Df Resid. Dev P(|Chi|)
NULL23225.455
Dose  1  202.36622 23.0902e-16 ***
Sex   14.32821 18.7620.0375 *
Dose:Sex  11.14920 17.6130.2838
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

summary(glm2)


Call:
glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic)

Deviance Residuals:
  Min1QMedian3Q   Max
-1.82241  -0.85632   0.06675   0.61981   1.47874

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept) -3.479390.46167  -7.537 4.83e-14 ***
Dose 0.105970.01286   8.243  2e-16 ***
SexM 0.155010.63974   0.2420.809
Dose:SexM   -0.018210.01707  -1.0670.286
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

 Null deviance: 225.455  on 23  degrees of freedom
Residual deviance:  17.613  on 20  degrees of freedom
AIC: 91.115

Number of Fisher Scoring iterations: 4



This is pretty much the budworm example in MASS (the book).
I would produce a plot of Prob(dead) vs dose with separate
lines for M/F:

 dose - seq(10, 60, length=51)
 ypF - predict(glm2, data.frame(Dose=dose, Sex=F), type=response)
 ypM - predict(glm2, data.frame(Dose=dose, Sex=M), type=response)

 plot(c(10,60), c(0,1), type=n)
 lines(dose, ypF, col=2)
 lines(dose, ypM, col=4)

 text(locator(1), F, col=2)
 text(locator(1), M, col=4)

See recent posts by Ben Bolker for confidence bands.

  -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot for Binomial GLM

2010-10-04 Thread Joshua Wiley

Hi,

Dennis was kind of enough to remind me that glm() can take a two
column matrix, which is probably what you did with deadalive.  He also
gave a rather elegant graphing solution using xyplot:

xyplot(Alive/20 ~ Dose, data = rat.toxic, groups = Sex, type = c('p', 'a'))

Josh

On Mon, Oct 4, 2010 at 8:23 AM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 On Mon, Oct 4, 2010 at 7:21 AM, klsk89 karenkls...@yahoo.com wrote:

 Hi i would like to use some graphs or tables to explore the data and make
 some sensible guesses of what  to expect to see in a glm model to assess if
 toxin concentration and sex have a relationship with the kill rate of rats.
 But i cant seem to work it out as i have two predictor
 variables~help?Thanks.:)

 What about xtabs?   For instance:

 xtabs(deadalive ~ Dose + Sex, data = rat.toxic)

 Regarding graphs, take a look at faceting in ggplot2 (or lattice).
 You can get something close to the 3 way table but in graphical form
 that way.  I am not sure if this is completely up and running yet, but
 I know there has been work linking ggobi with R.  I have seen a few
 demonstrations that looked quite promising, and it may work well for
 you to visualize three variables at once (and interactively).  Here is
 the link:  http://www.ggobi.org/rggobi/


 Here's my data.

 rat.toxic-read.table(file=Rats.csv,header=T,row.names=NULL,sep=,)
 attach(rat.toxic)
      ^  why attach it?
 names(rat.toxic)
 [1] Dose  Sex   Dead  Alive
 rat.toxic
   Dose Sex Dead Alive
 1    10   F    1    19
 2    10   M    0    20
 3    20   F    4    16
 4    20   M    4    16
 5    30   F    9    11
 6    30   M    8    12
 7    40   F   13     7
 8    40   M   13     7
 9    50   F   18     2
 10   50   M   17     3
 11   60   F   20     0
 12   60   M   16     4
 13   10   F    3    17
 14   10   M    1    19
 15   20   F    2    18
 16   20   M    2    18
 17   30   F   10    10
 18   30   M    8    12
 19   40   F   14     6
 20   40   M   12     8
 21   50   F   16     4
 22   50   M   13     7
 23   60   F   18     2
 24   60   M   16     4

 Please tell me that after this, you converted the counts of dead and
 alive into a single variable that had a 0 or 1 if dead and the
 opposite as alive before you used it as the dependent variable in your
 logistic regression.

 glm2-glm(deadalive~Dose*Sex,family=binomial,data=rat.toxic)
 anova(glm2,test=Chi)
 Analysis of Deviance Table

 Model: binomial, link: logit

 Response: deadalive

 Terms added sequentially (first to last)


         Df Deviance Resid. Df Resid. Dev P(|Chi|)
 NULL                        23    225.455
 Dose      1  202.366        22     23.090    2e-16 ***
 Sex       1    4.328        21     18.762    0.0375 *
 Dose:Sex  1    1.149        20     17.613    0.2838
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 summary(glm2)

 Call:
 glm(formula = deadalive ~ Dose * Sex, family = binomial, data = rat.toxic)

 Deviance Residuals:
     Min        1Q    Median        3Q       Max
 -1.82241  -0.85632   0.06675   0.61981   1.47874

 Coefficients:
            Estimate Std. Error z value Pr(|z|)
 (Intercept) -3.47939    0.46167  -7.537 4.83e-14 ***
 Dose         0.10597    0.01286   8.243   2e-16 ***
 SexM         0.15501    0.63974   0.242    0.809
 Dose:SexM   -0.01821    0.01707  -1.067    0.286
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 (Dispersion parameter for binomial family taken to be 1)

    Null deviance: 225.455  on 23  degrees of freedom
 Residual deviance:  17.613  on 20  degrees of freedom
 AIC: 91.115

 Number of Fisher Scoring iterations: 4






 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Plot-for-Binomial-GLM-tp2954406p2954406.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fisher exact test?

2010-10-04 Thread James Nead

Hi Thomas,

Thanks for the reply.

1. In the first pick, I draw 'A' genes from N, without replacement.
2. Similarly, in the second pick, I draw 'B' genes from N, without replacement 
(and 'C' genes from 'N' etc.)
3. Order does not matter - so the two cases you cited are equivalent.

I would like to find out the probability that 'n' of the genes are common to 
all 
three 'picks'.

many thanks!




From: Thomas Stewart tgstew...@gmail.com

Cc: r-help@r-project.org
Sent: Mon, October 4, 2010 10:46:37 AM
Subject: Re: [R] Fisher exact test?


To find the probability directly, you'll need to clearly state how you are 
sampling.  Are you sampling with replacement? (Draw then put back.)  Or are you 
sampling without replacement? (Draw.  Draw from the remaining.  Draw from the 
remaining.)

Also, of the N genes, do you know how many are of the particular set?
How many genes do you draw each try?

Does order matter to you?  In other words is (A=5, B=0, C=23) equivalent to 
(A=0,B=5,C=23)?

-tgs



Hi All,

My apologies if this is a totally newbie question.

I want to calculate the probability that a particular set of genes is picked
repeatedly for 3 samplings. For example, if from a total of 'N' genes, I pick
'A' number of genes in the first pick, 'B' number of genes in the second pick,
and 'C' number of genes in the third pick, I would want to calculate p(n) where
n = (A intersection B intersection C).

Would fisher exact test be the correct method to evaluate the probability? And
if so how can I frame the 3x3 (?) contingency table? I can find numerous
examples of 2x2 tables, but none for 3x3. Any guidance would be appreciated.

many thanks.



   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Script auto-detecting its own path

2010-10-04 Thread Hadley Wickham

 I'm not sure this will solve the issue because if I move the script, I would
 still have to go into the script and edit the /path/to/my/script.r, or do
 I misunderstand your workaround?
 I'm looking for something like:
 file.path.is.here(myscript.r)
 and which would return something like:
 [1] c:/user/Desktop/
 so that regardless of where the script is, as long as the accompanying
 scripts are in the same directory, they can be easily sourced with something
 like:
 dirX - file.path.is.here(MasterScript.r)
 source(paste(dirX, AuxillaryFile.r, sep=))

If you use relative paths like so:

# master.r
source(AuxillaryFile.r)

Then source(path/to/master.r, chdir = T) will work.  Mastering
working directories is a much better idea than coming up with your own
workarounds.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reducing distances between tickmarks

2010-10-04 Thread Dimitri Liakhovitski

Hello, everybody!
I have a code below that creates a data set and then a stacked bar
chart based on that data set.
No need to look at it - just notice please that my horizontal axis is
a date varible (x=my.data$date).
I have a question about the last 2 lines of this code:
grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd = par(lwd))
axis(1, las = 2)

Could anyone tell me - what are the tick mark labels on my X axis? Why
are they not dates?
axTicks(side=1)

Is there any way to change them to actual dates as in my.data$date?
And also - is there a way to force the graph to display all tickmarks
on X axis - and not just 4 of them?

Thank you!
Dimitri

my.data-data.frame(date=c(20080301,20080401,20080501,20080601,20080701,20080801,20080901,20081001,20081101,20081201,20090101,20090201,20090301,20090401,20090501,20090601,20090701,20090801,20090901,20091001,20091101,20091201,20100101,20100201,20100301,20100402,20100503),
x=c(1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5,
2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9),
y=c(-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9),
z=c(-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15,-0.06,-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15,-0.06,-0.06,-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15),
a=c(10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14))
my.data$date-as.character(my.data$date)
my.data$date-as.Date(my.data$date,%Y%m%d)
(my.data)
str(my.data)

positives-which(colSums(my.data[2:ncol(my.data)])0) # which vars
have positive column sums?
negatives-which(colSums(my.data[2:ncol(my.data)])0) # which vars
have negative column sums?

y.max-1.1*max(rowSums(my.data[names(positives)])) # the max on the y
axis of the chart
y.min-1.1*min(rowSums(my.data[names(negatives)])) # the min on the y
axis of the chart
ylim - c(y.min, y.max)
order.positives-rev(rank(positives))
order.of.pos.vars-names(order.positives)
order.negatives-rev(rank(negatives))
order.of.neg.vars-names(order.negatives)
order-c(order.negatives,order.positives)
order.of.vars-names(order)   # the order of variables on the chart -
from the bottom up
### so, the bottom-most area should be for z, the second from the
bottom area- for y (above z)

all.colors-c('red','blue','green','orange','yellow','purple')
xx - c(my.data$date, rev(my.data$date))
bottom.y.coordinates-rowSums(my.data[names(negatives)])

plot(x=my.data$date, y=bottom.y.coordinates, ylim=ylim, col='white',
type='l', xaxt='n',
ylab='Title for Y', xlab='Date', main='Chart Title')

for(var in order.of.neg.vars){
top.line.coords-bottom.y.coordinates-my.data[[var]]
bottom.coords-c(bottom.y.coordinates,rev(top.line.coords))
polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in% var)])
bottom.y.coordinates-top.line.coords
}

for(var in order.of.pos.vars){
top.line.coords-bottom.y.coordinates+my.data[[var]]
bottom.coords-c(bottom.y.coordinates,rev(top.line.coords))
polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in% var)])
bottom.y.coordinates-top.line.coords
}

grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd = par(lwd))
axis(1, las = 2)


-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Debye function

2010-10-04 Thread Christophe Dutang

Thank you for your time.

I'm looking for another package than the gsl package. But it does not seem to 
be implemented anywhere else.

Christophe

Le 4 oct. 2010 à 19:10, Bert Gunter a écrit :

 Well, not really ...
 
 Google on R package Debye . The 5th hit lists the gsl package with
 the debye() function.
 
 Moral: DO make use of standard professional search tools, also.
 
 -- Bert
 
 On Mon, Oct 4, 2010 at 9:13 AM, Spencer Graves
 spencer.gra...@structuremonitoring.com wrote:
 
 install.packages('sos') # if not already installed
 library(sos)
 Debye - ???Debye
 #found 4 matches
 Debye # Print method opens table in a web browser
 
 
  Conclusions:
 
 
1.  There are references to Debye in the CHNOSZ package, but it
 may not be what you want.
 
 
2.  If it's otherwise available in R, it's very well concealed.
 
 
  Spencer
 
 
 On 10/4/2010 8:56 AM, Christophe Dutang wrote:
 
 Dear list,
 
 Is there another package than gsl* where the Debye function (of first
 order)
 is implemented? Google only finds the gsl package.
 
 Thanks in advance
 
 Christophe
 
 * page 12 of reference manual of gsl.
 http://cran.r-project.org/web/packages/gsl/gsl.pdf
 
 
 --
 Spencer Graves, PE, PhD
 President and Chief Operating Officer
 Structure Inspection and Monitoring, Inc.
 751 Emerson Ct.
 San José, CA 95126
 ph:  408-655-4567
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Bert Gunter
 Genentech Nonclinical Biostatistics

--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reducing distances between tickmarks

2010-10-04 Thread David Winsemius



On Oct 4, 2010, at 3:28 PM, Dimitri Liakhovitski wrote:


Hello, everybody!
I have a code below that creates a data set and then a stacked bar
chart based on that data set.
No need to look at it - just notice please that my horizontal axis is
a date varible (x=my.data$date).
I have a question about the last 2 lines of this code:
grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd =  
par(lwd))

axis(1, las = 2)


But you only gave two arguments to axis... the side and las???


Could anyone tell me - what are the tick mark labels on my X axis? Why
are they not dates?


Because they are the number of days since the origin. Read the help  
page for Dates. If you want the text value for axis to be formatted,  
then consult the axis help page for the argument names and use the  
appropriate call to format:


?Dates
?axis  #   perhaps ... labels =format(my.data$date, %Y-%m-%d)

# but you will also need to decide where you want the tick marks

Perhaps:

 axis(1, labels =format(as.Date(axTicks(1), origin=1970-01-01),  
%Y-%m-%d), at=axTicks(1), col=red)




?format



axTicks(side=1)

Is there any way to change them to actual dates as in my.data$date?
And also - is there a way to force the graph to display all tickmarks
on X axis - and not just 4 of them?

Thank you!
Dimitri

my.data- 
data.frame(date=c(20080301,20080401,20080501,20080601,20080701,20080801,20080901,20081001,20081101,20081201,20090101,20090201,20090301,20090401,20090501,20090601,20090701,20090801,20090901,20091001,20091101,20091201,20100101,20100201,20100301,20100402,20100503),

x=c(1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5,
2.1, 1.3, 1.9, 1.1, 1, 1.6, 1, 2, 1.5, 2.1, 1.3, 1.9),
y 
= 
c 
(-4 
,-3 
,-6 
,-5 
,-7 
,-5.2 
,-6 
,-4 
,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9,-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9),
z 
= 
c 
(-0.2 
,-0.3 
,-0.4 
,-0.1 
,-0.2 
,-0.05 
,-0.2 
,-0.15 
,-0.06 
,-0.2 
,-0.3 
,-0.4 
,-0.1 
,-0.2 
,-0.05 
,-0.2,-0.15,-0.06,-0.06,-0.2,-0.3,-0.4,-0.1,-0.2,-0.05,-0.2,-0.15),
a 
= 
c 
(10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14,10,13,15,15,16,17,15,16,14 
))

my.data$date-as.character(my.data$date)
my.data$date-as.Date(my.data$date,%Y%m%d)
(my.data)
str(my.data)

positives-which(colSums(my.data[2:ncol(my.data)])0) # which vars
have positive column sums?
negatives-which(colSums(my.data[2:ncol(my.data)])0) # which vars
have negative column sums?

y.max-1.1*max(rowSums(my.data[names(positives)])) # the max on the y
axis of the chart
y.min-1.1*min(rowSums(my.data[names(negatives)])) # the min on the y
axis of the chart
ylim - c(y.min, y.max)
order.positives-rev(rank(positives))
order.of.pos.vars-names(order.positives)
order.negatives-rev(rank(negatives))
order.of.neg.vars-names(order.negatives)
order-c(order.negatives,order.positives)
order.of.vars-names(order)   # the order of variables on the chart -
from the bottom up
### so, the bottom-most area should be for z, the second from the
bottom area- for y (above z)

all.colors-c('red','blue','green','orange','yellow','purple')
xx - c(my.data$date, rev(my.data$date))
bottom.y.coordinates-rowSums(my.data[names(negatives)])

plot(x=my.data$date, y=bottom.y.coordinates, ylim=ylim, col='white',
type='l', xaxt='n',
ylab='Title for Y', xlab='Date', main='Chart Title')

for(var in order.of.neg.vars){
top.line.coords-bottom.y.coordinates-my.data[[var]]
bottom.coords-c(bottom.y.coordinates,rev(top.line.coords))
	polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in%  
var)])

bottom.y.coordinates-top.line.coords
}

for(var in order.of.pos.vars){
top.line.coords-bottom.y.coordinates+my.data[[var]]
bottom.coords-c(bottom.y.coordinates,rev(top.line.coords))
	polygon(xx,bottom.coords,col=all.colors[which(names(my.data) %in%  
var)])

bottom.y.coordinates-top.line.coords
}

grid(nx=NULL,ny=NULL,col = lightgray, lty = dotted,lwd =  
par(lwd))

axis(1, las = 2)


--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading data into

2010-10-04 Thread Federman, Douglas

I have data in the following form:

 

judge poster score poster score poster score

a1   89 2  79  392

b   3   45 4  65

 

and am trying to get it to the following:

 

Poster  Judge_A  Judge_B Judge_C

1  89

2  79

3  92   45

4 65

 

Any hints would be appreciated.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ridge regression and mixed models

2010-10-04 Thread harez...@post.harvard.edu

Dear R users,
  An equivalence between linear mixed model formulation and penalized 
regression 
models (including the ridge regression and penalized regression splines) has 
proven to be very useful in many aspects. Examples include the use of the lme() 
function in the library(nlme) to fit smooth models including the estimation of 
a 
smoothing parameter using REML. My question concerns the use of the linear 
mixed 
model software to fit a ridge regression with the number of columns in the 
design matrix X (p) exceeding the number of observations (n). Has anybody in 
the 
R community implemented the LME-like approach with estimation of the variance 
components using REML to find the coefficient estimates (BLUEs) and predictors 
(BLUPs) in the ridge regression problem in the p  n  setting?

Sample code below summarizes my problem:

version$version.string
# [1] R version 2.11.1 (2010-05-31)

library(nlme)

# DATA generation:
dim - 200
n - 50
XX - matrix(rnorm(dim*n, 0, 0.1), ncol=dim, nrow=n)
beta - matrix(c(rep(1, 40), rep(2,20), rep(0,140)), ncol=1)
Y - XX %*% beta + rnorm(n)

# MODEL fit:
dummyId - factor(rep(1,n))
Z.block - list(dummyId=pdIdent(~-1+XX))
data.fr - data.frame(Y,XX)
fit - lme(Y~1,
data=data.fr, 
random=Z.block)

# ERROR:
Warning message:
In lme.formula(Y ~ 1, data = data.fr, random = Z.block) :
  Fewer observations than random effects in all level 1 groups
#

Thank you in advance,
Jarek  Harezlak


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Format of Output of Residuals

2010-10-04 Thread Michael Just

Thank you this was worked great. Thank you for the additional
information as well. Knowledge is power.

Cheers,
Michael

On Fri, Oct 1, 2010 at 5:34 AM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:

 CW.lm - lm(weight ~ Diet, ChickWeight)
 resid.CW.lm - resid(CW.lm)
 as.data.frame(resid(CW.lm))[1:10, ]  # (nope)
  [1] -60.645455 -51.645455 -43.645455 -38.645455 -26.645455  -9.645455
  [7]   3.354545  22.354545  46.354545  68.354545
 # convert residuals to one column matrix, then convert to data frame:
 as.data.frame(matrix(resid(CW.lm), ncol = 1))
   V1
 1    -60.6454545
 2    -51.6454545
 3    -43.6454545
 4    -38.6454545
 5    -26.6454545
 ...

 HTH,
 Dennis

 On Thu, Sep 30, 2010 at 9:09 PM, Michael Just mgj...@gmail.com wrote:

 An excerpt from dataset ChickWeight:
     weight Time Chick Diet
 1   42    0 1    1
 2   51    2 1    1
 3   59    4 1    1

 I am interested in the residuals of the dataset.  Specifically in
 saving them to another format. I have been creating text files with
 sink.

 CW.lm - lm(weight ~ Diet, ChickWeight)
 resid.CW.lm - resid(CW.lm)

 But when I call:
 resid.CW.lm

 The data appears like this (excerpt) :

    1    2    3    4    5    6
  -60.6454545  -51.6454545  -43.6454545  -38.6454545  -26.6454545   -9.6454545

 How can I get the data to be formatted like below in an output / sink
 friendly way?

 1 -60.6454545
 2 -51.6454545
 3 -43.6454545
 4 -38.6454545
 5 -26.6454545
 6 -9.6454545

 Thank you kindly,
 Cheers,
 Mike

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] source package build/installation with subdirectory-lib

2010-10-04 Thread Murat Tasan

hi all - i have a source package i'm writing that i'd like to be able
to install with a special library that my R src files rely on.
to be more precise, i have a normal R package directory structure
(i.e. src/ R/ man/ DESCRIPTION NAMESPACE etc.).
i also have another directory here called depPkg, and it has it's own
configure file for the canonical './configure  make  make install'
installation.

so myPkg looks like so (only showing the relevant pieces):

+ myPkg
|--- DESCRIPTION
|--- NAMESPACE
|--- R/   (with R files in here)
|--- src/   (with my .c files in here)
|--- depPkg/   (with all of the package contents (e.g. src, docs,
configure script, etc)


if i switch into and build depPkg by hand it works just fine, and i
set a configure option to locate the .a library file it creates into
the depPkg/lib/ directory, like so:
bash$ ./configure --prefix=./lib --with-pic --disable-shared
this builds depPkglib.a and puts it in the lib directory under depPkg.

now when i compile *my* shared library in testing, it also works just
fine:
bash$ cd src
bash$ gcc -std=gnu99 -I/usr/share/R/include -I../depPkg/include -fpic -
g -O2 -c mycode.c -o mycode.o
bash$ gcc -std=gnu99 -shared -o mycode.so mycode.o -L/usr/lib64/R/lib -
lR -L../depPgk/lib -ldepPkg

when in R, i can now use dyn.load(mycode.so) and .Call(...) with my
C functions perfectly.

now i want to build the package so i can install it easily on my
colleagues' machines.
so i use the normal R build mechanism (R CMD build), but first create
a Makevars file that looks like so:


PKG_CPPFLAGS = -I../depPkg/include
PKG_LIBS = ../depPkg/lib

.PHONY: all mylibs

all: $(SHLIB)

$(SHLIB): mylibs

mylibs:
(cd ..; gunzip depPgk.tar.gz; tar -xf depPkg.tar; cd depPkg; ./
configure --prefix=./lib --with-pic --disable-shared; make; make
install)


finally, when i try R CMD INSTALL, i know it sees the Makevars file,
because the first build step shows the added include path (from the
PKG_CPPFLAGS variable), but it never seems to execute the commands for
the mylibs target, thus the compiling step of my code fails.

sorry for the long-winded description, but without it i think this
would have made no sense (and likely still doesn't).

i have also tried moving depPgk INTO the src directory and updating
the necessary ../ to ./ in the Makevars, but target mylibs never seems
to be built first.

anyone ever try something like this and have some pointers on how to
debug?

thanks much for any help,

-murat

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fisher exact test?

2010-10-04 Thread Peter Dalgaard

On 10/04/2010 09:09 PM, James Nead wrote:
 Hi Thomas,
 
 Thanks for the reply.
 
 1. In the first pick, I draw 'A' genes from N, without replacement.
 2. Similarly, in the second pick, I draw 'B' genes from N, without 
 replacement 
 (and 'C' genes from 'N' etc.)
 3. Order does not matter - so the two cases you cited are equivalent.
 
 I would like to find out the probability that 'n' of the genes are common to 
 all 
 three 'picks'.
 
 many thanks!

Hmm, a sort of cubic version of Fisher's test? Someone might well have
studied this extensively, but it wasn't me However, there are three
constraints on the 7 df 2x2x2 table, whereas the usual Fisher case puts
two constraints on 3 df, so it's not like one cell determines the rest.

With only 'A' and 'B' samples, it's a clear hypergeometric
(capture-recapture) situation: Just color the balls that you caught in
the first draw. If you draw a third time, coloring the recaptured balls,
you'd get a hypergeometric mixture of hypergeometric distributions,
which might be workable if the samples are not too large.

Alternatively, in the large-sample regime, you can probably work out the
mean and variance of 'n'...


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 139 matches

Mail list logo