Re: [R] how to combine presence only data sets to one presence/absence table

2011-04-20 Thread antu
What about the opposite of this, This has been very helpful for me, but at
the same time, I needed the opposite of this to..

ie, this to

spl_A   spl_B   spl_C 
spcs1   1   1   0 
spcs2   1   0   1 
spcs3   0   1   1

this

spl_A spl_B spl_C 
spcs1  spcs1spcs2 
spcs2spcs3  spcs3 
spcs4spcs5 
spcs5


Thank you

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-combine-presence-only-data-sets-to-one-presence-absence-table-tp830140p3465121.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help needed!

2011-04-20 Thread Peter Ehlers

On 2011-04-20 14:33, Shuangyan Xiong wrote:

Hi everyone,

I have a question. Now I am reading the resource code of the package
"ssfcov". The resource code is as following. I cannot find the resource
code of the function "myss2d" anywhere in the package. Can anyone give
me a hint how to find it in the package. Thanks a lot!!bv


  >  ssfcov
function (time, x, subject, nbasis = 5, centered = FALSE, noDiag = TRUE)
{
  if (!centered) {
  fit<- smooth.spline(time, x)
  x<- x - fitted(fit)
  }
  gg<- NULL
  for (zz in unique(subject)) {
  if (sum(subject == zz)>  1) {
  tt<- time[subject == zz]
  xx<- x[subject == zz]
  g<- expand.grid(t1 = tt, t2 = tt)
  scov<- xx %*% t(xx)
  if (noDiag)
  scov<- scov + diag(rep(Inf, length(xx)))
  g$z<- matrix(scov, ncol = 1)
  gg<- rbind(gg, g[g$z<  Inf, ])
  }
  }
  nobs<- nrow(gg)
  tt<- min(time) + (max(time) - min(time)) * (1:nbasis)/(nbasis +
  1)
  g<- expand.grid(t1 = tt, t2 = tt)
  g$z<- 0
  gg<- rbind(gg, g)
  fit<- myss2d(z ~ t1 * t2, data = gg, id.basis = ((nobs +
  1):(nobs + nbasis * nbasis)))
  class(fit)<- "rfcovObj"
  return(fit)
}



Maybe I'm just slow tonight, but where did you get
that package? I can't see on CRAN or R-Forge.

You might also consider a more informative subject
line. "Help needed" is true for all questions (not
answers) on R-help. Just think, why is this list
called R-***help***?

Peter Ehlers




Best,
Shuangyan
  >

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] automatic font selection, please help

2011-04-20 Thread Jinsong Zhao

Hi there,

With the helps from this list, I can set specific CJK fonts for 
character string using text() function. for example:


song <- CIDFont("SimSun", "GBK-EUC-H", "GBK", "")

postscriptFonts(song = song)

postscript("test.ps", height = 7, width =7, family = "Times", fonts = 
c("song"), horizontal = FALSE, onefile = FALSE, paper = "special")


plot(1:10, xlab = "")
mtext("\u4F60\u597D", family="song", font = 1, side = 1, line = 2)

text(1,1, "\u4F60\u597D", family="song", font = 1)
text(3,1, "\u4F60\u597D", family="song", font = 2)
text(5,1, "\u4F60\u597D", family="song", font = 3)
text(7,1, "\u4F60\u597D", family="song", font = 4)

dev.off()

Now, I hope to change the xlab to "\u4F60\u597D (Hello)", than, I using 
the following code:


mtext("\u4F60\u597D (Hello)", family="song", font = 1, side=1, line=2)

Now, the typeface of " (Hello)" is "song", however, I hope it is 
"Times", which looks better than "song".


If using windows() as graphic device, the \u4F60\u597D can be displayed 
correctly in "song" (the default font, but I don't know how to change 
it), and " (Hello)" in "Times" with par(family = "serif").


The question is: how to set R for automatic font selection, i.e., for 
latin or ascii characters using "Times", and Chinese characters using 
"song"?


Any suggestions or comments will be really appreciated. Thanks in advance.

Regards,
Jinsong



> sessionInfo()
R version 2.11.1 (2010-05-31)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Chinese_People's Republic of China.936
[2] LC_CTYPE=Chinese_People's Republic of China.936
[3] LC_MONETARY=Chinese_People's Republic of China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese_People's Republic of China.936

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.11.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stymied by plyr

2011-04-20 Thread Stuart Luppescu
Hello, This is my first time trying to use plyr, and I'm getting
nowhere. I have teacher ratings data (1:4), on 10 components, by
external observers and internal observers, in schools in areas. I want
to calculate the percentage of each rating given on each component, by
each type of observer, within each school, within each area. The data
look like this:

unit area ext.obs rating comp
11 711   0  31
12 711   0  42
13 711   0  33
14 711   0  44
15 711   0  35
16 711   0  36
17 711   0  37
18 711   0  38
19 711   0  39
20 711   0  3   10

I thought this would be a perfect application for plyr. I tried this:

calc.pct <- function(x) {
  table(x)/sum(table(x))
}

pcts <- ddply(test.school, .(area, ext.obs, comp), calc.pct, x=rating)
Error in .fun(piece, ...) : unused argument(s) (piece)

Then I tried this:
 pcts <- ddply(test.school, .(area, ext.obs, comp), .(calc.pct(rating)))
Error in .fun(piece, ...) : attempt to apply non-function

I tried all kinds of other variations but with no success. Can someone
give me some pointers?

Thanks. 
-- 
Stuart Luppescu -=- slu .at. ccsr.uchicago.edu 
University of Chicago -=- CCSR 
才文と智奈美の父 -=- Kernel 2.6.36-gentoo-r5 
Lars Strand: Will R run under Windows Pocket PC? Brian D. Ripley: We
don't know! There are no binary versions of R for that platform, but
perhaps you could find a suitable compiler and manage to build the
sources. Outside pure mathematics it is usually very hard to establish
that something cannot be done (and it can be very hard in pure
mathematics, too). -- Lars Strand and Brian D. Ripley R-help (November
2004)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GLM output for deviance and loglikelihood

2011-04-20 Thread Juliet Hannah
As you mentioned, the deviance does not always reduce to:

D = -2(loglikelihood(model))

It does for ungrouped data, such as for binary logistic regression. So
let's stick with the original definition.
In this case, we need the log-likelihood for the saturated model.

x = rnorm(10)

 y = rpois(10,lam=exp(1 + 2*x))

 test = glm(formula = y ~ x, family = poisson)


sm <- glm(y ~ factor(1:10),family=poisson)

mydev <- as.numeric(2*(logLik(sm)-logLik(test)))
mydev
deviance(test)


On Fri, Apr 15, 2011 at 7:00 AM, Jeffrey Pollock
 wrote:
> It has always been my understanding that deviance for GLMs is defined
> by;
>
>
>
> D =  -2(loglikelihood(model) - loglikelihood(saturated model))
>
>
>
> and this can be calculated by (or at least usually is);
>
>
>
> D = -2(loglikelihood(model))
>
>
>
> As is done so in the code for 'polr' by Brian Ripley (in the package
> 'MASS') where the -loglikehood is minimised using optim;
>
>
>
> res <- optim(s0, fmin, gmin, method = "BFGS", hessian = Hess, ...)
>
> .
>
> .
>
> .
>
> deviance <- 2 * res$value
>
>
>
> If so, why is it that;
>
>
>
>> x = rnorm(10)
>
>> y = rpois(10,lam=exp(1 + 2*x))
>
>> test = glm(formula = y ~ x, family = poisson)
>
>> deviance(test)
>
> [1] 5.483484
>
>> -2*logLik(test)
>
> [1] 36.86335
>
>
>
> I'm clearly not understanding something here, can anyone shed any light?
> Why is;
>
>
>
> -2*logLik(test) =/= deviance(test) ???
>
>
>
> I think this is something that is poorly understood all over the
> internet (at least from my google searches anyway!)
>
>
>
> Thanks,
>
>
>
> Jeff
>
>
>
> - 
> ** Confidentiality: The contents 
> of this e-mail and any attachments transmitted with it are intended to be 
> confidential to the intended recipient; and may be privileged or otherwise 
> protected from disclosure. If you are not an intended recipient of this 
> e-mail, do not duplicate or redistribute it by any means. Please delete it 
> and any attachments and notify the sender that you have received it in error. 
> This e-mail is sent by a William Hill PLC group company. The William Hill 
> group companies include, among others, William Hill PLC (registered number 
> 4212563), William Hill Organization Limited (registered number 278208), 
> William Hill Credit Limited (registered number 413846), WHG (International) 
> Limited (registered number 99191) and WHG Trading Limited (registered number 
> 101439). Each of William Hill PLC, William Hill Organization Limited and 
> William Hill Credit Limited is registered in Engl!
>  and and Wales and has its registered office at Greenside House, 50 Station 
> Road, Wood Green, London N22 7TP. Each of WHG (International) Limited and WHG 
> Trading Limited is registered in Gibraltar and has its registered office at 
> 6/1 Waterport Place, Gibraltar. Unless specifically indicated otherwise, the 
> contents of this e-mail are subject to contract; and are not an official 
> statement, and do not necessarily represent the views, of William Hill PLC, 
> its subsidiaries or affiliated companies. Please note that neither William 
> Hill PLC, nor its subsidiaries and affiliated companies can accept any 
> responsibility for any viruses contained within this e-mail and it is your 
> responsibility to scan any emails and their attachments. William Hill PLC, 
> its subsidiaries and affiliated companies may monitor e-mail traffic data and 
> also the content of e-mails for effective operation of the e-mail system, or 
> for security, purposes. *
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with matching rows

2011-04-20 Thread gary engstrom
Dear Sir,

Please excuse my akwardness as I a new to R and computers, but would kindly
appreciate help.
{
a <- sample (1:10,100,replace=T )
b <-sample(10:20,100,replace=T)
c <- sample(20:30,100,replace=T)
d <- sample(30:40,100,replace=T)
e <- sample(40:50,100,replace=T)
}
d1 <- a
d2 <- b
d3 <-c
d4 <- d
d5 <- e

data.frame(d1,d2,d3,d4,d5)
dd <- data.frame(d1,d2,d3,d4,d5)
dd
sd(d1)
summary(d1)
sd(d2)
summary(d2)
sd(d3)
summary(d3)
sd(d4)
summary(d4)
sd(d5)
summary(d5)
I am a beginner to R and am trying to learn statistical
probability. I have started Dr. Levine and Dr Kerns books.
So far from the usual sources, I haven't found the answers
to the following questions and would greatly appreciate
any assistance that anyone might kindly share.
If I run this code, how do I look for duplicate rows and how can
 I adjust the SD of the sample function to make the chances
of a duplicate row occur more often ?
How do I export the dd data frame to excel?
Deepest Gradtitude
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] override default arguments in nested function

2011-04-20 Thread Joshua Wiley
Hi All,

I think I already know the answer, but I am hoping I am missing
something.  I am using function omega from the psych_1.0-96 in R
version 2.13.0.  I would like to override one of the default arguments
of a function that is eventually called (to fix convergence issues),
the problem is there is no argument at the omega() function level that
make it to the function I want to override.  That is,

omega calls schmid calls oblimin calls GPFoblq

and I want to change an argument in GPFoblq.  Right now I am using
(the rather unsatisfactory):

trace(what = GPFoblq, tracer = expression(maxit <- 2000), at = 10,
print = FALSE)

to override the maximum number of iterations, which works but is a
hassle.  Anyone have other ideas/techniques?

Thanks,

Josh


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error running pvals.fnc in R version 2.13.0

2011-04-20 Thread Amy DiBattista
Dear R-help:

I've been trying to run pvals.fnc in the newest version of R (2.13.0). The
function lmer worked fine, but when I tried to use pvals.fnc on the lmer
object, I got the following error:

"Error in pvals.fnc(elogr.subj.dys.sum.3x3.p, nsim = 1) :
  trying to get slot "coefs" from an object (class "summaryDefault") that is
not an S4 object."

How can I fix this problem? Thanks in advance for your help.

-- 
Amy DiBattista
Ph.D. Candidate
Department of Psychology
Northeastern University
(617) 373-3077
dibattist...@husky.neu.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question on automated content analysis program

2011-04-20 Thread 엄기홍
 Hello.

I am trying to replicate “Automated Content Analysis” program, which is
developped by Gary King.
As was instructed in “readme.pdf”, I put the command “undergrad.results <-
undergrad(sep = ",")” in R.

It gave me the following message:

 “[1] "python C:/PROGRA~1/R/R-212~1.2/library/ReadMe/makerfile
--control-file control.txt

--table-file tablefile.txt --threshold 0.01" Error in undergrad() : Python
module failed. Aborting
undergrad.”

Would anyone tell me how to solve the problem? I have already installed
Python 2.7 and set the correct path in the system path file.

Best,
Kihong

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] invoking python from R

2011-04-20 Thread Rolf Turner

On 21/04/11 10:43, 엄기홍 wrote:

Dear colleagues,

I was wondering whether anyone may give me some help on a python issue in
R.
I tried to invoke Python from R, but failed.
I have set up a system path to call python: "C:\python27" in path file.
In addition, I have confirmed python interpreter properly.

The code I used is typical for checking, which is as follows:
x<- .Python("listdir", "/tmp", .module="posix")

The command gives me the following message:

x<- .Python("listdir", "/tmp", .module="posix")

Error: could not find function ".Python"
Any help would be greatly appreciated.


Clearly there is no such R function  as ``.Python''.  What makes
you think there is such a function?

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grid.table + splom: how to nicely align panel entries

2011-04-20 Thread Marius Hofert
Dear Baptiste,

*fantastic*, thank you very much, *precisely* what I was looking for!

Cheers,

Marius

On 2011-04-21, at 01:31 , baptiste auguie wrote:

> On 21 April 2011 09:54, Marius Hofert  wrote:
>> Dear Baptiste,
>> 
>> great, many thanks!
>> One last thing: Do you know why the gpar(cex=0.1) argument is ignored?
>> 
> 
> Yes – the theme overrides it, you need to include it in the theme.list().
> 
> baptiste
> 
> 
>> Cheers,
>> 
>> Marius
>> 
>> library(lattice)
>> library(grid)
>> library(gridExtra)
>> 
>> ## function for correct digit alignment
>> align.digits <- function(l){
>>sp <- strsplit(as.character(l), "\\.")
>>chars <- sapply(sp, function(x) nchar(x)[1])
>>n <- max(chars)-chars
>>l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
>>labels <- sapply(seq_along(sp), function(i){
>>point <- if(is.na(sp[[i]][2])) NULL else quote(.)
>>as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1]) * .(point) * 
>> .(sp[[i]][2]) ))})
>> }
>> 
>> ## splom with customized lower.panel
>> ## x: data
>> ## arr: array of containing expressions which are plotted in a grid table in 
>> the
>> ##  lower panel (i,j)]
>> splom2 <- function(x, arr, nr){
>>## function for creating table
>>table.fun <- function(vec){ # vector containing lines for table for *one* 
>> panel
>>grid.table(matrix(vec, nrow=nr, byrow=TRUE),
>>   parse=TRUE, # parse labels as expressions
>>   gpar.coretext=gpar(cex=0.1), # text size
>>   theme=theme.list(
>>   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
>>   core.just="left", padding.h=unit(0,"mm")) # justification 
>> of labels
>>   )
>>}
>>## splom
>>splom(x, varname.cex=1.2,
>>  superpanel=function(z, ...){
>>  panel.pairs(z, upper.panel=panel.splom, 
>> lower.panel=function(i,j){
>>  table.fun(arr[i,j,])
>>  }, ...)
>>  })
>> }
>> 
>> ## create data and array of expressions
>> d <- 4
>> x <- matrix(runif(d*1000), ncol=d) # data to be plotted with splom
>> nr <- 3 # number of rows for the panel entries
>> nc <- 3 # number of cols for the panel entries
>> arr <- array(list(rep(NA,nr*nc)), dim=c(d,d,nr*nc), 
>> dimnames=c("i","j","val")) # array containing the table entries per panel
>> f <- function(i,j) (i+j)*10 # dummy function
>> eq <- "phantom()==phantom()"
>> for(i in 1:d){
>>for(j in 1:d){
>>numbers <- align.digits(c(round(pi,4), round(pi, 6), f(i,j)))
>>arr[i,j,] <- c("alpha", eq, numbers[1],
>>   "italic(bbb)", eq, numbers[2],
>>   "gamma", eq, numbers[3])
>>}
>> }
>> 
>> ## plot
>> splom2(x, arr, nr=3)
>> 
>> 
>> On 2011-04-20, at 22:38 , baptiste auguie wrote:
>> 
>>> Try this,
>>> 
>>> align.digits = function(l)
>>> {
>>> 
>>> sp <- strsplit(as.character(l), "\\.")
>>> chars <- sapply(sp, function(x) nchar(x)[1])
>>> n = max(chars) - chars
>>> l0 = sapply(n, function(x) paste(rep("0", x), collapse=""))
>>> labels = sapply(seq_along(sp), function(i) {
>>>  point <- if(is.na(sp[[i]][2])) NULL else quote(.)
>>>  as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])*
>>> .(point)*.(sp[[i]][2]) ))})
>>> 
>>> return(labels)
>>> }
>>> 
>>> 
>>> library(gridExtra)
>>> 
>>> d <- align.digits(l = c(125.3, 1.2344, 12))
>>> grid.newpage()
>>> grid.table(d, parse=T, core.just="left", gpar.coretext=gpar(cex=0.5))
>>> 
>>> HTH,
>>> 
>>> baptiste
>>> 
>>> On 21 April 2011 03:07, Marius Hofert  wrote:
 Dear Baptiste,
 
 very nice, indeed!
 
 Two minor issues that remain, are:
 (1) I tried to omit the decimal dot for those numbers that do not have 
 digits
after the decimal dot. But somehow it does not work...
 (2) Do you know how one can decrease the text size for the text appearing 
 in the
lower panel? I tried to work with "cex=0.5"... but it was ignored all 
 the time.
 
 Cheers,
 
 Marius
 
 
 library(lattice)
 library(grid)
 library(gridExtra)
 
 ## function for correct digit alignment
 align.digits <- function(l){
sp <- strsplit(as.character(l), "\\.")
chars <- sapply(sp, function(x) nchar(x)[1])
n <- max(chars)-chars
l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
sapply(seq_along(sp), function(i){
if(length(sp[[1]])==1){
as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])))
}else{
as.expression(bquote(phantom(.(l0[i])) * 
 .(sp[[i]][1])*.*.(sp[[i]][2])))
}
})
 }
 
 ## splom with customized lower.panel
 ## x: data
 ## arr: array of containing expressions which are plotted in a grid table 
 in the
 ##  lower panel (i,j)]
 splom2 <- function(x, arr, nr){
## function for creating table
table.fun <- 

Re: [R] random value generation in a row

2011-04-20 Thread David Winsemius


On Apr 20, 2011, at 7:30 PM, Rujealous wrote:


the plan is

i have to create a column with random numbers from 1 to 5 forming a  
vector

with 200 numbers.
these numbers should have the same proportion, like 40 "1", 40  
''2'', etc.


So you really want a random permutation of a fixed set of 200 numbers:

sample( rep(1:5,40), 200)

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] random value generation in a row

2011-04-20 Thread Rujealous
the plan is

i have to create a column with random numbers from 1 to 5 forming a vector
with 200 numbers.
these numbers should have the same proportion, like 40 "1", 40 ''2'', etc.

thx,

pvm

--
View this message in context: 
http://r.789695.n4.nabble.com/random-value-generation-in-a-row-tp3464503p3464503.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sqldf INSERT INTO

2011-04-20 Thread Gabor Grothendieck
On Wed, Apr 20, 2011 at 12:39 PM, new2R  wrote:
>  Hi,
>
> I am new to R and trying to migrate from SAS. I am trying to copy data from
> one table to another table which have same columns using sqldf. but not
> working and showing "NULL"
>
> I wrote statement as sqldf("INSERT INTO new select * from data") but showing
> NULL
>
> Please help me in this regard.
>

In your example new is a table in the sqlite database, not in R's
workspace, so you have to return it:

> library(sqldf)
> BOD
  Time demand
118.3
22   10.3
33   19.0
44   16.0
55   15.6
67   19.8
> New <- BOD[1, ]
> BOD1 <- BOD[2:3,]
> sqldf(c("insert into New select * from BOD1", "select * from New"))
  Time demand
118.3
22   10.3
33   19.0


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] taking rows from data.frames in list to form new data.frame?

2011-04-20 Thread jctoll
Dennis,

Thanks, the first example works perfectly.

> do.call(rbind, lapply(database, function(df) subset(df, Symbol == 'IBM')))

I haven't tried the second, but will look into plyr.

Thanks, again,

James



On Wed, Apr 20, 2011 at 6:36 PM, Dennis Murphy  wrote:
> Hi:
>
> Perhaps you're looking for subset()? I'm not sure I understand the
> problem completely, but is
>
> do.call(rbind, lapply(database, function(df) subset(df, Symbol == 'IBM')))
>
> or
>
> library(plyr)
> ldply(lapply(database, function(df) subset(df, Symbol == 'IBM'), rbind)
>
> in the vicinity of what you're looking for? [Obviously untested for
> the usual reasons...]
>
> HTH,
> Dennis
>
> On Wed, Apr 20, 2011 at 4:13 PM, jctoll  wrote:
>> Hi,
>>
>> I am having a problem figuring out how to extract a subset of rows.  I
>> have a list with 68 similar data.frames.  Each data.frame is 500 rows
>> by 5 columns.  I want to take one row from each data.frame based upon
>> the data in a particular column (i.e. it matches a symbol).  For
>> example:
>>
>>> str(database)
>> List of 68
>>  $ X2011.01.11:'data.frame':    500 obs. of  5 variables:
>>  ..$ Symbol    : chr [1:500] "MMM" "ACE" "AES" "AFL" ...
>>  ..$ Price     : num [1:500] 87.7 60.7 13.1 55.7 15.6 ...
>>  ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ...
>>  ..$ Float     : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ...
>>  ..$ Market.Cap: num [1:500] 6.27e+10 2.06e+10 1.04e+10 2.62e+10 1.72e+09 ...
>>  $ X2011.01.12:'data.frame':    500 obs. of  5 variables:
>>  ..$ Symbol    : chr [1:500] "MMM" "ACE" "AES" "AFL" ...
>>  ..$ Price     : num [1:500] 88.7 60.9 12.9 57.1 15.2 ...
>>  ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ...
>>  ..$ Float     : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ...
>>  ..$ Market.Cap: num [1:500] 6.34e+10 2.06e+10 1.02e+10 2.69e+10 1.68e+09 ...
>>  . . .
>>
>>> lapply(database, function(x) which(x == "IBM"))
>> $X2011.01.11
>> [1] 234
>>
>> $X2011.01.12
>> [1] 234
>>  . . .
>>
>>> lapply(database, function(x) x[which(x == "IBM"), ])
>> $X2011.01.11
>>    Symbol  Price Shares.Out    Float Market.Cap
>> 234    IBM 147.28   1.24e+09 1.24e+09 1.8297e+11
>>
>> $X2011.01.12
>>    Symbol Price Shares.Out    Float Market.Cap
>> 234    IBM 149.1   1.24e+09 1.24e+09 1.8524e+11
>>  . . .
>>
>> What I would like to do is create a new data.frame with 68 rows by 5
>> columns of data, perhaps using the old data.frame names as the new
>> row.names. I can get to the subset of data that I want, I just can't
>> get it from list form into one new data.frame.  Any suggestions?
>> Thank you.
>>
>> James
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] taking rows from data.frames in list to form new data.frame?

2011-04-20 Thread Dennis Murphy
Hi:

Perhaps you're looking for subset()? I'm not sure I understand the
problem completely, but is

do.call(rbind, lapply(database, function(df) subset(df, Symbol == 'IBM')))

or

library(plyr)
ldply(lapply(database, function(df) subset(df, Symbol == 'IBM'), rbind)

in the vicinity of what you're looking for? [Obviously untested for
the usual reasons...]

HTH,
Dennis

On Wed, Apr 20, 2011 at 4:13 PM, jctoll  wrote:
> Hi,
>
> I am having a problem figuring out how to extract a subset of rows.  I
> have a list with 68 similar data.frames.  Each data.frame is 500 rows
> by 5 columns.  I want to take one row from each data.frame based upon
> the data in a particular column (i.e. it matches a symbol).  For
> example:
>
>> str(database)
> List of 68
>  $ X2011.01.11:'data.frame':    500 obs. of  5 variables:
>  ..$ Symbol    : chr [1:500] "MMM" "ACE" "AES" "AFL" ...
>  ..$ Price     : num [1:500] 87.7 60.7 13.1 55.7 15.6 ...
>  ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ...
>  ..$ Float     : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ...
>  ..$ Market.Cap: num [1:500] 6.27e+10 2.06e+10 1.04e+10 2.62e+10 1.72e+09 ...
>  $ X2011.01.12:'data.frame':    500 obs. of  5 variables:
>  ..$ Symbol    : chr [1:500] "MMM" "ACE" "AES" "AFL" ...
>  ..$ Price     : num [1:500] 88.7 60.9 12.9 57.1 15.2 ...
>  ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ...
>  ..$ Float     : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ...
>  ..$ Market.Cap: num [1:500] 6.34e+10 2.06e+10 1.02e+10 2.69e+10 1.68e+09 ...
>  . . .
>
>> lapply(database, function(x) which(x == "IBM"))
> $X2011.01.11
> [1] 234
>
> $X2011.01.12
> [1] 234
>  . . .
>
>> lapply(database, function(x) x[which(x == "IBM"), ])
> $X2011.01.11
>    Symbol  Price Shares.Out    Float Market.Cap
> 234    IBM 147.28   1.24e+09 1.24e+09 1.8297e+11
>
> $X2011.01.12
>    Symbol Price Shares.Out    Float Market.Cap
> 234    IBM 149.1   1.24e+09 1.24e+09 1.8524e+11
>  . . .
>
> What I would like to do is create a new data.frame with 68 rows by 5
> columns of data, perhaps using the old data.frame names as the new
> row.names. I can get to the subset of data that I want, I just can't
> get it from list form into one new data.frame.  Any suggestions?
> Thank you.
>
> James
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grid.table + splom: how to nicely align panel entries

2011-04-20 Thread baptiste auguie
On 21 April 2011 09:54, Marius Hofert  wrote:
> Dear Baptiste,
>
> great, many thanks!
> One last thing: Do you know why the gpar(cex=0.1) argument is ignored?
>

Yes – the theme overrides it, you need to include it in the theme.list().

baptiste


> Cheers,
>
> Marius
>
> library(lattice)
> library(grid)
> library(gridExtra)
>
> ## function for correct digit alignment
> align.digits <- function(l){
>    sp <- strsplit(as.character(l), "\\.")
>    chars <- sapply(sp, function(x) nchar(x)[1])
>    n <- max(chars)-chars
>    l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
>    labels <- sapply(seq_along(sp), function(i){
>        point <- if(is.na(sp[[i]][2])) NULL else quote(.)
>        as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1]) * .(point) * 
> .(sp[[i]][2]) ))})
> }
>
> ## splom with customized lower.panel
> ## x: data
> ## arr: array of containing expressions which are plotted in a grid table in 
> the
> ##      lower panel (i,j)]
> splom2 <- function(x, arr, nr){
>    ## function for creating table
>    table.fun <- function(vec){ # vector containing lines for table for *one* 
> panel
>        grid.table(matrix(vec, nrow=nr, byrow=TRUE),
>                   parse=TRUE, # parse labels as expressions
>                   gpar.coretext=gpar(cex=0.1), # text size
>                   theme=theme.list(
>                   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
>                   core.just="left", padding.h=unit(0,"mm")) # justification 
> of labels
>                   )
>    }
>    ## splom
>    splom(x, varname.cex=1.2,
>          superpanel=function(z, ...){
>              panel.pairs(z, upper.panel=panel.splom, 
> lower.panel=function(i,j){
>                  table.fun(arr[i,j,])
>              }, ...)
>          })
> }
>
> ## create data and array of expressions
> d <- 4
> x <- matrix(runif(d*1000), ncol=d) # data to be plotted with splom
> nr <- 3 # number of rows for the panel entries
> nc <- 3 # number of cols for the panel entries
> arr <- array(list(rep(NA,nr*nc)), dim=c(d,d,nr*nc), 
> dimnames=c("i","j","val")) # array containing the table entries per panel
> f <- function(i,j) (i+j)*10 # dummy function
> eq <- "phantom()==phantom()"
> for(i in 1:d){
>    for(j in 1:d){
>        numbers <- align.digits(c(round(pi,4), round(pi, 6), f(i,j)))
>        arr[i,j,] <- c("alpha", eq, numbers[1],
>                       "italic(bbb)", eq, numbers[2],
>                       "gamma", eq, numbers[3])
>    }
> }
>
> ## plot
> splom2(x, arr, nr=3)
>
>
> On 2011-04-20, at 22:38 , baptiste auguie wrote:
>
>> Try this,
>>
>> align.digits = function(l)
>> {
>>
>> sp <- strsplit(as.character(l), "\\.")
>> chars <- sapply(sp, function(x) nchar(x)[1])
>> n = max(chars) - chars
>> l0 = sapply(n, function(x) paste(rep("0", x), collapse=""))
>> labels = sapply(seq_along(sp), function(i) {
>>  point <- if(is.na(sp[[i]][2])) NULL else quote(.)
>>  as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])*
>> .(point)*.(sp[[i]][2]) ))})
>>
>> return(labels)
>> }
>>
>>
>> library(gridExtra)
>>
>> d <- align.digits(l = c(125.3, 1.2344, 12))
>> grid.newpage()
>> grid.table(d, parse=T, core.just="left", gpar.coretext=gpar(cex=0.5))
>>
>> HTH,
>>
>> baptiste
>>
>> On 21 April 2011 03:07, Marius Hofert  wrote:
>>> Dear Baptiste,
>>>
>>> very nice, indeed!
>>>
>>> Two minor issues that remain, are:
>>> (1) I tried to omit the decimal dot for those numbers that do not have 
>>> digits
>>>    after the decimal dot. But somehow it does not work...
>>> (2) Do you know how one can decrease the text size for the text appearing 
>>> in the
>>>    lower panel? I tried to work with "cex=0.5"... but it was ignored all 
>>> the time.
>>>
>>> Cheers,
>>>
>>> Marius
>>>
>>>
>>> library(lattice)
>>> library(grid)
>>> library(gridExtra)
>>>
>>> ## function for correct digit alignment
>>> align.digits <- function(l){
>>>    sp <- strsplit(as.character(l), "\\.")
>>>    chars <- sapply(sp, function(x) nchar(x)[1])
>>>    n <- max(chars)-chars
>>>    l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
>>>    sapply(seq_along(sp), function(i){
>>>        if(length(sp[[1]])==1){
>>>            as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])))
>>>        }else{
>>>            as.expression(bquote(phantom(.(l0[i])) * 
>>> .(sp[[i]][1])*.*.(sp[[i]][2])))
>>>        }
>>>    })
>>> }
>>>
>>> ## splom with customized lower.panel
>>> ## x: data
>>> ## arr: array of containing expressions which are plotted in a grid table 
>>> in the
>>> ##      lower panel (i,j)]
>>> splom2 <- function(x, arr, nr){
>>>    ## function for creating table
>>>    table.fun <- function(vec){ # vector containing lines for table for 
>>> *one* panel
>>>        grid.table(matrix(vec, nrow=nr, byrow=TRUE),
>>>                   parse=TRUE, # parse labels as expressions
>>>                   theme=theme.list(
>>>                   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
>>>                   core.just="le

[R] R with X_Trader XTAPI

2011-04-20 Thread dbonneau
Is there a way or an existing packages to connect R to X_Trader using XTAPI ?
I know there is a way of connecting R to TWS (Interactive broker).. but I
was wondering if anyone has done connecting R to X_Trader ?? Thank you.


--
View this message in context: 
http://r.789695.n4.nabble.com/R-with-X-Trader-XTAPI-tp3464484p3464484.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help needed!

2011-04-20 Thread Shuangyan Xiong

Hi everyone,

I have a question. Now I am reading the resource code of the package 
"ssfcov". The resource code is as following. I cannot find the resource 
code of the function "myss2d" anywhere in the package. Can anyone give 
me a hint how to find it in the package. Thanks a lot!!bv



> ssfcov
function (time, x, subject, nbasis = 5, centered = FALSE, noDiag = TRUE)
{
if (!centered) {
fit <- smooth.spline(time, x)
x <- x - fitted(fit)
}
gg <- NULL
for (zz in unique(subject)) {
if (sum(subject == zz) > 1) {
tt <- time[subject == zz]
xx <- x[subject == zz]
g <- expand.grid(t1 = tt, t2 = tt)
scov <- xx %*% t(xx)
if (noDiag)
scov <- scov + diag(rep(Inf, length(xx)))
g$z <- matrix(scov, ncol = 1)
gg <- rbind(gg, g[g$z < Inf, ])
}
}
nobs <- nrow(gg)
tt <- min(time) + (max(time) - min(time)) * (1:nbasis)/(nbasis +
1)
g <- expand.grid(t1 = tt, t2 = tt)
g$z <- 0
gg <- rbind(gg, g)
fit <- myss2d(z ~ t1 * t2, data = gg, id.basis = ((nobs +
1):(nobs + nbasis * nbasis)))
class(fit) <- "rfcovObj"
return(fit)
}



Best,
Shuangyan
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can R replicate this data manipulation in SAS?

2011-04-20 Thread Ista Zahn
Oops, I missed the HAART part. Fortunately that translates straightforwardly:

n.dat$HAART <- with(n.dat, ifelse((NRTI >= 3 & NNRTI==0 & PI==0) |
  (NRTI >= 2 & (NNRTI >= 1 | PI >= 1)) |
  (NRTI == 1 & NNRTI >= 1 & PI >= 1),
  1, 0))

Best,
Ista

On Wed, Apr 20, 2011 at 5:22 PM, Ista Zahn  wrote:
> I think this is kind of like asking "will your Land Rover make it up
> my driveway?", but I'll assume the question was asked in all
> seriousness.
>
> Here is one solution:
>
> ##  Read in test data;
> dat <- read.table(textConnection("id    drug      start       stop
> 1004    NRTI     07/24/95    01/05/99
> 1004    NRTI     11/20/95 12/10/95
> 1004    NRTI     01/10/96    01/05/99
> 1004    PI       05/09/96    11/16/97
> 1004    NRTI     06/01/96    02/01/97
> 1004    NRTI     07/01/96    03/01/97
>     PI       01/02/03    NA
>     NNRTI    04/05/06    07/08/09"), header=TRUE)
> closeAllConnections()
>
> dat$start <- as.Date(dat$start, format = "%m/%d/%y")
> dat$stop <- as.Date(dat$stop, format = "%m/%d/%y")
>
> ##  Reshape data into series with 1 date rather than separate starts and
> ## stops;
>
> library(reshape)
>
> m.dat <- melt(dat, id = c("id", "drug"))
> m.dat <- m.dat[order(m.dat$id, m.dat$value),]
> m.dat$variable <- ifelse(m.dat$variable == "start", 1, -1)
> names(m.dat) <-  c("id", "drug", "value", "date")
> m.dat
>
> ##  Get regimen information plus start and stop dates;
>
> n.dat <- cast(m.dat, id + date ~ drug, fun.aggregate=sum, margins="grand_col")
> for (i in names(n.dat)[-c(1:2)]) {
>     n.dat[i] <- cumsum(n.dat[i])
>   }
> n.dat <- ddply(n.dat, .(id), transform,
>      regimen = 1:length(id))
> n.dat
>
> ssd.dat <- ddply(n.dat, .(id), summarize,
>                id = id[-1],
>                regimen = regimen[-length(regimen)],
>                 start_date = date[-length(date)],
>                stop_date = date[-1])
> ssd.dat
>
> ##  Merge data to create regimens dataset;
> all.dat <- merge(n.dat[-2], ssd.dat)
> all.dat <- all.dat[order(all.dat$id, all.dat$regimen), c("id",
> "start_date", "stop_date", "regimen", "NRTI", "NNRTI", "PI",
> "X.all.")]
> all.dat
>
>
> Best,
> Ista
>
>
>
> On Wed, Apr 20, 2011 at 2:59 PM, Ted Harding  wrote:
>> [*** PLEASE NOTE: I am sending this message on behalf of
>>  Paul Miller:
>>  Paul Miller 
>>  (to whom this message has also been copied). He has been
>>  trying to send it, but it has never got through. Please
>>  do  not reply to me, but either to the list and/or to Paul
>>  at that address ***]
>> ==
>> Hello Everyone,
>>
>> I'm learning R and am trying to get a better sense of what it will and
>> will not
>> do. I'm hearing in some places that R may not be able to accomplish all
>> of the
>> data manipulation tasks that SAS can. In others, I'm hearing that R can do
>> pretty much any data manipulation that SAS can but the way in which it
>> does so
>> is likely to be quite different.
>>
>> Below is some SAS syntax that that codes Highly Active Antiretroviral
>> Therapy
>> (HAART) regimens in HIV patients by retaining the values of variables.
>> Interspersed between the bits of code are printouts of data sets that are
>> created in the process of coding. I'm hoping this will come through
>> clearly and
>> that people will be able to see exactly what is being done. Basically,
>> the code
>> keeps track of how many drugs people are on and what types of drugs they
>> are
>> taking during specific periods of time and decides whether that
>> constitutes
>> HAART or not.
>>
>> To me, this is a pretty tricky data manipulation in SAS. Is there any way
>> to
>> get the equivalent result in R?
>>
>> Thanks,
>>
>> Paul
>>
>>
>>  SAS syntax for coding HAART in HIV patients;
>>  Read in test data;
>>
>> data haart;
>> input id drug_class $ start_date :mmddyy. stop_date :mmddyy.;
>> format start_date stop_date mmddyy8.;
>> cards;
>> 1004 NRTI  07/24/95 01/05/99
>> 1004 NRTI  11/20/95 12/10/95
>> 1004 NRTI  01/10/96 01/05/99
>> 1004 PI    05/09/96 11/16/97
>> 1004 NRTI  06/01/96 02/01/97
>> 1004 NRTI  07/01/96 03/01/97
>>  PI    01/02/03 .
>>  NNRTI 04/05/06 07/08/09
>> ;
>> run;
>>
>> proc print data=haart;
>> run;
>>
>>               drug_      start_       stop_
>> Obs     id     class        date        date
>> 1     1004    NRTI     07/24/95    01/05/99
>> 2     1004    NRTI     11/20/95 12/10/95
>> 3     1004    NRTI     01/10/96    01/05/99
>> 4     1004    PI       05/09/96    11/16/97
>> 5     1004    NRTI     06/01/96    02/01/97
>> 6     1004    NRTI     07/01/96    03/01/97
>> 7         PI       01/02/03           .
>> 8         NNRTI    04/05/06    07/08/09
>>
>>  Reshape data into series with 1 date rather than separate starts and
>> stops;
>>
>> data changes (drop=start_date stop_date where=(not missing(date)));
>> set haart;
>> date = start_date;

[R] invoking python from R

2011-04-20 Thread 엄기홍
Dear colleagues,

I was wondering whether anyone may give me some help on a python issue in
R.
I tried to invoke Python from R, but failed.
I have set up a system path to call python: "C:\python27" in path file.
In addition, I have confirmed python interpreter properly.

The code I used is typical for checking, which is as follows:
x <- .Python("listdir", "/tmp", .module="posix")

The command gives me the following message:
> x <- .Python("listdir", "/tmp", .module="posix")
Error: could not find function ".Python"
Any help would be greatly appreciated.

Best,

Kihong

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple imputation

2011-04-20 Thread DOCMAA
I have missing values from a few subjects due to instrumentation not working. 
My data set is N=283 data points. For some subjects i have 60 data points
missing max. 

I tried to use Amelia 2  to impute the missing values but i am getting a
negative number and i am sure this is wrong because its biologically
implausible to have a negative number for what i am measuring.

Does anyone have any suggestions on how to proceed?

Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/multiple-imputation-tp3464093p3464093.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] taking rows from data.frames in list to form new data.frame?

2011-04-20 Thread jctoll
Hi,

I am having a problem figuring out how to extract a subset of rows.  I
have a list with 68 similar data.frames.  Each data.frame is 500 rows
by 5 columns.  I want to take one row from each data.frame based upon
the data in a particular column (i.e. it matches a symbol).  For
example:

> str(database)
List of 68
 $ X2011.01.11:'data.frame':500 obs. of  5 variables:
  ..$ Symbol: chr [1:500] "MMM" "ACE" "AES" "AFL" ...
  ..$ Price : num [1:500] 87.7 60.7 13.1 55.7 15.6 ...
  ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ...
  ..$ Float : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ...
  ..$ Market.Cap: num [1:500] 6.27e+10 2.06e+10 1.04e+10 2.62e+10 1.72e+09 ...
 $ X2011.01.12:'data.frame':500 obs. of  5 variables:
  ..$ Symbol: chr [1:500] "MMM" "ACE" "AES" "AFL" ...
  ..$ Price : num [1:500] 88.7 60.9 12.9 57.1 15.2 ...
  ..$ Shares.Out: num [1:500] 7.15e+08 3.39e+08 7.88e+08 4.71e+08 1.10e+08 ...
  ..$ Float : num [1:500] 7.13e+08 3.38e+08 6.61e+08 4.60e+08 1.09e+08 ...
  ..$ Market.Cap: num [1:500] 6.34e+10 2.06e+10 1.02e+10 2.69e+10 1.68e+09 ...
 . . .

> lapply(database, function(x) which(x == "IBM"))
$X2011.01.11
[1] 234

$X2011.01.12
[1] 234
 . . .

> lapply(database, function(x) x[which(x == "IBM"), ])
$X2011.01.11
Symbol  Price Shares.OutFloat Market.Cap
234IBM 147.28   1.24e+09 1.24e+09 1.8297e+11

$X2011.01.12
Symbol Price Shares.OutFloat Market.Cap
234IBM 149.1   1.24e+09 1.24e+09 1.8524e+11
 . . .

What I would like to do is create a new data.frame with 68 rows by 5
columns of data, perhaps using the old data.frame names as the new
row.names. I can get to the subset of data that I want, I just can't
get it from list form into one new data.frame.  Any suggestions?
Thank you.

James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BMA, logistic regression, odds ratio, model reduction etc

2011-04-20 Thread Frank Harrell
I think it's OK.  You can also use the Hmisc package's varclus function.
Frank


細田弘吉 wrote:
> 
> Dear Prof. Harrel,
> 
> Thank you very much for your quick advice.
> I will try rms package.
> 
> Regarding model reduction, is my model 2 method (clustering and recoding 
> that are blinded to the outcome) permissible?
> 
> Sincerely,
> 
> --
> KH
> 
> (11/04/20 22:01), Frank Harrell wrote:
>> Deleting variables is a bad idea unless you make that a formal part of
>> the
>> BMA so that the attempt to delete variables is penalized for.  Instead of
>> BMA I recommend simple penalized maximum likelihood estimation (see the
>> lrm
>> function in the rms package) or pre-modeling data reduction that is
>> blinded
>> to the outcome variable.
>> Frank
>>
>>
>> 細田弘吉 wrote:
>>>
>>> Hi everybody,
>>> I apologize for long mail in advance.
>>>
>>> I have data of 104 patients, which consists of 15 explanatory variables
>>> and one binary outcome (poor/good). The outcome consists of 25 poor
>>> results and 79 good results. I tried to analyze the data with logistic
>>> regression. However, the 15 variables and 25 events means events per
>>> variable (EPV) is much less than 10 (rule of thumb). Therefore, I used R
>>> package, "BMA" to perform logistic regression with BMA to avoid this
>>> problem.
>>>
>>> model 1 (full model):
>>> x1, x2, x3, x4 are continuous variables and others are binary data.
>>>
 x16.bic.glm<- bic.glm(outcome ~ ., data=x16.df,
>>> glm.family="binomial", OR20, strict=FALSE)
 summary(x16.bic.glm)
>>> (The output below has been cut off at the right edge to save space)
>>>
>>>62  models were selected
>>>   Best  5  models (cumulative posterior probability =  0.3606 ):
>>>
>>>   p!=0EV SDmodel 1model2
>>> Intercept100-5.1348545  1.652424-4.4688  -5.15
>>> -5.1536
>>> age3.3   0.0001634  0.007258  .
>>> sex4.0
>>> .M   -0.0243145  0.220314  .
>>> side  10.8
>>>  .R   0.0811227  0.301233  .
>>> procedure 46.9  -0.5356894  0.685148  .  -1.163
>>> symptom3.8  -0.0099438  0.129690  .  .
>>> stenosis   3.4  -0.0003343  0.005254  .
>>> x13.7  -0.0061451  0.144084  .
>>> x2   100.0   3.1707661  0.892034 3.2221 3.11
>>> x351.3  -0.4577885  0.551466-0.9154 .
>>> HT 4.6
>>>.positive  0.0199299  0.161769  .  .
>>> DM 3.3
>>>.positive -0.0019986  0.105910  .  .
>>> IHD3.5
>>> .positive 0.0077626  0.122593  .  .
>>> smoking9.1
>>> .positive 0.0611779  0.258402  .  .
>>> hyperlipidemia16.0
>>>.positive  0.1784293  0.512058  .  .
>>> x4 8.2   0.0607398  0.267501  .  .
>>>
>>>
>>> nVar   2  2
>>>   1  3  3
>>> BIC   -376.9082
>>> -376.5588  -376.3094  -375.8468  -374.5582
>>> post prob0.104
>>> 0.087  0.077  0.061  0.032
>>>
>>> [Question 1]
>>> Is it O.K to calculate odds ratio and its 95% confidence interval from
>>> "EV" (posterior distribution mean) and“SD”(posterior distribution
>>> standard deviation)?
>>> For example, 95%CI of EV of x2 can be calculated as;
 exp(3.1707661)
>>> [1] 23.82573 ->  odds ratio
 exp(3.1707661+1.96*0.892034)
>>> [1] 136.8866
 exp(3.1707661-1.96*0.892034)
>>> [1] 4.146976
>>> -->  95%CI (4.1 to 136.9)
>>> Is this O.K.?
>>>
>>> [Question 2]
>>> Is it permissible to delete variables with small value of "p!=0" and
>>> "EV", such as age (3.3% and 0.0001634) to reduce the number of
>>> explanatory variables and reconstruct new model without those variables
>>> for new session of BMA?
>>>
>>> model 2 (reduced model):
>>> I used R package, "pvclust", to reduce the model. The result suggested
>>> x1, x2 and x4 belonged to the same cluster, so I picked up only x2.
>>> Based on the subject knowledge, I made a simple unweighted sum, by
>>> counting the number of clinical features. For 9 features (sex, side,
>>> HT2, hyperlipidemia, DM, IHD, smoking, symptom, age), the sum ranges
>>> from 0 to 9. This score was defined as ClinicalScore. Consequently, I
>>> made up new data set (x6.df), which consists of 5 variables (stenosis,
>>> x2, x3, procedure, and ClinicalScore) and one binary outcome
>>> (poor/good). Then, for alternative BMA session...
>>>
 BMAx6.glm<- bic.glm(postopDWI_HI ~ ., dat

Re: [R] question regarding qmvnorm

2011-04-20 Thread Rolf Turner

On 21/04/11 09:59, li li wrote:

Dear all,
 I wrote the following function previously. It worked fine with the old
mvtnorm package.
Somehow with the updated package, I got a error message when trying to use
the function.
I need some help. It is sort of urgent. Can anyone please take a look. The
function is the following.
Thank you very much!
  Hannah


(1) In the first instance you should probably contact the maintainer of
the mvtnorm package:

require(utils)
maintainer("mvtnorm")

(2) The code you enclosed indicates that you don't understand anything
about functions.  You have m, rho, k, and alpha given as *arguments*
to your function, and yet you specify them explicitly within the body
of the function.  This makes no sense at all.

If you are going to do something it pays to *understand* what you are
doing rather than just slapping down some code at random and hoping
that it will work.

Delete the first four (non-blank!) lines of your code and then call

cc_f(10,0.1,2,0.5)

This does indeed throw an error --- from uniroot().  As I say, contact the
package maintainer.

cheers,

Rolf Turner


library(mvtnorm)

cc_f<- function(m, rho, k, alpha) {

m<- 10

rho<- 0.1

k<- 2

alpha<- 0.05


cc_z<- numeric(m)

var<- matrix(c(1,rho,rho,1), nrow=2, ncol=2, byrow=T)

for (i in 1:m){

if (i<= k) {cc_z[i]<- qmvnorm((k*(k-1))/(m*(m-1))*alpha, tail="upper",
sigma=var)$quantile} else

{cc_z[i]<- qmvnorm((k*(k-1))/((m-i+k)*(m-i+k-1))*alpha, tail
="upper", sigma=var)$quantile}

}

cc<- 1-pnorm(cc_z)

return(cc)

  }


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] succession time series graph

2011-04-20 Thread David Winsemius


On Apr 20, 2011, at 6:23 PM, David Bird wrote:


Dear gracious R community,

I would like to produce charts of phytoplankton biomass changes
through time. Each species has a line, and the biomass varies in
mirror form along the line for each species along the X time axis.
Here is an example of what I'd like to do:
http://www.er.uqam.ca/nobel/r30240/Succession.jpg


Got any data?  (the polygon plotting function is likely to be of use  
and you might want to look at how violin plots are handled.)


--
David

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question regarding qmvnorm

2011-04-20 Thread Ravi Varadhan
If you had told us what the error message was, my job would have been easier.  
But, at least you provied the code, so it was not hard for me to see where the 
problem was.  There is a problem with the strategy used by `qmvnorm' to locate 
the initial interval in which the quantile is supposed to lie.  In particular, 
the problem is with the approx_interval() function.  

In your case, the interval computed by this function does not contain the root. 
Hence, uniroot() fails. The solution is to provide the interval that contains 
the roots.  I am cc'ing Torsten so that he is aware of the problem.

The following works:

m <- 10
rho <- 0.1
k <- 2
alpha <- 0.05

cc_z <- rep(NA, length=m)
var <- matrix(c(1,rho,rho,1), nrow=2, ncol=2, byrow=T)
for (i in 1:m){
if (i <= k) {cc_z[i] <- qmvnorm((k*(k-1))/(m*(m-1))*alpha, tail="upper",
sigma=var, interval=c(0, 5))$quantile} else
{cc_z[i] <- qmvnorm((k*(k-1))/((m-i+k)*(m-i+k-1))*alpha, tail
="upper", sigma=var, interval=c(0,5))$quantile}
}

cc_z

> cc_z
 [1] 1.9438197 1.9438197 1.8910860 1.8303681 1.7590806 1.6730814 1.5652220
 [8] 1.4219558 1.2116459 0.8267921
>

Ravi.
  


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
li li [hannah@gmail.com]
Sent: Wednesday, April 20, 2011 5:59 PM
To: r-help
Subject: [R] question regarding qmvnorm

Dear all,
I wrote the following function previously. It worked fine with the old
mvtnorm package.
Somehow with the updated package, I got a error message when trying to use
the function.
I need some help. It is sort of urgent. Can anyone please take a look. The
function is the following.
Thank you very much!
 Hannah

library(mvtnorm)

cc_f <- function(m, rho, k, alpha) {

m <- 10

rho <- 0.1

k <- 2

alpha <- 0.05


cc_z <- numeric(m)

var <- matrix(c(1,rho,rho,1), nrow=2, ncol=2, byrow=T)

for (i in 1:m){

   if (i <= k) {cc_z[i] <- qmvnorm((k*(k-1))/(m*(m-1))*alpha, tail="upper",
sigma=var)$quantile} else

   {cc_z[i] <- qmvnorm((k*(k-1))/((m-i+k)*(m-i+k-1))*alpha, tail
="upper", sigma=var)$quantile}

   }

cc <- 1-pnorm(cc_z)

return(cc)

 }

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] succession time series graph

2011-04-20 Thread Sarah Goslee
Hi David,

On Wed, Apr 20, 2011 at 6:23 PM, David Bird  wrote:
> Dear gracious R community,
>
> I would like to produce charts of phytoplankton biomass changes
> through time. Each species has a line, and the biomass varies in
> mirror form along the line for each species along the X time axis.
> Here is an example of what I'd like to do:
> http://www.er.uqam.ca/nobel/r30240/Succession.jpg

It's called a "violin plot" and this example may help you get going:
http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=43

Otherwise we need more details, like how your data are organized
(ideally as a reproducible example).

Sarah


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] succession time series graph

2011-04-20 Thread David Bird

Dear gracious R community,

I would like to produce charts of phytoplankton biomass changes
through time. Each species has a line, and the biomass varies in
mirror form along the line for each species along the X time axis.
Here is an example of what I'd like to do:
http://www.er.uqam.ca/nobel/r30240/Succession.jpg

Any suggestions?

Thanks
David Bird
UQAM, Montreal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integrate na.rm in own defined functions

2011-04-20 Thread Rolf Turner

On 20/04/11 22:25, vioravis wrote:

This should work!!

rmse<-function (x){
dquared<-x^2
sum1<-sum(x^2,na.rm=TRUE)
rmse<-sqrt((1/length(x))*sum1)
rmse}


Shouldn't the divisor be the number of non-missing values in x?
Rather than the length of x?  (Like, e.g. sum(!is.na(x)) ?)

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rcmdr vs SPSS

2011-04-20 Thread Eik Vettorazzi
Hi,
maybe my eyes are not very well, but I can't see any difference between
both results up to different rounding policies...



Am 20.04.2011 23:36, schrieb Tamas Barjak:
> Hy all!
> 
> Excuse me for the inaccurate composition, but I do not speak well in
> English.
> 
> I noticed a mistake in Rcmdr (?) -- Models menu --- Add observation
> statistics to data --- Studentized residuals.
> My output :
> 
> (Rcmdr !!!)
> 
> rstudent.RegModel.1 (= *Studentized residuals*)
> 
> -1.5690952
> -0.0697492
> -0.6830684
> 1.0758056
> 0.2719739
> 0.3626101
> 0.8361803
> 1.0180479
> 0.8936783
> -0.4630021
> -3.2972946
> 
> AND!!!
> 
> SPSS (= *Studentized DELETED residuals*)
> 
> -1,56910
> -0,06975
> -0,68307
> 1,07581
> 0,27197
> 0,36261
> 0,83618
> 1,01805
> 0,89368
> -0,46300
> -3,29729
> 
> This wrong???
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rcmdr vs SPSS

2011-04-20 Thread Jeremy Miles
What's the mistake?  They look like the same numbers to me.  (Although
I didn't check them all).

Oh, hang on, are you saying that they're different kinds of residuals,
but they are the same?  This is because SPSS names its residuals
wrongly.

SPSS has standardized residuals, these are residuals divided by the
standard deviation of the residuals, calculated as the overall SD.

SPSS has studentized residuals.  Everyone else calls these
standardized residuals.

SPSS has studentized deleted residuals, which follow a student's
t-distribution.  Everyone else calls the studentized residuals
(because they follow the distribution).  John Fox's regression book
 explains all of
this nicely.

Jeremy




On 20 April 2011 14:36, Tamas Barjak  wrote:
> Hy all!
>
> Excuse me for the inaccurate composition, but I do not speak well in
> English.
>
> I noticed a mistake in Rcmdr (?) -- Models menu --- Add observation
> statistics to data --- Studentized residuals.
> My output :
>
> (Rcmdr !!!)
>
> rstudent.RegModel.1 (= *Studentized residuals*)
>
> -1.5690952
> -0.0697492
> -0.6830684
> 1.0758056
> 0.2719739
> 0.3626101
> 0.8361803
> 1.0180479
> 0.8936783
> -0.4630021
> -3.2972946
>
> AND!!!
>
> SPSS (= *Studentized DELETED residuals*)
>
> -1,56910
> -0,06975
> -0,68307
> 1,07581
> 0,27197
> 0,36261
> 0,83618
> 1,01805
> 0,89368
> -0,46300
> -3,29729
>
> This wrong???
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jeremy Miles
Psychology Research Methods Wiki: www.researchmethodsinpsychology.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question regarding qmvnorm

2011-04-20 Thread li li
Dear all,
I wrote the following function previously. It worked fine with the old
mvtnorm package.
Somehow with the updated package, I got a error message when trying to use
the function.
I need some help. It is sort of urgent. Can anyone please take a look. The
function is the following.
Thank you very much!
 Hannah

library(mvtnorm)

cc_f <- function(m, rho, k, alpha) {

m <- 10

rho <- 0.1

k <- 2

alpha <- 0.05


cc_z <- numeric(m)

var <- matrix(c(1,rho,rho,1), nrow=2, ncol=2, byrow=T)

for (i in 1:m){

   if (i <= k) {cc_z[i] <- qmvnorm((k*(k-1))/(m*(m-1))*alpha, tail="upper",
sigma=var)$quantile} else

   {cc_z[i] <- qmvnorm((k*(k-1))/((m-i+k)*(m-i+k-1))*alpha, tail
="upper", sigma=var)$quantile}

   }

cc <- 1-pnorm(cc_z)

return(cc)

 }

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grid.table + splom: how to nicely align panel entries

2011-04-20 Thread Marius Hofert
Dear Baptiste,

great, many thanks!
One last thing: Do you know why the gpar(cex=0.1) argument is ignored?

Cheers,

Marius

library(lattice) 
library(grid)
library(gridExtra)

## function for correct digit alignment
align.digits <- function(l){
sp <- strsplit(as.character(l), "\\.")
chars <- sapply(sp, function(x) nchar(x)[1])
n <- max(chars)-chars
l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
labels <- sapply(seq_along(sp), function(i){
point <- if(is.na(sp[[i]][2])) NULL else quote(.)
as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1]) * .(point) * 
.(sp[[i]][2]) ))})
}

## splom with customized lower.panel
## x: data
## arr: array of containing expressions which are plotted in a grid table in 
the 
##  lower panel (i,j)]
splom2 <- function(x, arr, nr){
## function for creating table 
table.fun <- function(vec){ # vector containing lines for table for *one* 
panel
grid.table(matrix(vec, nrow=nr, byrow=TRUE),
   parse=TRUE, # parse labels as expressions
   gpar.coretext=gpar(cex=0.1), # text size
   theme=theme.list(
   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
   core.just="left", padding.h=unit(0,"mm")) # justification of 
labels
   ) 
}
## splom
splom(x, varname.cex=1.2,
  superpanel=function(z, ...){
  panel.pairs(z, upper.panel=panel.splom, lower.panel=function(i,j){
  table.fun(arr[i,j,])
  }, ...)
  })
}

## create data and array of expressions
d <- 4
x <- matrix(runif(d*1000), ncol=d) # data to be plotted with splom
nr <- 3 # number of rows for the panel entries
nc <- 3 # number of cols for the panel entries
arr <- array(list(rep(NA,nr*nc)), dim=c(d,d,nr*nc), dimnames=c("i","j","val")) 
# array containing the table entries per panel
f <- function(i,j) (i+j)*10 # dummy function
eq <- "phantom()==phantom()"
for(i in 1:d){
for(j in 1:d){
numbers <- align.digits(c(round(pi,4), round(pi, 6), f(i,j)))
arr[i,j,] <- c("alpha", eq, numbers[1],
   "italic(bbb)", eq, numbers[2],
   "gamma", eq, numbers[3])
}
}

## plot
splom2(x, arr, nr=3)


On 2011-04-20, at 22:38 , baptiste auguie wrote:

> Try this,
> 
> align.digits = function(l)
> {
> 
> sp <- strsplit(as.character(l), "\\.")
> chars <- sapply(sp, function(x) nchar(x)[1])
> n = max(chars) - chars
> l0 = sapply(n, function(x) paste(rep("0", x), collapse=""))
> labels = sapply(seq_along(sp), function(i) {
>  point <- if(is.na(sp[[i]][2])) NULL else quote(.)
>  as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])*
> .(point)*.(sp[[i]][2]) ))})
> 
> return(labels)
> }
> 
> 
> library(gridExtra)
> 
> d <- align.digits(l = c(125.3, 1.2344, 12))
> grid.newpage()
> grid.table(d, parse=T, core.just="left", gpar.coretext=gpar(cex=0.5))
> 
> HTH,
> 
> baptiste
> 
> On 21 April 2011 03:07, Marius Hofert  wrote:
>> Dear Baptiste,
>> 
>> very nice, indeed!
>> 
>> Two minor issues that remain, are:
>> (1) I tried to omit the decimal dot for those numbers that do not have digits
>>after the decimal dot. But somehow it does not work...
>> (2) Do you know how one can decrease the text size for the text appearing in 
>> the
>>lower panel? I tried to work with "cex=0.5"... but it was ignored all the 
>> time.
>> 
>> Cheers,
>> 
>> Marius
>> 
>> 
>> library(lattice)
>> library(grid)
>> library(gridExtra)
>> 
>> ## function for correct digit alignment
>> align.digits <- function(l){
>>sp <- strsplit(as.character(l), "\\.")
>>chars <- sapply(sp, function(x) nchar(x)[1])
>>n <- max(chars)-chars
>>l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
>>sapply(seq_along(sp), function(i){
>>if(length(sp[[1]])==1){
>>as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])))
>>}else{
>>as.expression(bquote(phantom(.(l0[i])) * 
>> .(sp[[i]][1])*.*.(sp[[i]][2])))
>>}
>>})
>> }
>> 
>> ## splom with customized lower.panel
>> ## x: data
>> ## arr: array of containing expressions which are plotted in a grid table in 
>> the
>> ##  lower panel (i,j)]
>> splom2 <- function(x, arr, nr){
>>## function for creating table
>>table.fun <- function(vec){ # vector containing lines for table for *one* 
>> panel
>>grid.table(matrix(vec, nrow=nr, byrow=TRUE),
>>   parse=TRUE, # parse labels as expressions
>>   theme=theme.list(
>>   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
>>   core.just="left", padding.h=unit(0,"mm")) # justification 
>> of labels
>>   )
>>}
>>## splom
>>splom(x, varname.cex=1.2,
>>  superpanel=function(z, ...){
>>  panel.pairs(z, upper.panel=panel.splom, 
>> lower.panel=function(i,j){
>>  table.fun(arr[i,j,])
>>   

[R] Rcmdr vs SPSS

2011-04-20 Thread Tamas Barjak
Hy all!

Excuse me for the inaccurate composition, but I do not speak well in
English.

I noticed a mistake in Rcmdr (?) -- Models menu --- Add observation
statistics to data --- Studentized residuals.
My output :

(Rcmdr !!!)

rstudent.RegModel.1 (= *Studentized residuals*)

-1.5690952
-0.0697492
-0.6830684
1.0758056
0.2719739
0.3626101
0.8361803
1.0180479
0.8936783
-0.4630021
-3.2972946

AND!!!

SPSS (= *Studentized DELETED residuals*)

-1,56910
-0,06975
-0,68307
1,07581
0,27197
0,36261
0,83618
1,01805
0,89368
-0,46300
-3,29729

This wrong???

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can R replicate this data manipulation in SAS?

2011-04-20 Thread Ista Zahn
I think this is kind of like asking "will your Land Rover make it up
my driveway?", but I'll assume the question was asked in all
seriousness.

Here is one solution:

##  Read in test data;
dat <- read.table(textConnection("iddrug  start   stop
1004NRTI 07/24/9501/05/99
1004NRTI 11/20/95 12/10/95
1004NRTI 01/10/9601/05/99
1004PI   05/09/9611/16/97
1004NRTI 06/01/9602/01/97
1004NRTI 07/01/9603/01/97
PI   01/02/03NA
NNRTI04/05/0607/08/09"), header=TRUE)
closeAllConnections()

dat$start <- as.Date(dat$start, format = "%m/%d/%y")
dat$stop <- as.Date(dat$stop, format = "%m/%d/%y")

##  Reshape data into series with 1 date rather than separate starts and
## stops;

library(reshape)

m.dat <- melt(dat, id = c("id", "drug"))
m.dat <- m.dat[order(m.dat$id, m.dat$value),]
m.dat$variable <- ifelse(m.dat$variable == "start", 1, -1)
names(m.dat) <-  c("id", "drug", "value", "date")
m.dat

##  Get regimen information plus start and stop dates;

n.dat <- cast(m.dat, id + date ~ drug, fun.aggregate=sum, margins="grand_col")
for (i in names(n.dat)[-c(1:2)]) {
 n.dat[i] <- cumsum(n.dat[i])
   }
n.dat <- ddply(n.dat, .(id), transform,
  regimen = 1:length(id))
n.dat

ssd.dat <- ddply(n.dat, .(id), summarize,
id = id[-1],
regimen = regimen[-length(regimen)],
 start_date = date[-length(date)],
stop_date = date[-1])
ssd.dat

##  Merge data to create regimens dataset;
all.dat <- merge(n.dat[-2], ssd.dat)
all.dat <- all.dat[order(all.dat$id, all.dat$regimen), c("id",
"start_date", "stop_date", "regimen", "NRTI", "NNRTI", "PI",
"X.all.")]
all.dat


Best,
Ista



On Wed, Apr 20, 2011 at 2:59 PM, Ted Harding  wrote:
> [*** PLEASE NOTE: I am sending this message on behalf of
>  Paul Miller:
>  Paul Miller 
>  (to whom this message has also been copied). He has been
>  trying to send it, but it has never got through. Please
>  do  not reply to me, but either to the list and/or to Paul
>  at that address ***]
> ==
> Hello Everyone,
>
> I'm learning R and am trying to get a better sense of what it will and
> will not
> do. I'm hearing in some places that R may not be able to accomplish all
> of the
> data manipulation tasks that SAS can. In others, I'm hearing that R can do
> pretty much any data manipulation that SAS can but the way in which it
> does so
> is likely to be quite different.
>
> Below is some SAS syntax that that codes Highly Active Antiretroviral
> Therapy
> (HAART) regimens in HIV patients by retaining the values of variables.
> Interspersed between the bits of code are printouts of data sets that are
> created in the process of coding. I'm hoping this will come through
> clearly and
> that people will be able to see exactly what is being done. Basically,
> the code
> keeps track of how many drugs people are on and what types of drugs they
> are
> taking during specific periods of time and decides whether that
> constitutes
> HAART or not.
>
> To me, this is a pretty tricky data manipulation in SAS. Is there any way
> to
> get the equivalent result in R?
>
> Thanks,
>
> Paul
>
>
>  SAS syntax for coding HAART in HIV patients;
>  Read in test data;
>
> data haart;
> input id drug_class $ start_date :mmddyy. stop_date :mmddyy.;
> format start_date stop_date mmddyy8.;
> cards;
> 1004 NRTI  07/24/95 01/05/99
> 1004 NRTI  11/20/95 12/10/95
> 1004 NRTI  01/10/96 01/05/99
> 1004 PI    05/09/96 11/16/97
> 1004 NRTI  06/01/96 02/01/97
> 1004 NRTI  07/01/96 03/01/97
>  PI    01/02/03 .
>  NNRTI 04/05/06 07/08/09
> ;
> run;
>
> proc print data=haart;
> run;
>
>               drug_      start_       stop_
> Obs     id     class        date        date
> 1     1004    NRTI     07/24/95    01/05/99
> 2     1004    NRTI     11/20/95 12/10/95
> 3     1004    NRTI     01/10/96    01/05/99
> 4     1004    PI       05/09/96    11/16/97
> 5     1004    NRTI     06/01/96    02/01/97
> 6     1004    NRTI     07/01/96    03/01/97
> 7         PI       01/02/03           .
> 8         NNRTI    04/05/06    07/08/09
>
>  Reshape data into series with 1 date rather than separate starts and
> stops;
>
> data changes (drop=start_date stop_date where=(not missing(date)));
> set haart;
> date = start_date;
> change =  1;
> output;
> date =  stop_date;
> change = -1;
> output;
> format date mmddyy10.;
> run;
>
> proc sort data=changes;
> by id date;
> run;
>
> proc print data=changes;
> run;
>
>               drug_
> Obs     id     class          date    change
>  1    1004    NRTI     07/24/1995       1
>  2    1004    NRTI     11/20/1995       1
>  3    1004    NRTI     12/10/1995      -1
>  4    1004    NRTI     01/10/1996       1
>  5    1004    PI       05/09/1996       1
>  6    1004    NRTI     06/01/1996       1
>  7    1004    NRTI     07/01/1996       1
>  8    10

Re: [R] Pattern match

2011-04-20 Thread Dennis Murphy
Hi:

This is a bit of a roundabout approach; I'm sure that folks with regex
expertise will trump this in a heartbeat. I modified the last piece of
the string a bit to accommodate the approach below. Depending on where
the strings have line breaks, you may have some odd '\n' characters
inserted.

# Step 1: read the input as a single character string
u <- "SpeciesCommon=(Human);SpeciesScientific=(Homo
sapiens);ReactiveCentres=(N,C,C,C,+H,O,C,C,C,C,O,H);BondInvolved=(C-H);EzCatDBID=(S00343);BondFormed=(O-H,O-H);Bond=(255B);Cofactors=(Cu(II),CU,501,A,Cu(II),CU,502,A);CatalyticSwissProt=(P25006);SpeciesScientific=(Achromobacter
cycloclastes);SpeciesCommon=(Bacteria);Reactive=(Ce+)"

# Step 2: Split input lines by the ';' delimiter and then use lapply()
to split variable names from values.
# This results in a nested list for ulist2.
ulist <- strsplit(u, ';')
ulist2 <- lapply(ulist, function(s) strsplit(s, '='))

# Step 3: Break out the results into a matrix whose first column is
the variable name
# and whose second column is the value (with parens included)
# This avoids dealing with nested lists
v <- matrix(unlist(ulist2), ncol = 2, byrow = TRUE)

# Step 4: Strip off the parens
w <- apply(v, 2, function(s) gsub('([\\(\\)])', '', s))
colnames(w) <- c('Name', 'Value')
w
  Name Value
 [1,] "SpeciesCommon"  "Human"
 [2,] "SpeciesScientific"  "Homo sapiens"
 [3,] "ReactiveCentres""N,C,C,C,+H,O,C,C,C,C,O,H"
 [4,] "BondInvolved"   "C-H"
 [5,] "EzCatDBID"  "S00343"
 [6,] "BondFormed" "O-H,O-H"
 [7,] "Bond"   "255B"
 [8,] "Cofactors"  "CuII,CU,501,A,CuII,CU,502,A"
 [9,] "CatalyticSwissProt" "P25006"
[10,] "SpeciesScientific"  "Achromobacter\ncycloclastes"
[11,] "SpeciesCommon"  "Bacteria"
[12,] "Reactive"   "Ce+"

# Step 5: Subset out the values of the SpeciesScientific variables
subset(as.data.frame(w), Name == 'SpeciesScientific', select = 'Value')
 Value
2 Homo sapiens
10 Achromobacter\ncycloclastes


One possible 'advantage' of this approach is that if you have a number
of string records of this type, you can create nested lists for each
string and then manipulate the lists to get what you need. Hopefully
you can use some of these ideas for other purposes as well.

Dennis



On Wed, Apr 20, 2011 at 10:17 AM, Neeti  wrote:
> Hi ALL,
>
> I have very simple question regarding pattern matching. Could anyone tell me
> how to I can use R to retrieve string pattern from text file.  for example
> my file contain following information
>
> SpeciesCommon=(Human);SpeciesScientific=(Homo
> sapiens);ReactiveCentres=(N,C,C,C,+
> H,O,C,C,C,C,O,H);BondInvolved=(C-H);EzCatDBID=(S00343);BondFormed=(O-H,O-H);Bond+
> 255B);Cofactors=(Cu(II),CU,501,A,Cu(II),CU,502,A);CatalyticSwissProt=(P25006);Sp+
> eciesScientific=(Achromobacter
> cycloclastes);SpeciesCommon=(Bacteria);ReactiveCe+
>
> and I want to extract “SpeciesScientific = (?)” information from this file.
> Problem is in 3rd line where SpeciesScientific word is divided with +.
>
> Could anyone help me please?
> Thank you
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Pattern-match-tp3463625p3463625.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Include C++ DLL, error in ...: C symbol name not in load table

2011-04-20 Thread Duncan Murdoch

On 20/04/2011 4:38 PM, Dirk Eddelbuettel wrote:


On 20 April 2011 at 16:24, Duncan Murdoch wrote:
| On 20/04/2011 4:06 PM, Sascha Vieweg wrote:
|>  Hello R experts
|>
|>  I am googling and reading around, however, I can't get it working
|>  (perhaps because I do not understand much C, however, I'll give it
|>  a try). I am trying to include C++ code into an R routine, where
|>  the C++ code looks:
|>
|>  #include
|>  using namespace std;
|>  void foo (double* x, double* y, double* out)
|>  {
|>   out[0] = x[0] + y[0];
|>  }
|>
|>  Back in R, the command
|>
|>  R CMD SHLIB --preclean -o xplusy
|>
|>  works fine resulting in two new files, xplusy.o and xplusy.so. The
|>  wrapper in R is:
|>
|>  dyn.load("xplusy.so")
|>  xplusy<- function(x, y){
|>  .C("foo", as.double(x), as.double(y), out=double(1))$out
|>  }
|>  xplusy(1, 2)
|>  dyn.unload("xplusy.so")
|>
|>  Now, dyn.load() works and xplusy also shows up in getLoadedDLLs().
|>  However, when invoking the function, xplusy(1, 2), R complains:
|>
|>  Error in .C("foo", as.double(x), as.double(y), out = double(1)): C
|>  symbol name "foo" not in load table
|>
|>  I found some hints concerning Fortran code producing this error
|>  message, but no help concerning C code.
|
| You have C++ code, not C code.  C++ normally mangles names of exports.
|
| To get this to work, you should surround your declarations with
|
| extern "C" {
| }
|
| Another possibility is to use the Rcpp package; it writes the interface
| code for you.

I believe Duncan refers to the 'inline' package, rathern than 'Rcpp' (which
itself uses 'inline').


Oops, Dirk is right.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get cells by the combination of their column and row names

2011-04-20 Thread David Winsemius


On Apr 20, 2011, at 3:52 PM, yoav baranan wrote:



Hi,
I have a (correlation) matrix and I want to select a subset of its  
cells depending on the combination of their column and row names.
This illustrates my problem: mtrx <- matrix(c(1,2,3,4,5,6,7,8,9),  
nrow=3, ncol=3, dimnames = list(c('c132','c432', 'c233'),   
c('r132','r233', 'r432')))> mtrx r132 r233 r432c13214 
7c432258c233369
At this point I want to compute the mean of the cells in which the  
column and the row names share the same suffix (i.e., the cells  
mtrx[1,1], mtrx[2,3], mtrx[3,2]). By suffix I mean the last three  
digits of the row/col name.
I need a generic code (and not vec<-c(mtrx[1,1], mtrx[2,3],  
mtrx[3,2])) because I don't know in advance how many rows and  
columns I will have (currently 118).


> mtrx[ sub("r|c", "", rep(rownames(mtrx),times=3) ) ==
sub("r|c", "", rep(colnames(mtrx), each=3))]
[1] 1 6 8

> mean( mtrx[ sub("r|c", "", rep(rownames(mtrx),times=3) ) ==
  sub("r|c", "", rep(colnames(mtrx), each=3))] )
[1] 5

Obviously you need to substitute NROW() and NCOL() vlues for the rep  
arguments but that should be simple.




Thanks, Yoav
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grid.table + splom: how to nicely align panel entries

2011-04-20 Thread baptiste auguie
Try this,

align.digits = function(l)
{

sp <- strsplit(as.character(l), "\\.")
chars <- sapply(sp, function(x) nchar(x)[1])
n = max(chars) - chars
l0 = sapply(n, function(x) paste(rep("0", x), collapse=""))
labels = sapply(seq_along(sp), function(i) {
  point <- if(is.na(sp[[i]][2])) NULL else quote(.)
  as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])*
.(point)*.(sp[[i]][2]) ))})

return(labels)
}


library(gridExtra)

d <- align.digits(l = c(125.3, 1.2344, 12))
grid.newpage()
grid.table(d, parse=T, core.just="left", gpar.coretext=gpar(cex=0.5))

HTH,

baptiste

On 21 April 2011 03:07, Marius Hofert  wrote:
> Dear Baptiste,
>
> very nice, indeed!
>
> Two minor issues that remain, are:
> (1) I tried to omit the decimal dot for those numbers that do not have digits
>    after the decimal dot. But somehow it does not work...
> (2) Do you know how one can decrease the text size for the text appearing in 
> the
>    lower panel? I tried to work with "cex=0.5"... but it was ignored all the 
> time.
>
> Cheers,
>
> Marius
>
>
> library(lattice)
> library(grid)
> library(gridExtra)
>
> ## function for correct digit alignment
> align.digits <- function(l){
>    sp <- strsplit(as.character(l), "\\.")
>    chars <- sapply(sp, function(x) nchar(x)[1])
>    n <- max(chars)-chars
>    l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
>    sapply(seq_along(sp), function(i){
>        if(length(sp[[1]])==1){
>            as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])))
>        }else{
>            as.expression(bquote(phantom(.(l0[i])) * 
> .(sp[[i]][1])*.*.(sp[[i]][2])))
>        }
>    })
> }
>
> ## splom with customized lower.panel
> ## x: data
> ## arr: array of containing expressions which are plotted in a grid table in 
> the
> ##      lower panel (i,j)]
> splom2 <- function(x, arr, nr){
>    ## function for creating table
>    table.fun <- function(vec){ # vector containing lines for table for *one* 
> panel
>        grid.table(matrix(vec, nrow=nr, byrow=TRUE),
>                   parse=TRUE, # parse labels as expressions
>                   theme=theme.list(
>                   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
>                   core.just="left", padding.h=unit(0,"mm")) # justification 
> of labels
>                   )
>    }
>    ## splom
>    splom(x, varname.cex=1.2,
>          superpanel=function(z, ...){
>              panel.pairs(z, upper.panel=panel.splom, 
> lower.panel=function(i,j){
>                  table.fun(arr[i,j,])
>              }, ...)
>          })
> }
>
> ## create data and array of expressions
> d <- 4
> x <- matrix(runif(d*1000), ncol=d) # data to be plotted with splom
> nr <- 3 # number of rows for the panel entries
> nc <- 3 # number of cols for the panel entries
> arr <- array(list(rep(NA,nr*nc)), dim=c(d,d,nr*nc), 
> dimnames=c("i","j","val")) # array containing the table entries per panel
> f <- function(i,j) (i+j)*10 # dummy function
> eq <- "phantom()==phantom()"
> for(i in 1:d){
>    for(j in 1:d){
>        numbers <- align.digits(c(round(pi,4), round(pi, 6), f(i,j)))
>        arr[i,j,] <- c("alpha", eq, numbers[1],
>                       "italic(bbb)", eq, numbers[2],
>                       "gamma", eq, numbers[3])
>    }
> }
>
> ## plot
> splom2(x, arr, nr=3)
>
>
> On 2011-04-20, at 11:56 , baptiste auguie wrote:
>
>> On 20 April 2011 21:16, Marius Hofert  wrote:
>>> Dear expeRts,
>>>
>>> is there a way to get the entries in each panel correctly aligned according 
>>> to the
>>> equality signs?
>>>
>>> Here is the "wish-list":
>>> (1) the equality signs in each panel should be vertically aligned
>>
>> You can put the equal signs in their own column,
>>
>> library(gridExtra)
>> d = matrix(c("italic(a)", "phantom()==phantom()", round(pi,4),
>> "italic(b)", "phantom()==phantom()", round(pi,6)), ncol=3, byrow=T)
>> grid.table(d, parse=T,theme=theme.list(core.just="left"))
>>
>>> (2) the numbers should be aligned on the decimal point
>>
>> You could place some phantom()s to do this,
>>
>> align.digits = function(l)
>> {
>>
>> sp <- strsplit(as.character(l), "\\.")
>> chars <- sapply(sp, function(x) nchar(x)[1])
>> n = max(chars) - chars
>> l0 = sapply(n, function(x) paste(rep("0", x), collapse=""))
>> labels = sapply(seq_along(sp), function(i) {
>>  as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])*.*.(sp[[i]][2])))})
>>
>> return(labels)
>> }
>>
>> library(gridExtra)
>>
>> d <- align.digits(l = c(125.3, 1.2344))
>> grid.table(d, parse=T,core.just="left")
>>
>> HTH,
>>
>> baptiste
>>
>>> One could adjust the phantom()-arguments by hand to achieve (1), but is 
>>> there a
>>> simpler solution? For (2) I have no idea.
>>>
>>> Cheers,
>>>
>>> Marius
>>>
>>>
>>> library(lattice)
>>> library(grid)
>>> library(gridExtra)
>>>
>>> ## splom with customized lower.panel
>>> ## x: data
>>> ## arr: array of containing expressions which are plotted in a grid table 
>>> in the
>>> ##      lower panel (i,j)]
>>> splom2 <- function(x, 

Re: [R] Include C++ DLL, error in ...: C symbol name not in load table

2011-04-20 Thread Dirk Eddelbuettel

On 20 April 2011 at 16:24, Duncan Murdoch wrote:
| On 20/04/2011 4:06 PM, Sascha Vieweg wrote:
| > Hello R experts
| >
| > I am googling and reading around, however, I can't get it working
| > (perhaps because I do not understand much C, however, I'll give it
| > a try). I am trying to include C++ code into an R routine, where
| > the C++ code looks:
| >
| > #include
| > using namespace std;
| > void foo (double* x, double* y, double* out)
| > {
| > out[0] = x[0] + y[0];
| > }
| >
| > Back in R, the command
| >
| > R CMD SHLIB --preclean -o xplusy
| >
| > works fine resulting in two new files, xplusy.o and xplusy.so. The
| > wrapper in R is:
| >
| > dyn.load("xplusy.so")
| > xplusy<- function(x, y){
| > .C("foo", as.double(x), as.double(y), out=double(1))$out
| > }
| > xplusy(1, 2)
| > dyn.unload("xplusy.so")
| >
| > Now, dyn.load() works and xplusy also shows up in getLoadedDLLs().
| > However, when invoking the function, xplusy(1, 2), R complains:
| >
| > Error in .C("foo", as.double(x), as.double(y), out = double(1)): C
| > symbol name "foo" not in load table
| >
| > I found some hints concerning Fortran code producing this error
| > message, but no help concerning C code.
| 
| You have C++ code, not C code.  C++ normally mangles names of exports.
| 
| To get this to work, you should surround your declarations with
| 
| extern "C" {
| }
| 
| Another possibility is to use the Rcpp package; it writes the interface 
| code for you.

I believe Duncan refers to the 'inline' package, rathern than 'Rcpp' (which
itself uses 'inline').

Dirk

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get cells by the combination of their column and row names

2011-04-20 Thread Phil Spector

Yoav -
   Here's one possibility:


wh = outer(rownames(mtrx),colnames(mtrx),

+ function(x,y)substr(x,nchar(x)-2,nchar(x)) == 
substr(y,nchar(y)-2,nchar(y)))

mtrx[wh]

[1] 1 6 8

If you knew that all of the row and column names were 4 characters long, it 
would simplify to


mtrx[outer(rownames(mtrx),colnames(mtrx),function(x,y)substr(x,2,4) == 
substr(y,2,4))]


Hope this helps.
- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Wed, 20 Apr 2011, yoav baranan wrote:



Hi,
I have a (correlation) matrix and I want to select a subset of its cells 
depending on the combination of their column and row names.
This illustrates my problem: mtrx <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3, 
dimnames = list(c('c132','c432', 'c233'),  c('r132','r233', 'r432')))> mtrx 
r132 r233 r432c132147c432258c233369
At this point I want to compute the mean of the cells in which the column and 
the row names share the same suffix (i.e., the cells mtrx[1,1], mtrx[2,3], 
mtrx[3,2]). By suffix I mean the last three digits of the row/col name.
I need a generic code (and not vec<-c(mtrx[1,1], mtrx[2,3], mtrx[3,2])) because 
I don't know in advance how many rows and columns I will have (currently 118).
Thanks, Yoav
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Include C++ DLL, error in ...: C symbol name not in load table

2011-04-20 Thread Duncan Murdoch

On 20/04/2011 4:06 PM, Sascha Vieweg wrote:

Hello R experts

I am googling and reading around, however, I can't get it working
(perhaps because I do not understand much C, however, I'll give it
a try). I am trying to include C++ code into an R routine, where
the C++ code looks:

#include
using namespace std;
void foo (double* x, double* y, double* out)
{
out[0] = x[0] + y[0];
}

Back in R, the command

R CMD SHLIB --preclean -o xplusy

works fine resulting in two new files, xplusy.o and xplusy.so. The
wrapper in R is:

dyn.load("xplusy.so")
xplusy<- function(x, y){
.C("foo", as.double(x), as.double(y), out=double(1))$out
}
xplusy(1, 2)
dyn.unload("xplusy.so")

Now, dyn.load() works and xplusy also shows up in getLoadedDLLs().
However, when invoking the function, xplusy(1, 2), R complains:

Error in .C("foo", as.double(x), as.double(y), out = double(1)): C
symbol name "foo" not in load table

I found some hints concerning Fortran code producing this error
message, but no help concerning C code.


You have C++ code, not C code.  C++ normally mangles names of exports.

To get this to work, you should surround your declarations with

extern "C" {
}

Another possibility is to use the Rcpp package; it writes the interface 
code for you.


Duncan Murdoch



Any help tips?

Thanks, *S*

$. uname -a
Darwin * 10.7.3 Darwin Kernel Version 10.7.3[...]

$: c++ --version
i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666)
(dot 3)

platform   i386-apple-darwin9.8.0
arch   i386
os darwin9.8.0
system i386, darwin9.8.0
status
major  2
minor  12.2
year   2011
month  02
day25
svn rev54585
language   R
version.string R version 2.12.2 (2011-02-25)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get cells by the combination of their column and row names

2011-04-20 Thread yoav baranan

Hi, 
I have a (correlation) matrix and I want to select a subset of its cells 
depending on the combination of their column and row names.  
This illustrates my problem: mtrx <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=3, 
ncol=3, dimnames = list(c('c132','c432', 'c233'),  c('r132','r233', 'r432')))> 
mtrx r132 r233 r432c132147c432258c233369
At this point I want to compute the mean of the cells in which the column and 
the row names share the same suffix (i.e., the cells mtrx[1,1], mtrx[2,3], 
mtrx[3,2]). By suffix I mean the last three digits of the row/col name.
I need a generic code (and not vec<-c(mtrx[1,1], mtrx[2,3], mtrx[3,2])) because 
I don't know in advance how many rows and columns I will have (currently 118). 
Thanks, Yoav  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Include C++ DLL, error in ...: C symbol name not in load table

2011-04-20 Thread Sascha Vieweg

Hello R experts

I am googling and reading around, however, I can't get it working 
(perhaps because I do not understand much C, however, I'll give it 
a try). I am trying to include C++ code into an R routine, where 
the C++ code looks:


#include 
using namespace std;
void foo (double* x, double* y, double* out)
{
out[0] = x[0] + y[0];
}

Back in R, the command

R CMD SHLIB --preclean -o xplusy

works fine resulting in two new files, xplusy.o and xplusy.so. The 
wrapper in R is:


dyn.load("xplusy.so")
xplusy <- function(x, y){
  .C("foo", as.double(x), as.double(y), out=double(1))$out
}
xplusy(1, 2)
dyn.unload("xplusy.so")

Now, dyn.load() works and xplusy also shows up in getLoadedDLLs(). 
However, when invoking the function, xplusy(1, 2), R complains:


Error in .C("foo", as.double(x), as.double(y), out = double(1)): C 
symbol name "foo" not in load table


I found some hints concerning Fortran code producing this error 
message, but no help concerning C code.


Any help tips?

Thanks, *S*

$. uname -a
Darwin * 10.7.3 Darwin Kernel Version 10.7.3[...]

$: c++ --version
i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) 
(dot 3)


platform   i386-apple-darwin9.8.0
arch   i386
os darwin9.8.0
system i386, darwin9.8.0
status
major  2
minor  12.2
year   2011
month  02
day25
svn rev54585
language   R
version.string R version 2.12.2 (2011-02-25)

--
Sascha Vieweg, saschav...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in dimnames(x) for Poisson EWMA model

2011-04-20 Thread David Winsemius


On Apr 20, 2011, at 3:05 PM, Ashley E. Jochim wrote:

I am attempting to run a Poisson EWMA model using Patrick Brandt's  
source code.  I get the following error when I run the code:


Error in dimnames(x) <- dn :
 length of 'dimnames' [1] not equal to array extent

Dimnames(x) looks like this:

[[1]]
NULL

[[2]]
[1] "mip"   "div"   "nom"   "unity" "mood"   
"times"

[7] "hmajparty" "smajparty" "pres"

It seems to me that the NULL value here is causing the problem but I  
can't figure out (1) why the null value is showing up and (2) how to  
get rid of it.  Background code is as follows:


cr.pois<- glm(countlaw ~ mip + div + nom + unity + mood + times +  
hmajparty + smajparty + pres, family = poisson)

summary(cr.pois)

init <- coef(cr.pois)

x<- cbind(mip,div,nom,unity,mood,times,hmajparty,smajparty,pres)


This returns a matrix, i.e an object with two dimensions and dimnames  
returns a list with two character vectors (one of which is null.


> a = 1
> b=2
> cc=3
> cbind(a,b,cc)
 a b cc
[1,] 1 2  3
> dim(cbind(a,b,cc))
[1] 1 3
> dimnames(cbind(a,b,cc))
[[1]]
NULL

[[2]]
[1] "a"  "b"  "cc"

If you wanted to assert that the other dimname should be something  
then you need to insert a value into just the first element:


dimnames(x)[[1]] <- "something"

E.g.:

> dimnames(x)[[1]] <- "something"
> x
  a b cc
something 1 2  3



cr.PEWMA<-  
Pewma 
(y=countlaw,x=as.matrix(x),init.param=c(0.8,init[2:length(init)]))


Not sure why the error occurs. This appears to be using a package  
which you have chosen not to name.


--
David.




Which results in this output:

N = 10, M = 5 machine precision = 2.22045e-16

iterations 34
function evaluations 45
segments explored during Cauchy searches 35
BFGS updates skipped 0
active bounds at final generalized Cauchy point 1
norm of the final projected gradient 0.00806577
final function value 1.78439

final  value 1.784390
converged
Error in dimnames(x) <- dn :
 length of 'dimnames' [1] not equal to array extent

Any thoughts on what's going on here and how to fix it?

Many thanks in advance.

Ashley
--
Ashley E. Jochim
Graduate Fellow
Center for American Politics & Public Policy
Department of Political Science
University of Washington
a...@u.washington.edu
http://students.washington.edu/aew9

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in dimnames(x) for Poisson EWMA model

2011-04-20 Thread Ashley E. Jochim
I am attempting to run a Poisson EWMA model using Patrick Brandt's source code. 
 I get the following error when I run the code:

Error in dimnames(x) <- dn : 
  length of 'dimnames' [1] not equal to array extent

Dimnames(x) looks like this:

[[1]]
NULL

[[2]]
[1] "mip"   "div"   "nom"   "unity" "mood"  "times"
[7] "hmajparty" "smajparty" "pres"  

It seems to me that the NULL value here is causing the problem but I can't 
figure out (1) why the null value is showing up and (2) how to get rid of it.  
Background code is as follows:

cr.pois<- glm(countlaw ~ mip + div + nom + unity + mood + times + hmajparty + 
smajparty + pres, family = poisson)
summary(cr.pois)

init <- coef(cr.pois)

x<- cbind(mip,div,nom,unity,mood,times,hmajparty,smajparty,pres)

cr.PEWMA<- 
Pewma(y=countlaw,x=as.matrix(x),init.param=c(0.8,init[2:length(init)]))

Which results in this output:

N = 10, M = 5 machine precision = 2.22045e-16

iterations 34
function evaluations 45
segments explored during Cauchy searches 35
BFGS updates skipped 0
active bounds at final generalized Cauchy point 1
norm of the final projected gradient 0.00806577
final function value 1.78439

final  value 1.784390 
converged
Error in dimnames(x) <- dn : 
  length of 'dimnames' [1] not equal to array extent

Any thoughts on what's going on here and how to fix it?

Many thanks in advance.

Ashley
--
Ashley E. Jochim
Graduate Fellow
Center for American Politics & Public Policy
Department of Political Science
University of Washington
a...@u.washington.edu
http://students.washington.edu/aew9

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] output usage info to stdout

2011-04-20 Thread Hui Du
Hi all,


I want to run a R code in the batch mode under UNIX system. 
Inside that code, I have a usage() function to give the hints regarding 
parameters. For example
usage = function()
{
msg = "R CMD BATCH --save \"--args path=\\\"input_path\\\" 
file=\\\"input_file\\\" [seed=\\\"integer\\\" proportion=\\\"[0-1]\\\" 
outpath=\\\"output_path\\\" outfile=\\\"output_file\\\"]\" 
/S/hxd/utils/sampleRecords.r.txt";
   print(msg);
}

When I run the code like

R CMD BATCH mycode.R, I want to output the usage info if the required 
parameters are not there. But how to output the usage info listed above to 
standard out rather than to a file?


Thanks.

HXD


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] avoiding if-then statements for looped chi-square tests

2011-04-20 Thread Louis Plough
Hi,
I am trying to test for pairwise associations between genotypes (
Rows=individuals, Columns =genes, data are up to 4 genotypes per gene, some
with 2,3 or 4) where each chisquare comparison is different depending on the
genes tested.  The test is the observed multilocus (across columns for each
individual) genotypes vs the expectation, which is the product of the
individual frequency for each genotype times the total number of
individuals. Simple test.

I have set up a loop that pairs each gene together in a 2 column array, but
there are a number of different RxC tests based on how many genotypes (e.g.
2 genotypes vs 4 gives 8 cells, 2 by 2 gives you a 4 cell test etc.).
 Instead of writing a series of if then statements for the chi-square test
on the obs-exp numbers for each possible gene pair (i've started to do
this2x2, 2x3, 3x2, 2x4, 3x4,etc) is there a generic way to pass the
number of unique elements for each gene (which would define the number of
genotypes for that gene) into a calculation of the obs-exp and then the
chisquare test?

Here is some example code to illustrate what I am trying to work with.

###example dataset

lets<-c("ac","ad","bc","bd")
epis<-cbind("m1"=sample(lets[1:3],25,
replace=TRUE),"m2"=sample(lets,25,replace=TRUE),"m3"=sample(lets[1:2],25,
replace=TRUE))

###the loop that binds all possible combinations of columns (genes)
pairwise###
for (i in 1:2){
   for (j in (i+1):3){
   epi2=cbind(epis[,i], epis[,j])
  print(epi2)
}
}


Thanks for any help and thanks to the previous responses on my post with the
error in my loop (Peter Dahlgard).

Louis

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Relabelling

2011-04-20 Thread Dennis Murphy
Hi:

How about

y <- rnorm(4000)
ymat <- rowMeans(replicate(1000, y[sample(4000)]))
hist(ymeans)

system.time({y <- rnorm(4000); yy <- rowMeans(replicate(1000,
y[sample(4000)]))})
   user  system elapsed
   0.190.030.22

HTH,
Dennis

On Wed, Apr 20, 2011 at 7:04 AM, kmatthews  wrote:
> I have 4000 observations that I need to randomly relabel 1000 times and then
> calculate the mean of the 1000 values at each of the 4000 points.  Any ideas
> for where to begin?
>
> Thanks
> Kevin
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Random-Relabelling-tp3463100p3463100.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Relabelling

2011-04-20 Thread Kevin Matthews
I have a map of Iowa of with 4000 locations.  At each location, I have a
cancer mortality rate.  I need to test my null hypothesis; that the spatial
distribution of the mortality rates is  random.  For this test, I need to
establish a spatial reference distribution.

My reference distribution will be created by some random relabelling
algorithm.  The 4000 locations would remain fixed, but the observed
mortality rates would be randomly redistributed.  Then, I want 1000
permutations of the same algorithm.  For each of those 1000 times, I would
record the redistributed mortality rate at each location.  Then,  I would
calculate the mean of the 1000 points.  The result would be a spatial
reference distribution with a mean value of the random permutations at each
of the 4000 locations.

Thanks for the response,
Kevin

On Wed, Apr 20, 2011 at 11:08 AM, John Kane  wrote:

> Can you explain this a bit more. At the moment I don't see what you are
> trying to achieve.   "calculate the mean of the 1000 values at each of the
> 4000 points" does not seem to make sense.
>
> --- On Wed, 4/20/11, kmatthews  wrote:
>
> > From: kmatthews 
> > Subject: [R] Random Relabelling
> > To: r-help@r-project.org
> > Received: Wednesday, April 20, 2011, 10:04 AM
> > I have 4000 observations that I need
> > to randomly relabel 1000 times and then
> > calculate the mean of the 1000 values at each of the 4000
> > points.  Any ideas
> > for where to begin?
> >
> > Thanks
> > Kevin
> >
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pattern match

2011-04-20 Thread Neeti
Hi ALL,

I have very simple question regarding pattern matching. Could anyone tell me
how to I can use R to retrieve string pattern from text file.  for example
my file contain following information

SpeciesCommon=(Human);SpeciesScientific=(Homo
sapiens);ReactiveCentres=(N,C,C,C,+
H,O,C,C,C,C,O,H);BondInvolved=(C-H);EzCatDBID=(S00343);BondFormed=(O-H,O-H);Bond+
255B);Cofactors=(Cu(II),CU,501,A,Cu(II),CU,502,A);CatalyticSwissProt=(P25006);Sp+
eciesScientific=(Achromobacter
cycloclastes);SpeciesCommon=(Bacteria);ReactiveCe+

and I want to extract “SpeciesScientific = (?)” information from this file.
Problem is in 3rd line where SpeciesScientific word is divided with +.  

Could anyone help me please?
Thank you


--
View this message in context: 
http://r.789695.n4.nabble.com/Pattern-match-tp3463625p3463625.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can R replicate this data manipulation in SAS?

2011-04-20 Thread Ted Harding
[*** PLEASE NOTE: I am sending this message on behalf of
 Paul Miller:
 Paul Miller 
 (to whom this message has also been copied). He has been
 trying to send it, but it has never got through. Please
 do  not reply to me, but either to the list and/or to Paul
 at that address ***]
==
Hello Everyone,

I'm learning R and am trying to get a better sense of what it will and
will not
do. I'm hearing in some places that R may not be able to accomplish all
of the
data manipulation tasks that SAS can. In others, I'm hearing that R can do
pretty much any data manipulation that SAS can but the way in which it
does so
is likely to be quite different.

Below is some SAS syntax that that codes Highly Active Antiretroviral
Therapy
(HAART) regimens in HIV patients by retaining the values of variables.
Interspersed between the bits of code are printouts of data sets that are
created in the process of coding. I'm hoping this will come through
clearly and
that people will be able to see exactly what is being done. Basically,
the code
keeps track of how many drugs people are on and what types of drugs they
are
taking during specific periods of time and decides whether that
constitutes
HAART or not.

To me, this is a pretty tricky data manipulation in SAS. Is there any way
to
get the equivalent result in R?

Thanks,

Paul


 SAS syntax for coding HAART in HIV patients;
 Read in test data;

data haart;
input id drug_class $ start_date :mmddyy. stop_date :mmddyy.;
format start_date stop_date mmddyy8.;
cards;
1004 NRTI  07/24/95 01/05/99
1004 NRTI  11/20/95 12/10/95
1004 NRTI  01/10/96 01/05/99
1004 PI05/09/96 11/16/97
1004 NRTI  06/01/96 02/01/97
1004 NRTI  07/01/96 03/01/97
 PI01/02/03 .
 NNRTI 04/05/06 07/08/09
;
run;

proc print data=haart;
run;

   drug_  start_   stop_
Obs id classdatedate
1 1004NRTI 07/24/9501/05/99
2 1004NRTI 11/20/9512/10/95
3 1004NRTI 01/10/9601/05/99
4 1004PI   05/09/9611/16/97
5 1004NRTI 06/01/9602/01/97
6 1004NRTI 07/01/9603/01/97
7 PI   01/02/03   .
8 NNRTI04/05/0607/08/09

 Reshape data into series with 1 date rather than separate starts and
stops;

data changes (drop=start_date stop_date where=(not missing(date)));
set haart;
date = start_date;
change =  1;
output;
date =  stop_date;
change = -1;
output;
format date mmddyy10.;
run;

proc sort data=changes;
by id date;
run;

proc print data=changes;
run;

   drug_
Obs id class  datechange
  11004NRTI 07/24/1995   1
  21004NRTI 11/20/1995   1
  31004NRTI 12/10/1995  -1
  41004NRTI 01/10/1996   1
  51004PI   05/09/1996   1
  61004NRTI 06/01/1996   1
  71004NRTI 07/01/1996   1
  81004NRTI 02/01/1997  -1
  91004NRTI 03/01/1997  -1
101004PI   11/16/1997  -1
111004NRTI 01/05/1999  -1
121004NRTI 01/05/1999  -1
13PI   01/02/2003   1
14NNRTI04/05/2006   1
15NNRTI07/08/2009  -1

 Get regimen information plus start and stop dates;

data cumulative(drop=drug_class change stop_date) 
 stop_dates(keep=id regimen stop_date);
set changes;
by id date;

if first.id then do;
  regimen = 0;
  NRTI = 0;
  NNRTI = 0;
  PI = 0;
end;

if drug_class = 'NNRTI' then NNRTI + change;
else if drug_class = 'NRTI' then NRTI + change;
else if drug_class = 'PI  ' then PI + change;

if last.date then do;
  stop_date = date - 1;
if regimen then output stop_dates;
   regimen + 1;
  alldrugs = NNRTI + NRTI + PI;
  HAART = (NRTI >= 3 AND NNRTI=0 AND PI=0) OR
(NRTI >= 2 AND (NNRTI >= 1 OR PI >= 1)) OR
(NRTI = 1 AND NNRTI >= 1 AND PI >= 1);
output cumulative;
end;

format stop_date mmddyy10.;
run;

proc print data=cumulative;
run;
Obs id   dateregimenNRTINNRTIPIalldrugs  
 HAART
  1100407/24/199511   0   01 
   0
  2100411/20/199522   0   02 
   0
  3100412/10/199531   0   01 
   0
  4100401/10/199642   0   02 
   0
  5100405/09/199652   0   13 
   1
  6100406/01/199663   0   14 
   1
  7100407/01/199674   0   15 
   1
  8100402/01/199783   0   14 
   1
  9100403/01/199792   0   13 
   1
10100411/16/1997   102   0   02  
  0
11100401/05/1999   110 

[R] Sqldf INSERT INTO

2011-04-20 Thread new2R
 Hi,

I am new to R and trying to migrate from SAS. I am trying to copy data from
one table to another table which have same columns using sqldf. but not
working and showing "NULL"

I wrote statement as sqldf("INSERT INTO new select * from data") but showing
NULL

Please help me in this regard.

Thank you

--
View this message in context: 
http://r.789695.n4.nabble.com/Sqldf-INSERT-INTO-tp3463533p3463533.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Polynomial question solved.

2011-04-20 Thread Erin Hodgess
Dear R People:

Here is the solution:

> polynomial(c(-2,1))*polynomial(c(-1,1))
2 - 3*x + x^2
>
Sorry for the trouble.
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] polynomial question

2011-04-20 Thread Erin Hodgess
Dear R People:

Suppose I have (x-1)*(x-2) and would like to produce x^2 - 3x + 2.

I have tried
> polynomial(c(1,1)*c(2,1))
2 + x

from the polynom library, but as you can see, that doesn't do it.

Is there a function to do this please?  I'm sure that I've done this
before, but I can't remember the correct way.

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Crouts algorithm

2011-04-20 Thread Adrienne Wootten
R-listers

Quick question for the group.  Is there any LU decomposition that
makes use of Crout's algorithm in R.  I've been looking for it and I
really haven't seen it among the R packages.

A

-- 
Adrienne Wootten
Graduate Research Assistant
State Climate Office of North Carolina
Department of Marine, Earth and Atmospheric Sciences
North Carolina State University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Relabelling

2011-04-20 Thread Jeremy Hetzel
Kevin,

The following follows John's suggestion, but without the loop.  It's quick 
for me.

Jeremy


Jeremy T. Hetzel
Boston University



## Generate sample data
n <- 4000
rep <- 1000
rate <- rnorm(n, mean = 15, sd = 2) / 10 # Mortality rates around 
15/100k

## Create an empty matrix with appropriate dimensions
permutations <- matrix(ncol = n, nrow = rep)

## Use apply() to resample
permutations <- apply(permutations, 1, function(x)
{
sample(rate, size = n, replace = F)
})

## Look at the matrix
dim(permutations)
head(permutations)

## Find the column means
means <- apply(permutations, 1, mean)
means





On Wednesday, April 20, 2011 1:56:35 PM UTC-4, John Kane wrote:
>
> There is probably a better way to do this but a for loop like this should 
> work. You would just need to change the numbers to yours and then add on the 
> locations 
> = 
>
> scores  <- 1:5
> mydata <- matrix(data=NA, nrow=5, ncol=10)
>
> for(i in 1:10) {
> mydata[,i] <- sample(scores, 5, replace=FALSE)
> }
>
> =
> --- On Wed, 4/20/11, Kevin Matthews  wrote:
>
> From: Kevin Matthews 
> Subject: Re: [R] Random Relabelling
> To: "John Kane" 
> Cc: r-h...@r-project.org
> Received: Wednesday, April 20, 2011, 1:22 PM
>
> I have a map of Iowa of with 4000 locations.  At each location, I have a 
> cancer mortality rate.  I need to test my null hypothesis; that the spatial 
> distribution of the mortality rates is  random.  For this test, I need to 
> establish a spatial reference distribution.  
>
>
> My reference distribution will be created by some random relabelling 
> algorithm.  The 4000 locations would remain fixed, but the observed 
> mortality rates would be randomly redistributed.  Then, I want 1000 
> permutations of the same algorithm.  For each of those 1000 times, I would 
> record the redistributed mortality rate at each location.  Then,  I would 
> calculate the mean of the 1000 points.  The result would be a spatial 
> reference distribution with a mean value of the random permutations at each 
> of the 4000 locations.  
>
> Thanks for the response,Kevin
>
> On Wed, Apr 20, 2011 at 11:08 AM, John Kane  wrote:
>
>
> Can you explain this a bit more. At the moment I don't see what you are 
> trying to achieve.   "calculate the mean of the 1000 values at each of the 
> 4000 points" does not seem to make sense.
>
> --- On Wed, 4/20/11, kmatthews  wrote:
>
> > From: kmatthews 
>
> > Subject: [R] Random Relabelling
>
> > To: r-h...@r-project.org
>
> > Received: Wednesday, April 20, 2011, 10:04 AM
>
> > I have 4000 observations that I need
>
> > to randomly relabel 1000 times and then
>
> > calculate the mean of the 1000 values at each of the 4000
>
> > points.  Any ideas
>
> > for where to begin?
>
> >
>
> > Thanks
>
> > Kevin
>
> >
>
>
> [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix package transpose

2011-04-20 Thread Douglas Bates
On Wed, Apr 20, 2011 at 8:37 AM, Tobias Abenius
 wrote:

> Since I installed R 2.13 I cannot use the transpose method "t" on sparse
> matrices inside my package. Outside the package works. Is there something
> new that I have to import methods? Can I then import everything from the
> Matrix package? The problem is that R tries to use t.default which of course
> doesn't work.

As the tag line on each message says:

> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Please provide an example.  It will also help to include the output of

sessionInfo()

so we can determine exactly which version of R and the Matrix package
you are using on what platform.  Also try

> find("t")
[1] "package:Matrix" "package:base"

to see which version of "t" is the first on the search path.

It seems to still be  working for me, but "outside the package".

> library(Matrix)
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

det

> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Matrix_0.9996875-0 lattice_0.19-17

loaded via a namespace (and not attached):
[1] grid_2.13.0
> example(spMatrix)
... output omitted
> A
10 x 20 sparse Matrix of class "dgTMatrix"

 [1,] . 7 . . .  .  .  .  .  . . . . . . . . . . .
 [2,] . . . . .  .  .  .  .  . . . . . . . . . . .
 [3,] . . . . .  .  .  . 14  . . . . . . . . . . .
 [4,] . . . . . 21  .  .  .  . . . . . . . . . . .
 [5,] . . . . .  . 28  .  .  . . . . . . . . . . .
 [6,] . . . . .  .  . 35  .  . . . . . . . . . . .
 [7,] . . . . .  .  .  . 42  . . . . . . . . . . .
 [8,] . . . . .  .  .  .  . 49 . . . . . . . . . .
 [9,] . . . . .  .  .  .  .  . . . . . . . . . . .
[10,] . . . . .  .  .  .  .  . . . . . . . . . . .
> t(A)
20 x 10 sparse Matrix of class "dgTMatrix"

 [1,] . .  .  .  .  .  .  . . .
 [2,] 7 .  .  .  .  .  .  . . .
 [3,] . .  .  .  .  .  .  . . .
 [4,] . .  .  .  .  .  .  . . .
 [5,] . .  .  .  .  .  .  . . .
 [6,] . .  . 21  .  .  .  . . .
 [7,] . .  .  . 28  .  .  . . .
 [8,] . .  .  .  . 35  .  . . .
 [9,] . . 14  .  .  . 42  . . .
[10,] . .  .  .  .  .  . 49 . .
[11,] . .  .  .  .  .  .  . . .
[12,] . .  .  .  .  .  .  . . .
[13,] . .  .  .  .  .  .  . . .
[14,] . .  .  .  .  .  .  . . .
[15,] . .  .  .  .  .  .  . . .
[16,] . .  .  .  .  .  .  . . .
[17,] . .  .  .  .  .  .  . . .
[18,] . .  .  .  .  .  .  . . .
[19,] . .  .  .  .  .  .  . . .
[20,] . .  .  .  .  .  .  . . .

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Relabelling

2011-04-20 Thread John Kane
There is probably a better way to do this but a for loop like this should work. 
You would just need to change the numbers to yours and then add on the 
locations 
= 

scores  <- 1:5
mydata <- matrix(data=NA, nrow=5, ncol=10)

for(i in 1:10) {
mydata[,i] <- sample(scores, 5, replace=FALSE)
}

=
--- On Wed, 4/20/11, Kevin Matthews  wrote:

From: Kevin Matthews 
Subject: Re: [R] Random Relabelling
To: "John Kane" 
Cc: r-help@r-project.org
Received: Wednesday, April 20, 2011, 1:22 PM

I have a map of Iowa of with 4000 locations.  At each location, I have a cancer 
mortality rate.  I need to test my null hypothesis; that the spatial 
distribution of the mortality rates is  random.  For this test, I need to 
establish a spatial reference distribution.  


My reference distribution will be created by some random relabelling algorithm. 
 The 4000 locations would remain fixed, but the observed mortality rates would 
be randomly redistributed.  Then, I want 1000 permutations of the same 
algorithm.  For each of those 1000 times, I would record the redistributed 
mortality rate at each location.  Then,  I would calculate the mean of the 1000 
points.  The result would be a spatial reference distribution with a mean value 
of the random permutations at each of the 4000 locations.  



Thanks for the response,Kevin

On Wed, Apr 20, 2011 at 11:08 AM, John Kane  wrote:


Can you explain this a bit more. At the moment I don't see what you are trying 
to achieve.   "calculate the mean of the 1000 values at each of the 4000 
points" does not seem to make sense.





--- On Wed, 4/20/11, kmatthews  wrote:



> From: kmatthews 

> Subject: [R] Random Relabelling

> To: r-help@r-project.org

> Received: Wednesday, April 20, 2011, 10:04 AM

> I have 4000 observations that I need

> to randomly relabel 1000 times and then

> calculate the mean of the 1000 values at each of the 4000

> points.  Any ideas

> for where to begin?

>

> Thanks

> Kevin

>






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'Record' row values every time the binary value in a collumn changes

2011-04-20 Thread Phil Spector

Here's one way to do part 1:


rr = rle(Table[,'binary'])
cc = cumsum(rr$lengths)+1
thestarts =  c(1,cc[cc<=nrow(Table)])
theends = cc-1
answer = 
cbind(Table[thestarts,'Chromosome'],Table[thestarts,'start'],Table[theends,'start'],rr$values)
answer

 [,1] [,2] [,3] [,4]
[1,]1   12   181
[2,]1   20   360
[3,]2   12   161
[4,]2   17   190

If I understand you correctly, here's a way to do part 2:


Next = 
matrix(c(rep(1,12),rep(2,8),c(12,13,14,15,16,18,20,21,22,23,25,35,12,13,14,15,16,17,18,19)),ncol=2)
apply(Next,1,function(x)answer[answer[,1]==x[1] & x[2] >= answer[,2] & x[2] <= 
answer[,3],4])

 [1] 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu












On Wed, Apr 20, 2011 at 5:01 AM, baboon2010  wrote:

My question is twofold.

Part 1:
My data looks like this:

(example set, real data has 2*10^6 rows)
binary<-c(1,1,1,0,0,0,1,1,1,0,0)
Chromosome<-c(1,1,1,1,1,1,2,2,2,2,2)
start<-c(12,17,18,20,25,36,12,15,16,17,19)
Table<-cbind(Chromosome,start,binary)
     Chromosome start binary
 [1,]          1    12      1
 [2,]          1    17      1
 [3,]          1    18      1
 [4,]          1    20      0
 [5,]          1    25      0
 [6,]          1    36      0
 [7,]          2    12      1
 [8,]          2    15      1
 [9,]          2    16      1
[10,]          2    17      0
[11,]          2    19      0

As output I need a shortlist for each binary block: giving me the starting
and ending position of each block.
Which for these example would look like this:
    Chromosome2 position_start position_end binary2
[1,]           1             12           18       1
[2,]           1             20           36       0
[3,]           2             12           16       1
[4,]           2             17           19       0

Part 2:
Based on the output of part 1, I need to assign the binary to rows of
another data set. If the position value in this second data set falls in one
of the blocks defined in the shortlist made in part1,the binary value of the
shortlist should be assigned to an extra column for this row.  This would
look something like this:
    Chromosome3 position Value binary3
 [1,] "1"         "12"     "a"   "1"
 [2,] "1"         "13"     "b"   "1"
 [3,] "1"         "14"     "c"   "1"
 [4,] "1"         "15"     "d"   "1"
 [5,] "1"         "16"     "e"   "1"
 [6,] "1"         "18"     "f"   "1"
 [7,] "1"         "20"     "g"   "0"
 [8,] "1"         "21"     "h"   "0"
 [9,] "1"         "22"     "i"   "0"
[10,] "1"         "23"     "j"   "0"
[11,] "1"         "25"     "k"   "0"
[12,] "1"         "35"     "l"   "0"
[13,] "2"         "12"     "m"   "1"
[14,] "2"         "13"     "n"   "1"
[15,] "2"         "14"     "o"   "1"
[16,] "2"         "15"     "p"   "1"
[17,] "2"         "16"     "q"   "1"
[18,] "2"         "17"     "s"   "0"
[19,] "2"         "18"     "d"   "0"
[20,] "2"         "19"     "f"   "0"


Many thanks in advance,

Niels

--
View this message in context: 
http://r.789695.n4.nabble.com/Record-row-values-every-time-the-binary-value-in-a-collumn-changes-tp3462496p3462496.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two Questions

2011-04-20 Thread Greg Snow
When running a large number of commands from a script that produces multiple 
plots it is often best to send the plots to the pdf device (or other system) 
that you can then page through after it is finished.  You could also specify 
par(ask=TRUE) then you would be prompted before changing the plot (but other 
code would not execute either).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Stephen P Molnar
> Sent: Wednesday, April 20, 2011 7:23 AM
> To: R-help
> Subject: [R] Two Questions
> 
> Sorry for the somewhat nondescript subject line, but I have two
> questions:
> 
> 
> 
> 1.What is a really good book on R for a nonprogrammer?
> 
> 2.   How do I open more than one R Graphics: Device 2(ACTIVE).
> That
> what is the R command that I can use to keep more than one plot open.
> I am
> running a script from a book on Chemometrics that results in more than
> one
> graph during the execution, but it seems that R deletes each graph when
> the
> script calls for the next plot.
> 
> 
> 
> Thanks in advance
> 
> 
> 
> Stephen P. Molnar, Ph.D.  Life
> is a
> fuzzy set
> 
> Foundation for Chemistry
> Stochastic
> and multivariate
> 
> http://www.FoundationForChemistry.com
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'Record' row values every time the binary value in a collumn changes

2011-04-20 Thread Dennis Murphy
Hi:

Here are a couple more options using packages plyr and data.table. The
labels in the second part are changed because they didn't make sense
in a 2M line file (well, mine may not either, but it's a start). You
can always change them to something more pertinent.

# Question 1:
Table <- data.frame(binary, chromosome = Chromosome, start)

library(plyr)
(df <- ddply(Table, .(chromosome, binary), summarise, position_start =
min(start),
 position_end = max(start)))
  chromosome binary position_start position_end
1  1  0 20   36
2  1  1 12   18
3  2  0 17   19
4  2  1 12   16

library(data.table)
dTable <- data.table(Table, key = 'chromosome, binary')
(dt <- dTable[, list(position_start = min(start),
   position_end = max(start)), by = 'chromosome, binary'])
 chromosome binary position_start position_end
[1,]  1  0 20   36
[2,]  1  1 12   18
[3,]  2  0 17   19
[4,]  2  1 12   16

## Question 2:

For plyr, it's easy to write a function that takes a generic input data frame
(in this case, a single line) and then outputs a data frame with
positions and labels.

tfun <- function(df) {
 diff <- with(df, position_end - position_start + 1)
 position <- with(df, seq(position_start, position_end))
 value <- paste(df$chromosome, df$binary, letters[1:diff], sep = '.')
 data.frame(chromosome = df$chromosome, position, value, binary = df$binary)
}

# Then:

> ddply(df, .(chromosome, binary), tfun)
   chromosome position value binary
1   1   20 1.0.a  0
2   1   21 1.0.b  0
3   1   22 1.0.c  0
4   1   23 1.0.d  0
5   1   24 1.0.e  0
6   1   25 1.0.f  0
7   1   26 1.0.g  0
8   1   27 1.0.h  0
9   1   28 1.0.i  0
10  1   29 1.0.j  0
11  1   30 1.0.k  0
12  1   31 1.0.l  0
13  1   32 1.0.m  0
14  1   33 1.0.n  0
15  1   34 1.0.o  0
16  1   35 1.0.p  0
17  1   36 1.0.q  0
18  1   12 1.1.a  1
19  1   13 1.1.b  1
20  1   14 1.1.c  1
21  1   15 1.1.d  1
22  1   16 1.1.e  1
23  1   17 1.1.f  1
24  1   18 1.1.g  1
25  2   17 2.0.a  0
26  2   18 2.0.b  0
27  2   19 2.0.c  0
28  2   12 2.1.a  1
29  2   13 2.1.b  1
30  2   14 2.1.c  1
31  2   15 2.1.d  1
32  2   16 2.1.e  1

# For data.table, one can apply the internals of tfun directly:

dt[, list(chromosome = chromosome, position = seq(position_start, position_end),
value = paste(chromosome, binary,
  letters[1:(position_end - position_start + 1)],
sep = '.'),
binary = binary), by = 'chromosome, binary']
   chromosome binary chromosome.1 position value binary.1
1  01   20 1.0.a0
1  01   21 1.0.b0
1  01   22 1.0.c0
1  01   23 1.0.d0
1  01   24 1.0.e0
1  01   25 1.0.f0
1  01   26 1.0.g0
1  01   27 1.0.h0
1  01   28 1.0.i0
1  01   29 1.0.j0
1  01   30 1.0.k0
1  01   31 1.0.l0
1  01   32 1.0.m0
1  01   33 1.0.n0
1  01   34 1.0.o0
1  01   35 1.0.p0
1  01   36 1.0.q0
1  11   12 1.1.a1
1  11   13 1.1.b1
1  11   14 1.1.c1
1  11   15 1.1.d1
1  11   16 1.1.e1
1  11   17 1.1.f1
1  11   18 1.1.g1
2  02   17 2.0.a0
2  02   18 2.0.b0
2  02   19 2.0.c0
2  12   12 2.1.a1
2  12   13 2.1.b1
2  12 

Re: [R] How can I 'predict' from an nls model with a fit specified for separate groups?

2011-04-20 Thread peter dalgaard

On Apr 20, 2011, at 17:04 , Stuart Rosen wrote:

> Following an example on p 111 in 'Nonlinear Regression with R' by Ritz & 
> Streibig, I have been fitting nls models using square brackets with the 
> grouping variable inside. In their book is this example, in which 'state' is 
> a factor indicating whether a treatment has been used or not:
> 
> > Puromycin.m1 <- nls(rate ~ Vm[state] *
> + conc/(K[state] + conc), data = Puromycin,
> + start = list(K = c(0.1, 0.1),
> + Vm = c(200, 200)))
> 
> What I cannot figure out is how to specify the value of the grouping variable 
> in a 'predict' statement. In my own example, I can only seem to get the 
> predictions for the 1st specified level of the grouping variable. I promise 
> that I have read the documentation, and have tried a number of things, but 
> cannot get the correct predictions.

What's the problem? If I continue your example with

plot(Puromycin$conc, predict(Puromycin.m1), col=(1:2)[Puromycin$state])

I get a black and a red set of points, distinctly different.

It's difficult to say what you did wrong in the "number of things" that you 
tried, but I suspect you didn't set up your newdata properly:

> myconc <- seq(0,1.2,,11)> lv <- levels(Puromycin$state)
> predict(Puromycin.m1, 
> newdata=data.frame(conc=myconc,state=factor("untreated",levels=lv)))
 [1]   0. 114.6850 133.7022 141.5248 145.7897 148.4743 150.3196 151.6661
 [9] 152.6918 153.4993 154.1514
> predict(Puromycin.m1, 
> newdata=data.frame(conc=myconc,state=factor("treated",levels=lv)))
 [1]   0. 138.6154 167.8413 180.5289 187.6203 192.1490 195.2916 197.6000
 [9] 199.3674 200.7641 201.8956


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'Record' row values every time the binary value in acollumn changes

2011-04-20 Thread William Dunlap
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of jim holtman
> Sent: Wednesday, April 20, 2011 9:59 AM
> To: baboon2010
> Cc: r-help@r-project.org
> Subject: Re: [R] 'Record' row values every time the binary 
> value in acollumn changes
> 
> Here is an answer to part 1:
> 
> > binary<-c(1,1,1,0,0,0,1,1,1,0,0)
> > Chromosome<-c(1,1,1,1,1,1,2,2,2,2,2)
> > start<-c(12,17,18,20,25,36,12,15,16,17,19)
> > Table<-cbind(Chromosome,start,binary)
> > # determine where the start/end of each group is
> > # use indices since the size is large
> > startEnd <- lapply(split(seq(nrow(Table))
> +  , list(Table[, "Chromosome"], Table[, 
> 'binary'])
> +  , drop = TRUE
> +  )
> +   , function(.indx){
> + se <- range(.indx)
> + c(Chromosome2 = unname(Table[se[1L], "Chromosome"])
> +   , position_start = unname(Table[se[1L], 'start'])
> +   , position_end = unname(Table[se[2L], 'start'])
> +   , binary2 = unname(Table[se[1L], 'binary'])
> +   )
> + })
> > do.call(rbind, startEnd)
> Chromosome2 position_start position_end binary2
> 1.0   1 20   36   0
> 2.0   2 17   19   0
> 1.1   1 12   18   1
> 2.1   2 12   16   1

The following will likely be quicker way to find where
a column changes values than that lapply() when there
are lots of rows:

  f1 <- function (Table) {
  isFirstInRun <- function(x) c(TRUE, x[-1] != x[-length(x)])
  isLastInRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
  with(data.frame(Table), {
  first <- isFirstInRun(binary)
  last <- isLastInRun(binary)
  cbind(Chromosome2 = Chromosome[first], position_start = start[first], 
  position_end = start[last], binary2 = binary[first])
})
  }

E.g.,

  > f1(Table)
   Chromosome2 position_start position_end binary2
  [1,]   1 12   18   1
  [2,]   1 20   36   0
  [3,]   2 12   16   1
  [4,]   2 17   19   0

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> >
> >
> 
> 
> On Wed, Apr 20, 2011 at 5:01 AM, baboon2010 
>  wrote:
> > My question is twofold.
> >
> > Part 1:
> > My data looks like this:
> >
> > (example set, real data has 2*10^6 rows)
> > binary<-c(1,1,1,0,0,0,1,1,1,0,0)
> > Chromosome<-c(1,1,1,1,1,1,2,2,2,2,2)
> > start<-c(12,17,18,20,25,36,12,15,16,17,19)
> > Table<-cbind(Chromosome,start,binary)
> >      Chromosome start binary
> >  [1,]          1    12      1
> >  [2,]          1    17      1
> >  [3,]          1    18      1
> >  [4,]          1    20      0
> >  [5,]          1    25      0
> >  [6,]          1    36      0
> >  [7,]          2    12      1
> >  [8,]          2    15      1
> >  [9,]          2    16      1
> > [10,]          2    17      0
> > [11,]          2    19      0
> >
> > As output I need a shortlist for each binary block: giving 
> me the starting
> > and ending position of each block.
> > Which for these example would look like this:
> >     Chromosome2 position_start position_end binary2
> > [1,]           1             12           18       1
> > [2,]           1             20           36       0
> > [3,]           2             12           16       1
> > [4,]           2             17           19       0
> >
> > Part 2:
> > Based on the output of part 1, I need to assign the binary 
> to rows of
> > another data set. If the position value in this second data 
> set falls in one
> > of the blocks defined in the shortlist made in part1,the 
> binary value of the
> > shortlist should be assigned to an extra column for this 
> row.  This would
> > look something like this:
> >     Chromosome3 position Value binary3
> >  [1,] "1"         "12"     "a"   "1"
> >  [2,] "1"         "13"     "b"   "1"
> >  [3,] "1"         "14"     "c"   "1"
> >  [4,] "1"         "15"     "d"   "1"
> >  [5,] "1"         "16"     "e"   "1"
> >  [6,] "1"         "18"     "f"   "1"
> >  [7,] "1"         "20"     "g"   "0"
> >  [8,] "1"         "21"     "h"   "0"
> >  [9,] "1"         "22"     "i"   "0"
> > [10,] "1"         "23"     "j"   "0"
> > [11,] "1"         "25"     "k"   "0"
> > [12,] "1"         "35"     "l"   "0"
> > [13,] "2"         "12"     "m"   "1"
> > [14,] "2"         "13"     "n"   "1"
> > [15,] "2"         "14"     "o"   "1"
> > [16,] "2"         "15"     "p"   "1"
> > [17,] "2"         "16"     "q"   "1"
> > [18,] "2"         "17"     "s"   "0"
> > [19,] "2"         "18"     "d"   "0"
> > [20,] "2"         "19"     "f"   "0"
> >
> >
> > Many thanks in advance,
> >
> > Niels
> >
> > --
> > View this message in context: 
> http://r.789695.n4.nabble.com/Record-row-values-every-time-the
-binary-value-in-a-collumn-changes-tp3462496p346249

[R] Expanding a VCV Matrix

2011-04-20 Thread Maithula Chandrashekhar
Dear all, I have special task to expand a given VCV matrix, however
could not accomplice yet. Let say I have following VCV matrix

> mat  <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
> colnames(mat) <- rownames(mat) <- paste("variable", 1:3)
> mat
   variable 1 variable 2 variable 3
variable 1  12.00.0
variable 2  25.00.5
variable 3  00.53.0

Now, say I have a general string vector like this:

> Str <- c(paste("variable", 1:4), "variable2")
> Str
[1] "variable 1" "variable 2" "variable 3" "variable 4" "variable 2"

Now according to this string, I want my previous VCV matrix also
expands. Therefore, as "variable 4" is not there in VCV matrix, so it
will be ignored. Therefore final VCV matrix will be of order 4, and
will look like:

variable 1  variable 2  variable 3  variable 2
variable 1  1   2   0   2
variable 2  2   5   0.5 5
variable 3  0   0.5 3   0.5
variable 2  2   5   0.5 5

However, I do not think it is just some straightforward expansion,
which could be done just by the subsetting mechanism of R. Is there
any idea on how can I do it for general case?

Thanks for your time,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'Record' row values every time the binary value in a collumn changes

2011-04-20 Thread jim holtman
Here is an answer to part 1:

> binary<-c(1,1,1,0,0,0,1,1,1,0,0)
> Chromosome<-c(1,1,1,1,1,1,2,2,2,2,2)
> start<-c(12,17,18,20,25,36,12,15,16,17,19)
> Table<-cbind(Chromosome,start,binary)
> # determine where the start/end of each group is
> # use indices since the size is large
> startEnd <- lapply(split(seq(nrow(Table))
+  , list(Table[, "Chromosome"], Table[, 'binary'])
+  , drop = TRUE
+  )
+   , function(.indx){
+ se <- range(.indx)
+ c(Chromosome2 = unname(Table[se[1L], "Chromosome"])
+   , position_start = unname(Table[se[1L], 'start'])
+   , position_end = unname(Table[se[2L], 'start'])
+   , binary2 = unname(Table[se[1L], 'binary'])
+   )
+ })
> do.call(rbind, startEnd)
Chromosome2 position_start position_end binary2
1.0   1 20   36   0
2.0   2 17   19   0
1.1   1 12   18   1
2.1   2 12   16   1
>
>


On Wed, Apr 20, 2011 at 5:01 AM, baboon2010  wrote:
> My question is twofold.
>
> Part 1:
> My data looks like this:
>
> (example set, real data has 2*10^6 rows)
> binary<-c(1,1,1,0,0,0,1,1,1,0,0)
> Chromosome<-c(1,1,1,1,1,1,2,2,2,2,2)
> start<-c(12,17,18,20,25,36,12,15,16,17,19)
> Table<-cbind(Chromosome,start,binary)
>      Chromosome start binary
>  [1,]          1    12      1
>  [2,]          1    17      1
>  [3,]          1    18      1
>  [4,]          1    20      0
>  [5,]          1    25      0
>  [6,]          1    36      0
>  [7,]          2    12      1
>  [8,]          2    15      1
>  [9,]          2    16      1
> [10,]          2    17      0
> [11,]          2    19      0
>
> As output I need a shortlist for each binary block: giving me the starting
> and ending position of each block.
> Which for these example would look like this:
>     Chromosome2 position_start position_end binary2
> [1,]           1             12           18       1
> [2,]           1             20           36       0
> [3,]           2             12           16       1
> [4,]           2             17           19       0
>
> Part 2:
> Based on the output of part 1, I need to assign the binary to rows of
> another data set. If the position value in this second data set falls in one
> of the blocks defined in the shortlist made in part1,the binary value of the
> shortlist should be assigned to an extra column for this row.  This would
> look something like this:
>     Chromosome3 position Value binary3
>  [1,] "1"         "12"     "a"   "1"
>  [2,] "1"         "13"     "b"   "1"
>  [3,] "1"         "14"     "c"   "1"
>  [4,] "1"         "15"     "d"   "1"
>  [5,] "1"         "16"     "e"   "1"
>  [6,] "1"         "18"     "f"   "1"
>  [7,] "1"         "20"     "g"   "0"
>  [8,] "1"         "21"     "h"   "0"
>  [9,] "1"         "22"     "i"   "0"
> [10,] "1"         "23"     "j"   "0"
> [11,] "1"         "25"     "k"   "0"
> [12,] "1"         "35"     "l"   "0"
> [13,] "2"         "12"     "m"   "1"
> [14,] "2"         "13"     "n"   "1"
> [15,] "2"         "14"     "o"   "1"
> [16,] "2"         "15"     "p"   "1"
> [17,] "2"         "16"     "q"   "1"
> [18,] "2"         "17"     "s"   "0"
> [19,] "2"         "18"     "d"   "0"
> [20,] "2"         "19"     "f"   "0"
>
>
> Many thanks in advance,
>
> Niels
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Record-row-values-every-time-the-binary-value-in-a-collumn-changes-tp3462496p3462496.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fibonacci

2011-04-20 Thread Georgina Imberger
Thank-you all!!
Very helpful.

-- Forwarded message --
From: Bart Joosen 
Date: 20 April 2011 15:46
Subject: Re: [R] Fibonacci
To: r-help@r-project.org


Another solution:

while (Fibonacci[1] < 500)  Fibonacci <- c(sum(Fibonacci[c(1,2)]),
Fibonacci)

While this adds the sum before the existing values, the length or tail
function or avoided, but even with reordering, its faster
(Fibonacci[length(Fibonacci):1])

Best regards

Bart


--
View this message in context:
http://r.789695.n4.nabble.com/Fibonacci-tp3462636p3463050.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Partial Least Squares Regression independent variables

2011-04-20 Thread tommym
Hi,

I'm running PLSR on the PLS package.  I have 507 independent and one
dependent variable.

model<-plsr(y~x1+x2+x3.., data = mydata, validation = "CV")

the problem with this is writing in 507 variable names is not realistic as I
run out of line space in the command window.

I cannot run the model thus:

model<-plsr(dataframeY~dataframeX...)

as is it is specifically warned against in the documentation.

In the example the independent variables have been collated and are
represented by a single name ("NIR", represents numerous variables labelled
NIR.1, NIR,2 etc).  Can anyone advise on how this is done?

Thanks

Tommy McDermott

--
View this message in context: 
http://r.789695.n4.nabble.com/Partial-Least-Squares-Regression-independent-variables-tp3463478p3463478.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] survexp with weights

2011-04-20 Thread Mike Harwood
Hello,

I probably have a syntax error in trying to generate an expected
survival curve from a weighted cox model, but I can't see it.  I used
the help sample code to generate a weighted model, with the addition
of a "weights=albumin" argument (I only chose albumin because it had
no missing values, not because of any real relevance).  Below are my
code with the resulting error messages.  Thanks in advance!

> pfit <- coxph(Surv(time,status>0) ~ trt + log(bili) + log(protime) + age +
+ + platelet,  data=pbc
+ )
>
> pfit
Call:
coxph(formula = Surv(time, status > 0) ~ trt + log(bili) +
log(protime) +
age + +platelet, data = pbc)


  coef exp(coef) se(coef)z  p
trt  -0.000624 0.999  0.17304 -0.00360 1.
log(bili) 0.985497 2.679  0.08949 11.01262 0.
log(protime)  2.79400116.346  0.95289  2.93215 0.0034
age   0.020590 1.021  0.00805  2.55666 0.0110
platelet -0.001699 0.998  0.00085 -2.00130 0.0450

Likelihood ratio test=164  on 5 df, p=0  n= 308, number of events=
143
   (110 observations deleted due to missingness)
>
> plot(survfit(Surv(time, status>0) ~ trt, data=pbc))
> lines(survexp( ~ trt, ratetable=pfit, data=pbc), col='purple')
>
> pfit.wtd <- coxph(Surv(time,status>0) ~ trt + log(bili) + log(protime) + age +
+ + platelet,  weights=albumin, data=pbc
+ )
>
> pfit.wtd
Call:
coxph(formula = Surv(time, status > 0) ~ trt + log(bili) +
log(protime) +
age + +platelet, data = pbc, weights = albumin)


 coef exp(coef) se(coef)  z   p
trt  -0.01354 0.987 0.094204 -0.144 8.9e-01
log(bili) 0.99282 2.699 0.048690 20.391 0.0e+00
log(protime)  2.5413612.697 0.525797  4.833 1.3e-06
age   0.01913 1.019 0.004398  4.350 1.4e-05
platelet -0.00165 0.998 0.000462 -3.578 3.5e-04

Likelihood ratio test=535  on 5 df, p=0  n= 308, number of events=
143
   (110 observations deleted due to missingness)
> plot(survfit(Surv(time, status>0) ~ trt, data=pbc))
> lines(survexp( ~ trt, ratetable=pfit.wtd, data=pbc), col='purple')
Error in eval(expr, envir, enclos) : object 'albumin' not found
> lines(survexp( ~ trt, ratetable=pfit.wtd, weights=albumin, data=pbc), 
> col='purple')
Error in eval(expr, envir, enclos) : object 'albumin' not found
In addition: Warning message:
In survexp(~trt, ratetable = pfit.wtd, weights = albumin, data =
pbc) :
  Weights ignored
> lines(survexp( ~ trt, ratetable=pfit.wtd, weights=pbc$albumin, data=pbc), 
> col='purple')
Error in eval(expr, envir, enclos) : object 'albumin' not found
In addition: Warning message:
In survexp(~trt, ratetable = pfit.wtd, weights = pbc$albumin, data =
pbc) :
  Weights ignored
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two Questions

2011-04-20 Thread John Kane


> From: Stephen P Molnar 
> Subject: [R] Two Questions
> To: "R-help" 
> Received: Wednesday, April 20, 2011, 9:23 AM

> 1.        What is a really good book on
> R for a nonprogrammer?
Have a look at the books listed on the R website.

Books by Peter Dalgaard, Phil Spector, Michael Crawley & John Verzani are all 
possibilities.  Also haved a look at the Contributed Documentation page on the 
site.  It has some very useful material.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Class htest with non-numeric p-values

2011-04-20 Thread JiHO
Hi everyone,

For some tests, tables of critical values are readily available while
the algorithm to compute those critical values is not. In such cases,
only a range for the p-value can be evaluated (e.g. 0.05 < p < 0.1)
from the table of critical values and the statistic.

Is there anyway to include that in an object of class "htest"? I see
in print.htest() that the p.value element goes through format.pval()
which expect a numeric argument. Is there any workaround?

Thank you in advance. Sincerely,

JiHO
---
http://maururu.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave

2011-04-20 Thread Sarah Goslee
Hi,

On Wed, Apr 20, 2011 at 12:11 PM, R Heberto Ghezzo, Dr
 wrote:
> Hello,
> I never used Sweave before but now I try and got:

Have you installed LaTeX on your computer? It doesn't come installed on
Windows machines by default. I would suggest MiKTeX http://www.miktex.org/
but people who use Windows more often than I do may have other ideas.

Sarah


>>  rnwfile <- system.file("Sweave", "example-1.Rnw", package = "utils")
>>  Sweave(rnwfile)
> Writing to file example-1.tex
> Processing code chunks with options ...
>  1 : echo term verbatim
>  2 : term verbatim pdf
> You can now run (pdf)latex on 'example-1.tex'
>>  tools::texi2dvi("example-1.tex", pdf = TRUE)
> Error in tools::texi2dvi("example-1.tex", pdf = TRUE) :
>  pdflatex is not available
>>
> The code is from Sweave.pdf file in utils/doc
> I am using R-2.13.0 in Win7  Toshiba laptop
> Please any help?
>
> R.Heberto Ghezzo Ph.D.
> Montreal - Canada
> __



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave

2011-04-20 Thread R Heberto Ghezzo, Dr
Hello,
I never used Sweave before but now I try and got:
> 
>  rnwfile <- system.file("Sweave", "example-1.Rnw", package = "utils")
>  Sweave(rnwfile)
Writing to file example-1.tex
Processing code chunks with options ...
 1 : echo term verbatim
 2 : term verbatim pdf
You can now run (pdf)latex on 'example-1.tex'
>  tools::texi2dvi("example-1.tex", pdf = TRUE)
Error in tools::texi2dvi("example-1.tex", pdf = TRUE) : 
  pdflatex is not available
>
The code is from Sweave.pdf file in utils/doc
I am using R-2.13.0 in Win7  Toshiba laptop
Please any help?

R.Heberto Ghezzo Ph.D.
Montreal - Canada
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Relabelling

2011-04-20 Thread John Kane
Can you explain this a bit more. At the moment I don't see what you are trying 
to achieve.   "calculate the mean of the 1000 values at each of the 4000 
points" does not seem to make sense.

--- On Wed, 4/20/11, kmatthews  wrote:

> From: kmatthews 
> Subject: [R] Random Relabelling
> To: r-help@r-project.org
> Received: Wednesday, April 20, 2011, 10:04 AM
> I have 4000 observations that I need
> to randomly relabel 1000 times and then
> calculate the mean of the 1000 values at each of the 4000
> points.  Any ideas
> for where to begin? 
> 
> Thanks
> Kevin 
> 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test email #2: Please ignore

2011-04-20 Thread Paul Miller
Having trouble posting. This is a second test email to help determine if the 
problem has been resolved.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I 'predict' from an nls model with a fit specified for separate groups?

2011-04-20 Thread Stuart Rosen
Following an example on p 111 in 'Nonlinear Regression with R' by Ritz & 
Streibig, I have been fitting nls models using square brackets with the 
grouping variable inside. In their book is this example, in which 
'state' is a factor indicating whether a treatment has been used or not:


> Puromycin.m1 <- nls(rate ~ Vm[state] *
+ conc/(K[state] + conc), data = Puromycin,
+ start = list(K = c(0.1, 0.1),
+ Vm = c(200, 200)))

What I cannot figure out is how to specify the value of the grouping 
variable in a 'predict' statement. In my own example, I can only seem to 
get the predictions for the 1st specified level of the grouping 
variable. I promise that I have read the documentation, and have tried a 
number of things, but cannot get the correct predictions.


Thank you for any help.

Yours - Stuart

--
/**/
Stuart Rosen, PhD
Professor of Speech and Hearing Science
UCL Speech, Hearing and Phonetic Sciences

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple question about symbols()

2011-04-20 Thread murilofm
Thank you for the answer and sorry about the bad post i'll remember that
in the future.
By the way, the line code i used to read the data was

inv <- read.csv("data.csv", header=TRUE, sep=";")

I tried before to use the bg, but for some reason it wasn't working out for
me.
But now i got it.

Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/Simple-question-about-symbols-tp3461676p3463373.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unlist command drops all my column names in the first row and adopts NAs

2011-04-20 Thread John Kane
Why not just data.matrix(UN2010) ?

--- On Wed, 4/20/11, Haillie  wrote:

> From: Haillie 
> Subject: [R] Unlist command drops all my column names in the first row and 
> adopts NAs
> To: r-help@r-project.org
> Received: Wednesday, April 20, 2011, 11:05 AM
> Hi Everyone, 
> 
> I am having trouble turning my data.frame into a matrix
> format. Because I
> wanted to change my data.frame with mostly factor variables
> into a numeric
> matrix, I used the following code -->
> UN2010frame<-data.matrix(lapply(UN2010,as.numeric))
> 
> However when i checked the mode of the UN2010frame, it
> still showed up as a
> list. Because the code I want to run (Ordrating) does not
> accept data in a
> list format, I used UN2010matrix <- unlist(UN2010frame)
> to unlist my matrix.
> When I did this, my first row ( which was formerly a row
> with column names)
> turned into NAs. This was a problem for me because when I
> tried to run an
> ordinal IRT model using this data set, I got the following
> error message. 
> 
> Error in 1:nrow(Y) : argument of length 0
> 
> I think it is because all the values in my first row are
> now gone 
> 
> If you could help me on any front, It would be deeply
> appreciated. Thank you
> very much!
> 
> Haillie 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Unlist-command-drops-all-my-column-names-in-the-first-row-and-adopts-NAs-tp3463294p3463294.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get R plots with FastRweb

2011-04-20 Thread Simon Urbanek
Devi,

FastRWeb doesn't use Java - it is entirely R based so all you need is a web 
server with either CGI or PHP. The client it uses is either a C++ client (part 
of FastRWeb in the cgi-bin directory of the installed package - just copy to 
you server's cgi-bin) or a PHP client (in Rserve/src/client/php/simple.php - 
simply uncomment process_FastRWeb()). The location of the R scripts it serves 
is defined by PROJECT_ROOT at the time you compile FastRWeb which is by default 
/var/FastRWeb so by default foo.png.R would be /var/FastRWeb/web.R/foo.png.R 

Clearly, you could use Java to connect to the Rserve serving FastRWeb but it 
sort of defeats the purpose, because FastRWeb does all the work for you 
consisting of parsing the request and running the scripts.

If you still have issues, I can point you to a sample configuration including 
the setup etc. (for a unix server) if that helps.

Cheers,
Simon

PS: please use stats-rosuda-devel mailing list - I'm not monitoring R-help.


On Apr 20, 2011, at 3:14 AM, MLSC MANIPAL wrote:

> Dear friends,
> 
> I am working in a web service project which uses integration of Java with R. 
> I have used RJava to connect with Java and that is working fine. As R 
> produces more interactive plots, I would also like to pipe plots generate 
> from R on web page. I came to know that FastRWeb, R2HTML, brew and 
> WebGraphics, Cairo together can be used to do that. I have installed all 
> those successfully. I am developing my web service project with NetBeans. For 
> testing purpose I have used the excersie (kmeans) given in the FastRWeb 
> document.I have put the kmeans.png.R program in my netbeansProject/R 
> directory where other java code runs properly.
> 
> student@mlscubl30:~/NetBeansProjects/R$ ls
> build  build.xml  dist  kmeans.png.R  nbproject  src  test  web
> student@mlscubl30:~/NetBeansProjects/R$ 
> 
> When i try to execute it with http://localhost:8080/R/kmeans.png in browser 
> it does not execute it. Hence can you please tell me what exactly i have to 
> do in order to make it up and run?
> 
> thanks in advance.
> 
> Regards,
> Devi
> Manipal Life Sciences Center

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grid.table + splom: how to nicely align panel entries

2011-04-20 Thread Marius Hofert
Dear Baptiste,

very nice, indeed! 

Two minor issues that remain, are:
(1) I tried to omit the decimal dot for those numbers that do not have digits 
after the decimal dot. But somehow it does not work...
(2) Do you know how one can decrease the text size for the text appearing in 
the 
lower panel? I tried to work with "cex=0.5"... but it was ignored all the 
time.

Cheers,

Marius


library(lattice) 
library(grid)
library(gridExtra)

## function for correct digit alignment
align.digits <- function(l){
sp <- strsplit(as.character(l), "\\.")
chars <- sapply(sp, function(x) nchar(x)[1])
n <- max(chars)-chars
l0 <- sapply(n, function(x) paste(rep("0", x), collapse=""))
sapply(seq_along(sp), function(i){
if(length(sp[[1]])==1){
as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])))
}else{
as.expression(bquote(phantom(.(l0[i])) * 
.(sp[[i]][1])*.*.(sp[[i]][2])))
}
})
}

## splom with customized lower.panel
## x: data
## arr: array of containing expressions which are plotted in a grid table in 
the 
##  lower panel (i,j)]
splom2 <- function(x, arr, nr){
## function for creating table 
table.fun <- function(vec){ # vector containing lines for table for *one* 
panel
grid.table(matrix(vec, nrow=nr, byrow=TRUE),
   parse=TRUE, # parse labels as expressions
   theme=theme.list(
   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
   core.just="left", padding.h=unit(0,"mm")) # justification of 
labels
   ) 
}
## splom
splom(x, varname.cex=1.2,
  superpanel=function(z, ...){
  panel.pairs(z, upper.panel=panel.splom, lower.panel=function(i,j){
  table.fun(arr[i,j,])
  }, ...)
  })
}

## create data and array of expressions
d <- 4
x <- matrix(runif(d*1000), ncol=d) # data to be plotted with splom
nr <- 3 # number of rows for the panel entries
nc <- 3 # number of cols for the panel entries
arr <- array(list(rep(NA,nr*nc)), dim=c(d,d,nr*nc), dimnames=c("i","j","val")) 
# array containing the table entries per panel
f <- function(i,j) (i+j)*10 # dummy function
eq <- "phantom()==phantom()"
for(i in 1:d){
for(j in 1:d){
numbers <- align.digits(c(round(pi,4), round(pi, 6), f(i,j)))
arr[i,j,] <- c("alpha", eq, numbers[1],
   "italic(bbb)", eq, numbers[2],
   "gamma", eq, numbers[3])
}
}

## plot
splom2(x, arr, nr=3)


On 2011-04-20, at 11:56 , baptiste auguie wrote:

> On 20 April 2011 21:16, Marius Hofert  wrote:
>> Dear expeRts,
>> 
>> is there a way to get the entries in each panel correctly aligned according 
>> to the
>> equality signs?
>> 
>> Here is the "wish-list":
>> (1) the equality signs in each panel should be vertically aligned
> 
> You can put the equal signs in their own column,
> 
> library(gridExtra)
> d = matrix(c("italic(a)", "phantom()==phantom()", round(pi,4),
> "italic(b)", "phantom()==phantom()", round(pi,6)), ncol=3, byrow=T)
> grid.table(d, parse=T,theme=theme.list(core.just="left"))
> 
>> (2) the numbers should be aligned on the decimal point
> 
> You could place some phantom()s to do this,
> 
> align.digits = function(l)
> {
> 
> sp <- strsplit(as.character(l), "\\.")
> chars <- sapply(sp, function(x) nchar(x)[1])
> n = max(chars) - chars
> l0 = sapply(n, function(x) paste(rep("0", x), collapse=""))
> labels = sapply(seq_along(sp), function(i) {
>  as.expression(bquote(phantom(.(l0[i])) * .(sp[[i]][1])*.*.(sp[[i]][2])))})
> 
> return(labels)
> }
> 
> library(gridExtra)
> 
> d <- align.digits(l = c(125.3, 1.2344))
> grid.table(d, parse=T,core.just="left")
> 
> HTH,
> 
> baptiste
> 
>> One could adjust the phantom()-arguments by hand to achieve (1), but is 
>> there a
>> simpler solution? For (2) I have no idea.
>> 
>> Cheers,
>> 
>> Marius
>> 
>> 
>> library(lattice)
>> library(grid)
>> library(gridExtra)
>> 
>> ## splom with customized lower.panel
>> ## x: data
>> ## arr: array of containing expressions which are plotted in a grid table in 
>> the
>> ##  lower panel (i,j)]
>> splom2 <- function(x, arr){
>>## function for creating table
>>table.fun <- function(vec){ # vector containing lines for table for *one* 
>> panel
>>grid.table(matrix(vec, ncol=2, byrow=TRUE),
>>   parse=TRUE, # parse labels as expressions
>>   theme=theme.list(
>>   gpar.corefill=gpar(fill=NA, col=NA), # make bg transparent
>>   core.just="left", padding.h=unit(0,"mm")) # justification 
>> of labels
>>   )
>>}
>>## splom
>>splom(x, varname.cex=1.4,
>>  superpanel=function(z, ...){
>>  panel.pairs(z, upper.panel=panel.splom, 
>> lower.panel=function(i,j){
>>  table.fun(arr[i,j,])
>>  }, ...)
>>  })
>> }
>> 
>> ## create data and array of expr

[R] Unlist command drops all my column names in the first row and adopts NAs

2011-04-20 Thread Haillie
Hi Everyone, 

I am having trouble turning my data.frame into a matrix format. Because I
wanted to change my data.frame with mostly factor variables into a numeric
matrix, I used the following code -->
UN2010frame<-data.matrix(lapply(UN2010,as.numeric))

However when i checked the mode of the UN2010frame, it still showed up as a
list. Because the code I want to run (Ordrating) does not accept data in a
list format, I used UN2010matrix <- unlist(UN2010frame) to unlist my matrix.
When I did this, my first row ( which was formerly a row with column names)
turned into NAs. This was a problem for me because when I tried to run an
ordinal IRT model using this data set, I got the following error message. 

Error in 1:nrow(Y) : argument of length 0

I think it is because all the values in my first row are now gone 

If you could help me on any front, It would be deeply appreciated. Thank you
very much!

Haillie 


--
View this message in context: 
http://r.789695.n4.nabble.com/Unlist-command-drops-all-my-column-names-in-the-first-row-and-adopts-NAs-tp3463294p3463294.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix package transpose

2011-04-20 Thread Tobias Abenius

Hi,

Since I installed R 2.13 I cannot use the transpose method "t" on sparse 
matrices inside my package. Outside the package works. Is there 
something new that I have to import methods? Can I then import 
everything from the Matrix package? The problem is that R tries to use 
t.default which of course doesn't work.


Happy easter, Tobias
--
Tobias Abenius
Ph.D. Student, M.Sc. in Computer Science

Mathematical Statistics
Mathematical Sciences
University of Gothenburg
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fibonacci

2011-04-20 Thread Bart Joosen
Another solution:

while (Fibonacci[1] < 500)  Fibonacci <- c(sum(Fibonacci[c(1,2)]),
Fibonacci)

While this adds the sum before the existing values, the length or tail
function or avoided, but even with reordering, its faster
(Fibonacci[length(Fibonacci):1])

Best regards

Bart


--
View this message in context: 
http://r.789695.n4.nabble.com/Fibonacci-tp3462636p3463050.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random Relabelling

2011-04-20 Thread kmatthews
I have 4000 observations that I need to randomly relabel 1000 times and then
calculate the mean of the 1000 values at each of the 4000 points.  Any ideas
for where to begin? 

Thanks
Kevin 

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-Relabelling-tp3463100p3463100.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BMA, logistic regression, odds ratio, model reduction etc

2011-04-20 Thread khosoda

Dear Prof. Harrel,

Thank you very much for your quick advice.
I will try rms package.

Regarding model reduction, is my model 2 method (clustering and recoding 
that are blinded to the outcome) permissible?


Sincerely,

--
KH

(11/04/20 22:01), Frank Harrell wrote:

Deleting variables is a bad idea unless you make that a formal part of the
BMA so that the attempt to delete variables is penalized for.  Instead of
BMA I recommend simple penalized maximum likelihood estimation (see the lrm
function in the rms package) or pre-modeling data reduction that is blinded
to the outcome variable.
Frank


細田弘吉 wrote:


Hi everybody,
I apologize for long mail in advance.

I have data of 104 patients, which consists of 15 explanatory variables
and one binary outcome (poor/good). The outcome consists of 25 poor
results and 79 good results. I tried to analyze the data with logistic
regression. However, the 15 variables and 25 events means events per
variable (EPV) is much less than 10 (rule of thumb). Therefore, I used R
package, "BMA" to perform logistic regression with BMA to avoid this
problem.

model 1 (full model):
x1, x2, x3, x4 are continuous variables and others are binary data.


x16.bic.glm<- bic.glm(outcome ~ ., data=x16.df,

glm.family="binomial", OR20, strict=FALSE)

summary(x16.bic.glm)

(The output below has been cut off at the right edge to save space)

   62  models were selected
  Best  5  models (cumulative posterior probability =  0.3606 ):

  p!=0EV SDmodel 1model2
Intercept100-5.1348545  1.652424-4.4688  -5.15
-5.1536
age3.3   0.0001634  0.007258  .
sex4.0
.M   -0.0243145  0.220314  .
side  10.8
 .R   0.0811227  0.301233  .
procedure 46.9  -0.5356894  0.685148  .  -1.163
symptom3.8  -0.0099438  0.129690  .  .
stenosis   3.4  -0.0003343  0.005254  .
x13.7  -0.0061451  0.144084  .
x2   100.0   3.1707661  0.892034 3.2221 3.11
x351.3  -0.4577885  0.551466-0.9154 .
HT 4.6
   .positive  0.0199299  0.161769  .  .
DM 3.3
   .positive -0.0019986  0.105910  .  .
IHD3.5
.positive 0.0077626  0.122593  .  .
smoking9.1
.positive 0.0611779  0.258402  .  .
hyperlipidemia16.0
   .positive  0.1784293  0.512058  .  .
x4 8.2   0.0607398  0.267501  .  .


nVar   2  2
  1  3  3
BIC   -376.9082
-376.5588  -376.3094  -375.8468  -374.5582
post prob0.104
0.087  0.077  0.061  0.032

[Question 1]
Is it O.K to calculate odds ratio and its 95% confidence interval from
"EV" (posterior distribution mean) and“SD”(posterior distribution
standard deviation)?
For example, 95%CI of EV of x2 can be calculated as;

exp(3.1707661)

[1] 23.82573 ->  odds ratio

exp(3.1707661+1.96*0.892034)

[1] 136.8866

exp(3.1707661-1.96*0.892034)

[1] 4.146976
-->  95%CI (4.1 to 136.9)
Is this O.K.?

[Question 2]
Is it permissible to delete variables with small value of "p!=0" and
"EV", such as age (3.3% and 0.0001634) to reduce the number of
explanatory variables and reconstruct new model without those variables
for new session of BMA?

model 2 (reduced model):
I used R package, "pvclust", to reduce the model. The result suggested
x1, x2 and x4 belonged to the same cluster, so I picked up only x2.
Based on the subject knowledge, I made a simple unweighted sum, by
counting the number of clinical features. For 9 features (sex, side,
HT2, hyperlipidemia, DM, IHD, smoking, symptom, age), the sum ranges
from 0 to 9. This score was defined as ClinicalScore. Consequently, I
made up new data set (x6.df), which consists of 5 variables (stenosis,
x2, x3, procedure, and ClinicalScore) and one binary outcome
(poor/good). Then, for alternative BMA session...


BMAx6.glm<- bic.glm(postopDWI_HI ~ ., data=x6.df,

glm.family="binomial", OR=20, strict=FALSE)

summary(BMAx6.glm)

(The output below has been cut off at the right edge to save space)
Call:
bic.glm.formula(f = postopDWI_HI ~ ., data = x6.df, glm.family =
"binomial", strict = FALSE, OR = 20)


   13  models were selected
  Best  5  models (cumulative posterior probability =  0.7626 ):

 p!=0EV SD   model 1model 2
Intercept   100-5.6918362  1.81220-4.4688-6.3166
stenosis 

Re: [R] Extrapolating data points for individuals who lack them

2011-04-20 Thread Mike Marchywka





> From: de...@exeter.ac.uk
> To: r-help@r-project.org
> Date: Wed, 20 Apr 2011 12:41:29 +0100
> Subject: [R] Extrapolating data points for individuals who lack them
>
> Hi,
>
> We have an experiment where individuals responses were measured over 5 days. 
> Some responses were not obtained because we only allowed individuals to 
> respond within a limited time-frame. These individuals are given the maximum 
> response time as they did not respond, yet we feel they may have done if 
> given time (and by looking at the rest of their responses over time, the 
> non-response days stand out).
>
> We therefore want to extrapolate data points for individuals, on days when 
> they didn't respond, using a regression of days when they did.
>
> Does anyone know how we could do this quickly and easily in R?

You are probably talking about right censoring. See things like this, 
( you may have good luck just with "R" rather than "CRAN" ) 

http://www.google.com/#sclient=psy&hl=en&source=hp&q=CRAN+informative+%22right+censoring%22


If you post data maybe someone can try a few things. It isn't hard to take data
subsets, fit models, and replace data with model predictions but easier and more
interesting to illustrate with your data.


Personally I would avoid making up data and of course extrapolation tends
to be the most error prone way of doing that. 





>
> Thanks very much
> Dave

  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple question about symbols()

2011-04-20 Thread David Winsemius


On Apr 20, 2011, at 8:45 AM, murilofm wrote:


The link to the csv file is

http://www.filedropper.com/data_5

I use the "d" variable to create the radius:

radius <- sqrt( inv$d/ pi )

and i tried

symbols(inv$a, inv$b, circles=radius, inches=0.35, fg="white",
  bg="red", xlab="aa", ylab="bb",
  col=c("blue","red")[inv$c+1])


You should follow the Posting Guide's advice and offer the full code  
you used to read in that data. It's _not_ comma separated, and for  
some reason even using read.csv2 to properly read a semi-colon  
separated file, I also needed to to convert the first two variables  
from factor to numeric  ... please also read the FAQ item on this topic.


Once that was done this "works", since col is _not_ the proper  
argument for coloring with that function:


?symbols

plot.new()
 symbols(inv$a, inv$b, circles=radius, inches=0.35,
 bg=c("blue","red")[inv$c+1])

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Yearly aggregates and matrices

2011-04-20 Thread mathijsdevaan
Thanks for clarifying that.

Best


Gabor Grothendieck wrote:
> 
> On Wed, Apr 20, 2011 at 5:49 AM, mathijsdevaan
>  wrote:
>> As a follow up on this post, I am trying to slightly adjust the solution
>> kindly provided by Gabor. However, I am getting some results that I do
>> not
>> understand. Example:
>>
>> # devel version of zoo
>> install.packages("zoo", repos = "http://r-forge.r-project.org";)
>> library(zoo)
>>
>> DF1 = data.frame(read.table(textConnection("    B  C  D  E  F  G
>> 8025  1995  0  4  1  2
>> 8025  1997  1  1  3  4
>> 8026  1995  0  7  0  0
>> 8026  1996  1  2  3  0
>> 8026  1997  1  2  3  1
>> 8026  1998  6  0  0  4
>> 8026  1999  3  7  0  3
>> 8027  1997  1  2  3  9
>> 8027  1998  1  2  3  1
>> 8027  1999  6  0  0  2
>> 8028  1999  3  7  0  0
>> 8029  1995  0  2  3  3
>> 8029  1998  1  2  3  2
>> 8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE))
>>
>> a <- read.zoo(DF1, split = 1, index = 2, FUN = identity)
>> sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA
>> b <- rollapply(a, 3,  sum.na, align = "right", partial = TRUE)
>> newDF <- lapply(1:nrow(b), function(i)
>>       prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE,
>>               dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1))
>> names(newDF) <- time(a)
>> c<-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2
>>
>> Now I would like the elements e in c to be equal to 1-e. However,
>>
>> c<-lapply(newDF, function(mat) 1 - tcrossprod(mat /
>> sqrt(rowSums(mat^2
>>
>> gives a value  of 2.220446e-16 for as.data.frame(c['1999'])[2,2] instead
>> of
>> 0
>>
>> What am I doing wrong here? Thanks a lot!
>>
> 
> See FAQ 7.31 at:
> 
> http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
> 
> 
> -- 
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/Yearly-aggregates-and-matrices-tp3438140p3463000.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test email: Please disregard

2011-04-20 Thread Paul Miller
Having trouble posting. Thought it might have something to do with the 
particular message I'm sending. Sending this message to test that possibility.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two Questions

2011-04-20 Thread David Winsemius


On Apr 20, 2011, at 9:23 AM, Stephen P Molnar wrote:

Sorry for the somewhat nondescript subject line, but I have two  
questions:




1.What is a really good book on R for a nonprogrammer?

2.   How do I open more than one R Graphics: Device 2(ACTIVE).   
That

what is the R command that I can use to keep more than one plot open.


You can have more than one device available, but you need to address  
them serially. Only one device can receive input at a time.


?Devices
?dev.set


 I am
running a script from a book on Chemometrics that results in more  
than one
graph during the execution, but it seems that R deletes each graph  
when the

script calls for the next plot.


More likely you are seeing one graph displayed at a time on the screen  
device. On my screen device the cmd-left-arrow will bring up prior  
plots to a depth of 15 earlier results.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two Questions

2011-04-20 Thread Duncan Murdoch

On 20/04/2011 9:23 AM, Stephen P Molnar wrote:

Sorry for the somewhat nondescript subject line, but I have two questions:



1.What is a really good book on R for a nonprogrammer?


Any book that teaches you the basics of programming would be good, it 
doesn't need to be about R.  If you want to use R and remain as a 
nonprogrammer, you will not have any easy time.



2.   How do I open more than one R Graphics: Device 2(ACTIVE).  That
what is the R command that I can use to keep more than one plot open.  I am
running a script from a book on Chemometrics that results in more than one
graph during the execution, but it seems that R deletes each graph when the
script calls for the next plot.



dev.new()  will open a new plot window, and subsequent plotting commands 
will be drawn there.  dev.set() lets you switch back to drawing on the 
original one.


Duncan Murdoch




Thanks in advance



Stephen P. Molnar, Ph.D.  Life is a
fuzzy set

Foundation for Chemistry Stochastic
and multivariate

http://www.FoundationForChemistry.com




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SPDEP Package, neighbours list for Moran's I on large grid dataset

2011-04-20 Thread Ben Haller
Laurent Jégou wrote:

> Hello list members, i'd like to calculate the Moran I on a large dataset,
> a 8640x3432 grid of values.
> 
> When i try to create the neighbours list with the cell2nb function, on
> such a scale, R works for several hours and seems to crash. I'm using the last
> version (2.13), 64 bits, on a mac pro with 4go of memory.
> 
> I think that i'm doing it wrong :-)
> 
> I'd like to use a "moving window" weight matrix instead of a full scale
> one, but i was not lucky enough to find a solution in the docs.

  I confronted this problem recently, and decided to roll my own.  For a large 
dataset, an exhaustive computation of Moran's I is very time-consuming, as you 
say; but an estimate of it, based on a subset of pairs chosen randomly, seemed 
(for my data, at least) to converge quite nicely.  Here's my code.  It assumes 
a square grid, so you'll need to adapt it for your non-square grid, but that 
should be trivial.  To determine the right number of pairs for a good estimate 
in my application, I called this function with various numbers of pairs, and 
plotted the estimate produced as a function of the number of pairs used, and 
observed good convergence by 200,000 pairs.  You will probably want to do this 
yourself, for your dataset.  200,000 pairs took just a second or two to run, on 
my machine, so this is much, much faster than an exhaustive calculation.  As a 
bonus, this code also gives you Geary's C.  Oh, and the code below assumes a 
wraparound lattice (i.e. a torus, i.e. periodic boundaries), which may not be 
what you want; just get rid of the d_wraparound stuff and the following pmax() 
call if you want non-wraparound boundaries, and that should work fine, I think. 
 Not sure what you mean by the moving window thing, so I leave that as an 
exercise for the reader.  :->

  I've never made an R package, and I'm not sure there's much demand for this 
code (is there?), so I'm presently unlikely to package it up.  However, if 
someone who owns a package related to this problem would like to adopt this 
code, generalize it to non-square lattices, add a flag to choose periodic or 
non-periodic boundaries, etc., feel free to do so; I hereby place it in the 
public domain.  I'd just like to hear about it if someone does this, and 
receive credit somewhere, that's all.  I'm not super-good in R, either, so if 
there's a better way to do some things, like the somewhat hacked random index 
generation (trying to avoid floating point issues when generating a random 
integer), please let me know.  I'm always interested in learning how to do 
things better.

  And of course this code is provided without warranty, may have bugs, etc.

  Enjoy!

Ben Haller
McGill University


correlation_stats <- function(p, n_pairs=20)
{
# Compute Moran's I and Geary's C for the landscape p.  This is tricky 
because the exact calculation
# would be far too time-consuming, as it involves comparison of every 
point to every other point.
# So I am trying to estimate it here with a small subset of all 
possible point comparisons.
p_size <- NROW(p)   # 512
p_length <- length(p)   # 262144
mean_p <- mean(p)
pv <- as.vector(p)

# select points and look up their values; for speed, points are 
selected by their vector index
p1ix <- floor(runif(n_pairs, min=0, max=p_length - 0.001))
p1x <- (p1ix %/% p_size) + 1
p1y <- (p1ix %% p_size) + 1
p1ix <- p1ix + 1

p2ix <- floor(runif(n_pairs, min=0, max=p_length - 0.001))
p2x <- (p2ix %/% p_size) + 1
p2y <- (p2ix %% p_size) + 1
p2ix <- p2ix + 1

v1 <- pv[p1ix] - mean_p
v2 <- pv[p2ix] - mean_p

# Calculate distances between point pairs, both directly and wrapping 
around
# The average direct distance is much smaller than the average 
wraparound distance,
# because points near the center vertically have a maximum direct 
distance near 256,
# but a minimum wraparound distance near 256.  The Moran's I estimate 
from wraparound
# distances is different, as a result.  Rupert recommends taking the 
shorter of the
# two distances, whichever it is, because that keeps us within the 
meaningful scale
# of our space, before it just starts repeating due to periodicity.
d_direct <- 1 / sqrt(((p1x - p2x) ^ 2) + ((p1y - p2y) ^ 2))
d_direct[is.infinite(d_direct)] <- 0

d_wraparound <- 1 / sqrt(((p1x - p2x) ^ 2) + ((p_size - abs(p1y - p2y)) 
^ 2))
d_wraparound[is.infinite(d_wraparound)] <- 0

d <- pmax(d_direct, d_wraparound)   # max because we want the min 
distance, and these are 1/distance

# precalculate some shared terms
sum_d <- sum(d)
sum_v1_sq <- sum(v1 ^ 2)

# Moran's I: -1 is perfect dispersion, 1 is perfect correlation, 0 is a 
random spa

Re: [R] Simple question about symbols()

2011-04-20 Thread murilofm
The link to the csv file is

http://www.filedropper.com/data_5

I use the "d" variable to create the radius:

radius <- sqrt( inv$d/ pi )

and i tried

symbols(inv$a, inv$b, circles=radius, inches=0.35, fg="white",
   bg="red", xlab="aa", ylab="bb",
   col=c("blue","red")[inv$c+1])

Thanks for the help.

--
View this message in context: 
http://r.789695.n4.nabble.com/Simple-question-about-symbols-tp3461676p3462882.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple question about symbols()

2011-04-20 Thread Ben Bolker
David Winsemius  comcast.net> writes:

> 
> 
> On Apr 19, 2011, at 10:51 PM, murilofm wrote:
> 
> > Thanks for the answer; I see that col=c("blue","red")[inv$c+1]  
> > creates a
> > vector of "red" and "blue" associated with the binnary c.
> > But still I got everything red.
> 
> If you want tested solution, submit test data.
> 

  David also suggests (off-lists) that you use "bg" instead of "col" as 
the argument name.  (Another strategy is to go to ?symbols
and search for "colour" -- I wouldn't expect a new user
to guess the c("blue","red")[inv$c+1] stuff, but you probably
could go to the help page and figure out that the appropriate
argument name is "bg" ...)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Two Questions

2011-04-20 Thread Stephen P Molnar
Sorry for the somewhat nondescript subject line, but I have two questions:

 

1.What is a really good book on R for a nonprogrammer?

2.   How do I open more than one R Graphics: Device 2(ACTIVE).  That
what is the R command that I can use to keep more than one plot open.  I am
running a script from a book on Chemometrics that results in more than one
graph during the execution, but it seems that R deletes each graph when the
script calls for the next plot.

 

Thanks in advance

 

Stephen P. Molnar, Ph.D.  Life is a
fuzzy set

Foundation for Chemistry Stochastic
and multivariate

http://www.FoundationForChemistry.com

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply sequence

2011-04-20 Thread Duncan Murdoch

On 20/04/2011 7:26 AM, Dean Marks wrote:

Good day,

My question is: Does the lapply function guarantee a particular sequence in
which elements are mapped? And, are we guaranteed that lapply will always be
sequential (i.e. never map elements in parallel) ?


No.


The reason I ask is if I use lapply with the mapping function set to
something that has side-effects that need to be executed in a particular
sequence.


Use a for loop.

If this is not possible, is there an alternate method other than using a for
loop?


while or repeat.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BMA, logistic regression, odds ratio, model reduction etc

2011-04-20 Thread Frank Harrell
Deleting variables is a bad idea unless you make that a formal part of the
BMA so that the attempt to delete variables is penalized for.  Instead of
BMA I recommend simple penalized maximum likelihood estimation (see the lrm
function in the rms package) or pre-modeling data reduction that is blinded
to the outcome variable.
Frank


細田弘吉 wrote:
> 
> Hi everybody,
> I apologize for long mail in advance.
> 
> I have data of 104 patients, which consists of 15 explanatory variables
> and one binary outcome (poor/good). The outcome consists of 25 poor
> results and 79 good results. I tried to analyze the data with logistic
> regression. However, the 15 variables and 25 events means events per
> variable (EPV) is much less than 10 (rule of thumb). Therefore, I used R
> package, "BMA" to perform logistic regression with BMA to avoid this
> problem.
> 
> model 1 (full model):
> x1, x2, x3, x4 are continuous variables and others are binary data.
> 
>> x16.bic.glm <- bic.glm(outcome ~ ., data=x16.df,
> glm.family="binomial", OR20, strict=FALSE)
>> summary(x16.bic.glm)
> (The output below has been cut off at the right edge to save space)
> 
>   62  models were selected
>  Best  5  models (cumulative posterior probability =  0.3606 ):
> 
>  p!=0EV SDmodel 1model2
> Intercept100-5.1348545  1.652424-4.4688  -5.15
> -5.1536
> age3.3   0.0001634  0.007258  .
> sex4.0
>.M   -0.0243145  0.220314  .
> side  10.8
> .R   0.0811227  0.301233  .
> procedure 46.9  -0.5356894  0.685148  .  -1.163
> symptom3.8  -0.0099438  0.129690  .  .
> stenosis   3.4  -0.0003343  0.005254  .
> x13.7  -0.0061451  0.144084  .
> x2   100.0   3.1707661  0.892034 3.2221 3.11
> x351.3  -0.4577885  0.551466-0.9154 .
> HT 4.6
>   .positive  0.0199299  0.161769  .  .
> DM 3.3
>   .positive -0.0019986  0.105910  .  .
> IHD3.5
>.positive 0.0077626  0.122593  .  .
> smoking9.1
>.positive 0.0611779  0.258402  .  .
> hyperlipidemia16.0
>   .positive  0.1784293  0.512058  .  .
> x4 8.2   0.0607398  0.267501  .  .
> 
> 
> nVar   2  2
>  1  3  3
> BIC   -376.9082
> -376.5588  -376.3094  -375.8468  -374.5582
> post prob0.104
> 0.087  0.077  0.061  0.032
> 
> [Question 1]
> Is it O.K to calculate odds ratio and its 95% confidence interval from
> "EV" (posterior distribution mean) and“SD”(posterior distribution
> standard deviation)?
> For example, 95%CI of EV of x2 can be calculated as;
>> exp(3.1707661)
> [1] 23.82573 -> odds ratio
>> exp(3.1707661+1.96*0.892034)
> [1] 136.8866
>> exp(3.1707661-1.96*0.892034)
> [1] 4.146976
> --> 95%CI (4.1 to 136.9)
> Is this O.K.?
> 
> [Question 2]
> Is it permissible to delete variables with small value of "p!=0" and
> "EV", such as age (3.3% and 0.0001634) to reduce the number of
> explanatory variables and reconstruct new model without those variables
> for new session of BMA?
> 
> model 2 (reduced model):
> I used R package, "pvclust", to reduce the model. The result suggested
> x1, x2 and x4 belonged to the same cluster, so I picked up only x2.
> Based on the subject knowledge, I made a simple unweighted sum, by
> counting the number of clinical features. For 9 features (sex, side,
> HT2, hyperlipidemia, DM, IHD, smoking, symptom, age), the sum ranges
> from 0 to 9. This score was defined as ClinicalScore. Consequently, I
> made up new data set (x6.df), which consists of 5 variables (stenosis,
> x2, x3, procedure, and ClinicalScore) and one binary outcome
> (poor/good). Then, for alternative BMA session...
> 
>> BMAx6.glm <- bic.glm(postopDWI_HI ~ ., data=x6.df,
> glm.family="binomial", OR=20, strict=FALSE)
>> summary(BMAx6.glm)
> (The output below has been cut off at the right edge to save space)
> Call:
> bic.glm.formula(f = postopDWI_HI ~ ., data = x6.df, glm.family =
> "binomial", strict = FALSE, OR = 20)
> 
> 
>   13  models were selected
>  Best  5  models (cumulative posterior probability =  0.7626 ):
> 
> p!=0EV SD   model 1model 2
> Intercept   100-5.6918362  1.81220-4.4688-6.3166
> stenosis  8.1  -0.0008417  0.00815  .  .
> x2  100.0   3.0606165  0

Re: [R] problem reading csv file

2011-04-20 Thread Petr PIKAL
Thank you. 

r-help-boun...@r-project.org napsal dne 19.04.2011 22:00:56:

> Not attached. You might succeed if you rename the file with a .txt 
> extension and re-post.
> 
> Almost surely an encoding issue. We may need your session Info to get 
> "locale".

Maybe you are right. After some further thorough documentation search I 
tried encoding, which was not success. After that I tried fileEncoding = 
UCS-2LE and bingo, that was it.

From file help page

The encodings ‘"UCS-2LE"’ and ‘"UTF-16LE"’ are treated specially,
 as they are appropriate values for Windows ‘Unicode’ text files.
 If the first two bytes are the Byte Order Mark ‘0xFFFE’ then these
 are removed as some implementations of ‘iconv’ do not accept BOMs.

˙ţ1 looks in hex editor like that 0xFFFE so the problem is solved for now.

Thanks again

Petr

> 
> -- 
> David
> On Apr 19, 2011, at 2:47 PM, Petr Pikal wrote:
> 
> > Dear all
> >
> > I have several files which claim to be *.csv (one attached, maybe it
> > will come through) . They can be read to Open Office without much
> > problem, however I can not read them into R. I tried
> > read.table("H2O.CSV", sep=",", dec=".")
> > V1
> > 1 ˙ţ1
> > 2
> > 3
> > 
> >> read.table("H2O.CSV", sep=",", dec=".", skip=1)
> > Error in read.table("H2O.CSV", sep = ",", dec = ".", skip = 1) :
> > empty beginning of file
> >
> >> readLines("H2O.CSV", 1)
> > [1] "˙ţ1"
> >
> >> readLines("H2O.CSV", 5)
> > [1] "˙ţ1" "" "" "" ""
> >
> >
> > readChar("H2O.CSV", 10)
> > [1] "˙ţ1"
> >> readBin("H2O.CSV", 10)
> > [1] 9.456937e-308
> >
> >
> > This is how first two lines appear in Notepad
> >
> > 1,1.77436423301697,"BV
> > ", 
> > 
91.0779418945313,7.35872077941895,0.178956836462021,1.70007145404816,1.90102112293243
> > 2,1.94783389568329,"VV
> > ", 
> > 
1341.51489257812,9.04244899749756,1.76539707183838,1.90102112293243,3.52783703804016
> > 
> >
> > The problem seems to be in first item "˙ţ1" which somehow blocks
> > further values to be read.
> >
> > Does anybody have idea what to do or where to look for some help? I do
> > not want to transfer files through spreadsheet manually.
> >
> > Best regards
> > Petr
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> West Hartford, CT
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Yearly aggregates and matrices

2011-04-20 Thread Gabor Grothendieck
On Wed, Apr 20, 2011 at 5:49 AM, mathijsdevaan  wrote:
> As a follow up on this post, I am trying to slightly adjust the solution
> kindly provided by Gabor. However, I am getting some results that I do not
> understand. Example:
>
> # devel version of zoo
> install.packages("zoo", repos = "http://r-forge.r-project.org";)
> library(zoo)
>
> DF1 = data.frame(read.table(textConnection("    B  C  D  E  F  G
> 8025  1995  0  4  1  2
> 8025  1997  1  1  3  4
> 8026  1995  0  7  0  0
> 8026  1996  1  2  3  0
> 8026  1997  1  2  3  1
> 8026  1998  6  0  0  4
> 8026  1999  3  7  0  3
> 8027  1997  1  2  3  9
> 8027  1998  1  2  3  1
> 8027  1999  6  0  0  2
> 8028  1999  3  7  0  0
> 8029  1995  0  2  3  3
> 8029  1998  1  2  3  2
> 8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE))
>
> a <- read.zoo(DF1, split = 1, index = 2, FUN = identity)
> sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA
> b <- rollapply(a, 3,  sum.na, align = "right", partial = TRUE)
> newDF <- lapply(1:nrow(b), function(i)
>       prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE,
>               dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1))
> names(newDF) <- time(a)
> c<-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2
>
> Now I would like the elements e in c to be equal to 1-e. However,
>
> c<-lapply(newDF, function(mat) 1 - tcrossprod(mat / sqrt(rowSums(mat^2
>
> gives a value  of 2.220446e-16 for as.data.frame(c['1999'])[2,2] instead of
> 0
>
> What am I doing wrong here? Thanks a lot!
>

See FAQ 7.31 at:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >