[R] how to draw random numbers from many categorical distributions quickly?

2011-12-14 Thread Sean Zhang
Dear R helpers,

I have a question about drawing random numbers from many categorical
distributions.

Consider n individuals, each follows a categorical distribution defined
over k categories.
Consider a simple case in which n=4, k=3 as below

catDisMat -
rbind(c(0.1,0.2,0.7),c(0.2,0.2,0.6),c(0.1,0.2,0.7),c(0.1,0.2,0.7))

outVec - rep(NA,nrow(catDisMat))
for (i in 1:nrow(catDisMat)){
outVec[i] - sample(1:3,1, prob=catDisMat[i,], replace = TRUE)
}

I can think of one way to potentially speed it up (in reality, my n is very
large, so speed matters). The approach above only samples 1 value each
time. I could have sampled two values for c(0.1,0.2,0.7) because it appears
three times. so by doing some manipulation, I think I can have the idea,
sample(1:3, 3, prob=c(0.1,0.2,0.7), replace = TRUE),  implemented to
improve speed a bit. But, I wonder whether there is a better approach for
speed?

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to efficiently extract elements of a list?

2011-02-07 Thread Sean Zhang
Dear R helper,

I wonder whether there is a quick way to extract some elements for a list.

for a vector we can do the following

vec - seq(3)
names(vec) - LETTERS[1:3]

vec[c(1,3)]
vec[c('A','C')]


But for a list,
test.l - list(c(1,3),array(NA,c(1,2)),array(0,c(2,3)))
names(test.l)-LETTERS[1:3]

The following does not work. is there some command (I was thinking of
do.call) that can do the job?

test.l[[c('A','B')]]
test.l[[c(1,3)]]
do.call('[',c(test.l,c(1,3)))
do.call('[[',c(test.l,c(1,3)))
do.call('[',c(test.l,c('A','C')))
do.call('[[',c(test.l,c('A','C')))

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using character vector as input argument to setkey (data.table pakcage)

2011-02-06 Thread Sean Zhang
Dear R helpers,

I wonder how to use a character vector as an input argument to setkey
(data.table package).
The following works:

library(data.table)
test.dt - data.table(expand.grid(a=1:30,b=LETTERS),c=seq(30*26))
setkey(test.dt,a,b)

I like a similar function, but can accept c('a','b') as an input argument as
below
setkey.wanted(test.dt,c('a','b'))

Your help will be highly appreciated.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to cut a multidimensional array along a chosen dimension and store each piece into a list

2011-01-17 Thread Sean Zhang
Dear R-Helpers,

I wonder whether there is a function which cuts a multiple dimensional array
along a chosen dimension and then store each piece (still an array of one
dimension less) into a list.
For example,

arr - array(seq(1*2*3*4),dim=c(1,2,3,4))  # I made a point to set the
length of the first dimension be 1to test whether I worry about drop=F
option.

brkArrIntoListAlong - function(arr,alongWhichDim){

return(outlist)
}

I have tried splitter_a in plyr package but does not get what I want.

library(plyr)
plyr:::splitter_a(arr,3)

I understand that I can write a for loop to make it happen but I am
searching for a better solution.

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question about deparse(substitute(...))

2011-01-13 Thread Sean Zhang
Dear R helpers:

I like to apply deparse(substitute()) on multiple arguments to collect the
names of the arguments into a character vector.
I used function test.fun as below. it works when there is only one input
argument. but it does not work for multiple arguements. can someone kindly
help?

test.fun - function(...){deparse(substitute(...))}
test.fun(x) #this works
test.fun(x,y,z) # I like c('x','y','z') be the output, but cannot get it.

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to replace a single forward slash with a double backward slash in a string?

2009-12-13 Thread Sean Zhang
Dear R-helpers.

Can someone kindly tell me how to replace a single forward slash with double
backward slash in a string?

i.e.,  from  a/b to a\\b

Many thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to replace a single backward slash with a double backward slash?

2009-12-13 Thread Sean Zhang
Dear R-helpers:

Hours ago, I asked how to replace a single forward slash with a double
backward slash and recieved great help. Thanks again for all the repliers.

In the meantime, I wonder how to replace a single backward slash with a
double backward slash?

e.g., I want change c:\test into c:\\test

I tried the following but does not work.
gsub(\\\,,c:\test)

Can someone help?

Thanks a lot in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to replace a single backward slash with a double backward slash?

2009-12-13 Thread Sean Zhang
David and William,
Thanks for your reply which make me know the concept of escape symbols.

As David guessed, I was trying to write a function which will
accept a path cut from windows explorer.
and as you know windows explorer uses \.

e.g., c:\temp\function.r

I originally would like that the function is able to change the example path
into c:/temp/function.r
David's final comment seems to suggest this is impossible...
If so, it is a limitation because I have to manually change \ into /
each time.
But it is good to know this limitation.

Correct me, if I misunderstand and there is no such a limitation.

Thanks again.

-Sean








On Sun, Dec 13, 2009 at 5:26 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Dec 13, 2009, at 5:11 PM, Sean Zhang wrote:

  Dear R-helpers:

 Hours ago, I asked how to replace a single forward slash with a double
 backward slash and recieved great help. Thanks again for all the repliers.

 In the meantime, I wonder how to replace a single backward slash with a
 double backward slash?

 e.g., I want change c:\test into c:\\test

 I tried the following but does not work.
 gsub(\\\,,)

 Can someone help?


 Your problem may be that you think there actually is a \ in c:\test.
 There isn't:

  grep(, c:\test)  # which would have found a true \
 integer(0)

 It's an escaped t, which is the tab character = \t:

  grep(\\\t, c:\test)
 [1] 1
  cat(rr\tqq)
 rr  qq

 If your goal is to make file paths in Windows correctly, then you have two
 choices:

 a) use doubled \\'s in the literal strings you type, or ...
 b) use /'s

 So maybe you should explain what you are doing? We don't request that
 background out of nosiness, but rather so we can give better answers

 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to convert character string with only month and year into date

2009-09-22 Thread Sean Zhang
Dear R helpers.

I am new to plotting time data using R.
wonder how to convert character time info into date in R.
I searched over the web but did not find answer.

the input character string is something like 03_1993 or 03-1993, so the
precision is at month level.  I tried the following but failed.
#R code below.

 strptime(c(03_1993),%m_%Y)
strptime(c(03-1993),%m-%Y)

Can you someone kindly show me to do it?

Many thanks in advance!

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to convert character string with only month and year into date

2009-09-22 Thread Sean Zhang
David, Gabor, and Henriuqe,
Thanks a lot for help!
Another (inelegant) way is to use ts() and then supply the start and end
time. this inelegant way works (I guess at least for equally spaced data.) .

-Sean



On Tue, Sep 22, 2009 at 3:19 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Sep 22, 2009, at 3:03 PM, Sean Zhang wrote:

 Dear R helpers.

 I am new to plotting time data using R.
 wonder how to convert character time info into date in R.
 I searched over the web but did not find answer.

 the input character string is something like 03_1993 or 03-1993, so the
 precision is at month level.  I tried the following but failed.
 #R code below.

 strptime(c(03_1993),%m_%Y)
 strptime(c(03-1993),%m-%Y)

 Can you someone kindly show me to do it?


 The usual R classes do not have a year-month version but package zoo does:

  library(zoo)

  as.yearmon(03_1993,%m_%Y)
 [1] Mar 1993


 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] why need a new database to store results generated from another database in filehash?

2009-08-19 Thread Sean Zhang
Dear R-Helpers:
I am trying filehash and would like to know whether I have to create a new
database to store results generated from another database.
Example code is presented below to show the question.

#R code below

library(filehash)
dbCreate(myDB1)
db1 - dbInit(myDB1)
dbDelete(db1,a)
dbInsert(db1, a, data.frame(id=I(LETTERS[1:3])))
dbInsert(db1, b, data.frame(id=I(LETTERS[2:3])))

#the following line does Not work, a_and_b will not created
dbInsert(db1,a_and_b,merge(db1$a,db1$b,by='id'))
dbList(db1)
#however, a new database(db2) can store a_and_b
dbCreate(myDB2)
db2-dbInit(myDB2)
dbInsert(db2,a_and_b,merge(db1$a,db1$b,by='id'))
dbList(db2)
db2$a_and_b

#R code above 

Is it possible to make dbInsert(db1,a_and_b,merge(db1$a,db1$b,by='id'))
work?
I am interested in avoiding creating db2.

Many thanks in advance!

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to pass more than one argument to the function called by lapply?

2009-08-18 Thread Sean Zhang
Dear R helpers:

I wonder how to pass more than one argument to the function called by
lapply.
For example,

#R code below ---

indf - data.frame(id=I(c('a','b')),y=c(1,10))
#I want to add an addition argument cutoff into the function called by
lapply.
outside.fun - function(indf, cutoff)
{
 unlist(lapply(split(indf, indf[,'id']), function(.x, cutoff) {.x[,'y'] 
cutoff} ))
}
#but the next line does not work
outside.fun(indf,3)

#as you expected, hard code cutoff works as below, but I do not like hard
coding.
outside.fun.hardcode.cutoff - function(indf, cutoff)
{
 unlist(lapply(split(indf, indf[,'id']), function(.x, cutoff) {.x[,'y']  3}
))
}
outside.fun.hardcode.cutoff(indf,)

#R code above

So, can someone kindly show me how to pass more than one arguments into the
function called by lapply?

Many thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to use do.call together with cbind and get inside a function

2009-07-26 Thread Sean Zhang
Dear R-helpers:
I have a question related to using do.call to call cbind and get.

#the following works
vec1 - c(1,2)
vec2 - c(3,4)
ColNameVec - c('vec1','vec2')
mat - do.call(cbind,lapply(ColNameVec,get))
mat

#put code above into a function then it does not work
#before doing so, first remove vec1 and vec2 from global environment
rm(vec1,vec2)

test - function()
{
vec1 - c(1,2)
vec2 - c(3,4)
ColNameVec - c('vec1','vec2')
mat - do.call(cbind,lapply(ColNameVec,get))
return(mat)
}
test()

In my task, I have to run do.call(cbind,lapply(ColNameVec,get))
inside a function, can someone kindly help?

Many thanks in advance!

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with storing a sequence of lmer() model fit into a list

2009-06-22 Thread Sean Zhang
Dear R-helpers:
May I ask a question related to storing a number of lmer model fit into a
list.

Basically, I have a for-loop (see towards the bottom of this email)
in the loop, I am very sure that the i-th model fit (i.e.,fit_i) is
successfully generated and the character string (i.e., tmp_i) is created
correctly.
The problem stems from the following line in the for-loop
#trouble making line below
fit.list[[tmp_i]] - fit_i


I tried the following example which stores glm() model fit without a
problem.
#the following code can store glm() model fit into a list
---
x1-runif(200)
x2-rnorm(200)
y-x1+x2
testdf-data.frame(y=y, x1=x1, x2=x2)
indepvec-c(x1,x2)
fit.list-NULL
fit_1-glm(y~x1,data=testdf)
fit_2-glm(y~x2,data=testdf)
fit.list[[paste('fit_',indepvec[1],sep='')]]-fit_1
fit.list[[paste('fit_',indepvec[12],sep='')]]-fit_2

so why cannot I store lmer() model fit in a list?
Would someone kindly explain to me what the R error message(last line of
this email) really means?
Your kind help will be highly appreciated!

-Sean

#the following for-loop intends to store lmer() random poisson model output
into list (fit.list), it does not work
---
fit.list-NULL
for (i in seq_along(depvar_vec))
{
  #I found that s_sex, ses1 and race are not useful
  fit_i - lmer(as.formula(gen.ranpoisson.fml.jh(depvar_vec[i],
offsetvar ,factorindepvars,  nonfactorindepvars ,ranintvar )),
family=quasipoisson(link=log),verbose=F, data=indf)
  tmp_i-paste('ranpoi_', depvar_vec[i], sep='')
  fit.list[[tmp_i]] - fit_i
  #assign also does not work
  #assign(fit.list$parse(text = tmp_i), fit_i)
 }
---


#R gives the following error message.

Error in fit.list[[tmp_i]] - fit_i : invalid type/length (S4/0) in vector
allocation

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to apply the dummy coding rule in a dataframe with complete factor levels to another dataframe with incomplete factor levels?

2009-06-19 Thread Sean Zhang
Dear R helpers:

Sorry to bother for a basic question about model.matrix.
Basically, I want to apply the dummy coding rule in a dataframe with
complete factor levels to another dataframe with incomplete factor levels.
I used model.matrix, but could not get what I want.
The following is an example.

#Suppose I have two dataframe A and B
dfA=data.frame(f1=factor(c('a','b','c')), f2=factor(c('aa','bb','cc')))
dfB =data.frame(f1=factor(c('a','b','b')), f2=factor(c('aa','bb','bb')))
#dfB's factor variables have less number of levels

#use model.matrix on dfA
(matA-model.matrix(~f1+f2,data=dfA))
#use model.matrix on dfB
(matB-model.matrix(~f1+f2,data=dfB))
#I actaully like to dummy code dfB using the dummy coding rule defined in
model.matrix(~f1+f2,data=dfA))
#matB_wanted  is below
(matB_wanted-rbind(c(1,0,0,0,0),c(1,1,0,1,0),c(1,1,0,1,0)) )
colnames(matB_wanted)-colnames(matA)
matB_wanted
Can someone kindly show me how to get matB_wanted?
Many thanks in advance!

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question related to fitting overdispersion count data using lmer quasipoisson

2009-04-11 Thread Sean Zhang
Dear R-helpers:
I have a question related to fitting overdispersed count data using lmer.
Basically, I simulate an overdispsed data set by adding an observation-level
normal random shock
into exp(+rnorm()).
Then I fit a lmer quasipoisson model.
The estimation results are very off (see model output of fit.lmer.over.quasi
below).
Can someone kindly explain to me what went wrong?

Many thanks in advance.

-Sean

#data simulation (modified from code at
http://markmail.org/message/j3zmgrklihe73p4p)
set.seed(100)
m - 5
n - 100
N - n*m
#X - cbind(1,runif(N))
X - cbind(1,rnorm(N))
X - cbind(runif(N),rnorm(N))
id - rep(1:n,each=m)
#
Z - kronecker(diag(n),rep(1,m))
#Possion with group level heterogeneity
z - rpois(N, exp(X%*%matrix(c(1,2)) + Z%*%matrix(rnorm(n
#2*rnorm(n*m) is added to each observation to create overdispersion
z.overdis - rpois(N, exp(X%*%matrix(c(1,2)) + Z%*%matrix(rnorm(n)) +
2*rnorm(n*m)))

#without observation-level random shock i.e., 2*rnorm(n*m), estimate results
are very accurate
(fit.lmer - lmer(z ~ X + (1|id), family=poisson,verbose=F))
 #Generalized linear mixed model fit by the Laplace approximation
#Formula: z ~ X + (1 | id)
# AIC BIC logLik deviance
# 851 868   -422  843
#Random effects:
# Groups NameVariance Std.Dev.
# id (Intercept) 0.9770.988
#Number of obs: 500, groups: id, 100
#
#Fixed effects:
#Estimate Std. Error z value Pr(|z|)
#(Intercept)  -0.0128 0.1116-0.1  0.9
#X11.0615 0.060117.7   2e-16 ***
#X22.0236 0.021494.7   2e-16 ***
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#Correlation of Fixed Effects:
#   (Intr) X1
#X1 -0.349
#X2 -0.270  0.258




#Now you can see the results are very off
(fit.lmer.over.quasi - lmer(z.overdis ~ X + (1|id),
family=quasipoisson(link=log),verbose=F))
#Generalized linear mixed model fit by the Laplace approximation
#Formula: z.overdis ~ X + (1 | id)
#   AIC   BIC logLik deviance
# 41867 41888 -2092941857
#Random effects:
# Groups   NameVariance Std.Dev.
# id   (Intercept) 175.813.26
# Residual  72.9 8.54
#Number of obs: 500, groups: id, 100
#
#Fixed effects:
#Estimate Std. Error t value
#(Intercept)   1.3530 1.34921.00
#X11.0834 0.22734.77
#X21.3501 0.0783   17.25
#
#Correlation of Fixed Effects:
#   (Intr) X1
#X1 -0.099
#X2 -0.055  0.070

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sean / Re: question related to fitting overdispersion count data using lmer quasipoisson

2009-04-11 Thread Sean Zhang
Hey Buddy,

Hope you have been doing well since last contact.

If you have the answer to the following question, please let me know.

If you have chance to travel up north. let me know.

best,

-Sean


-- Forwarded message --
From: Sean Zhang seane...@gmail.com
Date: Sat, Apr 11, 2009 at 12:12 PM
Subject: question related to fitting overdispersion count data using lmer
quasipoisson
To: r-help@r-project.org
Cc: seane...@gmail.com


Dear R-helpers:
I have a question related to fitting overdispersed count data using lmer.
Basically, I simulate an overdispsed data set by adding an observation-level
normal random shock
into exp(+rnorm()).
Then I fit a lmer quasipoisson model.
The estimation results are very off (see model output of fit.lmer.over.quasi
below).
Can someone kindly explain to me what went wrong?

Many thanks in advance.

-Sean

#data simulation (modified from code at
http://markmail.org/message/j3zmgrklihe73p4p)
set.seed(100)
m - 5
n - 100
N - n*m
#X - cbind(1,runif(N))
X - cbind(1,rnorm(N))
X - cbind(runif(N),rnorm(N))
id - rep(1:n,each=m)
#
Z - kronecker(diag(n),rep(1,m))
#Possion with group level heterogeneity
z - rpois(N, exp(X%*%matrix(c(1,2)) + Z%*%matrix(rnorm(n
#2*rnorm(n*m) is added to each observation to create overdispersion
z.overdis - rpois(N, exp(X%*%matrix(c(1,2)) + Z%*%matrix(rnorm(n)) +
2*rnorm(n*m)))

#without observation-level random shock i.e., 2*rnorm(n*m), estimate results
are very accurate
(fit.lmer - lmer(z ~ X + (1|id), family=poisson,verbose=F))
 #Generalized linear mixed model fit by the Laplace approximation
#Formula: z ~ X + (1 | id)
# AIC BIC logLik deviance
# 851 868   -422  843
#Random effects:
# Groups NameVariance Std.Dev.
# id (Intercept) 0.9770.988
#Number of obs: 500, groups: id, 100
#
#Fixed effects:
#Estimate Std. Error z value Pr(|z|)
#(Intercept)  -0.0128 0.1116-0.1  0.9
#X11.0615 0.060117.7   2e-16 ***
#X22.0236 0.021494.7   2e-16 ***
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#Correlation of Fixed Effects:
#   (Intr) X1
#X1 -0.349
#X2 -0.270  0.258




#Now you can see the results are very off
(fit.lmer.over.quasi - lmer(z.overdis ~ X + (1|id),
family=quasipoisson(link=log),verbose=F))
#Generalized linear mixed model fit by the Laplace approximation
#Formula: z.overdis ~ X + (1 | id)
#   AIC   BIC logLik deviance
# 41867 41888 -2092941857
#Random effects:
# Groups   NameVariance Std.Dev.
# id   (Intercept) 175.813.26
# Residual  72.9 8.54
#Number of obs: 500, groups: id, 100
#
#Fixed effects:
#Estimate Std. Error t value
#(Intercept)   1.3530 1.34921.00
#X11.0834 0.22734.77
#X21.3501 0.0783   17.25
#
#Correlation of Fixed Effects:
#   (Intr) X1
#X1 -0.099
#X2 -0.055  0.070

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to handle tabular form data in lmer without expanding the data into binary outcome form?

2009-04-10 Thread Sean Zhang
Dear R-gurus:

I have a question about lmer.
Basically, I have a dataset, in which each observation records number of
trials (N) and number of events (Y) given a covariate combination(X) and
group id (grp_id).
So, my dataset is in tabular form. (in case my explanation of tabular form
is unclear,
please see the link:
http://www.stat.psu.edu/online/development/stat504/06_logreg/11_logreg_fitmodel.htm
)

My question: what is the lmer syntax for tabular data (model Y/N=X is the
what SAS does as seen in the link above).
In specific, where can I add N (number of trials) into the following line of
lmer code?
m1 - lmer(Y ~ X+(1|grp_id), family=biomial(link=logit))
As you may expect, I try to avoid expanding the tabular form data into
binary (0,1) outcome form data because doing so causes a quite large data
matrix in my study).
A link with similar question is seen at
https://stat.ethz.ch/pipermail/r-help/2008-May/161072.html
Seems to me, that link is implementing data expansion approach (they have
only 1600 obs after data expansion).
If someone knows a neat solution other than data expansion, please help.

Many thanks in advance!

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the best package for large data cleaning (not statistical analysis)?

2009-03-15 Thread Sean Zhang
Dear Jim:

Thanks for your reply.
Looks to me, you were using batching.
I used batching to digest large data in Matlab before.
Still wonder the answers to the two specifics questions without resorting to
batching.

Thanks.

-Sean




On Sat, Mar 14, 2009 at 10:13 PM, jim holtman jholt...@gmail.com wrote:

 Exactly what type of cleaning do you want to do on them?  Can you read
 in the data a block at a time (e.g., 1M records), clean them up and
 then write them back out?  You would have the choice of putting them
 back as a text file or possibly storing them using 'filehash'.  I have
 used that technique to segment a year's worth of data that was
 probably 3GB of text into monthly objects that were about 70MB
 dataframes that I stored using filehash.  These I then read back in to
 do processing where I could summarize by month.  So it all depends on
 what you want to do.

 You could read in the chunks, clean them and then reshape them into
 dataframes that you could process later.  You will still probably have
 the problem that all the data still won't fit in memory.  Now one
 thing I did was that since the dataframes were stored as binary
 objects in filehash, it was pretty fast to retrieve them, pick out the
 data I needed from each month and create a subset of just the data I
 needed that would now fit in memory.

 So it all depends ...

 On Sat, Mar 14, 2009 at 8:46 PM, Sean Zhang seane...@gmail.com wrote:
  Dear R helpers:
 
  I am a newbie to R and have a question related to cleaning large data
 frames
  in R.
 
  So far, I have been using SAS for data cleaning because my data sets are
  relatively large (handling multiple files, each could be as large as 5-10
  G).
  I am not a fan of SAS at all and am eager to move data cleaning tasks
 into R
  completely.
 
  Seems to me, there are 3 options. Using SQL, ff or filehash. I do not
 want
  to learn sql. so my question is more related to ff and filehash.
 
  In specifics,
 
  (1) for merging two large data frames,  which one is better, ff vs.
  filehash?
  (2) for reshaping a large data frame (say from long to wide or the
 opposite)
  which one is better, ff vs. filehash?
 
  If you can provide examples, that will be even better.
 
  Many thanks in advance.
 
  -Sean
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What is the best package for large data cleaning (not statistical analysis)?

2009-03-14 Thread Sean Zhang
Dear R helpers:

I am a newbie to R and have a question related to cleaning large data frames
in R.

So far, I have been using SAS for data cleaning because my data sets are
relatively large (handling multiple files, each could be as large as 5-10
G).
I am not a fan of SAS at all and am eager to move data cleaning tasks into R
completely.

Seems to me, there are 3 options. Using SQL, ff or filehash. I do not want
to learn sql. so my question is more related to ff and filehash.

In specifics,

(1) for merging two large data frames,  which one is better, ff vs.
filehash?
(2) for reshaping a large data frame (say from long to wide or the opposite)
which one is better, ff vs. filehash?

If you can provide examples, that will be even better.

Many thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any difference between - and =

2009-03-12 Thread Sean Zhang
Dear Jens and Wacek:

I appreciate your answers very much.

I came up an example based on your comments.
I feel the example helped me to understand...(I could be missing your points
though :( )
If so, please let me know.
Simon pointed out the following link:
http://www.stat.auckland.ac.nz/mail/archive/r-downunder/2008-October/000300.html
I am still trying to understand it...
My question is how my conclusion (see at the end of the example below) drawn
from lexical scope perspective is related to
an understanding from an environment perspective (if an understanding from
environment perspective validly exists).

Thank you all again very much!

-Sean Zhang

#My little example is listed below
f1-function(a=1,b=2) {print(a); print(b); print(a-b) }
f1()  #get 3, makes sense
f1(2,) #get 0, makes sense
a - 10
b - 20
f1(a=a+1,b=a)
a  #get 10  a is not changed outside function scope
b  #get 20, b is not changed outside function scope
a - 10
b - 20
f1(a - a+1, b - a)
a   #a is now 11, a is changed outside function
b   #b is now 11  b is changed outside function
a - 10
b - 20
f1({a=a+1},{b = a})
a #a is changed into 11
b #b is changed into a(i.e., 11)

a - 10
b - 20
f1((a=a+1),(b = a))
a #a is changed into 11
b #b is changed into a(i.e., 11)
#my conclusion based on testing the example above is below
#say argument is a, when used inside paraenthesis of
whatever.fun-function()
#a-something, (a=something) , and {a-something}
#are the same. They all change the values outside the function's scope.
#Typically, this breaks the desired lexical scope convention. so it is
dangerous.
#Correct me, if my understanding is off.
#Also, how to interprete the above test results from an environment
perspective? evnironment vs. scope?

#big thanks. -Sean




On Thu, Mar 12, 2009 at 11:29 AM, Jens Oehlschlägel oehl_l...@gmx.dewrote:

 Sean,

  would like to receive expert opinion to avoid potential trouble
 [..]
  i think the following is the most secure way if one really
  really has to do assignment in a function call
 f({a=3})
  and if one keeps this convention, - can be dropped altogether.

 secure is relative, since due to R's lazy evaluation you never know whether
 a function's argument is being evalutated, look at:

  f- function(x)TRUE
  x - 1
  f((x=2)) # obscured attempt to assign in a function call
 [1] TRUE
  x
 [1] 1

 Thus there is dangerous advice in the referenced blog which reads:
 
 f(x - 3)
 which means assign 3 to x, and call f with the first argument set to the
 value 3
 
 This might be the case in C but not in R. Actually in R f(x - 3) means:
 call f with a first unevaluated argument x - 3, and if and only if f
 decides to evaluate its first argument, then the assignment is done. To make
 this very clear:

  f - function(x)if(runif(1)0.5) TRUE else x
  x - 1
  print(f(x - x + 1))
 [1] TRUE
  print(f(x - x + 1))
 [1] 2
  print(f(x - x + 1))
 [1] 3
  print(f(x - x + 1))
 [1] TRUE
  print(f(x - x + 1))
 [1] 4
  print(f(x - x + 1))
 [1] 5
  print(f(x - x + 1))
 [1] TRUE
  print(f(x - x + 1))
 [1] 6
  print(f(x - x + 1))
 [1] TRUE

 Here it is unpredictable whether your assignment takes place. Thus
 assigning like f({x=1}) or f((x=1))is the maximum dangerous thing to do:
 even if you have a code-reviewer and the guy is aware of the danger of
 f(x-1) he will probably miss it because f((x=1)) does look too similar to a
 standard call f(x=1).

 According to help(-), R's assignment operator is rather - than =:

 
 The operators - and = assign into the environment in which they are
 evaluated. The operator - can be used anywhere, whereas the operator = is
 only allowed at the top level (e.g., in the complete expression typed at the
 command prompt) or as one of the subexpressions in a braced list of
 expressions.
 

 So my recommendation is
 1) use R's assignment operator with two spaces around (or assign()) and
 don't obscure assignments by using C's assignment operator (or other
 languages equality operator)
 2) do not assign in function arguments unless you have good reasons like in
 system.time(x - something)

 HTH


 Jens Oehlschlägel

 P.S. Disclaimer: you can consider me biased towards -, never trust
 experts, whether experienced or not.

 P.P.S. a puzzle, following an old tradition:

 What is going on here? (and what would you need to do to prove it?)

  search()
 [1] .GlobalEnvpackage:stats package:graphics
  package:grDevices package:utils package:datasets
  package:methods
 [8] Autoloads package:base
  ls(all.names = TRUE)
 [1] y
  y
 [1] 1 2 3
  identical(y, 1:3)
 [1] TRUE
  y[] - 1  # assigning 1 fails
  y
 [1] 1 2 3
  y[] - 2  # assigning 2 works
  y
 [1] 2 2 2
 
  # Tip: no standard packages modified, no extra packages loaded, neither
 classes nor methods defined, no print methods hiding anything, if you would
 investigate my R you would not find any false bottom anymore
 
  version
   _
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system

[R] Is there any difference between - and =

2009-03-11 Thread Sean Zhang
Dear R-helpers:

I have a question related to - and =.

I saw very experienced R programmers use = rather than - quite
consistently.
However, I heard from others that do not use = but always stick to - when
assigning valuese.

I personally like = because I was using Matabl, But, would like to receive
expert opinion to avoid potential trouble.

Many thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to write a function that accepts unlimited number of input arguments?

2009-03-09 Thread Sean Zhang
Dear R-helpers:
I am an R newbie and have a question related to writing functions that
accept unlimited number of input arguments.
(I tried to peek into functions such as paste and cbind, but failed, I
cannot see their codes..)

Can someone kindly show me through a summation example?
Say, we have input scalar,  1 2 3 4 5
then the ideal function, say sum.test, can do
(1+2+3+4+5)==sum.test(1,2,3,4,5)

Also sum.test can work as the number of input scalar changes.

Many thanks in advance!

-sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to write a function that accepts unlimited number of input arguments?

2009-03-09 Thread Sean Zhang
Big thanks for your help and suggestion for email communication.
-s

On Mon, Mar 9, 2009 at 12:18 PM, baptiste auguie ba...@exeter.ac.uk wrote:


  On 9 Mar 2009, at 16:04, Sean Zhang wrote:

  Dear Baptiste:

 Many thanks for your help!

 Using the Reduce way, it works almost perfectly.
 I ran into this problem when thinking of appending vectors.
 Is it possible to not use list() within add()
 so add(vec1,vec2,vec3) below can work?


 add - function(...) Reduce(+, list(...))
 add(1, 2, 3)


  Also, do you have some  quick hints on using '...'?

 Many Thanks in advance.



 I'm not sure of a good reference for this. I'd strongly suggest you read
 the Introduction to R manual ( also check the R project webpage for many
 other resources).

 Also, it'd be better if you could Cc R-help next time you ask for further
 information.

 Hope this helps,

 baptiste


 vec1-c(0,1)
 vec2-c(2,3)
 vec3-c(4,5)
 add - function(x) Reduce(append, x)
 add(list(vec1, vec2))
 #add(vec1,vec2) does not work at the moment

 -sean



 On Mon, Mar 9, 2009 at 11:50 AM, baptiste auguie ba...@exeter.ac.ukwrote:

 Hi,

  On 9 Mar 2009, at 15:32, Sean Zhang wrote:

  Dear R-helpers:
 I am an R newbie and have a question related to writing functions that
 accept unlimited number of input arguments.


 it's usually through the ... argument, e.g in paste(...).

  (I tried to peek into functions such as paste and cbind, but failed, I
 cannot see their codes..)


 simply type their name in the R prompt

  paste
  function (..., sep =  , collapse = NULL)
 .Internal(paste(list(...), sep, collapse))
 environment: namespace:base

 etc...

 but that's not very useful here.

   Can someone kindly show me through a summation example?
 Say, we have input scalar,  1 2 3 4 5
 then the ideal function, say sum.test, can do
 (1+2+3+4+5)==sum.test(1,2,3,4,5)


 see ?Reduce for one way to do this:

 add - function(x) Reduce(+, x)

 add(list(1, 2, 3))


 Also sum.test can work as the number of input scalar changes.

 Many thanks in advance!

 -sean

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


  _

 Baptiste Auguié

 School of Physics
 University of Exeter
 Stocker Road,
 Exeter, Devon,
 EX4 4QL, UK

 Phone: +44 1392 264187

 http://newton.ex.ac.uk/research/emag
 __



  _

 Baptiste Auguié

 School of Physics
 University of Exeter
 Stocker Road,
 Exeter, Devon,
 EX4 4QL, UK

 Phone: +44 1392 264187

 http://newton.ex.ac.uk/research/emag
 __



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to make warning message colorful (or have sound)?

2009-02-25 Thread Sean Zhang
Dear R-helpers:

I am new to R and wonder how to make a warning message colorful (if
possible, having sound is also welcome). I did some research and failed
to see options to allow this functionality. Is this a techical limitation so
far, or I miss some information.

Many thanks in advance.

-sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to NULL multiple variables of a df efficiently?

2009-02-24 Thread Sean Zhang
Dear R-helpers:

I am an R novice and would appreciate answer to the following question.

Want to delete many variables in a dataframe.
Am able to delete one variable by assigning it as NULL
Have a large number of variables and would like to delete them without using
a for loop.

Is there a command/function which does this job?

Many thanks in advance.

-Sean


#Small Example:

df-data.frame(var.a=rnorm(10), var.b=rnorm(10),var.c=rnorm(10))
df[,'var.a']-NULL   #this works for one single variable
df[,c('var.a','var.b')]-NULL  #does not work for multiple variables

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to transfer a list of space delimited character elements into a char vector?

2009-02-20 Thread Sean Zhang
My dear R-helpers:

I am a novice in R and have the following text string manipulation question.

Is there a function that performs the job described below?
Say,
wanted_output - c(ab, cd, ef)
#the function_wanted can generate  c(ab, cd, ef) using ab cd ef as the
single input argument
wanted_output - function_wanted(ab cd ef)


Motivation: I have a very long list of character elements (like, ab cd ef gg
ww kwfl ..),

I try to avoid
typing , between two adjacent elements,
typing  in front of the first element,
and typing  right after the last element.
when using them to generate a character vector.

Many Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to apply table() on subdata and stack outputs

2009-02-11 Thread Sean Zhang
Dear R helpers:

I am a R novice and have a question about using table() to extract
frequences over many sub-datasets.
A small example input dataframe and wanted output dataframe are provided
below. The real data is very large so a for loop is what I try to avoid.

Can someone englithen me how to use sapply or the like to achieve it?

Many thanks in advance!

-Sean

#example input dataframe
id - c('tom', 'tom', 'tom', 'jack', 'jack', 'jack', 'jack')
var_interest - c(happy,unhappy, , happy, unhappy, 'soso','happy')
input.df - data.frame(id=id, var_interest=var_interest)
input.df
wanted.df -

#output dataframe I want
id_unique - c('tom','jack')
happy_freq-c(1,2)
unhappy_freq-c(1,1)
soso_freq-c(0,1)
miss_freq-c(1,0)
output.df -data.frame(id_unique=id_unique, happy_freq=happy_freq,
unhappy_freq=unhappy_freq, soso_freq=soso_freq, miss_freq=miss_freq)
output.df

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3d scatter plot with both error bars and a flexibly fitted surface

2009-01-23 Thread Sean Zhang
Dear R-helpers:

 I, an entry level R user, wonder how make a 3d scatter plot with both error
bars and a flexibly fitted surface.

 Can anyone eligthen me?

 Many Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] misalignment of x-axis when overlaying two plots using latticeExtra

2009-01-14 Thread Sean Zhang
Dear R-helpers:
I am an entry-level R user and have a question related to overlaying a
barchart and and a xyplot using latticeExtra.
My problem is that when I overlay them I fail to align their x-axes.
I show my problem below through an example.

#the example data frame is provided below

   vec -c(1,5.056656,0.5977967,0.06126587,0.08557778,
 2,4.601049,0.5995989,0.05002188,0.11410027,
 3,4.932008,0.5502283,0.06727938,0.12531825,
 4,4.763798,0.5499489,0.06473846,0.10752641,
 5,4.944967,0.5328129,0.05445327,0.13663951,
 6,5.063504,0.5267245,0.06477738,0.12380332,
 7,4.735251,0.5528205,0.06851714,0.12196075,
 8,5.141733,0.5304151,0.07965567,0.15123277,
 9,5.215678,0.5219224,0.06694207,0.16476356,
10,4.930439,0.5712519,0.08591549,0.09710933,
11,5.075990,0.5615573,0.05778996,0.15361845,
12,4.909847,0.5683740,0.08711699,0.11189277,
13,4.863164,0.5652511,0.0727,0.12071060,
14,5.173818,0.5564918,0.09830620,0.11831926,
15,4.762325,0.5345888,0.08792658,0.11738642,
16,5.046225,0.5268459,0.09574746,0.13254236,
17,4.902188,0.5370394,0.07194955,0.13164327,
18,4.865935,0.5446562,0.06894994,0.12645103,
19,5.204060,0.5650887,0.06726925,0.09242551,
20,5.208138,0.5765187,0.09282935,0.11053842)

df-as.data.frame( t(matrix(vec,nrow=5,ncol=20)))
names(df)-c(group,outcome,proportion_1,proportion_2,proportion_3)

library(latticeExtra)
library(lattice)

#First generate barchart to plot the 3 proportions
prop.data -subset(df,select=c(proportion_1,proportion_2,proportion_3))
prop.tab - as.table(as.matrix(prop.data))
barchart.obj-barchart(prop.tab, stack=TRUE, horizontal = FALSE)
#Second, generate the dots of outcome (I could have used type=l but using
type=p makes the
#misalignment of x-axis more obvious.
dot.outcome - xyplot(outcome~group,df,type=p, col=blue)

#Last, overlay the two plots
barchart.obj+ as.layer(dot.outcome,style=2,axes=c(y), outside=TRUE)
#Now, you should be able to see the x-axis of the two plots are not
matching.
#i.e., a dot is not at the center of its correspoding bar.

How can I fix this?

Your help will be highly appreciated. Many thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Object name vectcor as function input argument?

2008-12-27 Thread Sean Zhang
Dear R-helpers:

I am new to R and ran into the following question and would appreicate
your advice very much.

My question: How to use a character vector that records object names as
function input argument?

I asked this question very recently and was advised to use get(). get()
works when passing one single object name.
but it does not work when passing multiple object names.

For example, I want to rbind many dfs into one df.
Below, I use 3 data frames for illustration.
df.1 - data.frame(v1=rnorm(5), v2=rnorm(5))
df.2 - data.frame(v1=rnorm(5), v2=rnorm(5))
df.3 - data.frame(v1=rnorm(5), v2=rnorm(5))

all.dfs - c(df.1,df.2,df.3)
# all.dfs is the a character vector recording all object names and I would
like to use all.dfs as
# an input argument for a function that performs rbind

# The following works, but I do not know how to use all.dfs as its input
argument
output - do.call(rbind,list(df.1,df.2,df.3))

# The desired function has the following form:

output - desired.function (all.dfs)


# Show some hw I have done below:
# I tried the following things and they do not work
do.call(rbind,list(all.dfs))

one.string - paste(all.dfs,collapse=,)

do.call(rbind,list(one.string))
do.call(rbind,list(get(one.string)))
do.call(rbind,list(parse(one.string)))
# By the way, the following loop.fun works but it is Not what I like because
I may have a large number of dfs
loop.fun - function (all.dfs)
{
for (i in 1:length(all.dfs) )
ifelse ( i==1, output - get(all.dfs[i]), output -
rbind(output,get(all.dfs[i])) )
return(output)
}

output - loop.fun(all.dfs)



#Your help is highly appreciated. Many thanks in advance.

-Sean Zhang, Ann Arbor

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to skip re-installing CRAN packages when updating R?

2008-12-24 Thread Sean Zhang
Dear R-helpers:

I am new to R and would like to seek your expert opinion on installation
tip. Many thanks in advance.
I want to update my R  to the newest  version and wonder the following two
questions:

Question 1:
How can I install R and its contributed packages in a way so when updating R
in the future, I do NOT need to
re-install contributed packages used by R of last version.

Question 2:
Is it an ok-practice to just install all the CRAN packages (i.e.,
install.packages(available.packages()[,1]) ). Does someone do so?
The reason I ask the second question is that if installing all available
packages does Not consume too much time (say less than 2 hours), too much
computer resource (I have big harddrive, so harddrive is probably not a
concern. I guess computing speed will not be affected but not sure...)
then, I do not need to bother Question 1 and will just install all available
packages when updating R.

Many Thanks in advance.

Merry Christmas!

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] quotation problem/dataframe names as function input argument.

2008-12-23 Thread Sean Zhang
Dear R friends:

Can someone help me with the following problem? Many thanks in advance.

# Problem Description:
# I want to write functions which take a (character) vector of dataframe
names as input argument.
# For example, I want to extract the number of observations from a number of
dataframes.
# I tried the following:

nobs.fun - function (dframe.vec)
{
  nobs.vec - array(NA,c(length(dframe.vec),1))

  for (i in 1:length(dframe.vec))
  {
  nobs.vec[i] - dim(dframe.vec[i])[1]
  }

  return(nobs.vec)
}

# To show the problem, I create a fake dataframe and store its name (i.e.,
dframe.1)
# in a vector (i.e., dframe.vec) of length 1.

# creation of fake dataframe
dframe.1 - as.data.frame(matrix(seq(1:2),c(1,2)))
# store the dataframe name into a vector using c() function
dframe.vec - c(dframe.1)

# The problem is that the following line does not work
nobs.fun(dframe.vec)

# Seems to me, the problem stems from the fact that dframe.vec[1] is
intepreted by R as dframe.vec (note: it is quotated)
# and dim(dframe.vec)[1] gives NULL.
# Also, I realize the following line works as expected (note: dframe.1 is
not quoted any more):
dim(dframe.1)[1]

So my question is then: how can I pass dataframe names as an input argument
for another function
without running into the quotation mark issue above?

Any hint?

Thank you in advance.
-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract values based on indexes without looping?

2008-12-23 Thread Sean Zhang
Dear R-Helpers:

I am a entry level user of R.

Have the following question. Many thanks in advance.


# value.vec stores values
value.vec - c('a','b','c')
#  which.vec stores the locations/indexs of values in value.vec.
which.vec - c(3, 2, 2, 1)
# How can I obtain the following vector based on the value.vec and which.vec
mentioned above
# vector.I.want - c('c', 'b', 'b', 'a')
#  3221


# I try to avoid using the following loop to achieve the goal because the
which.vec in reality will be very long

vector.I.want - rep(NA,length(which.vec))
for (i in 1:length(which.vec))
{ vector.I.want[i] - value.vec[which.vec[i]] }

# is there a faster way than looping?

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.