[R] Seeking help with an apparently simple recoding problem

2005-08-23 Thread Greg Blevins
Hello,

I have struggled, for longer than I care to admit, with this seemingly simple 
problem, but I cannot find a solution other than the use of long drawn out 
ifelse statements.  I know there has to be a better way.  Here is stripped down 
version of the situation:

I start with:
a - c(1,0,1,0,0,0,0)
b - c(1,1,1,1,0,0,0)
c - c(1,1,0,1,0,0,0)

rbind(a,b,c)
  [,1] [,2] [,3] [,4] [,5] [,6] [,7]
a1010000
b1111000
c1101000

I refer to column 3 as the target column, which at the end of the day will be 
NA in all instances.

The logic involved:

1) If columns 2, 4 thru 7 do NOT include at least one '1', then recode columns 
2 thru 7 to NA and recode column 1 to code 2.

2) If columns 2, 4 thru 7 contain at least one '1', then recode column 3 to NA.

Desired recoding of the above three rows:
  [,1]  [,2][,3][,4][,5][,6][,7]
a2  NA  NA  NA  NA  NA  NA
b1  1   NA  1   0   0   0
c1  1   NA  1   0   0   0

Thanks you.


Greg Blevins
The Market Solutions Group, Inc.

Windows XP, Version 2.1.1

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Seeking help with an apparently simple recoding problem

2005-08-23 Thread Marc Schwartz (via MN)
On Tue, 2005-08-23 at 10:12 -0500, Greg Blevins wrote:
 Hello,
 
 I have struggled, for longer than I care to admit, with this seemingly
 simple problem, but I cannot find a solution other than the use of
 long drawn out ifelse statements.  I know there has to be a better
 way.  Here is stripped down version of the situation:
 
 I start with:
 a - c(1,0,1,0,0,0,0)
 b - c(1,1,1,1,0,0,0)
 c - c(1,1,0,1,0,0,0)
 
 rbind(a,b,c)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7]
 a1010000
 b1111000
 c1101000
 
 I refer to column 3 as the target column, which at the end of the day
 will be NA in all instances.
 
 The logic involved:
 
 1) If columns 2, 4 thru 7 do NOT include at least one '1', then recode
 columns 2 thru 7 to NA and recode column 1 to code 2.
 
 2) If columns 2, 4 thru 7 contain at least one '1', then recode column
 3 to NA.
 
 Desired recoding of the above three rows:
   [,1][,2][,3][,4][,5][,6][,7]
 a2NA  NA  NA  NA  NA  NA
 b11   NA  1   0   0   0
 c11   NA  1   0   0   0
 
 Thanks you.


You left out one key detail in the explanation, which is that the
recoding appears to be done on a row by row basis, not overall.

The following gets the job done, though there may be a more efficient
approach:

 a - c(1,0,1,0,0,0,0)
 b - c(1,1,1,1,0,0,0)
 c - c(1,1,0,1,0,0,0)
 
 d - rbind(a, b, c)
 
 d
  [,1] [,2] [,3] [,4] [,5] [,6] [,7]
a1010000
b1111000
c1101000
 
 
 mod.row - function(x)
 {
   if (all(x[c(2, 4:7)] == 0))
   {
 x[2:7] - NA
 x[1] - 2
   } else {
   x[3] - NA
   }
 
   x
 }
 
 y - t(apply(d, 1, mod.row))

 y
  [,1] [,2] [,3] [,4] [,5] [,6] [,7]
a2   NA   NA   NA   NA   NA   NA
b11   NA1000
c11   NA1000


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Seeking help with a loop

2005-08-03 Thread Tony Plate
  x - data.frame(q33a=3:4,q33b=5:6,q35a=1:2,q35b=2:1)
  y - list()
  for (i in grep(q33, colnames(x), value=TRUE))
+y[[sub(q33,,i)]] - ifelse(x[[sub(q33,q35,i)]]==1, x[[i]], NA)
  as.data.frame(y)
a  b
1  3 NA
2 NA  6
  # if you really want to create new variables rather
  # than have them in a data frame:
  # (use paste() or sub() to modify the names if you
  #  want something like newfielda)
  for (i in names(y)) assign(i, y[[i]])
  a
[1]  3 NA
  b
[1] NA  6
 

hope this helps,

Tony Plate

Greg Blevins wrote:
 Hello R Helpers,
 
 After spending considerable time attempting to write a loop (and searching 
 the help archives) I have decided to post my problem.  
 
 In a dataframe I have columns labeled:
 
 q33a q33b q33c...q33rq35a q35b q35c...q35r
 
 What I want to do is create new variables based on the following logic:
 newfielda - ifelse(q35a==1, q33a, NA)
 newfieldb - ifelse(q35b==1, q33b, NA)
 ...
 newfieldr
 
 What I did was create two new dataframes, one containing q33a-r the other 
 q35a-r and tried to loop over both, but I could not get any of the loop 
 syntax I tried to give me the result I was seeking.
 
 Any help would be much appreciated.
 
 Greg Blevins
 Partner
 The Market Solutions Group, Inc.
 Minneapolis, MN
 
 Windows XP, R 2.1.1
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Seeking help with a loop

2005-08-03 Thread Jean Eid
You can do the following without resorting to a hard coded loop
sapply( paste(q35, letters[1:grep(r, letters)], sep=), function(x)
ifelse(temp[, x]%in%1,temp[, sub(5, 3, x)],NA)


as the following example shows
temp - matrix(sample(c(0,1), 360, replace=T), nrow=10)
 colnames(temp) - c(paste(q33, letters[1:grep(r, letters)], sep=), 
paste(q35, letters[1:grep(r, letters)], sep=))
 sapply( paste(q35, letters[1:grep(r, letters)], sep=), function(x)  
ifelse(temp[, x]%in%1,temp[, sub(5, 3, x)],NA))


HTH

Jean
On Wed, 3 Aug 2005, Greg Blevins wrote:

 Hello R Helpers,

 After spending considerable time attempting to write a loop (and searching 
 the help archives) I have decided to post my problem.

 In a dataframe I have columns labeled:

 q33a q33b q33c...q33rq35a q35b q35c...q35r

 What I want to do is create new variables based on the following logic:
 newfielda - ifelse(q35a==1, q33a, NA)
 newfieldb - ifelse(q35b==1, q33b, NA)
 ...
 newfieldr

 What I did was create two new dataframes, one containing q33a-r the other 
 q35a-r and tried to loop over both, but I could not get any of the loop 
 syntax I tried to give me the result I was seeking.

 Any help would be much appreciated.

 Greg Blevins
 Partner
 The Market Solutions Group, Inc.
 Minneapolis, MN

 Windows XP, R 2.1.1

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Seeking help with a simple loop construction

2004-11-29 Thread Greg Blevins
Hello,

I have a df, pp, with five variables:

 nobs(pp)
  q10_1   q10_2   q10_3   q10_4 actcode 
   16201620162016201620 

I want to create a loop to run four xtabs (the first four variables above by 
the fifth) and then store the results in a matrix.  Below I make my intent 
clear by showing the output of one xtab which is inserted into a matrix.

 a - xtabs(q10_1 ~ actcode)
 a
actcode
 1  2  3  4  5  6  7  8  9 10 
 7 11  3 60 66 56 21 40  7  8 

 freq.mat - matrix(0, 4, 10, byrow = TRUE)
 freq.mat[1,] - a
 freq.mat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]7   113   60   66   56   21   407 8
[2,]000000000 0
[3,]000000000 0
[4,]000000000 0
===
I have spent a couple of hours searching the web and my texts but continue to 
strike out in my attempts to construct a correct formulation of this simple 
loop. Help would be appreciated.

Greg Blevins
The Market Solutions Group, Inc.
Windows XP
R 2.0.1
Pentium 4
512 memory

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Seeking help with a simple loop construction

2004-11-29 Thread Andy Bunn
Does this do what you want?

foo.df - data.frame(x = rnorm(12), y = runif(12), z = factor(rep(1:3,4)))
bar.mat - matrix(NA,  nrow = ncol(foo.df)-1, ncol = nlevels(foo.df$z))
for(i in 1:(ncol(foo.df)-1))
{
bar.mat[i,] - xtabs(foo.df[,i] ~ foo.df$z)
}
bar.mat

There's probably a slicker way with apply...

HTH, Andy

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Greg Blevins
 Sent: Monday, November 29, 2004 1:09 PM
 To: [EMAIL PROTECTED]
 Subject: [R] Seeking help with a simple loop construction
 
 
 Hello,
 
 I have a df, pp, with five variables:
 
  nobs(pp)
   q10_1   q10_2   q10_3   q10_4 actcode 
16201620162016201620 
 
 I want to create a loop to run four xtabs (the first four 
 variables above by the fifth) and then store the results in a 
 matrix.  Below I make my intent clear by showing the output of 
 one xtab which is inserted into a matrix.
 
  a - xtabs(q10_1 ~ actcode)
  a
 actcode
  1  2  3  4  5  6  7  8  9 10 
  7 11  3 60 66 56 21 40  7  8 
 
  freq.mat - matrix(0, 4, 10, byrow = TRUE)
  freq.mat[1,] - a
  freq.mat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]7   113   60   66   56   21   407 8
 [2,]000000000 0
 [3,]000000000 0
 [4,]000000000 0
 ===
 I have spent a couple of hours searching the web and my texts but 
 continue to strike out in my attempts to construct a correct 
 formulation of this simple loop. Help would be appreciated.
 
 Greg Blevins
 The Market Solutions Group, Inc.
 Windows XP
 R 2.0.1
 Pentium 4
 512 memory
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Seeking help with multcomp

2004-04-18 Thread Greg Blevins
Hello R users,

I am having difficulting getting multcomp to run.

I have a dataframe attached with a numeric variable q12a and a numeric variable quota 
(which is really a classification variable).  

quota has 10 levels and unequal sample sizes.  
a12a has some missing data.

I am interested in doing pairwise testing across the 10 quota groups on q12a.  Using 
the ctest package the following code ran generating the pvalue matrix part of which I 
list below.  When I use the multcomp package and attempt to replicate this analysis, I 
cannot get it to work.  Below I show three attempts that failed.

Any help would be much appreciated.
 pairwise.t.test(q12a, quota, p.adj = fdr)

$method

[1] t tests with pooled SD

$data.name

[1] q12a and quota

$p.value

1 2 3 4 5 6 7 8 9

2 4.805732e-09 NA NA NA NA NA NA NA NA

 simtest(q12a ~ quota)

Error in parseformula(formula, data, subset, na.action, whichf, ...) : 

at least one factor required

 simtest(q12a ~ factor(quota))

Error in parse(file, n, text, prompt) : parse error



 simtest(q12a ~ factor(quota),na.action=na.exclude)

Error in parse(file, n, text, prompt) : parse error

Greg Blevins

Partner, The Market Solutions Group

Windows XP, version 1.9.



[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Seeking help for outomating regression (over columns) and storing selected output

2004-04-03 Thread Liaw, Andy
I'm quite sure there're better ways, but this works for me:

 dat - data.frame(y=rnorm(30), x1=runif(30), x2=runif(30), x3=runif(30),
+   group=factor(rep(1:3, each=10)))
 
 getCoef - function(dat) {
+ apply(dat[,c(x1,x2,x3)], 2,
+   function(x) lm.fit(cbind(1, x), dat$y)$coefficients[2])
+ }
 clist - by(dat[,c(y,x1,x2,x3)], dat$group, getCoef)
 cmat - do.call(rbind, clist)
 cmat
  x1 x2 x3
1 -1.8646962  0.6182181 -1.7859563
2 -1.5031314 -1.0639626 -0.2982066
3 -0.8302013  0.8111539 -1.0372803

HTH,
Andy

 From: Greg Blevins
 
 Hello, 
 
 I have spent considerable time trying to figure out that 
 which I am about to describe.  This included searching Help, 
 consulting my various R books, and trail and (always) error.  
 I have been assuming I would need to use a loop (looping over 
 columns) but perhaps and apply function would do the trick.  
 I have unsuccessfully tried both.
 
 A scaled down version of my situation is as follows:
 
 I have a dataframe as follows:
 
 ID   Y  x1  x2  x3   usergroup.
 
 Y is a continous criterion, x1-x3 continous predictors, and 
 usergroup is coded a 1, 2 or 3 to indicate user status.
 
 My end goal is a (dataframe or matrix) with just the 
 regression coef from each of 12 runs (each x regressed 
 separately on Y for the total sample and for each usergroup). 
  I envision output as follows, a three column by four row 
 dataframe or matrix.
 
   
  Y and x1;Y and x2;   
   Y and x3.
 Total sample:
 usergroup 1:   
 usergroup 2:   (Regression Coefs fill the matrix) 
 usergroup 3:  
 
 Using 1.8.1
 Windows 2000 and XP
 
 Help would be most appreciated.
 
 Greg Blevins, Partner
 The Market Solutions Group
   [[alternative HTML version deleted]]
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Seeking help for outomating regression (over columns) andstoring selected output

2004-04-03 Thread Robert W. Baer, Ph.D.
Here's one simplistic solution, perhaps there are better ones:
#  Make some test data and place in dataframe
x1=rnorm(20)
x2=rnorm(20)
x3=rnorm(20)
x4=as.factor(sample(c(G1,G2,G3),20,replace=T))
y1=2*x1+4*x2+0.5*x3+as.numeric(x4)+rnorm(20)

df=data.frame(y1,x1,x2,x3,x4)

# Now create the ouput dataframe described
out=data.frame(result=c(Intercept,levels(df$x4)))
out$X1=as.numeric(coef(lm(df$y1~df$x1+df$x4)))
out$X2=as.numeric(coef(lm(df$y1~df$x2+df$x4)))
out$X3=as.numeric(coef(lm(df$y1~df$x3+df$x4)))

#look at it
df
out



- Original Message - 
From: Greg Blevins [EMAIL PROTECTED]
To: R-Help [EMAIL PROTECTED]
Sent: Friday, April 02, 2004 9:03 PM
Subject: [R] Seeking help for outomating regression (over columns)
andstoring selected output


 Hello,

 I have spent considerable time trying to figure out that which I am about
to describe.  This included searching Help, consulting my various R books,
and trail and (always) error.  I have been assuming I would need to use a
loop (looping over columns) but perhaps and apply function would do the
trick.  I have unsuccessfully tried both.

 A scaled down version of my situation is as follows:

 I have a dataframe as follows:

 ID   Y  x1  x2  x3   usergroup.

 Y is a continous criterion, x1-x3 continous predictors, and usergroup is
coded a 1, 2 or 3 to indicate user status.

 My end goal is a (dataframe or matrix) with just the regression coef from
each of 12 runs (each x regressed separately on Y for the total sample and
for each usergroup).  I envision output as follows, a three column by four
row dataframe or matrix.


  Y and x1;Y and x2; Y and x3.
 Total sample:
 usergroup 1:
 usergroup 2:   (Regression Coefs fill the matrix)
 usergroup 3:

 Using 1.8.1
 Windows 2000 and XP

 Help would be most appreciated.

 Greg Blevins, Partner
 The Market Solutions Group
 [[alternative HTML version deleted]]

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Seeking help for outomating regression (over columns) and storing selected output

2004-04-03 Thread Gabor Grothendieck


Note that there is a QUESTION at the end regarding
random effects.

Suppose your data frame is df and has components 
y, x1, x2, x3 and u where u is a factor.  

1. There was a problem posted about doing repeated regressions 
(search for Operating on windows of data) last month that 
has similarities to this one.  

Making use of those ideas, the first sapply below loops 
over the y~xi regressions and the next two loop over 
the usergroup specific regressions.  We just rbind 
them altogether:

xvars - c(x1, x2, x3)
rbind(
   sapply( xvars, function(xi) coef( lm(y ~ df[,xi], data=df))[[2]] ), 
   sapply( xvars, function(xi)
sapply( levels(df$u), function(ulev)
coef(lm(y ~ df[,xi], subset=u==ulev, data=df))[[2]]
)
   )
)


2. Another possibility is to create a giant regression that does 
all the usergroup specific regressions at once and then repeat 
it without the usergroup variable to get the rest.  

df2 is a new data frame that strings out all the x variables into 
a single long column and adds a new factor i that identifies
which x variable it is.  y and u are repeated three times to bring 
them into line with x.  (

xvars - c(x1, x2, x3)
xm - as.matrix(df[,xvars])
df2 - data.frame(y=rep(df$y,3), x = c(xm), i=factor(c(col(xm))), u=rep(u,3))

# We could have alternately used reshape like this:
# df2 -  reshape(df,timevar=i,times=factor(1:3),
#varying=list(xvars),direction=long,v.name=x)

# The slopes by usergroup and across user group are:

coeff.u - coef(lm(y ~ i/u/x, data=df2))
coeff.all - coef(lm(y ~ i/x, data=df2))

# Pick off the slopes (they are at the end of each coef vector) and reform:

z - matrix( c( matrix( coef.all, nc=2)[,2], matrix( coef.u, nc=2)[,2] ), nc=3)
colnames(z) - xvars
rownames(z) - c(All, levels(df$u))

3. Note that the giant regression approach works as long as you are only
interested in the coefficients, however, if you were interested in the
variances then this would not work since each of the two regressions uses a
pooled estimate of variance.

QUESTION:  As a matter of interest, would someone that is familiar with random
effects models show what the corresponding giant model is with separate
variances for each regression.


P.S. I tried the above out on the following which is similar
to the original problem except there are 4 levels in u:

data(state)
x - state.x77[,1:3]
u - state.region
y - state.x77[,4]
df - data.frame(y=y, x1=x[,1], x2=x[,2], x3=x[,3], u=factor(u))



Greg Blevins gblevins at mn.rr.com writes:

: 
: Hello, 
: 
: I have spent considerable time trying to figure out that which I am about to 
describe.  This included
: searching Help, consulting my various R books, and trail and (always) 
error.  I have been assuming I would
: need to use a loop (looping over columns) but perhaps and apply function 
would do the trick.  I have
: unsuccessfully tried both.
: 
: A scaled down version of my situation is as follows:
: 
: I have a dataframe as follows:
: 
: ID   Y  x1  x2  x3   usergroup.
: 
: Y is a continous criterion, x1-x3 continous predictors, and usergroup is 
coded a 1, 2 or 3 to indicate user status.
: 
: My end goal is a (dataframe or matrix) with just the regression coef from 
each of 12 runs (each x regressed
: separately on Y for the total sample and for each usergroup).  I envision 
output as follows, a three column
: by four row dataframe or matrix.
: 
:  Y and x1;Y and x2; Y and x3.
: Total sample:
: usergroup 1:   
: usergroup 2:   (Regression Coefs fill the matrix) 
: usergroup 3:  
: 
: Using 1.8.1
: Windows 2000 and XP
: 
: Help would be most appreciated.
: 
: Greg Blevins, Partner
: The Market Solutions Group
:   [[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Seeking help for outomating regression (over columns) and storing selected output

2004-04-03 Thread Thomas Lumley
On Sat, 3 Apr 2004, Gabor Grothendieck wrote:

 2. Another possibility is to create a giant regression that does
 all the usergroup specific regressions at once and then repeat
 it without the usergroup variable to get the rest.

 df2 is a new data frame that strings out all the x variables into
 a single long column and adds a new factor i that identifies
 which x variable it is.  y and u are repeated three times to bring
 them into line with x.  (


snip


 3. Note that the giant regression approach works as long as you are only
 interested in the coefficients, however, if you were interested in the
 variances then this would not work since each of the two regressions uses a
 pooled estimate of variance.

 QUESTION:  As a matter of interest, would someone that is familiar with random
 effects models show what the corresponding giant model is with separate
 variances for each regression.

There are actually two answers to this.  The first is that if you use the
White/Huber robust/sandwich/model-agnostic variances you get the right
variances automatically.  This is useful when you what to compare
coefficients across models.

On the other hand, I don't think you can get the answer you are looking
for.  The problem is that the giant regression estimates are not MLEs for
anything, and so I think you can't get lme() to simultaneously get the
right coefficients and the right variances.

-thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Seeking help for outomating regression (over columns) and storing selected output

2004-04-02 Thread Greg Blevins
Hello, 

I have spent considerable time trying to figure out that which I am about to describe. 
 This included searching Help, consulting my various R books, and trail and (always) 
error.  I have been assuming I would need to use a loop (looping over columns) but 
perhaps and apply function would do the trick.  I have unsuccessfully tried both.

A scaled down version of my situation is as follows:

I have a dataframe as follows:

ID   Y  x1  x2  x3   usergroup.

Y is a continous criterion, x1-x3 continous predictors, and usergroup is coded a 1, 2 
or 3 to indicate user status.

My end goal is a (dataframe or matrix) with just the regression coef from each of 12 
runs (each x regressed separately on Y for the total sample and for each usergroup).  
I envision output as follows, a three column by four row dataframe or matrix.

  
 Y and x1;Y and x2; Y and x3.
Total sample:
usergroup 1:   
usergroup 2:   (Regression Coefs fill the matrix) 
usergroup 3:  

Using 1.8.1
Windows 2000 and XP

Help would be most appreciated.

Greg Blevins, Partner
The Market Solutions Group
[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] seeking help with with()

2003-08-27 Thread Simon Fear
I tried to define a function like:

fnx - function(x, by.vars=Month)
  print(by(x, by.vars, summary))

But this doesn't work (does not find x$Month; unlike other functions,
such as
subset(), the INDICES argument to by does not look for variables in
dataset
x. Is fully documented, but I forget every time). So I tried using
with:

fnxx - function(x, by.vars=Month)
  print(with(x, by(x, by.vars, summary)))

Still fails to find object x$Month. 

I DO have a working solution (below) - this post is just to ask: Can
anyone
explain what happened to the with()?



FYI solutions are to call like this:

fnx(airquality, airquality$Month)

but this will not work generically - e.g. in my real application the
dataset
gets subsetted and by.vars needs to refer to the subsets. So redefine
like
this:

fny - function(x, by.vars=Month) {
  attach(x)
  print(by(x, by.vars, summary))
  detach(x)
}
 

Simon Fear
Senior Statistician
Syne qua non Ltd
Tel: +44 (0) 1379 69
Fax: +44 (0) 1379 65
email: [EMAIL PROTECTED]
web: http://www.synequanon.com
 
Number of attachments included with this message: 0
 
This message (and any associated files) is confidential and\...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] seeking help with with()

2003-08-27 Thread Prof Brian Ripley
On Wed, 27 Aug 2003, Simon Fear wrote:

 I tried to define a function like:
 
 fnx - function(x, by.vars=Month)
   print(by(x, by.vars, summary))
 
 But this doesn't work (does not find x$Month; unlike other functions,
 such as
 subset(), the INDICES argument to by does not look for variables in
 dataset
 x. Is fully documented, but I forget every time). So I tried using
 with:
 
 fnxx - function(x, by.vars=Month)
   print(with(x, by(x, by.vars, summary)))
 
 Still fails to find object x$Month. 

That's not the actual error message, is it?

 I DO have a working solution (below) - this post is just to ask: Can
 anyone
 explain what happened to the with()?

Nothing!

by.vars is a variable passed to fnxx, so despite lazy evaluation, it is
going to be evaluated in the environment calling fnxx().  If that fails to
find it, it looks for the default value, and evaluates that in the
environment of the body of fnxx.  It didn't really get as far as with.

(I often forget where default args are evaluated, but I believe that is 
correct in R as well as in S.)

I think you intended Months to be a name and not a variable.  With

X - data.frame(z=rnorm(20), Month=factor(rep(1:2, each=10)))

fnx - function(x, by.vars=Month)
   print(by(x, x[by.vars], summary))

will work, as will

fnx - function(x, by.vars=Month)
   print(by(x, x[deparse(substitute(by.vars))], summary))


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] seeking help with with()

2003-08-27 Thread Peter Dalgaard BSA
Simon Fear [EMAIL PROTECTED] writes:

 I tried to define a function like:
 
 fnx - function(x, by.vars=Month)
   print(by(x, by.vars, summary))
 
 But this doesn't work (does not find x$Month; unlike other functions,
 such as
 subset(), the INDICES argument to by does not look for variables in
 dataset
 x. Is fully documented, but I forget every time). So I tried using
 with:
 
 fnxx - function(x, by.vars=Month)
   print(with(x, by(x, by.vars, summary)))
 
 Still fails to find object x$Month. 
 
 I DO have a working solution (below) - this post is just to ask: Can
 anyone
 explain what happened to the with()?
 

Nothing, but by.vars is evaluated in the function frame where it is
not defined. I think you're looking for something like

function(x, by.vars) {
  if (missing(by.vars)) by.vars - as.name(Month)
  print(eval.parent(substitute(with(x, by(x, by.vars, summary)
}

(Defining the default arg requires a bit of sneakiness...)
-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] seeking help with with()

2003-08-27 Thread Simon Fear
Thank you so much for that fix (to my understanding).

I would be willing to add such an example to the help 
page for future releases - though I'm sure others would 
do it better - there are currently no examples where
 INDICES is a name.

In fact in my real application it is more or less essential
that INDICES is a name or at least deparse(substituted 
as a subscript; in a slight elaboration of my previous fix

fnz - function(dframe, by.vars=treat)
  for (pop in 1:2) {
dframe.pop - subset(dframe, ITT==pop)
attach(dframe.pop)
print(by(dframe.pop, by.vars, summary))
detach(dframe.pop)
  }

the second call (when pop=2) to by() will crash because by.vars 
is not re-evaluated afresh - it retains its value 
from the first loop.

So, my fix was wrong and I am happy to stand corrected.


 -Original Message-
 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
 Sent: 27 August 2003 14:08
 To: Simon Fear
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] seeking help with with()
 
 
 Security Warning:
 If you are not sure an attachment is safe to open please contact 
 Andy on x234. There are 0 attachments with this message.
 
 
 On Wed, 27 Aug 2003, Simon Fear wrote:
 
  I tried to define a function like:
  
  fnx - function(x, by.vars=Month)
print(by(x, by.vars, summary))
  
  But this doesn't work (does not find x$Month; unlike other 
 functions,
  such as
  subset(), the INDICES argument to by does not look for 
 variables in
  dataset
  x. Is fully documented, but I forget every time). So I tried using
  with:
  
  fnxx - function(x, by.vars=Month)
print(with(x, by(x, by.vars, summary)))
  
  Still fails to find object x$Month. 
 
 That's not the actual error message, is it?
 
  I DO have a working solution (below) - this post is just to ask: Can
  anyone
  explain what happened to the with()?
 
 Nothing!
 
 by.vars is a variable passed to fnxx, so despite lazy 
 evaluation, it is
 going to be evaluated in the environment calling fnxx().  If 
 that fails
 to
 find it, it looks for the default value, and evaluates that in the
 environment of the body of fnxx.  It didn't really get as far as with.
 
 (I often forget where default args are evaluated, but I 
 believe that is 
 correct in R as well as in S.)
 
 I think you intended Months to be a name and not a variable.  With
 
 X - data.frame(z=rnorm(20), Month=factor(rep(1:2, each=10)))
 
 fnx - function(x, by.vars=Month)
print(by(x, x[by.vars], summary))
 
 will work, as will
 
 fnx - function(x, by.vars=Month)
print(by(x, x[deparse(substitute(by.vars))], summary))
 
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 

Simon Fear
Senior Statistician
Syne qua non Ltd
Tel: +44 (0) 1379 69
Fax: +44 (0) 1379 65
email: [EMAIL PROTECTED]
web: http://www.synequanon.com
 
Number of attachments included with this message: 0
 
This message (and any associated files) is confidential and\...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help