[R] Seeking help with an apparently simple recoding problem
Hello, I have struggled, for longer than I care to admit, with this seemingly simple problem, but I cannot find a solution other than the use of long drawn out ifelse statements. I know there has to be a better way. Here is stripped down version of the situation: I start with: a - c(1,0,1,0,0,0,0) b - c(1,1,1,1,0,0,0) c - c(1,1,0,1,0,0,0) rbind(a,b,c) [,1] [,2] [,3] [,4] [,5] [,6] [,7] a1010000 b1111000 c1101000 I refer to column 3 as the target column, which at the end of the day will be NA in all instances. The logic involved: 1) If columns 2, 4 thru 7 do NOT include at least one '1', then recode columns 2 thru 7 to NA and recode column 1 to code 2. 2) If columns 2, 4 thru 7 contain at least one '1', then recode column 3 to NA. Desired recoding of the above three rows: [,1] [,2][,3][,4][,5][,6][,7] a2 NA NA NA NA NA NA b1 1 NA 1 0 0 0 c1 1 NA 1 0 0 0 Thanks you. Greg Blevins The Market Solutions Group, Inc. Windows XP, Version 2.1.1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help with an apparently simple recoding problem
On Tue, 2005-08-23 at 10:12 -0500, Greg Blevins wrote: Hello, I have struggled, for longer than I care to admit, with this seemingly simple problem, but I cannot find a solution other than the use of long drawn out ifelse statements. I know there has to be a better way. Here is stripped down version of the situation: I start with: a - c(1,0,1,0,0,0,0) b - c(1,1,1,1,0,0,0) c - c(1,1,0,1,0,0,0) rbind(a,b,c) [,1] [,2] [,3] [,4] [,5] [,6] [,7] a1010000 b1111000 c1101000 I refer to column 3 as the target column, which at the end of the day will be NA in all instances. The logic involved: 1) If columns 2, 4 thru 7 do NOT include at least one '1', then recode columns 2 thru 7 to NA and recode column 1 to code 2. 2) If columns 2, 4 thru 7 contain at least one '1', then recode column 3 to NA. Desired recoding of the above three rows: [,1][,2][,3][,4][,5][,6][,7] a2NA NA NA NA NA NA b11 NA 1 0 0 0 c11 NA 1 0 0 0 Thanks you. You left out one key detail in the explanation, which is that the recoding appears to be done on a row by row basis, not overall. The following gets the job done, though there may be a more efficient approach: a - c(1,0,1,0,0,0,0) b - c(1,1,1,1,0,0,0) c - c(1,1,0,1,0,0,0) d - rbind(a, b, c) d [,1] [,2] [,3] [,4] [,5] [,6] [,7] a1010000 b1111000 c1101000 mod.row - function(x) { if (all(x[c(2, 4:7)] == 0)) { x[2:7] - NA x[1] - 2 } else { x[3] - NA } x } y - t(apply(d, 1, mod.row)) y [,1] [,2] [,3] [,4] [,5] [,6] [,7] a2 NA NA NA NA NA NA b11 NA1000 c11 NA1000 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help with a loop
x - data.frame(q33a=3:4,q33b=5:6,q35a=1:2,q35b=2:1) y - list() for (i in grep(q33, colnames(x), value=TRUE)) +y[[sub(q33,,i)]] - ifelse(x[[sub(q33,q35,i)]]==1, x[[i]], NA) as.data.frame(y) a b 1 3 NA 2 NA 6 # if you really want to create new variables rather # than have them in a data frame: # (use paste() or sub() to modify the names if you # want something like newfielda) for (i in names(y)) assign(i, y[[i]]) a [1] 3 NA b [1] NA 6 hope this helps, Tony Plate Greg Blevins wrote: Hello R Helpers, After spending considerable time attempting to write a loop (and searching the help archives) I have decided to post my problem. In a dataframe I have columns labeled: q33a q33b q33c...q33rq35a q35b q35c...q35r What I want to do is create new variables based on the following logic: newfielda - ifelse(q35a==1, q33a, NA) newfieldb - ifelse(q35b==1, q33b, NA) ... newfieldr What I did was create two new dataframes, one containing q33a-r the other q35a-r and tried to loop over both, but I could not get any of the loop syntax I tried to give me the result I was seeking. Any help would be much appreciated. Greg Blevins Partner The Market Solutions Group, Inc. Minneapolis, MN Windows XP, R 2.1.1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help with a loop
You can do the following without resorting to a hard coded loop sapply( paste(q35, letters[1:grep(r, letters)], sep=), function(x) ifelse(temp[, x]%in%1,temp[, sub(5, 3, x)],NA) as the following example shows temp - matrix(sample(c(0,1), 360, replace=T), nrow=10) colnames(temp) - c(paste(q33, letters[1:grep(r, letters)], sep=), paste(q35, letters[1:grep(r, letters)], sep=)) sapply( paste(q35, letters[1:grep(r, letters)], sep=), function(x) ifelse(temp[, x]%in%1,temp[, sub(5, 3, x)],NA)) HTH Jean On Wed, 3 Aug 2005, Greg Blevins wrote: Hello R Helpers, After spending considerable time attempting to write a loop (and searching the help archives) I have decided to post my problem. In a dataframe I have columns labeled: q33a q33b q33c...q33rq35a q35b q35c...q35r What I want to do is create new variables based on the following logic: newfielda - ifelse(q35a==1, q33a, NA) newfieldb - ifelse(q35b==1, q33b, NA) ... newfieldr What I did was create two new dataframes, one containing q33a-r the other q35a-r and tried to loop over both, but I could not get any of the loop syntax I tried to give me the result I was seeking. Any help would be much appreciated. Greg Blevins Partner The Market Solutions Group, Inc. Minneapolis, MN Windows XP, R 2.1.1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Seeking help with a simple loop construction
Hello, I have a df, pp, with five variables: nobs(pp) q10_1 q10_2 q10_3 q10_4 actcode 16201620162016201620 I want to create a loop to run four xtabs (the first four variables above by the fifth) and then store the results in a matrix. Below I make my intent clear by showing the output of one xtab which is inserted into a matrix. a - xtabs(q10_1 ~ actcode) a actcode 1 2 3 4 5 6 7 8 9 10 7 11 3 60 66 56 21 40 7 8 freq.mat - matrix(0, 4, 10, byrow = TRUE) freq.mat[1,] - a freq.mat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]7 113 60 66 56 21 407 8 [2,]000000000 0 [3,]000000000 0 [4,]000000000 0 === I have spent a couple of hours searching the web and my texts but continue to strike out in my attempts to construct a correct formulation of this simple loop. Help would be appreciated. Greg Blevins The Market Solutions Group, Inc. Windows XP R 2.0.1 Pentium 4 512 memory __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Seeking help with a simple loop construction
Does this do what you want? foo.df - data.frame(x = rnorm(12), y = runif(12), z = factor(rep(1:3,4))) bar.mat - matrix(NA, nrow = ncol(foo.df)-1, ncol = nlevels(foo.df$z)) for(i in 1:(ncol(foo.df)-1)) { bar.mat[i,] - xtabs(foo.df[,i] ~ foo.df$z) } bar.mat There's probably a slicker way with apply... HTH, Andy -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Greg Blevins Sent: Monday, November 29, 2004 1:09 PM To: [EMAIL PROTECTED] Subject: [R] Seeking help with a simple loop construction Hello, I have a df, pp, with five variables: nobs(pp) q10_1 q10_2 q10_3 q10_4 actcode 16201620162016201620 I want to create a loop to run four xtabs (the first four variables above by the fifth) and then store the results in a matrix. Below I make my intent clear by showing the output of one xtab which is inserted into a matrix. a - xtabs(q10_1 ~ actcode) a actcode 1 2 3 4 5 6 7 8 9 10 7 11 3 60 66 56 21 40 7 8 freq.mat - matrix(0, 4, 10, byrow = TRUE) freq.mat[1,] - a freq.mat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]7 113 60 66 56 21 407 8 [2,]000000000 0 [3,]000000000 0 [4,]000000000 0 === I have spent a couple of hours searching the web and my texts but continue to strike out in my attempts to construct a correct formulation of this simple loop. Help would be appreciated. Greg Blevins The Market Solutions Group, Inc. Windows XP R 2.0.1 Pentium 4 512 memory __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Seeking help with multcomp
Hello R users, I am having difficulting getting multcomp to run. I have a dataframe attached with a numeric variable q12a and a numeric variable quota (which is really a classification variable). quota has 10 levels and unequal sample sizes. a12a has some missing data. I am interested in doing pairwise testing across the 10 quota groups on q12a. Using the ctest package the following code ran generating the pvalue matrix part of which I list below. When I use the multcomp package and attempt to replicate this analysis, I cannot get it to work. Below I show three attempts that failed. Any help would be much appreciated. pairwise.t.test(q12a, quota, p.adj = fdr) $method [1] t tests with pooled SD $data.name [1] q12a and quota $p.value 1 2 3 4 5 6 7 8 9 2 4.805732e-09 NA NA NA NA NA NA NA NA simtest(q12a ~ quota) Error in parseformula(formula, data, subset, na.action, whichf, ...) : at least one factor required simtest(q12a ~ factor(quota)) Error in parse(file, n, text, prompt) : parse error simtest(q12a ~ factor(quota),na.action=na.exclude) Error in parse(file, n, text, prompt) : parse error Greg Blevins Partner, The Market Solutions Group Windows XP, version 1.9. [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Seeking help for outomating regression (over columns) and storing selected output
I'm quite sure there're better ways, but this works for me: dat - data.frame(y=rnorm(30), x1=runif(30), x2=runif(30), x3=runif(30), + group=factor(rep(1:3, each=10))) getCoef - function(dat) { + apply(dat[,c(x1,x2,x3)], 2, + function(x) lm.fit(cbind(1, x), dat$y)$coefficients[2]) + } clist - by(dat[,c(y,x1,x2,x3)], dat$group, getCoef) cmat - do.call(rbind, clist) cmat x1 x2 x3 1 -1.8646962 0.6182181 -1.7859563 2 -1.5031314 -1.0639626 -0.2982066 3 -0.8302013 0.8111539 -1.0372803 HTH, Andy From: Greg Blevins Hello, I have spent considerable time trying to figure out that which I am about to describe. This included searching Help, consulting my various R books, and trail and (always) error. I have been assuming I would need to use a loop (looping over columns) but perhaps and apply function would do the trick. I have unsuccessfully tried both. A scaled down version of my situation is as follows: I have a dataframe as follows: ID Y x1 x2 x3 usergroup. Y is a continous criterion, x1-x3 continous predictors, and usergroup is coded a 1, 2 or 3 to indicate user status. My end goal is a (dataframe or matrix) with just the regression coef from each of 12 runs (each x regressed separately on Y for the total sample and for each usergroup). I envision output as follows, a three column by four row dataframe or matrix. Y and x1;Y and x2; Y and x3. Total sample: usergroup 1: usergroup 2: (Regression Coefs fill the matrix) usergroup 3: Using 1.8.1 Windows 2000 and XP Help would be most appreciated. Greg Blevins, Partner The Market Solutions Group [[alternative HTML version deleted]] -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help for outomating regression (over columns) andstoring selected output
Here's one simplistic solution, perhaps there are better ones: # Make some test data and place in dataframe x1=rnorm(20) x2=rnorm(20) x3=rnorm(20) x4=as.factor(sample(c(G1,G2,G3),20,replace=T)) y1=2*x1+4*x2+0.5*x3+as.numeric(x4)+rnorm(20) df=data.frame(y1,x1,x2,x3,x4) # Now create the ouput dataframe described out=data.frame(result=c(Intercept,levels(df$x4))) out$X1=as.numeric(coef(lm(df$y1~df$x1+df$x4))) out$X2=as.numeric(coef(lm(df$y1~df$x2+df$x4))) out$X3=as.numeric(coef(lm(df$y1~df$x3+df$x4))) #look at it df out - Original Message - From: Greg Blevins [EMAIL PROTECTED] To: R-Help [EMAIL PROTECTED] Sent: Friday, April 02, 2004 9:03 PM Subject: [R] Seeking help for outomating regression (over columns) andstoring selected output Hello, I have spent considerable time trying to figure out that which I am about to describe. This included searching Help, consulting my various R books, and trail and (always) error. I have been assuming I would need to use a loop (looping over columns) but perhaps and apply function would do the trick. I have unsuccessfully tried both. A scaled down version of my situation is as follows: I have a dataframe as follows: ID Y x1 x2 x3 usergroup. Y is a continous criterion, x1-x3 continous predictors, and usergroup is coded a 1, 2 or 3 to indicate user status. My end goal is a (dataframe or matrix) with just the regression coef from each of 12 runs (each x regressed separately on Y for the total sample and for each usergroup). I envision output as follows, a three column by four row dataframe or matrix. Y and x1;Y and x2; Y and x3. Total sample: usergroup 1: usergroup 2: (Regression Coefs fill the matrix) usergroup 3: Using 1.8.1 Windows 2000 and XP Help would be most appreciated. Greg Blevins, Partner The Market Solutions Group [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help for outomating regression (over columns) and storing selected output
Note that there is a QUESTION at the end regarding random effects. Suppose your data frame is df and has components y, x1, x2, x3 and u where u is a factor. 1. There was a problem posted about doing repeated regressions (search for Operating on windows of data) last month that has similarities to this one. Making use of those ideas, the first sapply below loops over the y~xi regressions and the next two loop over the usergroup specific regressions. We just rbind them altogether: xvars - c(x1, x2, x3) rbind( sapply( xvars, function(xi) coef( lm(y ~ df[,xi], data=df))[[2]] ), sapply( xvars, function(xi) sapply( levels(df$u), function(ulev) coef(lm(y ~ df[,xi], subset=u==ulev, data=df))[[2]] ) ) ) 2. Another possibility is to create a giant regression that does all the usergroup specific regressions at once and then repeat it without the usergroup variable to get the rest. df2 is a new data frame that strings out all the x variables into a single long column and adds a new factor i that identifies which x variable it is. y and u are repeated three times to bring them into line with x. ( xvars - c(x1, x2, x3) xm - as.matrix(df[,xvars]) df2 - data.frame(y=rep(df$y,3), x = c(xm), i=factor(c(col(xm))), u=rep(u,3)) # We could have alternately used reshape like this: # df2 - reshape(df,timevar=i,times=factor(1:3), #varying=list(xvars),direction=long,v.name=x) # The slopes by usergroup and across user group are: coeff.u - coef(lm(y ~ i/u/x, data=df2)) coeff.all - coef(lm(y ~ i/x, data=df2)) # Pick off the slopes (they are at the end of each coef vector) and reform: z - matrix( c( matrix( coef.all, nc=2)[,2], matrix( coef.u, nc=2)[,2] ), nc=3) colnames(z) - xvars rownames(z) - c(All, levels(df$u)) 3. Note that the giant regression approach works as long as you are only interested in the coefficients, however, if you were interested in the variances then this would not work since each of the two regressions uses a pooled estimate of variance. QUESTION: As a matter of interest, would someone that is familiar with random effects models show what the corresponding giant model is with separate variances for each regression. P.S. I tried the above out on the following which is similar to the original problem except there are 4 levels in u: data(state) x - state.x77[,1:3] u - state.region y - state.x77[,4] df - data.frame(y=y, x1=x[,1], x2=x[,2], x3=x[,3], u=factor(u)) Greg Blevins gblevins at mn.rr.com writes: : : Hello, : : I have spent considerable time trying to figure out that which I am about to describe. This included : searching Help, consulting my various R books, and trail and (always) error. I have been assuming I would : need to use a loop (looping over columns) but perhaps and apply function would do the trick. I have : unsuccessfully tried both. : : A scaled down version of my situation is as follows: : : I have a dataframe as follows: : : ID Y x1 x2 x3 usergroup. : : Y is a continous criterion, x1-x3 continous predictors, and usergroup is coded a 1, 2 or 3 to indicate user status. : : My end goal is a (dataframe or matrix) with just the regression coef from each of 12 runs (each x regressed : separately on Y for the total sample and for each usergroup). I envision output as follows, a three column : by four row dataframe or matrix. : : Y and x1;Y and x2; Y and x3. : Total sample: : usergroup 1: : usergroup 2: (Regression Coefs fill the matrix) : usergroup 3: : : Using 1.8.1 : Windows 2000 and XP : : Help would be most appreciated. : : Greg Blevins, Partner : The Market Solutions Group : [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help for outomating regression (over columns) and storing selected output
On Sat, 3 Apr 2004, Gabor Grothendieck wrote: 2. Another possibility is to create a giant regression that does all the usergroup specific regressions at once and then repeat it without the usergroup variable to get the rest. df2 is a new data frame that strings out all the x variables into a single long column and adds a new factor i that identifies which x variable it is. y and u are repeated three times to bring them into line with x. ( snip 3. Note that the giant regression approach works as long as you are only interested in the coefficients, however, if you were interested in the variances then this would not work since each of the two regressions uses a pooled estimate of variance. QUESTION: As a matter of interest, would someone that is familiar with random effects models show what the corresponding giant model is with separate variances for each regression. There are actually two answers to this. The first is that if you use the White/Huber robust/sandwich/model-agnostic variances you get the right variances automatically. This is useful when you what to compare coefficients across models. On the other hand, I don't think you can get the answer you are looking for. The problem is that the giant regression estimates are not MLEs for anything, and so I think you can't get lme() to simultaneously get the right coefficients and the right variances. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Seeking help for outomating regression (over columns) and storing selected output
Hello, I have spent considerable time trying to figure out that which I am about to describe. This included searching Help, consulting my various R books, and trail and (always) error. I have been assuming I would need to use a loop (looping over columns) but perhaps and apply function would do the trick. I have unsuccessfully tried both. A scaled down version of my situation is as follows: I have a dataframe as follows: ID Y x1 x2 x3 usergroup. Y is a continous criterion, x1-x3 continous predictors, and usergroup is coded a 1, 2 or 3 to indicate user status. My end goal is a (dataframe or matrix) with just the regression coef from each of 12 runs (each x regressed separately on Y for the total sample and for each usergroup). I envision output as follows, a three column by four row dataframe or matrix. Y and x1;Y and x2; Y and x3. Total sample: usergroup 1: usergroup 2: (Regression Coefs fill the matrix) usergroup 3: Using 1.8.1 Windows 2000 and XP Help would be most appreciated. Greg Blevins, Partner The Market Solutions Group [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] seeking help with with()
I tried to define a function like: fnx - function(x, by.vars=Month) print(by(x, by.vars, summary)) But this doesn't work (does not find x$Month; unlike other functions, such as subset(), the INDICES argument to by does not look for variables in dataset x. Is fully documented, but I forget every time). So I tried using with: fnxx - function(x, by.vars=Month) print(with(x, by(x, by.vars, summary))) Still fails to find object x$Month. I DO have a working solution (below) - this post is just to ask: Can anyone explain what happened to the with()? FYI solutions are to call like this: fnx(airquality, airquality$Month) but this will not work generically - e.g. in my real application the dataset gets subsetted and by.vars needs to refer to the subsets. So redefine like this: fny - function(x, by.vars=Month) { attach(x) print(by(x, by.vars, summary)) detach(x) } Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] seeking help with with()
On Wed, 27 Aug 2003, Simon Fear wrote: I tried to define a function like: fnx - function(x, by.vars=Month) print(by(x, by.vars, summary)) But this doesn't work (does not find x$Month; unlike other functions, such as subset(), the INDICES argument to by does not look for variables in dataset x. Is fully documented, but I forget every time). So I tried using with: fnxx - function(x, by.vars=Month) print(with(x, by(x, by.vars, summary))) Still fails to find object x$Month. That's not the actual error message, is it? I DO have a working solution (below) - this post is just to ask: Can anyone explain what happened to the with()? Nothing! by.vars is a variable passed to fnxx, so despite lazy evaluation, it is going to be evaluated in the environment calling fnxx(). If that fails to find it, it looks for the default value, and evaluates that in the environment of the body of fnxx. It didn't really get as far as with. (I often forget where default args are evaluated, but I believe that is correct in R as well as in S.) I think you intended Months to be a name and not a variable. With X - data.frame(z=rnorm(20), Month=factor(rep(1:2, each=10))) fnx - function(x, by.vars=Month) print(by(x, x[by.vars], summary)) will work, as will fnx - function(x, by.vars=Month) print(by(x, x[deparse(substitute(by.vars))], summary)) -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] seeking help with with()
Simon Fear [EMAIL PROTECTED] writes: I tried to define a function like: fnx - function(x, by.vars=Month) print(by(x, by.vars, summary)) But this doesn't work (does not find x$Month; unlike other functions, such as subset(), the INDICES argument to by does not look for variables in dataset x. Is fully documented, but I forget every time). So I tried using with: fnxx - function(x, by.vars=Month) print(with(x, by(x, by.vars, summary))) Still fails to find object x$Month. I DO have a working solution (below) - this post is just to ask: Can anyone explain what happened to the with()? Nothing, but by.vars is evaluated in the function frame where it is not defined. I think you're looking for something like function(x, by.vars) { if (missing(by.vars)) by.vars - as.name(Month) print(eval.parent(substitute(with(x, by(x, by.vars, summary) } (Defining the default arg requires a bit of sneakiness...) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] seeking help with with()
Thank you so much for that fix (to my understanding). I would be willing to add such an example to the help page for future releases - though I'm sure others would do it better - there are currently no examples where INDICES is a name. In fact in my real application it is more or less essential that INDICES is a name or at least deparse(substituted as a subscript; in a slight elaboration of my previous fix fnz - function(dframe, by.vars=treat) for (pop in 1:2) { dframe.pop - subset(dframe, ITT==pop) attach(dframe.pop) print(by(dframe.pop, by.vars, summary)) detach(dframe.pop) } the second call (when pop=2) to by() will crash because by.vars is not re-evaluated afresh - it retains its value from the first loop. So, my fix was wrong and I am happy to stand corrected. -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: 27 August 2003 14:08 To: Simon Fear Cc: [EMAIL PROTECTED] Subject: Re: [R] seeking help with with() Security Warning: If you are not sure an attachment is safe to open please contact Andy on x234. There are 0 attachments with this message. On Wed, 27 Aug 2003, Simon Fear wrote: I tried to define a function like: fnx - function(x, by.vars=Month) print(by(x, by.vars, summary)) But this doesn't work (does not find x$Month; unlike other functions, such as subset(), the INDICES argument to by does not look for variables in dataset x. Is fully documented, but I forget every time). So I tried using with: fnxx - function(x, by.vars=Month) print(with(x, by(x, by.vars, summary))) Still fails to find object x$Month. That's not the actual error message, is it? I DO have a working solution (below) - this post is just to ask: Can anyone explain what happened to the with()? Nothing! by.vars is a variable passed to fnxx, so despite lazy evaluation, it is going to be evaluated in the environment calling fnxx(). If that fails to find it, it looks for the default value, and evaluates that in the environment of the body of fnxx. It didn't really get as far as with. (I often forget where default args are evaluated, but I believe that is correct in R as well as in S.) I think you intended Months to be a name and not a variable. With X - data.frame(z=rnorm(20), Month=factor(rep(1:2, each=10))) fnx - function(x, by.vars=Month) print(by(x, x[by.vars], summary)) will work, as will fnx - function(x, by.vars=Month) print(by(x, x[deparse(substitute(by.vars))], summary)) -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help