Gabor Grothendieck <[EMAIL PROTECTED]> writes: > Douglas Bates produced an extremely useful post so I hesitate to > add more but here goes anyways. > > 1. The bug I referred to in my previous post is still in the code. > The last data point in women never gets used. I think this > goes beyond just making a simple error but gets to the point of > how to avoid index arithmetic in the first place due to its > error prone nature and excess complexity. Using embed() > or running() (the latter is in package gregmisc) would do that although > at the expense of more memory.
You're right. I miscounted and it is easy to do that when manipulating the indices explicitly. I wasn't aware of the embed function but, now that you have pointed it out, I agree that it should be used here. > 2. I find using stopifnot convenient for testing assertions. There > is also a bug in the assertion code shown (width should be allowed > to equal nr) and again I think its due to the introduction of complexity. > The problem is that its harder to get it right when you have to invert > the assertion condition. I have recently discovered the R > stopifnot function which allows one to state assertions without > inverting them and now use it all the time. Once you use stopifnot > the assertion becomes somewhat clearer since its not muddied by the > inversion and in my mind it starts to bring out a wider range of > considerations such as whether a width > with zero or fewer degrees of freedom should be allowed. Perhaps > the condition should be width > length(all.vars(formula)) or if > formulae such as y~x-1 are allowed then a more complex specification. I agree that stopifnot is the best construction for assertions. Thanks for pointing that out. It is one of the very useful utilities added by Martin Maechler, who also wrote the str function which gets my vote for the best debugging tool in R. If you don't know the total number of coefficients then you would need to assert only width > 0. If you do know the number of coefficients then you can assert width >= ncoef. > With these changes (except I've kept width > 0), movingWindow becomes > > movingWindow <- function(formula, data, width, ...) { > nr <- nrow(data) > width <- as.integer(width)[1] > stopifnot( width > 0, width <= nr ) > indices <- as.data.frame( t( embed( 1:nr, width ) ) ) > lapply(indices, function(st) summary(lm(formula, data[st,])) ) > } > > Similar comments could be applied to movingWindow2. Adopting your suggestions I would change the movingWindow2 code to movingWindow2 <- function(formula, data, width, ...) { mCall = match.call() mCall$width = NULL mCall[[1]] = as.name("lm") mCall$x = mCall$y = TRUE bigfit = eval(mCall, parent.frame()) ncoef = length(coef(bigfit)) nr = nrow(data) width = as.integer(width)[1] stopifnot(width >= ncoef, width <= nr) y = bigfit$y x = bigfit$x terms = bigfit$terms inds = embed(seq(nr), width)[, rev(seq(width))] sumrys <- lapply(seq(nrow(inds)), function(st) { ind = inds[st,] fit = lm.fit(x[ind,], y[ind]) fit$terms = terms class(fit) = "lm" summary(fit) }) list(coefficients = sapply(sumrys, function(sm) coef(sm)[,"Estimate"]), Std.Error = sapply(sumrys, function(sm) coef(sm)[,"Std. Error"]), sigma = sapply(sumrys, "[[", "sigma"), r.squared = sapply(sumrys, "[[", "r.squared")) } for which the test gives > data(women) > movingWindow2(weight ~ height, women, 9) $coefficients [,1] [,2] [,3] [,4] [,5] [,6] (Intercept) -59.77778 -67.127778 -73.422222 -81.97222 -91.77778 -106.127778 height 3.00000 3.116667 3.216667 3.35000 3.50000 3.716667 [,7] (Intercept) -122.955556 height 3.966667 $Std.Error [,1] [,2] [,3] [,4] [,5] [,6] (Intercept) 3.77647036 2.64466036 3.70537766 4.71400546 5.1522157 6.60219186 height 0.06085806 0.04194352 0.05784947 0.07246601 0.0780042 0.09846709 [,7] (Intercept) 7.7792788 height 0.1143188 $sigma [1] 0.4714045 0.3248931 0.4481000 0.5613193 0.6042180 0.7627228 0.8855094 $r.squared [1] 0.9971276 0.9987338 0.9977411 0.9967352 0.9965351 0.9951107 0.9942195 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html