G'day Rolf, On Fri, 06 May 2011 09:58:50 +1200 Rolf Turner <rolf.tur...@xtra.co.nz> wrote:
> but it's strange that the dodgey code throws an error with gam(dat1$y > ~ s(dat1$x)) but not with gam(dat2$cf ~ s(dat2$s)) > Something a bit subtle is going on; it would be nice to be able to > understand it. Well, R> traceback() 3: eval(expr, envir, enclos) 2: eval(inp, data, parent.frame()) 1: gam(dat$y ~ s(dat$x)) So the lines leading up to the problem seem to be the following from the gam() function: vars <- all.vars(gp$fake.formula[-2]) inp <- parse(text = paste("list(", paste(vars, collapse = ","), ")")) if (!is.list(data) && !is.data.frame(data)) data <- as.data.frame(data) Setting R> options(error=recover) running the code until the error occurs, and then examining the frame number for the gam() call shows that "inp" is "expression(list( dat1,x ))" in your first example and "expression(list( dat2,s ))" in your second example. In both examples, "data" is "list()" (not unsurprisingly). When, dl <- eval(inp, data, parent.frame()) is executed, it tries to eval "inp", in both cases "dat1" and "dat2" are found, obviously, in the parent frame. In your first example "x" is (typically) not found and an error is thrown, in your second example an object with name "s" is found in "package:mgcv" and the call to eval succeeds. "dl" becomes a list with two components, the first being, respectively, "dat1" or "dat2", and the second the body of the function "s". (To verify that, you should probably issue the command "debug(gam)" and step through those first few lines of the function until you reach the above command.) The corollary is that you can use the name of any object that R will find in the parent frame, if it is another data set, then that data set will become the second component of "inp". E.g.: R> dat=data.frame(min=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R> gam(dat$cf ~ s(dat$min)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$min) Estimated degrees of freedom: 3.8925 total = 4.892488 GCV score: 0.002704789 Or R> dat=data.frame(BOD=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R> gam(dat$cf ~ s(dat$BOD)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$BOD) Estimated degrees of freedom: 3.9393 total = 4.939297 GCV score: 0.002666985 > Just out of pure academic interest. :-) Hope your academic curiosity is now satisfied. :) HTH. Cheers, Berwin ========================== Full address ============================ A/Prof Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019) +61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009 e-mail: berwin.turl...@gmail.com Australia http://www.maths.uwa.edu.au/~berwin ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.