Re: [R] question about update()
Hi, Berwin, good to hear from you, and thanks for the detailed comments and suggestion. Actually, my current experimental code works in the way that you suggest, calling directly lm.fit and glm.fit. What I am trying to develop is an “improved” version of the code for distribution to other people. Hence I wanted to streamline the code, in particular avoiding branches for each fitting procedure (lm.fit, glm.fit and possibly more). But I am now considering to drop the idea of the “improved” version, and stick to the direct calls to the fitting functions. Duncan, thanks for your additional comments. It is true that my original message presented a very simplified picture of the problem, possibly over-simplistic. If I present the problem in the full version of the code, it would look quite long and messy. If I manage to construct a reasonably simplified version of the code, I shall post the question again. Best wishes, Adelchi > On 4 May 2023, at 11:44, Berwin A Turlach wrote: > > G'day Adelchi, > > hope all is well with you. > > On Thu, 4 May 2023 10:34:00 +0200 > Adelchi Azzalini via R-help wrote: > >> Thanks, Duncan. What you indicate is surely the ideal route. >> Unfortunately, in my case this is not feasible, because the >> construction of xf and the update call are within an iterative >> procedure where xf is changed at each iteration, so that the steps >> >> obj$data <- cbind(obj$data, xf=xf) >> new.obj <- update(obj, . ~ . + xf) >> >> must be repeated hundreds of times, each with a different xf. > > If memory serves correctly, update() takes the object that is passed to > it, looks at what the call was that created that object, modifies that > call according to the additional arguments, and finally executes the > modified call. > > So there is a lot of manipulations going on in update(). In particular > it would result each time in a call to lm(), glm() or whatever call was > used to create the object. Inside any of these modelling functions a > lot of symbolic manipulations/calculations are needed too (parsing the > formula, creating the design matrix and response vector from the parsed > formula and data frame, checking if weights are used ). > > If you do the same calculation essentially over and over again, just > with minor modification, all these symbolic manipulations are just time > consuming. > > IMHO, you will be better off to bypass update() and just use lm.fit() > (for which lm() is a nice front-end) and glm.fit() (for which glm() is a > nice front-end), or whatever routine does the grunt work of fitting the > model to the data in your application (hopefully, the package creator > used a set up of XXX.fit() to fit the model, called by XXX() that does > all the fancy formula handling). > > Cheers, > > Berwin > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about update()
G'day Adelchi, hope all is well with you. On Thu, 4 May 2023 10:34:00 +0200 Adelchi Azzalini via R-help wrote: > Thanks, Duncan. What you indicate is surely the ideal route. > Unfortunately, in my case this is not feasible, because the > construction of xf and the update call are within an iterative > procedure where xf is changed at each iteration, so that the steps > > obj$data <- cbind(obj$data, xf=xf) > new.obj <- update(obj, . ~ . + xf) > > must be repeated hundreds of times, each with a different xf. If memory serves correctly, update() takes the object that is passed to it, looks at what the call was that created that object, modifies that call according to the additional arguments, and finally executes the modified call. So there is a lot of manipulations going on in update(). In particular it would result each time in a call to lm(), glm() or whatever call was used to create the object. Inside any of these modelling functions a lot of symbolic manipulations/calculations are needed too (parsing the formula, creating the design matrix and response vector from the parsed formula and data frame, checking if weights are used ). If you do the same calculation essentially over and over again, just with minor modification, all these symbolic manipulations are just time consuming. IMHO, you will be better off to bypass update() and just use lm.fit() (for which lm() is a nice front-end) and glm.fit() (for which glm() is a nice front-end), or whatever routine does the grunt work of fitting the model to the data in your application (hopefully, the package creator used a set up of XXX.fit() to fit the model, called by XXX() that does all the fancy formula handling). Cheers, Berwin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about update()
On 04/05/2023 4:34 a.m., Adelchi Azzalini wrote: On 4 May 2023, at 10:26, Duncan Murdoch wrote: On 04/05/2023 4:05 a.m., Adelchi Azzalini via R-help wrote: Hi. There must be something about the use of update() which I do not grasp, as the next exercise indicates. Suppose that obj is an object returned by a call to lm() or glm(). Next, a new variable xf is constructed using the same dataframe used for producing obj. Then obj$data <- cbind(obj$data, xf=xf) new.obj <- update(obj, . ~ . + xf) generates Error in eval(predvars, data, env) : object 'xf' not found Could somebody explain what I got wrong, and how to fix it? I don't think you should be modifying the obj$data element: as far as I can see, it's not used during the update, which will just re-evaluate the original call to glm(). So you should modify the dataframe that you passed in when creating obj. Thanks, Duncan. What you indicate is surely the ideal route. Unfortunately, in my case this is not feasible, because the construction of xf and the update call are within an iterative procedure where xf is changed at each iteration, so that the steps obj$data <- cbind(obj$data, xf=xf) new.obj <- update(obj, . ~ . + xf) must be repeated hundreds of times, each with a different xf. Sorry, that doesn't make sense. You didn't show us complete code, but presumably it's preceded by something like this: obj <- glm( ..., data = somedata) So change your modification to this: somedata$xf <- xf That can be done hundreds of times. This will need to be more elaborate if the function doing the iteration has a copy of obj but doesn't have a copy of somedata, but there are lots of ways to resolve that. Without seeing complete code, I can't recommend which one to use. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about update()
> On 4 May 2023, at 10:26, Duncan Murdoch wrote: > > On 04/05/2023 4:05 a.m., Adelchi Azzalini via R-help wrote: >> Hi. There must be something about the use of update() which I do not grasp, >> as the next exercise indicates. >> Suppose that obj is an object returned by a call to lm() or glm(). >> Next, a new variable xf is constructed using the same dataframe used >> for producing obj. Then >> obj$data <- cbind(obj$data, xf=xf) >> new.obj <- update(obj, . ~ . + xf) >> generates >> Error in eval(predvars, data, env) : object 'xf' not found >> Could somebody explain what I got wrong, and how to fix it? > > I don't think you should be modifying the obj$data element: as far as I can > see, it's not used during the update, which will just re-evaluate the > original call to glm(). So you should modify the dataframe that you passed > in when creating obj. > Thanks, Duncan. What you indicate is surely the ideal route. Unfortunately, in my case this is not feasible, because the construction of xf and the update call are within an iterative procedure where xf is changed at each iteration, so that the steps obj$data <- cbind(obj$data, xf=xf) new.obj <- update(obj, . ~ . + xf) must be repeated hundreds of times, each with a different xf. Adelchi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about update()
Hi. There must be something about the use of update() which I do not grasp, as the next exercise indicates. Suppose that obj is an object returned by a call to lm() or glm(). Next, a new variable xf is constructed using the same dataframe used for producing obj. Then obj$data <- cbind(obj$data, xf=xf) new.obj <- update(obj, . ~ . + xf) generates Error in eval(predvars, data, env) : object 'xf' not found Could somebody explain what I got wrong, and how to fix it? Best regards, Adelchi Azzalini __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.