On Sep 7, 2010, at 11:02 AM, Johann Hibschman wrote:

Is there any package that assists in saving and reconstituting glm and
nls fits without bringing along the accompanying data?  A quick search
on CRAN didn't turn up anything.

If not, how do other people deal with saving the coefficients of model
fits?

For example, I've run a glm fit that has 23 coefficents on data set that
had 193,008 rows, by the time the fit was called.  When I save the
resulting fit object, I get a 491 MB object, which suggests that it's
pulling along all sorts of junk in the environment, as 23*193k*8 is only
34 MB.  Even so, I would prefer to only save the coefficients

Have you read through the Value section of glm's help page?

...and

?coef

and the
Hessian, not the fit data set.

I'm not sure about whether there will be a Hessian in a glm object. Have you run str() on your objects. It's likely that the residuals, fitted.values, weights, prior.weights, and linear.predictors are going to be fairly large. You could use lapply to run object.size to see whether I have missed any. When I do that on hte first help page example, it is the model component that is the second largest, but its inclusion is optional. The largest compenent is "family" but I suspect that is a family of functions and would not increase in size with larger models.

Is there anything I can do?  If I want to save several fits, 490 MB a
shot starts to add up very quickly. If I just save the coefficients, I have to manually hack up an object that I can then run 'predict' on when
I want to evaluate the model, and that feels very error-prone.

The predict.glm function is visible so you can just type its name to see the code. It appears that the section of the code that does the work is fairly short. This is my nomination for what happens in most cases:

if (!se.fit) {# not generally invoked with se.fit=TRUE
        }
        else {
            pred <- predict.lm(object, newdata, se.fit, scale = 1,
                type = ifelse(type == "link", "response", type),
                terms = terms, na.action = na.action)
            switch(type, response = {
                pred <- family(object)$linkinv(pred)
            }, link = , terms = )
        }

So maybe you should write a predict function that would work on a reduced glm object that has a class name of your choosing.

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to