Re: [R] How to obtain individual log-likelihood value from glm?

John Fox Sat, 29 Aug 2020 11:15:37 -0700

Dear John,

If you look at the code for logitreg() in the MASS text, you'll see thatthe casewise components of the log-likelihood are multiplied by thecorresponding weights. As far as I can see, this only makes sense if theweights are binomial trials. Otherwise, while the coefficientsthemselves will be the same as obtained for proportionally similarinteger weights (e.g., using your weights rather than weights/10),quantities such as the maximized log-likelihood, deviance, andcoefficient standard errors will be uninterpretable.

logitreg() is simply another way to compute the MLE, using ageneral-purpose optimizer rather than than iteratively weightedleast-squares, which is what glm() uses. That the two functions providethe same answer within rounding error is unsurprising -- they're solvingthe same problem. A difference between the two functions is that glm()issues a warning about non-integer weights, while logitreg() doesn't. AsI understand it, the motivation for writing logitreg() is to provide afunction that could easily be modified, e.g., to impose parameterconstraints on the solution.

I think that this discussion has gotten unproductive. If you feel thatproceeding with noninteger weights makes sense, for a reason that Idon't understand, then you should go ahead.


Best,
 John

On 2020-08-29 1:23 p.m., John Smith wrote:

In the book Modern Applied Statistics with S, 4th edition, 2002, byVenables and Ripley, there is a function logitreg on page 445, whichdoes provide the weighted logistic regression I asked, judging by theloss function. And interesting enough, logitreg provides the samecoefficients as glm in the example I provided earlier, even with weights< 1. Also for residual deviance, logitreg yields the same number as glm.Unless I misunderstood something, I am convinced that glm is avalid tool for weighted logistic regression despite the description onweights and somehow questionable logLik value in the case of non-integerweights < 1. Perhaps this is a bold claim: the description of weightscan be modified and logLik can be updated as well.

The stackexchange inquiry I provided is what I feel interesting, not thelink in that post. Sorry for the confusion.

On Sat, Aug 29, 2020 at 10:18 AM John Smith <jsw...@gmail.com<mailto:jsw...@gmail.com>> wrote:


    Thanks for very insightful thoughts. What I am trying to achieve
    with the weights is actually not new, something like
    
https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
    I thought my inquiry was not too strange, and I could utilize some
    existing codes. It is just an optimization problem at the end of
    day, or not? Thanks

    On Sat, Aug 29, 2020 at 9:02 AM John Fox <j...@mcmaster.ca
    <mailto:j...@mcmaster.ca>> wrote:

        Dear John,

        On 2020-08-29 1:30 a.m., John Smith wrote:
         > Thanks Prof. Fox.
         >
         > I am curious: what is the model estimated below?

        Nonsense, as Peter explained in a subsequent response to your
        prior posting.

         >
         > I guess my inquiry seems more complicated than I thought:
        with y being 0/1, how to fit weighted logistic regression with
        weights <1, in the sense of weighted least squares? Thanks

        What sense would that make? WLS is meant to account for
        non-constant
        error variance in a linear model, but in a binomial GLM, the
        variance is
        purely a function for the mean.

        If you had binomial (rather than binary 0/1) observations (i.e.,
        binomial trials exceeding 1), then you could account for
        overdispersion,
        e.g., by introducing a dispersion parameter via the quasibinomial
        family, but that isn't equivalent to variance weights in a LM,
        rather to
        the error-variance parameter in a LM.

        I guess the question is what are you trying to achieve with the
        weights?

        Best,
           John

         >
         >> On Aug 28, 2020, at 10:51 PM, John Fox <j...@mcmaster.ca
        <mailto:j...@mcmaster.ca>> wrote:
         >>
         >> Dear John
         >>
         >> I think that you misunderstand the use of the weights
        argument to glm() for a binomial GLM. From ?glm: "For a binomial
        GLM prior weights are used to give the number of trials when the
        response is the proportion of successes." That is, in this case
        y should be the observed proportion of successes (i.e., between
        0 and 1) and the weights are integers giving the number of
        trials for each binomial observation.
         >>
         >> I hope this helps,
         >> John
         >>
         >> John Fox, Professor Emeritus
         >> McMaster University
         >> Hamilton, Ontario, Canada
         >> web: https://socialsciences.mcmaster.ca/jfox/
         >>
         >>> On 2020-08-28 9:28 p.m., John Smith wrote:
         >>> If the weights < 1, then we have different values! See an
        example below.
         >>> How  should I interpret logLik value then?
         >>> set.seed(135)
         >>>   y <- c(rep(0, 50), rep(1, 50))
         >>>   x <- rnorm(100)
         >>>   data <- data.frame(cbind(x, y))
         >>>   weights <- c(rep(1, 50), rep(2, 50))
         >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
         >>> res.dev <http://res.dev> <- residuals(fit, type="deviance")
         >>>   res2 <- -0.5*res.dev <http://res.dev>^2
         >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
         >>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard
        <pda...@gmail.com <mailto:pda...@gmail.com>> wrote:
         >>>> If you don't worry too much about an additive constant,
        then half the
         >>>> negative squared deviance residuals should do. (Not quite
        sure how weights
         >>>> factor in. Looks like they are accounted for.)
         >>>>
         >>>> -pd
         >>>>
         >>>>> On 25 Aug 2020, at 17:33 , John Smith <jsw...@gmail.com
        <mailto:jsw...@gmail.com>> wrote:
         >>>>>
         >>>>> Dear R-help,
         >>>>>
         >>>>> The function logLik can be used to obtain the maximum
        log-likelihood
         >>>> value
         >>>>> from a glm object. This is an aggregated value, a
        summation of individual
         >>>>> log-likelihood values. How do I obtain individual values?
        In the
         >>>> following
         >>>>> example, I would expect 9 numbers since the response has
        length 9. I
         >>>> could
         >>>>> write a function to compute the values, but there are lots of
         >>>>> family members in glm, and I am trying not to reinvent
        wheels. Thanks!
         >>>>>
         >>>>> counts <- c(18,17,15,20,10,20,25,13,12)
         >>>>>      outcome <- gl(3,1,9)
         >>>>>      treatment <- gl(3,3)
         >>>>>      data.frame(treatment, outcome, counts) # showing data
         >>>>>      glm.D93 <- glm(counts ~ outcome + treatment, family
        = poisson())
         >>>>>      (ll <- logLik(glm.D93))
         >>>>>
         >>>>>        [[alternative HTML version deleted]]
         >>>>>
         >>>>> ______________________________________________
         >>>>> R-help@r-project.org <mailto:R-help@r-project.org>
        mailing list -- To UNSUBSCRIBE and more, see
         >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
         >>>>> PLEASE do read the posting guide
         >>>> http://www.R-project.org/posting-guide.html
         >>>>> and provide commented, minimal, self-contained,
        reproducible code.
         >>>>
         >>>> --
         >>>> Peter Dalgaard, Professor,
         >>>> Center for Statistics, Copenhagen Business School
         >>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
         >>>> Phone: (+45)38153501
         >>>> Office: A 4.23
         >>>> Email: pd....@cbs.dk <mailto:pd....@cbs.dk>  Priv:
        pda...@gmail.com <mailto:pda...@gmail.com>
         >>>>
         >>>>
         >>>>
         >>>>
         >>>>
         >>>>
         >>>>
         >>>>
         >>>>
         >>>>
         >>>     [[alternative HTML version deleted]]
         >>> ______________________________________________
         >>> R-help@r-project.org <mailto:R-help@r-project.org> mailing
        list -- To UNSUBSCRIBE and more, see
         >>> https://stat.ethz.ch/mailman/listinfo/r-help
         >>> PLEASE do read the posting guide
        http://www.R-project.org/posting-guide.html
         >>> and provide commented, minimal, self-contained,
        reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to obtain individual log-likelihood value from glm?

Reply via email to