In the book Modern Applied Statistics with S, 4th edition, 2002, by Venables and Ripley, there is a function logitreg on page 445, which does provide the weighted logistic regression I asked, judging by the loss function. And interesting enough, logitreg provides the same coefficients as glm in the example I provided earlier, even with weights < 1. Also for residual deviance, logitreg yields the same number as glm. Unless I misunderstood something, I am convinced that glm is a valid tool for weighted logistic regression despite the description on weights and somehow questionable logLik value in the case of non-integer weights < 1. Perhaps this is a bold claim: the description of weights can be modified and logLik can be updated as well.
The stackexchange inquiry I provided is what I feel interesting, not the link in that post. Sorry for the confusion. On Sat, Aug 29, 2020 at 10:18 AM John Smith <jsw...@gmail.com> wrote: > Thanks for very insightful thoughts. What I am trying to achieve with the > weights is actually not new, something like > https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances. > I thought my inquiry was not too strange, and I could utilize some existing > codes. It is just an optimization problem at the end of day, or not? Thanks > > On Sat, Aug 29, 2020 at 9:02 AM John Fox <j...@mcmaster.ca> wrote: > >> Dear John, >> >> On 2020-08-29 1:30 a.m., John Smith wrote: >> > Thanks Prof. Fox. >> > >> > I am curious: what is the model estimated below? >> >> Nonsense, as Peter explained in a subsequent response to your prior >> posting. >> >> > >> > I guess my inquiry seems more complicated than I thought: with y being >> 0/1, how to fit weighted logistic regression with weights <1, in the sense >> of weighted least squares? Thanks >> >> What sense would that make? WLS is meant to account for non-constant >> error variance in a linear model, but in a binomial GLM, the variance is >> purely a function for the mean. >> >> If you had binomial (rather than binary 0/1) observations (i.e., >> binomial trials exceeding 1), then you could account for overdispersion, >> e.g., by introducing a dispersion parameter via the quasibinomial >> family, but that isn't equivalent to variance weights in a LM, rather to >> the error-variance parameter in a LM. >> >> I guess the question is what are you trying to achieve with the weights? >> >> Best, >> John >> >> > >> >> On Aug 28, 2020, at 10:51 PM, John Fox <j...@mcmaster.ca> wrote: >> >> >> >> Dear John >> >> >> >> I think that you misunderstand the use of the weights argument to >> glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are >> used to give the number of trials when the response is the proportion of >> successes." That is, in this case y should be the observed proportion of >> successes (i.e., between 0 and 1) and the weights are integers giving the >> number of trials for each binomial observation. >> >> >> >> I hope this helps, >> >> John >> >> >> >> John Fox, Professor Emeritus >> >> McMaster University >> >> Hamilton, Ontario, Canada >> >> web: https://socialsciences.mcmaster.ca/jfox/ >> >> >> >>> On 2020-08-28 9:28 p.m., John Smith wrote: >> >>> If the weights < 1, then we have different values! See an example >> below. >> >>> How should I interpret logLik value then? >> >>> set.seed(135) >> >>> y <- c(rep(0, 50), rep(1, 50)) >> >>> x <- rnorm(100) >> >>> data <- data.frame(cbind(x, y)) >> >>> weights <- c(rep(1, 50), rep(2, 50)) >> >>> fit <- glm(y~x, data, family=binomial(), weights/10) >> >>> res.dev <- residuals(fit, type="deviance") >> >>> res2 <- -0.5*res.dev^2 >> >>> cat("loglikelihood value", logLik(fit), sum(res2), "\n") >> >>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard <pda...@gmail.com> >> wrote: >> >>>> If you don't worry too much about an additive constant, then half the >> >>>> negative squared deviance residuals should do. (Not quite sure how >> weights >> >>>> factor in. Looks like they are accounted for.) >> >>>> >> >>>> -pd >> >>>> >> >>>>> On 25 Aug 2020, at 17:33 , John Smith <jsw...@gmail.com> wrote: >> >>>>> >> >>>>> Dear R-help, >> >>>>> >> >>>>> The function logLik can be used to obtain the maximum log-likelihood >> >>>> value >> >>>>> from a glm object. This is an aggregated value, a summation of >> individual >> >>>>> log-likelihood values. How do I obtain individual values? In the >> >>>> following >> >>>>> example, I would expect 9 numbers since the response has length 9. I >> >>>> could >> >>>>> write a function to compute the values, but there are lots of >> >>>>> family members in glm, and I am trying not to reinvent wheels. >> Thanks! >> >>>>> >> >>>>> counts <- c(18,17,15,20,10,20,25,13,12) >> >>>>> outcome <- gl(3,1,9) >> >>>>> treatment <- gl(3,3) >> >>>>> data.frame(treatment, outcome, counts) # showing data >> >>>>> glm.D93 <- glm(counts ~ outcome + treatment, family = >> poisson()) >> >>>>> (ll <- logLik(glm.D93)) >> >>>>> >> >>>>> [[alternative HTML version deleted]] >> >>>>> >> >>>>> ______________________________________________ >> >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>> PLEASE do read the posting guide >> >>>> http://www.R-project.org/posting-guide.html >> >>>>> and provide commented, minimal, self-contained, reproducible code. >> >>>> >> >>>> -- >> >>>> Peter Dalgaard, Professor, >> >>>> Center for Statistics, Copenhagen Business School >> >>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> >>>> Phone: (+45)38153501 >> >>>> Office: A 4.23 >> >>>> Email: pd....@cbs.dk Priv: pda...@gmail.com >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>> [[alternative HTML version deleted]] >> >>> ______________________________________________ >> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >>> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.