Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Fox

Dear John,

If you look at the code for logitreg() in the MASS text, you'll see that 
the casewise components of the log-likelihood are multiplied by the 
corresponding weights. As far as I can see, this only makes sense if the 
weights are binomial trials. Otherwise, while the coefficients 
themselves will be the same as obtained for proportionally similar 
integer weights (e.g., using your weights rather than weights/10), 
quantities such as the maximized log-likelihood, deviance, and 
coefficient standard errors will be uninterpretable.


logitreg() is simply another way to compute the MLE, using a 
general-purpose optimizer rather than than iteratively weighted 
least-squares, which is what glm() uses. That the two functions provide 
the same answer within rounding error is unsurprising -- they're solving 
the same problem. A difference between the two functions is that glm() 
issues a warning about non-integer weights, while logitreg() doesn't. As 
I understand it, the motivation for writing logitreg() is to provide a 
function that could easily be modified, e.g., to impose parameter 
constraints on the solution.


I think that this discussion has gotten unproductive. If you feel that 
proceeding with noninteger weights makes sense, for a reason that I 
don't understand, then you should go ahead.


Best,
 John

On 2020-08-29 1:23 p.m., John Smith wrote:
In the book Modern Applied Statistics with S, 4th edition, 2002, by 
Venables and Ripley, there is a function logitreg on page 445, which 
does provide the weighted logistic regression I asked, judging by the 
loss function. And interesting enough, logitreg provides the same 
coefficients as glm in the example I provided earlier, even with weights 
< 1. Also for residual deviance, logitreg yields the same number as glm. 
Unless I misunderstood something, I am convinced that glm is a 
valid tool for weighted logistic regression despite the description on 
weights and somehow questionable logLik value in the case of non-integer 
weights < 1. Perhaps this is a bold claim: the description of weights 
can be modified and logLik can be updated as well.


The stackexchange inquiry I provided is what I feel interesting, not the 
link in that post. Sorry for the confusion.


On Sat, Aug 29, 2020 at 10:18 AM John Smith > wrote:


Thanks for very insightful thoughts. What I am trying to achieve
with the weights is actually not new, something like

https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some
existing codes. It is just an optimization problem at the end of
day, or not? Thanks

On Sat, Aug 29, 2020 at 9:02 AM John Fox mailto:j...@mcmaster.ca>> wrote:

Dear John,

On 2020-08-29 1:30 a.m., John Smith wrote:
 > Thanks Prof. Fox.
 >
 > I am curious: what is the model estimated below?

Nonsense, as Peter explained in a subsequent response to your
prior posting.

 >
 > I guess my inquiry seems more complicated than I thought:
with y being 0/1, how to fit weighted logistic regression with
weights <1, in the sense of weighted least squares? Thanks

What sense would that make? WLS is meant to account for
non-constant
error variance in a linear model, but in a binomial GLM, the
variance is
purely a function for the mean.

If you had binomial (rather than binary 0/1) observations (i.e.,
binomial trials exceeding 1), then you could account for
overdispersion,
e.g., by introducing a dispersion parameter via the quasibinomial
family, but that isn't equivalent to variance weights in a LM,
rather to
the error-variance parameter in a LM.

I guess the question is what are you trying to achieve with the
weights?

Best,
   John

 >
 >> On Aug 28, 2020, at 10:51 PM, John Fox mailto:j...@mcmaster.ca>> wrote:
 >>
 >> Dear John
 >>
 >> I think that you misunderstand the use of the weights
argument to glm() for a binomial GLM. From ?glm: "For a binomial
GLM prior weights are used to give the number of trials when the
response is the proportion of successes." That is, in this case
y should be the observed proportion of successes (i.e., between
0 and 1) and the weights are integers giving the number of
trials for each binomial observation.
 >>
 >> I hope this helps,
 >> John
 >>
 >> John Fox, Professor Emeritus
 >> McMaster University
 >> Hamilton, Ontario, Canada
 >> web: https://socialsciences.mcmaster.ca/jfox/
 >>
 >>> On 2020-08-28 9:28 p.m., John Smith wrote:
 >>> If the weights < 1, then we have different values! See an
   

Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Smith
In the book Modern Applied Statistics with S, 4th edition, 2002, by
Venables and Ripley, there is a function logitreg on page 445, which does
provide the weighted logistic regression I asked, judging by the loss
function. And interesting enough, logitreg provides the same coefficients
as glm in the example I provided earlier, even with weights < 1. Also for
residual deviance, logitreg yields the same number as glm. Unless I
misunderstood something, I am convinced that glm is a valid tool for
weighted logistic regression despite the description on weights and somehow
questionable logLik value in the case of non-integer weights < 1. Perhaps
this is a bold claim: the description of weights can be modified and logLik
can be updated as well.

The stackexchange inquiry I provided is what I feel interesting, not the
link in that post. Sorry for the confusion.

On Sat, Aug 29, 2020 at 10:18 AM John Smith  wrote:

> Thanks for very insightful thoughts. What I am trying to achieve with the
> weights is actually not new, something like
> https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
> I thought my inquiry was not too strange, and I could utilize some existing
> codes. It is just an optimization problem at the end of day, or not? Thanks
>
> On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:
>
>> Dear John,
>>
>> On 2020-08-29 1:30 a.m., John Smith wrote:
>> > Thanks Prof. Fox.
>> >
>> > I am curious: what is the model estimated below?
>>
>> Nonsense, as Peter explained in a subsequent response to your prior
>> posting.
>>
>> >
>> > I guess my inquiry seems more complicated than I thought: with y being
>> 0/1, how to fit weighted logistic regression with weights <1, in the sense
>> of weighted least squares? Thanks
>>
>> What sense would that make? WLS is meant to account for non-constant
>> error variance in a linear model, but in a binomial GLM, the variance is
>> purely a function for the mean.
>>
>> If you had binomial (rather than binary 0/1) observations (i.e.,
>> binomial trials exceeding 1), then you could account for overdispersion,
>> e.g., by introducing a dispersion parameter via the quasibinomial
>> family, but that isn't equivalent to variance weights in a LM, rather to
>> the error-variance parameter in a LM.
>>
>> I guess the question is what are you trying to achieve with the weights?
>>
>> Best,
>>   John
>>
>> >
>> >> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
>> >>
>> >> Dear John
>> >>
>> >> I think that you misunderstand the use of the weights argument to
>> glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are
>> used to give the number of trials when the response is the proportion of
>> successes." That is, in this case y should be the observed proportion of
>> successes (i.e., between 0 and 1) and the weights are integers giving the
>> number of trials for each binomial observation.
>> >>
>> >> I hope this helps,
>> >> John
>> >>
>> >> John Fox, Professor Emeritus
>> >> McMaster University
>> >> Hamilton, Ontario, Canada
>> >> web: https://socialsciences.mcmaster.ca/jfox/
>> >>
>> >>> On 2020-08-28 9:28 p.m., John Smith wrote:
>> >>> If the weights < 1, then we have different values! See an example
>> below.
>> >>> How  should I interpret logLik value then?
>> >>> set.seed(135)
>> >>>   y <- c(rep(0, 50), rep(1, 50))
>> >>>   x <- rnorm(100)
>> >>>   data <- data.frame(cbind(x, y))
>> >>>   weights <- c(rep(1, 50), rep(2, 50))
>> >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
>> >>>   res.dev <- residuals(fit, type="deviance")
>> >>>   res2 <- -0.5*res.dev^2
>> >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>>  On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 
>> wrote:
>>  If you don't worry too much about an additive constant, then half the
>>  negative squared deviance residuals should do. (Not quite sure how
>> weights
>>  factor in. Looks like they are accounted for.)
>> 
>>  -pd
>> 
>> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
>> >
>> > Dear R-help,
>> >
>> > The function logLik can be used to obtain the maximum log-likelihood
>>  value
>> > from a glm object. This is an aggregated value, a summation of
>> individual
>> > log-likelihood values. How do I obtain individual values? In the
>>  following
>> > example, I would expect 9 numbers since the response has length 9. I
>>  could
>> > write a function to compute the values, but there are lots of
>> > family members in glm, and I am trying not to reinvent wheels.
>> Thanks!
>> >
>> > counts <- c(18,17,15,20,10,20,25,13,12)
>> >  outcome <- gl(3,1,9)
>> >  treatment <- gl(3,3)
>> >  data.frame(treatment, outcome, counts) # showing data
>> >  glm.D93 <- glm(counts ~ outcome + treatment, family =
>> poisson())
>> >  (ll <- logLik(glm.D93))
>> >
>> >[[alternative HTML version deleted]]
>> >
>> 

Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Fox

Dear John,

On 2020-08-29 11:18 a.m., John Smith wrote:

Thanks for very insightful thoughts. What I am trying to achieve with the
weights is actually not new, something like
https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some existing
codes. It is just an optimization problem at the end of day, or not? Thanks


So the object is to fit a regularized (i.e, penalized) logistic 
regression rather than to fit by ML. glm() won't do that.


I took a quick look at the stackexchange link that you provided and the 
document referenced in that link.  The penalty proposed in the document 
is just a multiple of the sum of squared regression coefficients, what 
usually called an L2 penalty in the machine-learning literature.  There 
are existing implementations of regularized logistic regression in R -- 
see the machine learning CRAN taskview 
. I believe 
that the penalized package will fit a regularized logistic regression 
with an L2 penalty.


As well, unless my quick reading was inaccurate, I think that you, and 
perhaps the stackexchange poster, might have been confused by the 
terminology used in the document: What's referred to as "weights" in the 
document is what statisticians more typically call "regression 
coefficients," and the "bias weight" is the "intercept" or "regression 
constant." Perhaps I'm missing some connection -- I'm not the best 
person to ask about machine learning.


Best,
 John



On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:


Dear John,

On 2020-08-29 1:30 a.m., John Smith wrote:

Thanks Prof. Fox.

I am curious: what is the model estimated below?


Nonsense, as Peter explained in a subsequent response to your prior
posting.



I guess my inquiry seems more complicated than I thought: with y being

0/1, how to fit weighted logistic regression with weights <1, in the sense
of weighted least squares? Thanks

What sense would that make? WLS is meant to account for non-constant
error variance in a linear model, but in a binomial GLM, the variance is
purely a function for the mean.

If you had binomial (rather than binary 0/1) observations (i.e.,
binomial trials exceeding 1), then you could account for overdispersion,
e.g., by introducing a dispersion parameter via the quasibinomial
family, but that isn't equivalent to variance weights in a LM, rather to
the error-variance parameter in a LM.

I guess the question is what are you trying to achieve with the weights?

Best,
   John




On Aug 28, 2020, at 10:51 PM, John Fox  wrote:

Dear John

I think that you misunderstand the use of the weights argument to glm()

for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used
to give the number of trials when the response is the proportion of
successes." That is, in this case y should be the observed proportion of
successes (i.e., between 0 and 1) and the weights are integers giving the
number of trials for each binomial observation.


I hope this helps,
John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/


On 2020-08-28 9:28 p.m., John Smith wrote:
If the weights < 1, then we have different values! See an example

below.

How  should I interpret logLik value then?
set.seed(135)
   y <- c(rep(0, 50), rep(1, 50))
   x <- rnorm(100)
   data <- data.frame(cbind(x, y))
   weights <- c(rep(1, 50), rep(2, 50))
   fit <- glm(y~x, data, family=binomial(), weights/10)
   res.dev <- residuals(fit, type="deviance")
   res2 <- -0.5*res.dev^2
   cat("loglikelihood value", logLik(fit), sum(res2), "\n")

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 

wrote:

If you don't worry too much about an additive constant, then half the
negative squared deviance residuals should do. (Not quite sure how

weights

factor in. Looks like they are accounted for.)

-pd


On 25 Aug 2020, at 17:33 , John Smith  wrote:

Dear R-help,

The function logLik can be used to obtain the maximum log-likelihood

value

from a glm object. This is an aggregated value, a summation of

individual

log-likelihood values. How do I obtain individual values? In the

following

example, I would expect 9 numbers since the response has length 9. I

could

write a function to compute the values, but there are lots of
family members in glm, and I am trying not to reinvent wheels.

Thanks!


counts <- c(18,17,15,20,10,20,25,13,12)
  outcome <- gl(3,1,9)
  treatment <- gl(3,3)
  data.frame(treatment, outcome, counts) # showing data
  glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
  (ll <- logLik(glm.D93))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Smith
Thanks for very insightful thoughts. What I am trying to achieve with the
weights is actually not new, something like
https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances.
I thought my inquiry was not too strange, and I could utilize some existing
codes. It is just an optimization problem at the end of day, or not? Thanks

On Sat, Aug 29, 2020 at 9:02 AM John Fox  wrote:

> Dear John,
>
> On 2020-08-29 1:30 a.m., John Smith wrote:
> > Thanks Prof. Fox.
> >
> > I am curious: what is the model estimated below?
>
> Nonsense, as Peter explained in a subsequent response to your prior
> posting.
>
> >
> > I guess my inquiry seems more complicated than I thought: with y being
> 0/1, how to fit weighted logistic regression with weights <1, in the sense
> of weighted least squares? Thanks
>
> What sense would that make? WLS is meant to account for non-constant
> error variance in a linear model, but in a binomial GLM, the variance is
> purely a function for the mean.
>
> If you had binomial (rather than binary 0/1) observations (i.e.,
> binomial trials exceeding 1), then you could account for overdispersion,
> e.g., by introducing a dispersion parameter via the quasibinomial
> family, but that isn't equivalent to variance weights in a LM, rather to
> the error-variance parameter in a LM.
>
> I guess the question is what are you trying to achieve with the weights?
>
> Best,
>   John
>
> >
> >> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
> >>
> >> Dear John
> >>
> >> I think that you misunderstand the use of the weights argument to glm()
> for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used
> to give the number of trials when the response is the proportion of
> successes." That is, in this case y should be the observed proportion of
> successes (i.e., between 0 and 1) and the weights are integers giving the
> number of trials for each binomial observation.
> >>
> >> I hope this helps,
> >> John
> >>
> >> John Fox, Professor Emeritus
> >> McMaster University
> >> Hamilton, Ontario, Canada
> >> web: https://socialsciences.mcmaster.ca/jfox/
> >>
> >>> On 2020-08-28 9:28 p.m., John Smith wrote:
> >>> If the weights < 1, then we have different values! See an example
> below.
> >>> How  should I interpret logLik value then?
> >>> set.seed(135)
> >>>   y <- c(rep(0, 50), rep(1, 50))
> >>>   x <- rnorm(100)
> >>>   data <- data.frame(cbind(x, y))
> >>>   weights <- c(rep(1, 50), rep(2, 50))
> >>>   fit <- glm(y~x, data, family=binomial(), weights/10)
> >>>   res.dev <- residuals(fit, type="deviance")
> >>>   res2 <- -0.5*res.dev^2
> >>>   cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>  On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard 
> wrote:
>  If you don't worry too much about an additive constant, then half the
>  negative squared deviance residuals should do. (Not quite sure how
> weights
>  factor in. Looks like they are accounted for.)
> 
>  -pd
> 
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> >
> > Dear R-help,
> >
> > The function logLik can be used to obtain the maximum log-likelihood
>  value
> > from a glm object. This is an aggregated value, a summation of
> individual
> > log-likelihood values. How do I obtain individual values? In the
>  following
> > example, I would expect 9 numbers since the response has length 9. I
>  could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels.
> Thanks!
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> >  outcome <- gl(3,1,9)
> >  treatment <- gl(3,3)
> >  data.frame(treatment, outcome, counts) # showing data
> >  glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> >  (ll <- logLik(glm.D93))
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
>  http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
>  --
>  Peter Dalgaard, Professor,
>  Center for Statistics, Copenhagen Business School
>  Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>  Phone: (+45)38153501
>  Office: A 4.23
>  Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> >>> [[alternative HTML version deleted]]
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, 

Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread John Fox

Dear John,

On 2020-08-29 1:30 a.m., John Smith wrote:

Thanks Prof. Fox.

I am curious: what is the model estimated below?


Nonsense, as Peter explained in a subsequent response to your prior posting.



I guess my inquiry seems more complicated than I thought: with y being 0/1, how to 
fit weighted logistic regression with weights <1, in the sense of weighted 
least squares? Thanks


What sense would that make? WLS is meant to account for non-constant 
error variance in a linear model, but in a binomial GLM, the variance is 
purely a function for the mean.


If you had binomial (rather than binary 0/1) observations (i.e., 
binomial trials exceeding 1), then you could account for overdispersion, 
e.g., by introducing a dispersion parameter via the quasibinomial 
family, but that isn't equivalent to variance weights in a LM, rather to 
the error-variance parameter in a LM.


I guess the question is what are you trying to achieve with the weights?

Best,
 John




On Aug 28, 2020, at 10:51 PM, John Fox  wrote:

Dear John

I think that you misunderstand the use of the weights argument to glm() for a binomial 
GLM. From ?glm: "For a binomial GLM prior weights are used to give the number of 
trials when the response is the proportion of successes." That is, in this case y 
should be the observed proportion of successes (i.e., between 0 and 1) and the weights 
are integers giving the number of trials for each binomial observation.

I hope this helps,
John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/


On 2020-08-28 9:28 p.m., John Smith wrote:
If the weights < 1, then we have different values! See an example below.
How  should I interpret logLik value then?
set.seed(135)
  y <- c(rep(0, 50), rep(1, 50))
  x <- rnorm(100)
  data <- data.frame(cbind(x, y))
  weights <- c(rep(1, 50), rep(2, 50))
  fit <- glm(y~x, data, family=binomial(), weights/10)
  res.dev <- residuals(fit, type="deviance")
  res2 <- -0.5*res.dev^2
  cat("loglikelihood value", logLik(fit), sum(res2), "\n")

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:
If you don't worry too much about an additive constant, then half the
negative squared deviance residuals should do. (Not quite sure how weights
factor in. Looks like they are accounted for.)

-pd


On 25 Aug 2020, at 17:33 , John Smith  wrote:

Dear R-help,

The function logLik can be used to obtain the maximum log-likelihood

value

from a glm object. This is an aggregated value, a summation of individual
log-likelihood values. How do I obtain individual values? In the

following

example, I would expect 9 numbers since the response has length 9. I

could

write a function to compute the values, but there are lots of
family members in glm, and I am trying not to reinvent wheels. Thanks!

counts <- c(18,17,15,20,10,20,25,13,12)
 outcome <- gl(3,1,9)
 treatment <- gl(3,3)
 data.frame(treatment, outcome, counts) # showing data
 glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
 (ll <- logLik(glm.D93))

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com











[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread peter dalgaard
Briefly, you shouldn't. One way of seeing it is if you switch the model to y~1, 
you still get logLik==0.

The root cause is the rounding in binomial()$aic:

> binomial()$aic
function (y, n, mu, wt, dev) 
{
m <- if (any(n > 1)) 
n
else wt
-2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m * y), 
round(m), mu, log = TRUE))
}

which, if wt is small enough ends up calculating dbinom(0, 0, p, log=TRUE) 
which is zero. 

(Not rounding gives you NaN, because you're trying to fit a model with a 
non-integer number of observations.)

-pd

> On 29 Aug 2020, at 03:28 , John Smith  wrote:
> 
> If the weights < 1, then we have different values! See an example below. How  
> should I interpret logLik value then?
> 
> set.seed(135)
>  y <- c(rep(0, 50), rep(1, 50))
>  x <- rnorm(100)
>  data <- data.frame(cbind(x, y))
>  weights <- c(rep(1, 50), rep(2, 50))
>  fit <- glm(y~x, data, family=binomial(), weights/10)
>  res.dev <- residuals(fit, type="deviance")
>  res2 <- -0.5*res.dev^2
>  cat("loglikelihood value", logLik(fit), sum(res2), "\n")
> 
> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:
> If you don't worry too much about an additive constant, then half the 
> negative squared deviance residuals should do. (Not quite sure how weights 
> factor in. Looks like they are accounted for.)
> 
> -pd
> 
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> > 
> > Dear R-help,
> > 
> > The function logLik can be used to obtain the maximum log-likelihood value
> > from a glm object. This is an aggregated value, a summation of individual
> > log-likelihood values. How do I obtain individual values? In the following
> > example, I would expect 9 numbers since the response has length 9. I could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels. Thanks!
> > 
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > data.frame(treatment, outcome, counts) # showing data
> > glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> > (ll <- logLik(glm.D93))
> > 
> >   [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-29 Thread peter dalgaard



> On 25 Aug 2020, at 18:40 , peter dalgaard  wrote:
> 
> If you don't worry too much about an additive constant, then half the 
> negative squared deviance residuals should do. (Not quite sure how weights 
> factor in. Looks like they are accounted for.)
> 
> -pd
> 
>> On 25 Aug 2020, at 17:33 , John Smith  wrote:
>> 
>> Dear R-help,
>> 
>> The function logLik can be used to obtain the maximum log-likelihood value
>> from a glm object. This is an aggregated value, a summation of individual
>> log-likelihood values. How do I obtain individual values? In the following
>> example, I would expect 9 numbers since the response has length 9. I could
>> write a function to compute the values, but there are lots of
>> family members in glm, and I am trying not to reinvent wheels. Thanks!
>> 
>> counts <- c(18,17,15,20,10,20,25,13,12)
>>outcome <- gl(3,1,9)
>>treatment <- gl(3,3)
>>data.frame(treatment, outcome, counts) # showing data
>>glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
>>(ll <- logLik(glm.D93))
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-28 Thread John Smith
Thanks Prof. Fox. 

I am curious: what is the model estimated below?

I guess my inquiry seems more complicated than I thought: with y being 0/1, how 
to fit weighted logistic regression with weights <1, in the sense of weighted 
least squares? Thanks

> On Aug 28, 2020, at 10:51 PM, John Fox  wrote:
> 
> Dear John
> 
> I think that you misunderstand the use of the weights argument to glm() for a 
> binomial GLM. From ?glm: "For a binomial GLM prior weights are used to give 
> the number of trials when the response is the proportion of successes." That 
> is, in this case y should be the observed proportion of successes (i.e., 
> between 0 and 1) and the weights are integers giving the number of trials for 
> each binomial observation.
> 
> I hope this helps,
> John
> 
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
> 
>> On 2020-08-28 9:28 p.m., John Smith wrote:
>> If the weights < 1, then we have different values! See an example below.
>> How  should I interpret logLik value then?
>> set.seed(135)
>>  y <- c(rep(0, 50), rep(1, 50))
>>  x <- rnorm(100)
>>  data <- data.frame(cbind(x, y))
>>  weights <- c(rep(1, 50), rep(2, 50))
>>  fit <- glm(y~x, data, family=binomial(), weights/10)
>>  res.dev <- residuals(fit, type="deviance")
>>  res2 <- -0.5*res.dev^2
>>  cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>>> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:
>>> If you don't worry too much about an additive constant, then half the
>>> negative squared deviance residuals should do. (Not quite sure how weights
>>> factor in. Looks like they are accounted for.)
>>> 
>>> -pd
>>> 
 On 25 Aug 2020, at 17:33 , John Smith  wrote:
 
 Dear R-help,
 
 The function logLik can be used to obtain the maximum log-likelihood
>>> value
 from a glm object. This is an aggregated value, a summation of individual
 log-likelihood values. How do I obtain individual values? In the
>>> following
 example, I would expect 9 numbers since the response has length 9. I
>>> could
 write a function to compute the values, but there are lots of
 family members in glm, and I am trying not to reinvent wheels. Thanks!
 
 counts <- c(18,17,15,20,10,20,25,13,12)
 outcome <- gl(3,1,9)
 treatment <- gl(3,3)
 data.frame(treatment, outcome, counts) # showing data
 glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
 (ll <- logLik(glm.D93))
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> --
>>> Peter Dalgaard, Professor,
>>> Center for Statistics, Copenhagen Business School
>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>> Phone: (+45)38153501
>>> Office: A 4.23
>>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>[[alternative HTML version deleted]]
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-28 Thread John Fox

Dear John

I think that you misunderstand the use of the weights argument to glm() 
for a binomial GLM. From ?glm: "For a binomial GLM prior weights are 
used to give the number of trials when the response is the proportion of 
successes." That is, in this case y should be the observed proportion of 
successes (i.e., between 0 and 1) and the weights are integers giving 
the number of trials for each binomial observation.


I hope this helps,
 John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2020-08-28 9:28 p.m., John Smith wrote:

If the weights < 1, then we have different values! See an example below.
How  should I interpret logLik value then?

set.seed(135)
  y <- c(rep(0, 50), rep(1, 50))
  x <- rnorm(100)
  data <- data.frame(cbind(x, y))
  weights <- c(rep(1, 50), rep(2, 50))
  fit <- glm(y~x, data, family=binomial(), weights/10)
  res.dev <- residuals(fit, type="deviance")
  res2 <- -0.5*res.dev^2
  cat("loglikelihood value", logLik(fit), sum(res2), "\n")

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:


If you don't worry too much about an additive constant, then half the
negative squared deviance residuals should do. (Not quite sure how weights
factor in. Looks like they are accounted for.)

-pd


On 25 Aug 2020, at 17:33 , John Smith  wrote:

Dear R-help,

The function logLik can be used to obtain the maximum log-likelihood

value

from a glm object. This is an aggregated value, a summation of individual
log-likelihood values. How do I obtain individual values? In the

following

example, I would expect 9 numbers since the response has length 9. I

could

write a function to compute the values, but there are lots of
family members in glm, and I am trying not to reinvent wheels. Thanks!

counts <- c(18,17,15,20,10,20,25,13,12)
 outcome <- gl(3,1,9)
 treatment <- gl(3,3)
 data.frame(treatment, outcome, counts) # showing data
 glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
 (ll <- logLik(glm.D93))

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com












[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-28 Thread Mark Leeds
Hii: It's been a long time but John Fox's "Companion to Appied Regression"
book has the expressions
for the likelihood of the binomial glm. ( and probably the others also ).
Just running logLik is not so useful
because it could be leaving out multiplicative factors. If you can get your
hands on any edition of John's book,
I remember it being quite helpful in terms of providing all of the gory
details and  for understanding GLM's in
general.







On Fri, Aug 28, 2020 at 9:28 PM John Smith  wrote:

> If the weights < 1, then we have different values! See an example below.
> How  should I interpret logLik value then?
>
> set.seed(135)
>  y <- c(rep(0, 50), rep(1, 50))
>  x <- rnorm(100)
>  data <- data.frame(cbind(x, y))
>  weights <- c(rep(1, 50), rep(2, 50))
>  fit <- glm(y~x, data, family=binomial(), weights/10)
>  res.dev <- residuals(fit, type="deviance")
>  res2 <- -0.5*res.dev^2
>  cat("loglikelihood value", logLik(fit), sum(res2), "\n")
>
> On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:
>
> > If you don't worry too much about an additive constant, then half the
> > negative squared deviance residuals should do. (Not quite sure how
> weights
> > factor in. Looks like they are accounted for.)
> >
> > -pd
> >
> > > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> > >
> > > Dear R-help,
> > >
> > > The function logLik can be used to obtain the maximum log-likelihood
> > value
> > > from a glm object. This is an aggregated value, a summation of
> individual
> > > log-likelihood values. How do I obtain individual values? In the
> > following
> > > example, I would expect 9 numbers since the response has length 9. I
> > could
> > > write a function to compute the values, but there are lots of
> > > family members in glm, and I am trying not to reinvent wheels. Thanks!
> > >
> > > counts <- c(18,17,15,20,10,20,25,13,12)
> > > outcome <- gl(3,1,9)
> > > treatment <- gl(3,3)
> > > data.frame(treatment, outcome, counts) # showing data
> > > glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> > > (ll <- logLik(glm.D93))
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Office: A 4.23
> > Email: pd@cbs.dk  Priv: pda...@gmail.com
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-28 Thread John Smith
If the weights < 1, then we have different values! See an example below.
How  should I interpret logLik value then?

set.seed(135)
 y <- c(rep(0, 50), rep(1, 50))
 x <- rnorm(100)
 data <- data.frame(cbind(x, y))
 weights <- c(rep(1, 50), rep(2, 50))
 fit <- glm(y~x, data, family=binomial(), weights/10)
 res.dev <- residuals(fit, type="deviance")
 res2 <- -0.5*res.dev^2
 cat("loglikelihood value", logLik(fit), sum(res2), "\n")

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:

> If you don't worry too much about an additive constant, then half the
> negative squared deviance residuals should do. (Not quite sure how weights
> factor in. Looks like they are accounted for.)
>
> -pd
>
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> >
> > Dear R-help,
> >
> > The function logLik can be used to obtain the maximum log-likelihood
> value
> > from a glm object. This is an aggregated value, a summation of individual
> > log-likelihood values. How do I obtain individual values? In the
> following
> > example, I would expect 9 numbers since the response has length 9. I
> could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels. Thanks!
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > data.frame(treatment, outcome, counts) # showing data
> > glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> > (ll <- logLik(glm.D93))
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-27 Thread John Smith
Thanks Peter for a very promising tip.

On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard  wrote:

> If you don't worry too much about an additive constant, then half the
> negative squared deviance residuals should do. (Not quite sure how weights
> factor in. Looks like they are accounted for.)
>
> -pd
>
> > On 25 Aug 2020, at 17:33 , John Smith  wrote:
> >
> > Dear R-help,
> >
> > The function logLik can be used to obtain the maximum log-likelihood
> value
> > from a glm object. This is an aggregated value, a summation of individual
> > log-likelihood values. How do I obtain individual values? In the
> following
> > example, I would expect 9 numbers since the response has length 9. I
> could
> > write a function to compute the values, but there are lots of
> > family members in glm, and I am trying not to reinvent wheels. Thanks!
> >
> > counts <- c(18,17,15,20,10,20,25,13,12)
> > outcome <- gl(3,1,9)
> > treatment <- gl(3,3)
> > data.frame(treatment, outcome, counts) # showing data
> > glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> > (ll <- logLik(glm.D93))
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-25 Thread peter dalgaard
If you don't worry too much about an additive constant, then half the negative 
squared deviance residuals should do. (Not quite sure how weights factor in. 
Looks like they are accounted for.)

-pd

> On 25 Aug 2020, at 17:33 , John Smith  wrote:
> 
> Dear R-help,
> 
> The function logLik can be used to obtain the maximum log-likelihood value
> from a glm object. This is an aggregated value, a summation of individual
> log-likelihood values. How do I obtain individual values? In the following
> example, I would expect 9 numbers since the response has length 9. I could
> write a function to compute the values, but there are lots of
> family members in glm, and I am trying not to reinvent wheels. Thanks!
> 
> counts <- c(18,17,15,20,10,20,25,13,12)
> outcome <- gl(3,1,9)
> treatment <- gl(3,3)
> data.frame(treatment, outcome, counts) # showing data
> glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
> (ll <- logLik(glm.D93))
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain individual log-likelihood value from glm?

2020-08-25 Thread Bert Gunter
If you look at

stats:::logLik.glm  #3 ":" because it's unexported, as is true of most
methods

it should be obvious.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 25, 2020 at 8:34 AM John Smith  wrote:

> Dear R-help,
>
> The function logLik can be used to obtain the maximum log-likelihood value
> from a glm object. This is an aggregated value, a summation of individual
> log-likelihood values. How do I obtain individual values? In the following
> example, I would expect 9 numbers since the response has length 9. I could
> write a function to compute the values, but there are lots of
> family members in glm, and I am trying not to reinvent wheels. Thanks!
>
> counts <- c(18,17,15,20,10,20,25,13,12)
>  outcome <- gl(3,1,9)
>  treatment <- gl(3,3)
>  data.frame(treatment, outcome, counts) # showing data
>  glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
>  (ll <- logLik(glm.D93))
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.