Re: [R] Likelihood ratio test in porl (MASS)
Dear Achim Zeileis, dear John Fox, Thank you for your time! Both worked well. lrtest(Restrict, Full) #Df LogLik Df Chisq Pr(>Chisq) 1 27 -882.00 2 28 -866.39 1 31.212 2.313e-08 *** anova(Restrict, Full) Resid. df Resid. Dev TestDf LR stat. Pr(Chi) 1 2121 1763.999 2 2120 1732.787 1 vs 2 1 31.21204 2.313266e-08 And both seems to reject the null hypothesis. Thanks again! Best, Faradj > 27 jul 2016 kl. 13:35 skrev Fox, John <j...@mcmaster.ca>: > > Dear Faradj Koliev, > > There is an anova() method for "polr" objects that computes LR chisquare > tests for nested models, so a short answer to your question is anova(Full, > Restricted). > > The question, however, seems to reflect some misunderstandings. First aov() > fits linear analysis-of-variance models, which assume normally distributed > errors. These are different from the ordinal regression models, such as the > proportional-odds model, fit by polr(). For the former, F-tests *are* LR > tests; for the latter, F-tests aren't appropriate. > > I hope this helps, > John > > - > John Fox, Professor > McMaster University > Hamilton, Ontario > Canada L8S 4M4 > Web: socserv.mcmaster.ca/jfox > > > > >> -Original Message- >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Faradj Koliev >> Sent: July 27, 2016 4:50 AM >> To: r-help@r-project.org >> Subject: [R] Likelihood ratio test in porl (MASS) >> >> Dear all, >> >> A quick question: Let’s say I have a full and a restricted model that looks >> something like this: >> >> Full<- polr(Y ~ X1+X2+X3+X4, data=data, Hess = TRUE, method="logistic”) # >> ordered logistic regression >> >> Restricted<- polr(Y ~ X1+X2+X3, data=data, Hess = TRUE, method="logistic”) # >> ordered logistic regression >> >> I wanted to conduct the F-test (using aov command) in order to determine >> whether the information from the X4 variable statistically improves our >> understanding of Y. >> However, I’ve been told that the likelihood ratio test is a better >> alternative. So, >> I would like to conduct the LR test. In rms package this is easy -- >> lrest(Full, >> Restricted) — I’m just curious how to perform the same using polr. Thanks! >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test in porl (MASS)
Dear Faradj Koliev, There is an anova() method for "polr" objects that computes LR chisquare tests for nested models, so a short answer to your question is anova(Full, Restricted). The question, however, seems to reflect some misunderstandings. First aov() fits linear analysis-of-variance models, which assume normally distributed errors. These are different from the ordinal regression models, such as the proportional-odds model, fit by polr(). For the former, F-tests *are* LR tests; for the latter, F-tests aren't appropriate. I hope this helps, John - John Fox, Professor McMaster University Hamilton, Ontario Canada L8S 4M4 Web: socserv.mcmaster.ca/jfox > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Faradj Koliev > Sent: July 27, 2016 4:50 AM > To: r-help@r-project.org > Subject: [R] Likelihood ratio test in porl (MASS) > > Dear all, > > A quick question: Let’s say I have a full and a restricted model that looks > something like this: > > Full<- polr(Y ~ X1+X2+X3+X4, data=data, Hess = TRUE, method="logistic”) # > ordered logistic regression > > Restricted<- polr(Y ~ X1+X2+X3, data=data, Hess = TRUE, method="logistic”) # > ordered logistic regression > > I wanted to conduct the F-test (using aov command) in order to determine > whether the information from the X4 variable statistically improves our > understanding of Y. > However, I’ve been told that the likelihood ratio test is a better > alternative. So, > I would like to conduct the LR test. In rms package this is easy -- > lrest(Full, > Restricted) — I’m just curious how to perform the same using polr. Thanks! > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test in porl (MASS)
On Wed, 27 Jul 2016, Faradj Koliev wrote: Dear all, A quick question: Let?s say I have a full and a restricted model that looks something like this: Full<- polr(Y ~ X1+X2+X3+X4, data=data, Hess = TRUE, method="logistic?) # ordered logistic regression Restricted<- polr(Y ~ X1+X2+X3, data=data, Hess = TRUE, method="logistic?) # ordered logistic regression I wanted to conduct the F-test (using aov command) in order to determine whether the information from the X4 variable statistically improves our understanding of Y. However, I?ve been told that the likelihood ratio test is a better alternative. So, I would like to conduct the LR test. In rms package this is easy -- lrest(Full, Restricted) ? I?m just curious how to perform the same using polr. Thanks! One generic possibility to conduct the likelihood ratio test is the lrtest() function in package "lmtest", i.e., library("lmtest") lrtest(Restricted, Full) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test in porl (MASS)
Dear all, A quick question: Let’s say I have a full and a restricted model that looks something like this: Full<- polr(Y ~ X1+X2+X3+X4, data=data, Hess = TRUE, method="logistic”) # ordered logistic regression Restricted<- polr(Y ~ X1+X2+X3, data=data, Hess = TRUE, method="logistic”) # ordered logistic regression I wanted to conduct the F-test (using aov command) in order to determine whether the information from the X4 variable statistically improves our understanding of Y. However, I’ve been told that the likelihood ratio test is a better alternative. So, I would like to conduct the LR test. In rms package this is easy -- lrest(Full, Restricted) — I’m just curious how to perform the same using polr. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood ratio test for mean difference assuming unequal variance
Hello, I am so sorry, but I have been struggling with the code for the entire day. I have a very simple dataset that looks like this: response=c(45,47,24,35,47,56,29) sub=c(A,A,B,B,C,C,C£© time=c(1,2,1,2,1,2,3) gdata=cbind(response,sub,time) Namely, for three subjects, each has 2 or 3 observations. Assuming that each subject has a different variance, I want to test whether the mean for the three subjects are equal. I tried: library(nlme) weight - varIdent(form = ~ 1 | sub ) lme1 - lme(response~sub, weights = weight, data = gdata) However, it shows error: Error in getGroups.data.frame(dataMix, groups) : invalid formula for groups Can anyone help me? Eventually want to try this: lme1 - lme(response~sub, weights = weight, data = gdata) lme2 - lme(response~1, weights = weight, data = gdata) lrtest(lme1,lme2) Am I on the right track? Or can anyone think of a better method? Thank you very much! Best regards, Amanda [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood ratio test for mean difference assuming unequal variance
Amanda Li amandali at uchicago.edu writes: Hello, I am so sorry, but I have been struggling with the code for the entire day. I have a very simple dataset that looks like this: response=c(45,47,24,35,47,56,29) sub=c(A,A,B,B,C,C,C£© time=c(1,2,1,2,1,2,3) gdata=cbind(response,sub,time) Namely, for three subjects, each has 2 or 3 observations. Assuming that each subject has a different variance, I want to test whether the mean for the three subjects are equal. I tried: library(nlme) weight - varIdent(form = ~ 1 | sub ) lme1 - lme(response~sub, weights = weight, data = gdata) However, it shows error: Error in getGroups.data.frame(dataMix, groups) : invalid formula for groups Can anyone help me? Eventually want to try this: lme1 - lme(response~sub, weights = weight, data = gdata) lme2 - lme(response~1, weights = weight, data = gdata) lrtest(lme1,lme2) Am I on the right track? Or can anyone think of a better method? Thank you very much! Best regards, Amanda A variety of issues: (1) you should use data.frame() rather than cbind() to combine your data: cbind() makes it into a matrix, which converts everything to character vectors. (2) you should use gls() rather than nlme() -- you don't have a random effect. (3) I really hope this is a toy example and that your real data are larger, because estimating means and variance for three groups from 7 data points is *extremely* optimistic. (If you look at intervals(lme1) you'll see that the confidence intervals are very wide ...) (4) I'm guessing lrtest() is from the lmtest package, but you didn't tell us that ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood
Hi all, I have run a regression and want to calculate the likelihood of obtaining the sample. Is there a way in which I can use R to get this likelihood value? Appreciate your help on this. The following are the details: raw_ols1=lm(data$LOSS~data$GDP+data$HPI+data$UE) summary(raw_ols1) Call: lm(formula = data$LOSS ~ data$GDP + data$HPI + data$UE) Residuals: Min 1Q Median 3QMax -0.0023859 -0.0006236 0.0002444 0.0006739 0.0017713 Coefficients: Estimate Std. Erro t value Pr(|t|) (Intercept)-3.940e-02 6.199e-03 -6.356 9.54e-06 *** data$GDP 3.467e-09 7.652e-09 0.453 0.656580 data$HPI 7.935e-05 1.875e-05 4.2320.000635 *** data$UE 6.858e-04 2.800e-04 2.449 0.026227 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.001198 on 16 degrees of freedom Multiple R-squared: 0.9528,Adjusted R-squared: 0.944 F-statistic: 107.8 on 3 and 16 DF, p-value: 7.989e-11 Thanks and regards, Preetam -- Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood
I have run a regression and want to calculate the likelihood of obtaining the sample. Is there a way in which I can use R to get this likelihood value? See ?logLik And see also ?help.search and ??. You would have found the above by typing ??likelihood at the command line in R S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test for comparing non-nested cox models
Dear All, I fitted two non-nested proportional hazards models using the coxph() function from package survival. Now, I would like to apply a model selection test like, e.g., the likelihood ratio test proposed by Vuong. I found an implementation of Vuong's test in the package 'pscl', but that does not work for cox models. Is there any package available that provides an implementation of that test or can someone point me to an implementation of a similar test? Thanks in advance! Hans-Ulrich -- Dr. Hans-Ulrich Klein Institute of Medical Informatics University of Münster Albert-Schweitzer-Campus 1 Gebäude A11 48149 Münster, Germany Tel.: +49 (0)251 83-58405 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio
Hi All I have to run multiple stimations and to compute Likelihhod ratio. If I compute ls function with coef and summary I can extract outputs that I need. I am not able to find something similar to log liklihood Can you pease tell me running a ls function x on y how to extract if posible LR statitic or Likelihood or Log likelihood. Many thanks in advance. If you send me some manual or webpage guide I will be very happy. I will try to improve my posts and questions. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio
On Nov 10, 2012, at 11:57 AM, bgnumis bgnum wrote: Hi All I have to run multiple stimations and to compute Likelihhod ratio. If I compute ls function with coef and summary I can extract outputs that I need. Do you mean lm? I am not able to find something similar to log liklihood You can use glm' with family=gaussian. Deviance is -2*LL Can you pease tell me running a ls function x on y how to extract if posible LR statitic or Likelihood or Log likelihood. Of you can look at the code and see how it does it. Many thanks in advance. If you send me some manual or webpage guide I will be very happy. Please read the Posting Guide. There are many helpful ideas about how to get further information and avoid the appearance of cluelessness. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood function involving integration, error in nlm
On 19-10-2012, at 04:40, stats12 wrote: Dear R users, I am trying to find the mle that involves integration. I am using the following code and get an error when I use the nlm function d-matrix(c(1,1,0,0,0,0,0,0,2,1,0,0,1,1,0,1,2,2,1,0),nrow=10,ncol=2) h-matrix(runif(20,0,1),10) integ-matrix(c(0),nrow=10, ncol=2) ll-function(p){ for (k in 1:2){ for(s in 1:10){ integrand-function(x) x^d[s,k]*exp(-x*gamma(1+1/p))^p*p*x^(p-1)*exp(-x*h[s,k]) integ[s,k]-integrate(integrand,0,Inf)$value } } lik-colSums(integ) -lik } initial-c(1) t-nlm(ll,initial) Error in nlm(ll, initial) : invalid function value in 'nlm' optimizer Before the call of nlm you should insert ll(initial) to check. You'll see that your function returns a vector and not a scalar as it should. I guess that ll() should return sum(-lik) or better -sum(integ) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood function involving integration, error in nlm
Thanks for pointing that out. Made some modifications and it worked. -- View this message in context: http://r.789695.n4.nabble.com/likelihood-function-involving-integration-error-in-nlm-tp4646697p4646764.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood function involving integration, error in nlm
Dear R users, I am trying to find the mle that involves integration. I am using the following code and get an error when I use the nlm function d-matrix(c(1,1,0,0,0,0,0,0,2,1,0,0,1,1,0,1,2,2,1,0),nrow=10,ncol=2) h-matrix(runif(20,0,1),10) integ-matrix(c(0),nrow=10, ncol=2) ll-function(p){ for (k in 1:2){ for(s in 1:10){ integrand-function(x) x^d[s,k]*exp(-x*gamma(1+1/p))^p*p*x^(p-1)*exp(-x*h[s,k]) integ[s,k]-integrate(integrand,0,Inf)$value } } lik-colSums(integ) -lik } initial-c(1) t-nlm(ll,initial) Error in nlm(ll, initial) : invalid function value in 'nlm' optimizer Any suggestions will be greatly appreciated. Thank you in advance -- View this message in context: http://r.789695.n4.nabble.com/likelihood-function-involving-integration-error-in-nlm-tp4646697.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood: Multiple Maxima
Hi everyone, I estimate structural equation models and use maximum likelihood estimation. However, the estimates differ depeding on the starting values I choose, so I guess there are multiple local maxima. I'm new to R (and statistics...;), does anybody maybe know how I solve this best? Thanks a lt! ;) Felix -- View this message in context: http://r.789695.n4.nabble.com/Likelihood-Multiple-Maxima-tp4645170.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test
Hello there, I want to perform a likelihood ratio test to check if a single exponential or a sum of 2 exponentials provides the best fit to my data. I am new to R programming and I am not sure if there is a direct function for doing this and whats the best way to go about it? #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) data - data.frame(x,y) Specifically, I would like to test if the model1 or model2 provides the best fit to the data- model 1: y = a*exp(-m*x) + c model 2: y = a*exp(-(m1+m2)*x) + c Likelihood ratio test = L(data| model1)/ L(data | model2) Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test
Hi Diviya, Take a look at the lrtest function in the lmtest package: install.packages('lmtest) require(lmtest) ?lrtest HTH, Jorge On Sun, Jun 12, 2011 at 1:16 PM, Diviya Smith wrote: Hello there, I want to perform a likelihood ratio test to check if a single exponential or a sum of 2 exponentials provides the best fit to my data. I am new to R programming and I am not sure if there is a direct function for doing this and whats the best way to go about it? #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) data - data.frame(x,y) Specifically, I would like to test if the model1 or model2 provides the best fit to the data- model 1: y = a*exp(-m*x) + c model 2: y = a*exp(-(m1+m2)*x) + c Likelihood ratio test = L(data| model1)/ L(data | model2) Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test
On Sun, 12 Jun 2011, Jorge Ivan Velez wrote: Hi Diviya, Take a look at the lrtest function in the lmtest package: install.packages('lmtest) require(lmtest) ?lrtest Yes, when you have to nls() fits, say m1 and m2, you can do lrtest(m1, m2) However, I don't think that both m1 and m2 can be identified in y = a * exp(-(m1+m2) * x) + c Unless I'm missing something only the sum (m1+m2) is identified anyway. Best, Z HTH, Jorge On Sun, Jun 12, 2011 at 1:16 PM, Diviya Smith wrote: Hello there, I want to perform a likelihood ratio test to check if a single exponential or a sum of 2 exponentials provides the best fit to my data. I am new to R programming and I am not sure if there is a direct function for doing this and whats the best way to go about it? #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) data - data.frame(x,y) Specifically, I would like to test if the model1 or model2 provides the best fit to the data- model 1: y = a*exp(-m*x) + c model 2: y = a*exp(-(m1+m2)*x) + c Likelihood ratio test = L(data| model1)/ L(data | model2) Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood ratio test
Dear R-help, Can anybody tell me which R package has Lo-Mendell Rubin LR test and Bootstrap LR test to compare the model fit between k class and k+1 class model for Latent class analysis? Thanks in advance, warn regards,Ms.Karunambigai M PhD Scholar Dept. of Biostatistics NIMHANS Bangalore India [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood ratio test
karuna m m_karuna2002 at yahoo.com writes: Can anybody tell me which R package has Lo-Mendell Rubin LR test and Bootstrap LR test to compare the model fit between k class and k+1 class model for Latent class analysis? I don't know, but library(sos) findFn(Lo-Mendell) findFn({latent class analysis}) might help you track down the answers. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood for new data from fitted nlme object
Dear list, I have a fitted nlme object from which I want to produce estimates of the (marginal) likelihood for new data, given the fitted model. I am trying to cross validate a number of nonlinear mixed effects models, and I would like to find a way to calculate the marginal likelihood for virgin data quickly. Since nlme must approximated the marginal likelihood many times when fitting a model, I feel like there should be an easy way to do this for new data. Is there an easy way to get this using the nlme model object? Is there a straightforward way to manually generate estimates using components of the fitted model objects (e.g. using equation 4.2 from Lindstrom and Bates1990 Nonlinear Mixed Effects Models for Repeated Measures Data)? Many thanks (and apologies if I have failed to conform to any of the list guidelines), Andrew Bateman PhD Candidate Department of Zoology University of Cambridge __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood of deviation
Hi, I have a dataset of 78.903 news articles pertaining to 340 corporate takeovers. Mean 231.3871 [articles per takeover] Std. Dev. 673.6395 I would like to calculate the probability of a certain number of news articles if I had more takeovers available. How likely is it to have X articles if I had Y takeovers? How can I do that? Thank you very much, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood of deviation
On Mar 25, 2011, at 12:17 PM, Michael Hecker wrote: Hi, I have a dataset of 78.903 news articles pertaining to 340 corporate takeovers. Mean 231.3871 [articles per takeover] Std. Dev. 673.6395 I would like to calculate the probability of a certain number of news articles if I had more takeovers available. How likely is it to have X articles if I had Y takeovers? You should submit homework ( or self-learning-general-stats questions) to other lists. In the old days this meant one of the sci.stat.* groups, but they are now over-run by spammers. Now the stats.exchange website is probably the best resource. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio based confidence intervals for logistic regression
I'm applying logistic regression to a moderate sized data set for which I believe Wald based confidence intervals on B coefficients are too conservative. Some of the literature recommends using confidence intervals based on the likelihood ratio in such cases, but I'm having difficulty locating a package that can do these. Any help would be immensely appreciated. Best, Jeff Hanna -- View this message in context: http://r.789695.n4.nabble.com/Likelihood-ratio-based-confidence-intervals-for-logistic-regression-tp2077303p2077303.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio based confidence intervals for logistic regression
jh556 wrote: I'm applying logistic regression to a moderate sized data set for which I believe Wald based confidence intervals on B coefficients are too conservative. Some of the literature recommends using confidence intervals based on the likelihood ratio in such cases, but I'm having difficulty locating a package that can do these. Any help would be immensely appreciated. Are you looking for profile-likelihood based CIs? Package MASS has those, via the confint function (if my memory is correct...) fm1 - glm(..., family = binomial) confint(fm1) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio based confidence intervals for logistic regression
Some quick googling suggests that they are the same thing. Thanks for the help! -- View this message in context: http://r.789695.n4.nabble.com/Likelihood-ratio-based-confidence-intervals-for-logistic-regression-tp2077303p2077354.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio based confidence intervals for logistic regression
On 2010-04-30 12:42, jh556 wrote: Some quick googling suggests that they are the same thing. Thanks for the help! And note that profile likelihood CIs are produced by default on glm objects, i.e. R uses MASS's confint.glm for glm objects. confint.default(your model) let's you compare with Wald CIs. -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood profiles for multiple parameters
Hi I'm using mle2 for optimization of a multinomial likelihood. The model fit is not very good and I would like to look at the likelihood profiles. However I'm optimizing many parameters (~40) so using the function profile() takes forever and is not practical. How can I calculate approximate likelihood profiles? Is there a function for this in R? thanks a lot for the help Beni __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood ratio test between glmer and glm
I am currently running a generalized linear mixed effect model using glmer and I want to estimate how much of the variance is explained by my random factor. summary(glmer(cbind(female,male)~date+(1|dam),family=binomial,data= liz3)) Generalized linear mixed model fit by the Laplace approximation Formula: cbind(female, male) ~ date + (1 | dam) Data: liz3 AIC BIC logLik deviance 241.3 258.2 -117.7235.3 Random effects: Groups NameVariance Std.Dev. dam(Intercept) 00 Number of obs: 2068, groups: dam, 47 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 1.480576 1.778648 0.83240.405 date -0.005481 0.007323 -0.74840.454 Correlation of Fixed Effects: (Intr) date2 -0.997 summary(glm(cbind(female,male)~date,family=binomial,data= liz3)) Call: glm(formula = cbind(female, male) ~ date, family = binomial, data = liz3) Deviance Residuals: Min 1Q Median 3Q Max -1.678 0.000 0.000 0.000 1.668 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 1.484429 1.778674 0.8350.404 date -0.005497 0.007324 -0.7510.453 (Dispersion parameter for binomial family taken to be 1) Null deviance: 235.90 on 174 degrees of freedom Residual deviance: 235.34 on 173 degrees of freedom AIC: 255.97 Number of Fisher Scoring iterations: 3 Based on a discussion found on the R mailing list but dating back to 2008, I have compared the log-likelihoods of the glm model and of the glmer model as follows: lrt - function (obj1, obj2){ L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } gm0 - glm(cbind(female,male)~date,family = binomial, data = liz3) gm1 - glmer(cbind(female,male)~date+(1|dam),family=binomial,data= liz3) lrt(gm0, gm1) and I have compared the deviances as follows: (d0 - deviance(gm0)) (d1 - deviance(gm1)) (LR - d0 - d1) pchisq(LR, 1, lower.tail = FALSE) As in some of the examples posted, although the variance of my random effect is zero, the p-value obtained by comparing the logLik is highly significant $L01 [1] 16.63553 $df p 1 $`p-value` [1] 4.529459e-05 while comparing the deviances gives me the following output that I don't quite understand but which seems (to the best of my understanding) to indicate that the random effect explains none of the variance. (LR - d0 - d1) ML -4.739449e-06 pchisq(LR, 1, lower = FALSE) ML 1 I would be very thankful if someone could clarify my mind about how to estimate the variance explained by my random factor and also when to use lower = FALSE versus TRUE. Many thanks in advance Davnah Urbach Post-doctoral researcher Dartmouth College Department of Biological Sciences 401 Gilman Hall Hanover, NH 03755 (USA) lab/office: (603) 646-9916 davnah.urb...@dartmouth.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood ratio test between glmer and glm
Based on a discussion found on the R mailing list but dating back to 2008, I have compared the log-likelihoods of the glm model and of the glmer model as follows: lrt - function (obj1, obj2){ L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } gm0 - glm(cbind(female,male)~date,family = binomial, data = liz3) gm1 - glmer(cbind(female,male)~date+(1|dam),family=binomial,data= liz3) lrt(gm0, gm1) It is _very_ dangerous to do this as the likelihoods are unlikely to be comparable/compatible. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood ratio test between glmer and glm
Thanks for this answer but does that mean that working with the deviances is better? Or how else could I evaluate the importance of my random terms? Many thanks, Davnah On Mar 14, 2010, at 8:12 PM, hadley wickham wrote: Based on a discussion found on the R mailing list but dating back to 2008, I have compared the log-likelihoods of the glm model and of the glmer model as follows: lrt - function (obj1, obj2){ L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } gm0 - glm(cbind(female,male)~date,family = binomial, data = liz3) gm1 - glmer(cbind(female,male)~date+(1|dam),family=binomial,data= liz3) lrt(gm0, gm1) It is _very_ dangerous to do this as the likelihoods are unlikely to be comparable/compatible. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ Davnah Urbach Post-doctoral researcher Dartmouth College Department of Biological Sciences 401 Gilman Hall Hanover, NH 03755 (USA) lab/office: (603) 646-9916 davnah.urb...@dartmouth.edu Davnah Urbach Post-doctoral researcher Dartmouth College Department of Biological Sciences 401 Gilman Hall Hanover, NH 03755 (USA) lab/office: (603) 646-9916 davnah.urb...@dartmouth.edu Davnah Urbach Post-doctoral researcher Dartmouth College Department of Biological Sciences 401 Gilman Hall Hanover, NH 03755 (USA) lab/office: (603) 646-9916 davnah.urb...@dartmouth.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood ratio test between glmer and glm
Davnah Urbach Davnah.Urbach at dartmouth.edu writes: Thanks for this answer but does that mean that working with the deviances is better? Or how else could I evaluate the importance of my random terms? You should probably (a) search the archives of the r-sig-mixed-models mailing list and (b) ask this question again there. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood Optimization With Categorical Variables
Dear all, I have the following problem: I have been using the routine optim in order to maximize a joint likelihood (basically a mixture with modeled weights) with quantitative variables..so far so good. Now I need to plug into the model a categorical variable (namely, age classes). Obviously, given the way optim works, it won't allow me to treat it directly (i.e. just plugging it in with a single parameter). I would like to avoid having to construct some heavy ifelse structure inside the function to be passed to optim, that would be quite inefficient.. Do you have any suggestion? Any other routine which could deal with categorical (along with non categorical) variables without for the aim of optimization? thanks in advance Federico Andreis Università degli Studi di Milano Bicocca, PhD Student MEB, Karolinska Institutet, Visiting PhD Student -- View this message in context: http://n4.nabble.com/Likelihood-Optimization-With-Categorical-Variables-tp1590874p1590874.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood
Hi all, Does any one know how to write the likelihood function for Poisson distribution in R when P(x=0). For normal case, it an be written as follows, n * log(lambda) - lambda * n * mean(dat) Any help is highly appreciated Ashta __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood Ratio Tests
Jim Silverton wrote: Is there any package available in R to do the following hypothesis tests? Testing the means of two Poissons (equivalent to the difference of two proportions) Testing the equality of two proportions from binomials Testing the equality of proprtions of two negative binomials (both conditional and unconditional tests). No large sample approximation tests...I need exact tests Thanks, Jim Well, poisson.test and binom.test exist. Wouldn't know what you're expecting for the negative binomial case (nor what you mean by the parenthesis re. Poisson means). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood Ratio Tests
Is there any package available in R to do the following hypothesis tests? Testing the means of two Poissons (equivalent to the difference of two proportions) Testing the equality of two proportions from binomials Testing the equality of proprtions of two negative binomials (both conditional and unconditional tests). No large sample approximation tests...I need exact tests Thanks, Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
Hi, I would like to apply the L-BFGS optimization algorithm to compute the MLE of a multilevel multinomial Logistic Regression. The likelihood formula for this model has as one of the summands the formula for computing the likelihood of an ordinary (single-level) multinomial logit regression. So I would basically need the R implementation for this formula. The L-BFGS algorithm also requires computing the partial derivatives of that formula in respect to all parameters. I would appreciate if you can point me to existing implementations that can do the above. Nick PS. The long story for the above: My data is as follows: - a vector of observed values (lenght = D) of the dependent multinomial variable each element belonging to one of N levels of that variable - a matrix of corresponding observed values (O x P) of the independent variables (P in total, most of them are binary but also a few are integer-valued) - a vector of current estimates (or starting values) for the Beta coefficients of the independent variables (length = P). This data is available for 4 different pools. The partially-pooled model that I want to compute has as a likelihood function a sum of several elements, one being the classical likelihood function of a multinomial logit regression for each of the 4 pools. This is the same model as in Finkel and Manning Hierarchical Bayesian Domain Adaptation (2009). -- View this message in context: http://www.nabble.com/Likelihood-Function-for-Multinomial-Logistic-Regression-and-its-partial-derivatives-tp24772731p24772731.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
Hi, Providing the gradient function is generally a good idea in optimization; however, it is not necessary. Almost all optimization routines will compute this using a simple finite-difference approximation, if they are not user-specified. If your function is very complicated, then you are more likely to make a mistake in computing analytic gradient, although many optimization routines also provide a check to see if the gradient is correctly specified or not. But you can do this yourself using the `grad' function in numDeriv package. Hope this is helpful, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: nikolay12 nikola...@gmail.com Date: Sunday, August 2, 2009 3:04 am Subject: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives To: r-help@r-project.org Hi, I would like to apply the L-BFGS optimization algorithm to compute the MLE of a multilevel multinomial Logistic Regression. The likelihood formula for this model has as one of the summands the formula for computing the likelihood of an ordinary (single-level) multinomial logit regression. So I would basically need the R implementation for this formula. The L-BFGS algorithm also requires computing the partial derivatives of that formula in respect to all parameters. I would appreciate if you can point me to existing implementations that can do the above. Nick PS. The long story for the above: My data is as follows: - a vector of observed values (lenght = D) of the dependent multinomial variable each element belonging to one of N levels of that variable - a matrix of corresponding observed values (O x P) of the independent variables (P in total, most of them are binary but also a few are integer-valued) - a vector of current estimates (or starting values) for the Beta coefficients of the independent variables (length = P). This data is available for 4 different pools. The partially-pooled model that I want to compute has as a likelihood function a sum of several elements, one being the classical likelihood function of a multinomial logit regression for each of the 4 pools. This is the same model as in Finkel and Manning Hierarchical Bayesian Domain Adaptation (2009). -- View this message in context: Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
Thanks a lot. The info about computing the gradient will be helpful. I admit that I am somewhat confused about the likelihood function itself. It is often said that you need to set a reference category. However, I found two different implementations in Matlab for which setting the reference category is not really obvious. Thus I wanted to find a single, indisputable implementation of the likelihood function for multinomous logistic regression in R. Can't believe there is no such available. Nick Ravi Varadhan wrote: Hi, Providing the gradient function is generally a good idea in optimization; however, it is not necessary. Almost all optimization routines will compute this using a simple finite-difference approximation, if they are not user-specified. If your function is very complicated, then you are more likely to make a mistake in computing analytic gradient, although many optimization routines also provide a check to see if the gradient is correctly specified or not. But you can do this yourself using the `grad' function in numDeriv package. Hope this is helpful, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: nikolay12 nikola...@gmail.com Date: Sunday, August 2, 2009 3:04 am Subject: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives To: r-help@r-project.org Hi, I would like to apply the L-BFGS optimization algorithm to compute the MLE of a multilevel multinomial Logistic Regression. The likelihood formula for this model has as one of the summands the formula for computing the likelihood of an ordinary (single-level) multinomial logit regression. So I would basically need the R implementation for this formula. The L-BFGS algorithm also requires computing the partial derivatives of that formula in respect to all parameters. I would appreciate if you can point me to existing implementations that can do the above. Nick PS. The long story for the above: My data is as follows: - a vector of observed values (lenght = D) of the dependent multinomial variable each element belonging to one of N levels of that variable - a matrix of corresponding observed values (O x P) of the independent variables (P in total, most of them are binary but also a few are integer-valued) - a vector of current estimates (or starting values) for the Beta coefficients of the independent variables (length = P). This data is available for 4 different pools. The partially-pooled model that I want to compute has as a likelihood function a sum of several elements, one being the classical likelihood function of a multinomial logit regression for each of the 4 pools. This is the same model as in Finkel and Manning Hierarchical Bayesian Domain Adaptation (2009). -- View this message in context: Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Likelihood-Function-for-Multinomial-Logistic-Regression-and-its-partial-derivatives-tp24772731p24781298.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
Hi: John Fox's CAR book has some very nice examples of how the multinomial likelihood is estimated computationally speaking. But you mentioned multilevel earlier which sounds more complex ? On Aug 2, 2009, nikolay12 nikola...@gmail.com wrote: Thanks a lot. The info about computing the gradient will be helpful. I admit that I am somewhat confused about the likelihood function itself. It is often said that you need to set a reference category. However, I found two different implementations in Matlab for which setting the reference category is not really obvious. Thus I wanted to find a single, indisputable implementation of the likelihood function for multinomous logistic regression in R. Can't believe there is no such available. Nick Ravi Varadhan wrote: Hi, Providing the gradient function is generally a good idea in optimization; however, it is not necessary. Almost all optimization routines will compute this using a simple finite-difference approximation, if they are not user-specified. If your function is very complicated, then you are more likely to make a mistake in computing analytic gradient, although many optimization routines also provide a check to see if the gradient is correctly specified or not. But you can do this yourself using the `grad' function in numDeriv package. Hope this is helpful, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: [1]rvarad...@jhmi.edu - Original Message - From: nikolay12 [2]nikola...@gmail.com Date: Sunday, August 2, 2009 3:04 am Subject: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives To: [3]r-h...@r-project.org Hi, I would like to apply the L-BFGS optimization algorithm to compute the MLE of a multilevel multinomial Logistic Regression. The likelihood formula for this model has as one of the summands the formula for computing the likelihood of an ordinary (single-level) multinomial logit regression. So I would basically need the R implementation for this formula. The L-BFGS algorithm also requires computing the partial derivatives of that formula in respect to all parameters. I would appreciate if you can point me to existing implementations that can do the above. Nick PS. The long story for the above: My data is as follows: - a vector of observed values (lenght = D) of the dependent multinomial variable each element belonging to one of N levels of that variable - a matrix of corresponding observed values (O x P) of the independent variables (P in total, most of them are binary but also a few are integer-valued) - a vector of current estimates (or starting values) for the Beta coefficients of the independent variables (length = P). This data is available for 4 different pools. The partially-pooled model that I want to compute has as a likelihood function a sum of several elements, one being the classical likelihood function of a multinomial logit regression for each of the 4 pools. This is the same model as in Finkel and Manning Hierarchical Bayesian Domain Adaptation (2009). -- View this message in context: Sent from the R help mailing list archive at Nabble.com. __ [4]r-h...@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ [5]r-h...@r-project.org mailing list [6]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [7]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: [8]http://www.nabble.com/Likelihood-Function-for-Multinomial-Logistic-Regr ession-and-its-partial-derivatives-tp24772731p24781298.html Sent from the R help mailing list archive at Nabble.com. __ [9]r-h...@r-project.org mailing list [10]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [11]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code
Re: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
You may refer to mlogit for the ordinary multinomial regression. As fas as I know, there are no functions for multilevel multinomial regression. Ronggui 2009/8/2 nikolay12 nikola...@gmail.com: Hi, I would like to apply the L-BFGS optimization algorithm to compute the MLE of a multilevel multinomial Logistic Regression. The likelihood formula for this model has as one of the summands the formula for computing the likelihood of an ordinary (single-level) multinomial logit regression. So I would basically need the R implementation for this formula. The L-BFGS algorithm also requires computing the partial derivatives of that formula in respect to all parameters. I would appreciate if you can point me to existing implementations that can do the above. Nick PS. The long story for the above: My data is as follows: - a vector of observed values (lenght = D) of the dependent multinomial variable each element belonging to one of N levels of that variable - a matrix of corresponding observed values (O x P) of the independent variables (P in total, most of them are binary but also a few are integer-valued) - a vector of current estimates (or starting values) for the Beta coefficients of the independent variables (length = P). This data is available for 4 different pools. The partially-pooled model that I want to compute has as a likelihood function a sum of several elements, one being the classical likelihood function of a multinomial logit regression for each of the 4 pools. This is the same model as in Finkel and Manning Hierarchical Bayesian Domain Adaptation (2009). -- View this message in context: http://www.nabble.com/Likelihood-Function-for-Multinomial-Logistic-Regression-and-its-partial-derivatives-tp24772731p24772731.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- HUANG Ronggui, Wincent PhD Candidate Dept of Public and Social Administration City University of Hong Kong Home page: http://asrr.r-forge.r-project.org/rghuang.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
Thanks for the book suggestion. I'll check it out tomorrow when the library opens up. Yes, it is a multilevel model, but its likelihood function is the sum of the likelihood functions for the individual levels (i.e. a simple multinomial logits) and some other terms (the priors). It is, essentially, the hierarchical priors model of Finkel and Manning (HLT, 2009). So to compute the above composite likelihood function I need just the simple multinomial logit likelihood for each of the separate pools (there are just two levels - the lower consisting of 4 pools, the upper of just one, i.e. it's a hierarchical relationship). So I am really surprised that I could not find R package containing a function that computes the simple multinomial logit regression for a given set of parameter values (rather than simply supplying the final fit). statquant wrote: Hi: John Fox's CAR book has some very nice examples of how the multinomial likelihood is estimated computationally speaking. But you mentioned multilevel earlier which sounds more complex ? -- View this message in context: http://www.nabble.com/Likelihood-Function-for-Multinomial-Logistic-Regression-and-its-partial-derivatives-tp24772731p24783615.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood Function for Multinomial Logistic Regression and its partial derivatives
Thanks, I had a look at mlogit. It seems it does fit a multinomial logit regression but - just as nnet or VGAM are doing it - it has a function that tells you the fitted value, not the value that you have with a set of parameters (which might not be the optimal ones). Or am I wrong on this? Ronggui Huang wrote: You may refer to mlogit for the ordinary multinomial regression. As fas as I know, there are no functions for multilevel multinomial regression. Ronggui -- View this message in context: http://www.nabble.com/Likelihood-Function-for-Multinomial-Logistic-Regression-and-its-partial-derivatives-tp24772731p24784053.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood
I have a problem related to measuring likelihood between -an observed presence absence dataset (containing 0 or 1) -a predicted simulation matrix of the same dimensions (containing values from 0 to 1) This must be a common problem but I am struggling to find the answer in the literature. Within the simulation model I have a parameter 'm' which I can alter to find the best fit (maximum likelihood). Currently I just use a 'sum of squares of difference' to measure likelihood. ie likelihood = sum (obs-pred)^2 This is then very easy to find (using numerical optimisation techniques) the value of 'm' which gives my maximum likelihood (least sum of squares) However I do not think my likelihood function is the correct one to be using for this purpose. Firstly, if sum of squares is the correct method, maybe I should be taking the square root of the likelihood (makes no difference) and possibly the 'mean' values of the datasets may need to be included in my calculatons. However, sum of squares suggests my data are normally distributed (which it is clearly not) Obs (boolean O or 1) Pred (beta O to 1) Difference (beta -1 to 1) My guess is that I should be using a beta (or uniform) defined likelihood measure. Or maybe just a simple transformation. Any help greatly appreciated Mark _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood of a ridge regression (lm.ridge)?
Joris, Ridge regression is a type of regularized estimation approach. The objective function for least-squares, (Y - Xb)^t (Y - Xb) is modified by adding a quadratic penalty, k b^t b. Because of this the log-likelihood value (sum of squared residuals), for a fixed k, does not have much meaning, and is not really useful. However, a key issue in such regularized estimation is how to choose the regularization parameter k. You can see that lm.ridge gives two different ways to estimate k (there are other ways). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: joris meys jorism...@gmail.com Date: Tuesday, March 17, 2009 7:37 pm Subject: [R] Likelihood of a ridge regression (lm.ridge)? To: R-help Mailing List r-help@r-project.org Dear all, I want to get the likelihood (or AIC or BIC) of a ridge regression model using lm.ridge from the MASS library. Yet, I can't really find it. As lm.ridge does not return a standard fit object, it doesn't work with functions like e.g. BIC (nlme package). Is there a way around it? I would calculate it myself, but I'm not sure how to do that for a ridge regression. Thank you in advance Kind regards Joris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood of a ridge regression (lm.ridge)?
Dear all, I want to get the likelihood (or AIC or BIC) of a ridge regression model using lm.ridge from the MASS library. Yet, I can't really find it. As lm.ridge does not return a standard fit object, it doesn't work with functions like e.g. BIC (nlme package). Is there a way around it? I would calculate it myself, but I'm not sure how to do that for a ridge regression. Thank you in advance Kind regards Joris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test using R
Hi! I am working on the Logistic Regression using R. My R script is as follows ONS - read.csv(ONS.csv,header = TRUE) ONS.glm - glm(Y ~ Age1+Age2+Sex+Education+Profession+TimeInJob+ TimeInCurrentJob+OwnResidence+Dependents+ApplIncome+FamilyInco+IIR+FOIR+YearsAtBank+SavingsAccount+CurrentAccount, family = binomial(link = logit), data=ONS) summary(ONS.glm) OUTPUT (RESULT) is as follows - Deviance Residuals: Min 1Q Median 3Q Max -1.7888 -0.6910 -0.3220 -0.2179 3.1608 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -1.454e+00 1.380e-01 -10.532 2e-16 *** Age1 2.313e-01 9.902e-02 2.336 0.0195 * Age2 -6.648e-01 5.209e-02 -12.761 2e-16 *** Sex -7.012e-01 7.325e-02 -9.572 2e-16 *** Education-5.271e-01 7.963e-02 -6.619 3.61e-11 *** Profession -3.512e-01 4.908e-02 -7.155 8.38e-13 *** TimeInJob-3.107e-01 1.980e-01 -1.569 0.1166 TimeInCurrentJob 2.752e+00 1.895e-01 14.521 2e-16 *** OwnResidence 1.608e+00 1.858e-01 8.654 2e-16 *** Dependents -4.066e-01 1.867e-01 -2.178 0.0294 * ApplIncome -6.401e-02 4.319e-03 -14.820 2e-16 *** FamilyInco7.148e-02 7.389e-03 9.674 2e-16 *** IIR 5.765e-02 1.353e-02 4.262 2.03e-05 *** FOIR 7.383e-07 1.056e-07 6.990 2.74e-12 *** YearsAtBank -1.926e-07 3.519e-08 -5.473 4.42e-08 *** SavingsAccount -2.083e-07 1.785e-07 -1.167 0.2432 CurrentAccount -1.550e+00 1.107e-01 -14.008 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 39300 on 42499 degrees of freedom Residual deviance: 32534 on 42483 degrees of freedom AIC: 32568 Question No - (1) What does this means? Deviance Residuals: Min 1Q Median 3Q Max -1.7888 -0.6910 -0.3220 -0.2179 3.1608 Question No - (2) What is the significance of these codes? Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Question No - (3) What does this means? Null deviance: 39300 on 42499 degrees of freedom Residual deviance: 32534 on 42483 degrees of freedom And this is the last question Question No - (4) To test Ho : All regression coefficients are ZERO, I need to use Likelihood Ratio Test (G) and the related p value. When I use minitab, I get value of G and related p, but using R language, how do I get value of G and p value? _ I am sincerely sorry for raising these many questions? But I am sure there will be many like me who will also be immensely benefited by the replies, I may get pertaining to these questions. Thanking you all in advance, With warm regards Maithili Shiva __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test using R
Hi! I am working on the Logistic Regression using R. My R script is as follows ONS - read.csv(ONS.csv,header = TRUE) ONS.glm - glm(Y ~ Age1+Age2+Sex+Education+Profession+TimeInJob+ TimeInCurrentJob+OwnResidence+Dependents+ApplIncome+FamilyInco+IIR+FOIR+YearsAtBank+SavingsAccount+CurrentAccount, family = binomial(link = logit), data=ONS) summary(ONS.glm) OUTPUT (RESULT) is as follows - Deviance Residuals: Min 1Q Median 3Q Max -1.7888 -0.6910 -0.3220 -0.2179 3.1608 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -1.454e+00 1.380e-01 -10.532 2e-16 *** Age1 2.313e-01 9.902e-02 2.336 0.0195 * Age2 -6.648e-01 5.209e-02 -12.761 2e-16 *** Sex -7.012e-01 7.325e-02 -9.572 2e-16 *** Education-5.271e-01 7.963e-02 -6.619 3.61e-11 *** Profession -3.512e-01 4.908e-02 -7.155 8.38e-13 *** TimeInJob-3.107e-01 1.980e-01 -1.569 0.1166 TimeInCurrentJob 2.752e+00 1.895e-01 14.521 2e-16 *** OwnResidence 1.608e+00 1.858e-01 8.654 2e-16 *** Dependents -4.066e-01 1.867e-01 -2.178 0.0294 * ApplIncome -6.401e-02 4.319e-03 -14.820 2e-16 *** FamilyInco7.148e-02 7.389e-03 9.674 2e-16 *** IIR 5.765e-02 1.353e-02 4.262 2.03e-05 *** FOIR 7.383e-07 1.056e-07 6.990 2.74e-12 *** YearsAtBank -1.926e-07 3.519e-08 -5.473 4.42e-08 *** SavingsAccount -2.083e-07 1.785e-07 -1.167 0.2432 CurrentAccount -1.550e+00 1.107e-01 -14.008 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 39300 on 42499 degrees of freedom Residual deviance: 32534 on 42483 degrees of freedom AIC: 32568 Question No - (1) What does this means? Deviance Residuals: Min 1Q Median 3Q Max -1.7888 -0.6910 -0.3220 -0.2179 3.1608 Question No - (2) What does this means? Null deviance: 39300 on 42499 degrees of freedom Residual deviance: 32534 on 42483 degrees of freedom And this is the last question Question No - (3) To test Ho : All regression coefficients are ZERO, I need to use Likelihood Ratio Test (G) and the related p value. When I use minitab, I get value of G and related p, but using R language, how do I get value of G and p value? _ I am sincerely sorry for raising these many questions? But I am sure there will be many like me who will also be immensely benefited by the replies, I may get pertaining to these questions. Thanking you all in advance, With warm regards Maithili Shiva __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test using R
For the third one: ?anova.glm test=Chisq will be LRT. For the first two, you can have the answer from ordinary stat book. On Wed, Nov 5, 2008 at 1:11 PM, Maithili Shiva [EMAIL PROTECTED] wrote: Hi! I am working on the Logistic Regression using R. My R script is as follows ONS - read.csv(ONS.csv,header = TRUE) ONS.glm - glm(Y ~ Age1+Age2+Sex+Education+Profession+TimeInJob+ TimeInCurrentJob+OwnResidence+Dependents+ApplIncome+FamilyInco+IIR+FOIR+YearsAtBank+SavingsAccount+CurrentAccount, family = binomial(link = logit), data=ONS) summary(ONS.glm) OUTPUT (RESULT) is as follows - Deviance Residuals: Min 1Q Median 3Q Max -1.7888 -0.6910 -0.3220 -0.2179 3.1608 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -1.454e+00 1.380e-01 -10.532 2e-16 *** Age1 2.313e-01 9.902e-02 2.336 0.0195 * Age2 -6.648e-01 5.209e-02 -12.761 2e-16 *** Sex -7.012e-01 7.325e-02 -9.572 2e-16 *** Education-5.271e-01 7.963e-02 -6.619 3.61e-11 *** Profession -3.512e-01 4.908e-02 -7.155 8.38e-13 *** TimeInJob-3.107e-01 1.980e-01 -1.569 0.1166 TimeInCurrentJob 2.752e+00 1.895e-01 14.521 2e-16 *** OwnResidence 1.608e+00 1.858e-01 8.654 2e-16 *** Dependents -4.066e-01 1.867e-01 -2.178 0.0294 * ApplIncome -6.401e-02 4.319e-03 -14.820 2e-16 *** FamilyInco7.148e-02 7.389e-03 9.674 2e-16 *** IIR 5.765e-02 1.353e-02 4.262 2.03e-05 *** FOIR 7.383e-07 1.056e-07 6.990 2.74e-12 *** YearsAtBank -1.926e-07 3.519e-08 -5.473 4.42e-08 *** SavingsAccount -2.083e-07 1.785e-07 -1.167 0.2432 CurrentAccount -1.550e+00 1.107e-01 -14.008 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 39300 on 42499 degrees of freedom Residual deviance: 32534 on 42483 degrees of freedom AIC: 32568 Question No - (1) What does this means? Deviance Residuals: Min 1Q Median 3Q Max -1.7888 -0.6910 -0.3220 -0.2179 3.1608 Question No - (2) What does this means? Null deviance: 39300 on 42499 degrees of freedom Residual deviance: 32534 on 42483 degrees of freedom And this is the last question Question No - (3) To test Ho : All regression coefficients are ZERO, I need to use Likelihood Ratio Test (G) and the related p value. When I use minitab, I get value of G and related p, but using R language, how do I get value of G and p value? _ I am sincerely sorry for raising these many questions? But I am sure there will be many like me who will also be immensely benefited by the replies, I may get pertaining to these questions. Thanking you all in advance, With warm regards Maithili Shiva __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- CH Chan Research Assistant - KWH http://www.macgrass.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood between observed and predicted response
Christophe LOOTS Christophe.Loots at ifremer.fr writes: Thank you so much for your help. The function dbinom seems to work very well. However, I'm a bit lost with the dnorm function. Apparently, I have to compute the mean mu and the standard deviation sd but what does it mean exactly? I only have a vector of predicted response and a vector of observed response that I would like to compare! What are mu and sigma. mu is the mean (which you might as well set to the predicted value). sd is the standard deviation; in order to calculate the likelihood in this case, you'll need an *independent* estimate (from somewhere) of the standard deviation. Without thinking about it too carefully I think you could probably get this from sqrt(sum((predicted-observed)^2)/(n-1)) Thanks again. Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test between glm and glmer fits
This particular case with a random intercept model can be handled by glmmML, by bootstrapping the p-value. Best, Göran On Thu, Jul 17, 2008 at 1:29 PM, Douglas Bates [EMAIL PROTECTED] wrote: On Thu, Jul 17, 2008 at 2:50 AM, Rune Haubo [EMAIL PROTECTED] wrote: 2008/7/16 Dimitris Rizopoulos [EMAIL PROTECTED]: well, for computing the p-value you need to use pchisq() and dchisq() (check ?dchisq for more info). For model fits with a logLik method you can directly use the following simple function: lrt - function (obj1, obj2) { L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } library(lme4) gm0 - glm(cbind(incidence, size - incidence) ~ period, family = binomial, data = cbpp) gm1 - glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data = cbpp) lrt(gm0, gm1) Yes, that seems quite natural, but then try to compare with the deviance: logLik(gm0) logLik(gm1) (d0 - deviance(gm0)) (d1 - deviance(gm1)) (LR - d0 - d1) pchisq(LR, 1, lower = FALSE) Obviously the deviance in glm is *not* twice the negative log-likelihood as it is in glmer. The question remains which of these two quantities is appropriate for comparison. I am not sure exactly how the deviance and/or log-likelihood are calculated in glmer, but my feeling is that one should trust the deviance rather than the log-likelihoods for these purposes. This is supported by the following comparison: Ad an arbitrary random effect with a close-to-zero variance and note the deviance: tmp - rep(1:4, each = nrow(cbpp)/4) gm2 - glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), family = binomial, data = cbpp) (d2 - deviance(gm2)) This deviance is very close to that obtained from the glm model. I have included the mixed-models mailing list in the hope that someone could explain how the deviance is computed in glmer and why deviances, but not likelihoods are comparable to glm-fits. In that example I think the problem may be that I have not yet written the code to adjust the deviance of the glmer fit for the null deviance. However, there are some issues regarding this likelihood ratio test. 1) The null hypothesis is on the boundary of the parameter space, i.e., you test whether the variance for the random effect is zero. For this case the assumed chi-squared distribution for the LRT may *not* be totally appropriate and may produce conservative p-values. There is some theory regarding this issue, which has shown that the reference distribution for the LRT in this case is a mixture of a chi-squared(df = 0) and chi-squared(df = 1). Another option is to use simulation-based approach where you can approximate the reference distribution of the LRT under the null using simulation. You may check below for an illustration of this procedure (not-tested): X - model.matrix(gm0) coefs - coef(gm0) pr - plogis(c(X %*% coefs)) n - length(pr) new.dat - cbpp Tobs - lrt(gm0, gm1)$L01 B - 200 out.T - numeric(B) for (b in 1:B) { y - rbinom(n, cbpp$size, pr) new.dat$incidence - y fit0 - glm(formula(gm0), family = binomial, data = new.dat) fit1 - glmer(formula(gm1), family = binomial, data = new.dat) out.T[b] - lrt(fit0, fit1)$L01 } # estimate p-value (sum(out.T = Tobs) + 1) / (B + 1) 2) For the glmer fit you have to note that you work with an approximation to the log-likelihood (obtained using numerical integration) and not the actual log-likelihood. I hope it helps. Best, Dimitris Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting COREY SPARKS [EMAIL PROTECTED]: Dear list, I am fitting a logistic multi-level regression model and need to test the difference between the ordinary logistic regression from a glm() fit and the mixed effects fit from glmer(), basically I want to do a likelihood ratio test between the two fits. The data are like this: My outcome is a (1,0) for health status, I have several (1,0) dummy variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 to 100). My higher level is called munid and has 581 levels. The data have 45243 observations. Here are my program statements: #GLM fit ph.fit.2-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), data=mx.merge) #GLMER fit
Re: [R] Likelihood ratio test between glm and glmer fits
2008/7/16 Dimitris Rizopoulos [EMAIL PROTECTED]: well, for computing the p-value you need to use pchisq() and dchisq() (check ?dchisq for more info). For model fits with a logLik method you can directly use the following simple function: lrt - function (obj1, obj2) { L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } library(lme4) gm0 - glm(cbind(incidence, size - incidence) ~ period, family = binomial, data = cbpp) gm1 - glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data = cbpp) lrt(gm0, gm1) Yes, that seems quite natural, but then try to compare with the deviance: logLik(gm0) logLik(gm1) (d0 - deviance(gm0)) (d1 - deviance(gm1)) (LR - d0 - d1) pchisq(LR, 1, lower = FALSE) Obviously the deviance in glm is *not* twice the negative log-likelihood as it is in glmer. The question remains which of these two quantities is appropriate for comparison. I am not sure exactly how the deviance and/or log-likelihood are calculated in glmer, but my feeling is that one should trust the deviance rather than the log-likelihoods for these purposes. This is supported by the following comparison: Ad an arbitrary random effect with a close-to-zero variance and note the deviance: tmp - rep(1:4, each = nrow(cbpp)/4) gm2 - glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), family = binomial, data = cbpp) (d2 - deviance(gm2)) This deviance is very close to that obtained from the glm model. I have included the mixed-models mailing list in the hope that someone could explain how the deviance is computed in glmer and why deviances, but not likelihoods are comparable to glm-fits. Best Rune However, there are some issues regarding this likelihood ratio test. 1) The null hypothesis is on the boundary of the parameter space, i.e., you test whether the variance for the random effect is zero. For this case the assumed chi-squared distribution for the LRT may *not* be totally appropriate and may produce conservative p-values. There is some theory regarding this issue, which has shown that the reference distribution for the LRT in this case is a mixture of a chi-squared(df = 0) and chi-squared(df = 1). Another option is to use simulation-based approach where you can approximate the reference distribution of the LRT under the null using simulation. You may check below for an illustration of this procedure (not-tested): X - model.matrix(gm0) coefs - coef(gm0) pr - plogis(c(X %*% coefs)) n - length(pr) new.dat - cbpp Tobs - lrt(gm0, gm1)$L01 B - 200 out.T - numeric(B) for (b in 1:B) { y - rbinom(n, cbpp$size, pr) new.dat$incidence - y fit0 - glm(formula(gm0), family = binomial, data = new.dat) fit1 - glmer(formula(gm1), family = binomial, data = new.dat) out.T[b] - lrt(fit0, fit1)$L01 } # estimate p-value (sum(out.T = Tobs) + 1) / (B + 1) 2) For the glmer fit you have to note that you work with an approximation to the log-likelihood (obtained using numerical integration) and not the actual log-likelihood. I hope it helps. Best, Dimitris Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting COREY SPARKS [EMAIL PROTECTED]: Dear list, I am fitting a logistic multi-level regression model and need to test the difference between the ordinary logistic regression from a glm() fit and the mixed effects fit from glmer(), basically I want to do a likelihood ratio test between the two fits. The data are like this: My outcome is a (1,0) for health status, I have several (1,0) dummy variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 to 100). My higher level is called munid and has 581 levels. The data have 45243 observations. Here are my program statements: #GLM fit ph.fit.2-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), data=mx.merge) #GLMER fit ph.fit.3-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(), data=mx.merge) I cannot find a method in R that will do the LR test between a glm and a glmer fit, so I try to do it using the liklihoods from both models #form the likelihood ratio test between the glm and glmer fits x2--2*(logLik(ph.fit.2)-logLik(ph.fit.3)) ML 79.60454 attr(,nobs) n 45243 attr(,nall) n 45243 attr(,df) [1] 14 attr(,REML) [1] FALSE attr(,class)
Re: [R] Likelihood ratio test between glm and glmer fits
On Thu, Jul 17, 2008 at 2:50 AM, Rune Haubo [EMAIL PROTECTED] wrote: 2008/7/16 Dimitris Rizopoulos [EMAIL PROTECTED]: well, for computing the p-value you need to use pchisq() and dchisq() (check ?dchisq for more info). For model fits with a logLik method you can directly use the following simple function: lrt - function (obj1, obj2) { L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } library(lme4) gm0 - glm(cbind(incidence, size - incidence) ~ period, family = binomial, data = cbpp) gm1 - glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data = cbpp) lrt(gm0, gm1) Yes, that seems quite natural, but then try to compare with the deviance: logLik(gm0) logLik(gm1) (d0 - deviance(gm0)) (d1 - deviance(gm1)) (LR - d0 - d1) pchisq(LR, 1, lower = FALSE) Obviously the deviance in glm is *not* twice the negative log-likelihood as it is in glmer. The question remains which of these two quantities is appropriate for comparison. I am not sure exactly how the deviance and/or log-likelihood are calculated in glmer, but my feeling is that one should trust the deviance rather than the log-likelihoods for these purposes. This is supported by the following comparison: Ad an arbitrary random effect with a close-to-zero variance and note the deviance: tmp - rep(1:4, each = nrow(cbpp)/4) gm2 - glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), family = binomial, data = cbpp) (d2 - deviance(gm2)) This deviance is very close to that obtained from the glm model. I have included the mixed-models mailing list in the hope that someone could explain how the deviance is computed in glmer and why deviances, but not likelihoods are comparable to glm-fits. In that example I think the problem may be that I have not yet written the code to adjust the deviance of the glmer fit for the null deviance. However, there are some issues regarding this likelihood ratio test. 1) The null hypothesis is on the boundary of the parameter space, i.e., you test whether the variance for the random effect is zero. For this case the assumed chi-squared distribution for the LRT may *not* be totally appropriate and may produce conservative p-values. There is some theory regarding this issue, which has shown that the reference distribution for the LRT in this case is a mixture of a chi-squared(df = 0) and chi-squared(df = 1). Another option is to use simulation-based approach where you can approximate the reference distribution of the LRT under the null using simulation. You may check below for an illustration of this procedure (not-tested): X - model.matrix(gm0) coefs - coef(gm0) pr - plogis(c(X %*% coefs)) n - length(pr) new.dat - cbpp Tobs - lrt(gm0, gm1)$L01 B - 200 out.T - numeric(B) for (b in 1:B) { y - rbinom(n, cbpp$size, pr) new.dat$incidence - y fit0 - glm(formula(gm0), family = binomial, data = new.dat) fit1 - glmer(formula(gm1), family = binomial, data = new.dat) out.T[b] - lrt(fit0, fit1)$L01 } # estimate p-value (sum(out.T = Tobs) + 1) / (B + 1) 2) For the glmer fit you have to note that you work with an approximation to the log-likelihood (obtained using numerical integration) and not the actual log-likelihood. I hope it helps. Best, Dimitris Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting COREY SPARKS [EMAIL PROTECTED]: Dear list, I am fitting a logistic multi-level regression model and need to test the difference between the ordinary logistic regression from a glm() fit and the mixed effects fit from glmer(), basically I want to do a likelihood ratio test between the two fits. The data are like this: My outcome is a (1,0) for health status, I have several (1,0) dummy variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 to 100). My higher level is called munid and has 581 levels. The data have 45243 observations. Here are my program statements: #GLM fit ph.fit.2-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), data=mx.merge) #GLMER fit ph.fit.3-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(), data=mx.merge) I cannot find a method in R that will do the LR test between a glm and a glmer fit, so I try to do it using the liklihoods from both models #form the
[R] Likelihood ratio test between glm and glmer fits
Dear list, I am fitting a logistic multi-level regression model and need to test the difference between the ordinary logistic regression from a glm() fit and the mixed effects fit from glmer(), basically I want to do a likelihood ratio test between the two fits. The data are like this: My outcome is a (1,0) for health status, I have several (1,0) dummy variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 to 100). My higher level is called munid and has 581 levels. The data have 45243 observations. Here are my program statements: #GLM fit ph.fit.2-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), data=mx.merge) #GLMER fit ph.fit.3-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(), data=mx.merge) I cannot find a method in R that will do the LR test between a glm and a glmer fit, so I try to do it using the liklihoods from both models #form the likelihood ratio test between the glm and glmer fits x2--2*(logLik(ph.fit.2)-logLik(ph.fit.3)) ML 79.60454 attr(,nobs) n 45243 attr(,nall) n 45243 attr(,df) [1] 14 attr(,REML) [1] FALSE attr(,class) [1] logLik #Get the associated p-value dchisq(x2,14) ML 5.94849e-15 Which looks like an improvement in model fit to me. Am I seeing this correctly or are the two models even able to be compared? they are both estimated via maximum likelihood, so they should be, I think. Any help would be appreciated. Corey Corey S. Sparks, Ph.D. Assistant Professor Department of Demography and Organization Studies University of Texas San Antonio One UTSA Circle San Antonio, TX 78249 email:[EMAIL PROTECTED] web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test between glm and glmer fits
well, for computing the p-value you need to use pchisq() and dchisq() (check ?dchisq for more info). For model fits with a logLik method you can directly use the following simple function: lrt - function (obj1, obj2) { L0 - logLik(obj1) L1 - logLik(obj2) L01 - as.vector(- 2 * (L0 - L1)) df - attr(L1, df) - attr(L0, df) list(L01 = L01, df = df, p-value = pchisq(L01, df, lower.tail = FALSE)) } library(lme4) gm0 - glm(cbind(incidence, size - incidence) ~ period, family = binomial, data = cbpp) gm1 - glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), family = binomial, data = cbpp) lrt(gm0, gm1) However, there are some issues regarding this likelihood ratio test. 1) The null hypothesis is on the boundary of the parameter space, i.e., you test whether the variance for the random effect is zero. For this case the assumed chi-squared distribution for the LRT may *not* be totally appropriate and may produce conservative p-values. There is some theory regarding this issue, which has shown that the reference distribution for the LRT in this case is a mixture of a chi-squared(df = 0) and chi-squared(df = 1). Another option is to use simulation-based approach where you can approximate the reference distribution of the LRT under the null using simulation. You may check below for an illustration of this procedure (not-tested): X - model.matrix(gm0) coefs - coef(gm0) pr - plogis(c(X %*% coefs)) n - length(pr) new.dat - cbpp Tobs - lrt(gm0, gm1)$L01 B - 200 out.T - numeric(B) for (b in 1:B) { y - rbinom(n, cbpp$size, pr) new.dat$incidence - y fit0 - glm(formula(gm0), family = binomial, data = new.dat) fit1 - glmer(formula(gm1), family = binomial, data = new.dat) out.T[b] - lrt(fit0, fit1)$L01 } # estimate p-value (sum(out.T = Tobs) + 1) / (B + 1) 2) For the glmer fit you have to note that you work with an approximation to the log-likelihood (obtained using numerical integration) and not the actual log-likelihood. I hope it helps. Best, Dimitris Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting COREY SPARKS [EMAIL PROTECTED]: Dear list, I am fitting a logistic multi-level regression model and need to test the difference between the ordinary logistic regression from a glm() fit and the mixed effects fit from glmer(), basically I want to do a likelihood ratio test between the two fits. The data are like this: My outcome is a (1,0) for health status, I have several (1,0) dummy variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male, divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20 to 100). My higher level is called munid and has 581 levels. The data have 45243 observations. Here are my program statements: #GLM fit ph.fit.2-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(), data=mx.merge) #GLMER fit ph.fit.3-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(), data=mx.merge) I cannot find a method in R that will do the LR test between a glm and a glmer fit, so I try to do it using the liklihoods from both models #form the likelihood ratio test between the glm and glmer fits x2--2*(logLik(ph.fit.2)-logLik(ph.fit.3)) ML 79.60454 attr(,nobs) n 45243 attr(,nall) n 45243 attr(,df) [1] 14 attr(,REML) [1] FALSE attr(,class) [1] logLik #Get the associated p-value dchisq(x2,14) ML 5.94849e-15 Which looks like an improvement in model fit to me. Am I seeing this correctly or are the two models even able to be compared? they are both estimated via maximum likelihood, so they should be, I think. Any help would be appreciated. Corey Corey S. Sparks, Ph.D. Assistant Professor Department of Demography and Organization Studies University of Texas San Antonio One UTSA Circle San Antonio, TX 78249 email:[EMAIL PROTECTED] web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] Likelihood between observed and predicted response
Christophe LOOTS Christophe.Loots at ifremer.fr writes: Hi, I've two fitted models, one binomial model with presence-absence data that predicts probability of presence and one gaussian model (normal or log-normal abundances). I would like to evaluate these models not on their capability of adjustment but on their capability of prediction by calculating the (log)likelihood between predicted and observed values for each type of model. I found the following formula for Bernouilli model : -2 log lik = -2 sum (y*log phat + (1-y)*log(1-phat) ), with phat is the probaility (between 0 and 1) and y is the observed values (0 or 1). 1) Is anybody can tell me if this formula is statistically true? This looks correct. 2) Can someone tell me what is the formula of the likelihood between observed and predicted values for a gaussian model ? -2 L = sum( (x_i - mu_i)^2)/sigma^2 - 2*n*log(sigma) + C assuming independence and equal variances: but don't trust my algebra, see ?dnorm and take the log of the likelihood shown there for yourself. You're reinventing the wheel a bit here: -2*sum(dbinom(y,prob=phat,size=1,log=TRUE)) and -2*sum(dnorm(x,mean=mu,sd=sigma,log=TRUE)) will do what you want. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood between observed and predicted response
Hi, I've two fitted models, one binomial model with presence-absence data that predicts probability of presence and one gaussian model (normal or log-normal abundances). I would like to evaluate these models not on their capability of adjustment but on their capability of prediction by calculating the (log)likelihood between predicted and observed values for each type of model. I found the following formula for Bernouilli model : -2 log lik = -2 sum (y*log phat + (1-y)*log(1-phat) ), with phat is the probaility (between 0 and 1) and y is the observed values (0 or 1). 1) Is anybody can tell me if this formula is statistically true? 2) Can someone tell me what is the formula of the likelihood between observed and predicted values for a gaussian model ? Thanks -- Christophe LOOTS PhD student - Hydrobiological modelling of fish habitats Sea Fisheries Laboratory - IFREMER Boulogne sur Mer 150, Quai Gambetta. BP 699 62321 Boulogne sur Mer- FRANCE Tél : +33(0)3 21 99 56 86 Fax : +33(0)3 21 99 56 01 E-mail : [EMAIL PROTECTED] http://www.ifremer.fr/drvboulogne/labo/equipe.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test for competing risks regression
R-helpers: I have a question regarding the crr function of the cmprsk package for performing competing risks regression. Specifically, I was wondering if the standard likelihood ratio test for a categorical covariate applies. For example: # Make up a fake example using the PBC dataset in the survival package attach(pbc) stage2 - ifelse(stage==2, 1, 0) stage3 - ifelse(stage==3, 1, 0) stage4 - ifelse(stage==4, 1, 0) newstatus - ifelse(status==1, 1, ifelse(status==0 trt==1, 2, 0)) fit1 - crr(ftime=time, fstatus=newstatus, cencode=0, failcode=1, cov1=(cbind(age, stage2, stage3, stage4))) fit1 fit2 - crr(ftime=time, fstatus=newstatus, cencode=0, failcode=1, cov1=(cbind(age))) 1 - pchisq(2*(fit1$loglik - fit2$loglik), df=3) If I was interested in the overall impact of stage as a predictor in the fake model in fit1 (stage being represented by the 3 dummy variables stage2, stage3 and stage4), would the LR test calculated in the last line of code be valid? Given the pseudolikelihood used in the crr fit, I am not sure if this is a valid statistical test. Thanks for any comments in advance, Brant Inman Mayo Clinic -- sessionInfo() R version 2.6.1 (2007-11-26) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] splines grid stats graphics grDevices utils datasets [8] methods base other attached packages: [1] cmprsk_2.1-7 mitools_1.0lme4_0.99875-9 [4] Matrix_0.999375-3 nlme_3.1-86latticeExtra_0.3-1 [7] RColorBrewer_1.0-2 survival_2.34 MASS_7.2-39 [10] foreign_0.8-23 RGraphics_1.0-6lattice_0.17-4 loaded via a namespace (and not attached): [1] rcompgen_0.1-17 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood optimization numerically
Dear List, Probably i am missing something important in optimize: llk.1st - function(alpha){ x - c(20.0, 23.9, 20.9, 23.8, 25.0, 24.0, 21.7, 23.8, 22.8, 23.1, 23.1, 23.5, 23.0, 23.0) n - length(x) llk1 - -n*log(gamma(alpha)) - n*alpha*log(sum(x)/(n*alpha)) + (alpha - 1)*(sum(log(x))) - (sum(x))/(sum(x)/(n*alpha)) return(llk1) } plot(llk.1st, 1,1000) optimize(f=llk.1st, interval =c(0,1000), tol = 0.0001) Reported result is approximately - alpha = 500 beta = 0.05 But i get - $minimum [1] 1000 ## whatever the upper bound is!! $objective [1] -Inf There were 50 or more warnings (use warnings() to see the first 50) also, optim(par = 500, fn = llk.1st) Error in optim(par = 500, fn = llk.1st) : function cannot be evaluated at initial parameters In addition: Warning messages: 1: In optim(par = 500, fn = llk.1st) : one-diml optimization by Nelder-Mead is unreliable: use optimize 2: In fn(par, ...) : value out of range in 'gammafn' Can anyone provide me hint? Thank you for your time. Ehsan http://www.youtube.com/profile_play_list?user=wildsc0p - Original Message From: Mohammad Ehsanul Karim [EMAIL PROTECTED] To: r-help@r-project.org Sent: Sunday, January 27, 2008 12:47:40 AM Subject: Likelihood optimization numerically Dear List, I am not sure how should i optimize a log-likelihood numerically: Here is a Text book example from Statistical Inference by George Casella, 2nd Edition Casella and Berger, Roger L. Berger (2002, pp. 355, ex. 7.4 # 7.2.b): data = x = c(20.0, 23.9, 20.9, 23.8, 25.0, 24.0, 21.7, 23.8, 22.8, 23.1, 23.1, 23.5, 23.0, 23.0) n - length(x) # likelihood from a 2 parameter Gamma(alpha, beta), both unknown llk = -n*log(gamma(alpha)) - n*alpha*log(beta) + (alpha - 1)*(sum(log(x))) - (sum(x))/beta # analytic 1st derivative solution w.r.t alpha, assuming beta known # by putting MLE of beta = sum(x)/(n*alpha) # (to simplify as far as possible analytically) llk.1st = - n*digamma(alpha) -n*(log(sum(x)/(n*alpha))+1) + (sum(log(x))) It feels like i should use nls(... , trace=T, start=c(alpha=...),nls.control(maxiter=100,tol=.1)) but not sure how. R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 6.1 $year [1] 2007 $month [1] 11 $day [1] 26 $`svn rev` [1] 43537 $language [1] R $version.string [1] R version 2.6.1 (2007-11-26) Never miss a thing. Make Yahoo your home page. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood optimization numerically
Dear List, I am not sure how should i optimize a log-likelihood numerically: Here is a Text book example from Statistical Inference by George Casella, 2nd Edition Casella and Berger, Roger L. Berger (2002, pp. 355, ex. 7.4 # 7.2.b): data = x = c(20.0, 23.9, 20.9, 23.8, 25.0, 24.0, 21.7, 23.8, 22.8, 23.1, 23.1, 23.5, 23.0, 23.0) n - length(x) # likelihood from a 2 parameter Gamma(alpha, beta), both unknown llk = -n*log(gamma(alpha)) - n*alpha*log(beta) + (alpha - 1)*(sum(log(x))) - (sum(x))/beta # analytic 1st derivative solution w.r.t alpha, assuming beta known # by putting MLE of beta = sum(x)/(n*alpha) # (to simplify as far as possible analytically) llk.1st = - n*digamma(alpha) -n*(log(sum(x)/(n*alpha))+1) + (sum(log(x))) It feels like i should use nls(... , trace=T, start=c(alpha=...),nls.control(maxiter=100,tol=.1)) but not sure how. Can anyone provide me hint? Thank you for your time. Ehsan http://www.youtube.com/profile_play_list?user=wildsc0p R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 6.1 $year [1] 2007 $month [1] 11 $day [1] 26 $`svn rev` [1] 43537 $language [1] R $version.string [1] R version 2.6.1 (2007-11-26) Be a better friend, newshound, and __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood optimization numerically
You asked for a hint. library(MASS) x - c(20.0, 23.9, 20.9, 23.8, 25.0, 24.0, 21.7, 23.8, 22.8, 23.1, 23.1, 23.5, 23.0, 23.0) fitdistr(x, gamma) shape rate 316.56387213.780766 (119.585534) ( 5.209952) To do it with more general and elementary tools, look at ?optim nls(...)? Not relevant at all, no matter how it feels. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mohammad Ehsanul Karim Sent: Sunday, 27 January 2008 4:48 PM To: r-help@r-project.org Subject: [R] Likelihood optimization numerically Dear List, I am not sure how should i optimize a log-likelihood numerically: Here is a Text book example from Statistical Inference by George Casella, 2nd Edition Casella and Berger, Roger L. Berger (2002, pp. 355, ex. 7.4 # 7.2.b): data = x = c(20.0, 23.9, 20.9, 23.8, 25.0, 24.0, 21.7, 23.8, 22.8, 23.1, 23.1, 23.5, 23.0, 23.0) n - length(x) # likelihood from a 2 parameter Gamma(alpha, beta), both unknown llk = -n*log(gamma(alpha)) - n*alpha*log(beta) + (alpha - 1)*(sum(log(x))) - (sum(x))/beta # analytic 1st derivative solution w.r.t alpha, assuming beta known # by putting MLE of beta = sum(x)/(n*alpha) # (to simplify as far as possible analytically) llk.1st = - n*digamma(alpha) -n*(log(sum(x)/(n*alpha))+1) + (sum(log(x))) It feels like i should use nls(... , trace=T, start=c(alpha=...),nls.control(maxiter=100,tol=.1)) but not sure how. Can anyone provide me hint? Thank you for your time. Ehsan http://www.youtube.com/profile_play_list?user=wildsc0p R.Version() $platform [1] i386-pc-mingw32 $arch [1] i386 $os [1] mingw32 $system [1] i386, mingw32 $status [1] $major [1] 2 $minor [1] 6.1 $year [1] 2007 $month [1] 11 $day [1] 26 $`svn rev` [1] 43537 $language [1] R $version.string [1] R version 2.6.1 (2007-11-26) Be a better friend, newshound, and __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood from test result
Dear David, no, distrTEst won't help. It has a different intention. We are currently working on a new package distrMod (cf. https://r-forge.r-project.org/projects/distrmod/) which sometime might have such a functionality. Best, Matthias David Bickel wrote: Is there any automatic mechanism for extracting a likelihood or test statistic distribution (PDF or CDF) from an object of class htest or from another object of a general class encoding a hypothesis test result? I would like to have a function that takes x, an object of class htest, as its only argument and that returns the likelihood or test statistic distribution that was used to compute the p-value. It seems the only way to write such a function is to manually assign each test its statistic's distribution, e.g., like this: FUN - if(names(x$statistic) == t) dt else if(names(x$statistic) == X-squared) dchisq # etc. Is there a general S3 or S4 class other than htest that would better accommodate such extraction of distributions or likelihoods? I would also appreciate any suggestions for strategies or contributed packages that may facilitate automation. For example, would the distrTEst package help? David __ David R. Bickel Ottawa Institute of Systems Biology http://www.oisb.ca/members.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] likelihood from test result
Is there any automatic mechanism for extracting a likelihood or test statistic distribution (PDF or CDF) from an object of class htest or from another object of a general class encoding a hypothesis test result? I would like to have a function that takes x, an object of class htest, as its only argument and that returns the likelihood or test statistic distribution that was used to compute the p-value. It seems the only way to write such a function is to manually assign each test its statistic's distribution, e.g., like this: FUN - if(names(x$statistic) == t) dt else if(names(x$statistic) == X-squared) dchisq # etc. Is there a general S3 or S4 class other than htest that would better accommodate such extraction of distributions or likelihoods? I would also appreciate any suggestions for strategies or contributed packages that may facilitate automation. For example, would the distrTEst package help? David __ David R. Bickel Ottawa Institute of Systems Biology http://www.oisb.ca/members.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood from test result
You could create an S3 generic that does it. That is not initially any less work than the if statement but if you add new distribution no existing code need be modified. Just add a new method for each distribution to be supported: getDistr - function(x) { .Class - names(x$value$statistic) NextMethod(getDistr) } # initial list of distributions supported getDistr.t - function(x) dt getDistr.X-squared - function(x) dchisq # test the two distributions example(t.test) getDistr(.Last.value) example(prop.test) getDistr(.Last.value) On Jan 9, 2008 10:46 AM, David Bickel [EMAIL PROTECTED] wrote: Is there any automatic mechanism for extracting a likelihood or test statistic distribution (PDF or CDF) from an object of class htest or from another object of a general class encoding a hypothesis test result? I would like to have a function that takes x, an object of class htest, as its only argument and that returns the likelihood or test statistic distribution that was used to compute the p-value. It seems the only way to write such a function is to manually assign each test its statistic's distribution, e.g., like this: FUN - if(names(x$statistic) == t) dt else if(names(x$statistic) == X-squared) dchisq # etc. Is there a general S3 or S4 class other than htest that would better accommodate such extraction of distributions or likelihoods? I would also appreciate any suggestions for strategies or contributed packages that may facilitate automation. For example, would the distrTEst package help? David __ David R. Bickel Ottawa Institute of Systems Biology http://www.oisb.ca/members.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] likelihood from test result
David Bickel wrote: Is there any automatic mechanism for extracting a likelihood or test statistic distribution (PDF or CDF) from an object of class htest or from another object of a general class encoding a hypothesis test result? I would like to have a function that takes x, an object of class htest, as its only argument and that returns the likelihood or test statistic distribution that was used to compute the p-value. It seems the only way to write such a function is to manually assign each test its statistic's distribution, e.g., like this: FUN - if(names(x$statistic) == t) dt else if(names(x$statistic) == X-squared) dchisq # etc. Just take the p-value itself as the test statistic and use dunif. I think doing this may also show you that you have a conceptual problem (What would you do with the likelihood?) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test for proportional odds logistic regression
Hi, I want to do a global likelihood ratio test for the proportional odds logistic regression model and am unsure how to go about it. I am using the polr() function in library(MASS). 1. Is the p-value from the likelihood ratio test obtained by anova(fit1,fit2), where fit1 is the polr model with only the intercept and fit2 is the full polr model (refer to example below)? So in the case of the example below, the p-value would be 1. 2. For the model in which there is only one independent variable, I would expect the Wald test and the likelihood ratio test to give similar p-values. However the p-values obtained from anova(fit1,fit3) (refer to example below) are very different (0.0002622986 vs. 1). Why is this so? library(MASS) fit1 - polr(housing$Sat~1) fit2- polr(housing$Sat~housing$Infl) fit3- polr(housing$Sat~housing$Cont) summary(fit1) Re-fitting to get Hessian Call: polr(formula = housing$Sat ~ 1) No coefficients Intercepts: Value Std. Error t value Low|Medium -0.6931 0.2500-2.7726 Medium|High 0.6931 0.2500 2.7726 Residual Deviance: 158.2002 AIC: 162.2002 summary(fit2) Re-fitting to get Hessian Call: polr(formula = housing$Sat ~ housing$Infl) Coefficients: Value Std. Error t value housing$InflMedium 6.347464e-06 0.5303301 1.196889e-05 housing$InflHigh 6.347464e-06 0.5303301 1.196889e-05 Intercepts: Value Std. Error t value Low|Medium -0.6931 0.3953-1.7535 Medium|High 0.6932 0.3953 1.7536 Residual Deviance: 158.2002 AIC: 166.2002 summary(fit3) Re-fitting to get Hessian Call: polr(formula = housing$Sat ~ housing$Cont) Coefficients: Value Std. Error t value housing$ContHigh 0.0001135777 0.4330091 0.0002622986 Intercepts: Value Std. Error t value Low|Medium -0.6931 0.3307-2.0956 Medium|High 0.6932 0.3307 2.0960 Residual Deviance: 158.2002 AIC: 164.2002 anova(fit1,fit2) Likelihood ratio tests of ordinal regression models Response: housing$Sat Model Resid. df Resid. Dev TestDf LR stat. Pr(Chi) 1170 158.2002 2 housing$Infl68 158.2002 1 vs 2 2 -6.375558e-10 1 anova(fit1,fit3) Likelihood ratio tests of ordinal regression models Response: housing$Sat Model Resid. df Resid. Dev TestDf LR stat. Pr(Chi) 1170 158.2002 2 housing$Cont69 158.2002 1 vs 2 1 -1.224427e-07 1 Thank you, Xinyi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test for proportional odds logistic regression
On Sat, 5 Jan 2008, xinyi lin wrote: Hi, I want to do a global likelihood ratio test for the proportional odds logistic regression model and am unsure how to go about it. I am using the polr() function in library(MASS). 1. Is the p-value from the likelihood ratio test obtained by anova(fit1,fit2), where fit1 is the polr model with only the intercept and fit2 is the full polr model (refer to example below)? So in the case of the example below, the p-value would be 1. There is no improvement in fit, as the near-zero coefficients show. You are not calling polr correctly on this example: 'why is this so?' given that it *is* the example on the help page and all you had to do was to read the help (or the book for which this is support software). 2. For the model in which there is only one independent variable, I would expect the Wald test and the likelihood ratio test to give similar p-values. However the p-values obtained from anova(fit1,fit3) (refer to example below) are very different (0.0002622986 vs. 1). Why is this so? Because you compared a t-value to a p-value, not at all the same thing. library(MASS) fit1 - polr(housing$Sat~1) fit2- polr(housing$Sat~housing$Infl) fit3- polr(housing$Sat~housing$Cont) summary(fit1) Re-fitting to get Hessian Call: polr(formula = housing$Sat ~ 1) No coefficients Intercepts: Value Std. Error t value Low|Medium -0.6931 0.2500-2.7726 Medium|High 0.6931 0.2500 2.7726 Residual Deviance: 158.2002 AIC: 162.2002 summary(fit2) Re-fitting to get Hessian Call: polr(formula = housing$Sat ~ housing$Infl) Coefficients: Value Std. Error t value housing$InflMedium 6.347464e-06 0.5303301 1.196889e-05 housing$InflHigh 6.347464e-06 0.5303301 1.196889e-05 Intercepts: Value Std. Error t value Low|Medium -0.6931 0.3953-1.7535 Medium|High 0.6932 0.3953 1.7536 Residual Deviance: 158.2002 AIC: 166.2002 summary(fit3) Re-fitting to get Hessian Call: polr(formula = housing$Sat ~ housing$Cont) Coefficients: Value Std. Error t value housing$ContHigh 0.0001135777 0.4330091 0.0002622986 Intercepts: Value Std. Error t value Low|Medium -0.6931 0.3307-2.0956 Medium|High 0.6932 0.3307 2.0960 Residual Deviance: 158.2002 AIC: 164.2002 anova(fit1,fit2) Likelihood ratio tests of ordinal regression models Response: housing$Sat Model Resid. df Resid. Dev TestDf LR stat. Pr(Chi) 1170 158.2002 2 housing$Infl68 158.2002 1 vs 2 2 -6.375558e-10 1 anova(fit1,fit3) Likelihood ratio tests of ordinal regression models Response: housing$Sat Model Resid. df Resid. Dev TestDf LR stat. Pr(Chi) 1170 158.2002 2 housing$Cont69 158.2002 1 vs 2 1 -1.224427e-07 1 Thank you, Xinyi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ration test on glm
I would like to try a likelihood ratio test in place of waldtest. Ideally I'd like to provide two glm models, the second a submodel of the first, in the style of lrt (http://www.pik-potsdam.de/~hrust/tools/farismahelp/lrt.html). [lrt takes farimsa objects] Does anyone know of such a likelihood ratio test? Chris Elsaesser, PhD Principal Scientist, Machine Learning SPADAC Inc. 7921 Jones Branch Dr. Suite 600 McLean, VA 22102 703.371.7301 (m) 703.637.9421 (o) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ration test on glm
Chris Elsaesser wrote: I would like to try a likelihood ratio test in place of waldtest. Ideally I'd like to provide two glm models, the second a submodel of the first, in the style of lrt (http://www.pik-potsdam.de/~hrust/tools/farismahelp/lrt.html). [lrt takes farimsa objects] Does anyone know of such a likelihood ratio test? I think anova(model1,model2,test=Chi) will do what you want. -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Department of Public Health Sciences Faculty of Medicine, University of Toronto email: [EMAIL PROTECTED] Tel: 416.864.5776 Fax: 416.864.6057 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ration test on glm
chris, as long as you know the log likelihood functions and the # of parameters in both models, a pencil and a piece of paper should be enough to calculate LR test. On 9/21/07, Chris Elsaesser [EMAIL PROTECTED] wrote: I would like to try a likelihood ratio test in place of waldtest. Ideally I'd like to provide two glm models, the second a submodel of the first, in the style of lrt (http://www.pik-potsdam.de/~hrust/tools/farismahelp/lrt.html). [lrt takes farimsa objects] Does anyone know of such a likelihood ratio test? Chris Elsaesser, PhD Principal Scientist, Machine Learning SPADAC Inc. 7921 Jones Branch Dr. Suite 600 McLean, VA 22102 703.371.7301 (m) 703.637.9421 (o) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === I am dying with the help of too many physicians. - Alexander the Great, on his deathbed === WenSui Liu (http://spaces.msn.com/statcompute/blog) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ration test on glm
On Fri, 21 Sep 2007, Wensui Liu wrote: chris, as long as you know the log likelihood functions and the # of parameters in both models, a pencil and a piece of paper should be enough to calculate LR test. True enough for the LR statistic. Or follow the instructions in the _posting guide_ and try RSiteSearch(glm likelihood) and page thru the results looking for entries like this one: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76603.html Chuck On 9/21/07, Chris Elsaesser [EMAIL PROTECTED] wrote: I would like to try a likelihood ratio test in place of waldtest. Ideally I'd like to provide two glm models, the second a submodel of the first, in the style of lrt (http://www.pik-potsdam.de/~hrust/tools/farismahelp/lrt.html). [lrt takes farimsa objects] Does anyone know of such a likelihood ratio test? Chris Elsaesser, PhD Principal Scientist, Machine Learning SPADAC Inc. 7921 Jones Branch Dr. Suite 600 McLean, VA 22102 703.371.7301 (m) 703.637.9421 (o) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === I am dying with the help of too many physicians. - Alexander the Great, on his deathbed === WenSui Liu (http://spaces.msn.com/statcompute/blog) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.