Re: [R] overdispersion and quasibinomial model

2009-11-25 Thread Peter Ehlers



djpren wrote:

Thanks for the reply. Naturally I already searched the site and help for the
answers to these questions. I think I've figured out how to run a
quasi-binomial model, but I cannot figure out how to test for
over-dispersion or how to apply a shapiro-wilk test.

This is not homework, neither do I have an instructor who is proficient in
using R. This program was suggested to me by another researcher after he
witnessed my frustration with the inflexibility of SPSS and other such
programs. I am on a very tight schedule and I don't have time to become a
statistician and computer scientist, which is why I wrote 3 very quick
questions asking for commands that i had already tried to find myself.

Testing for over-dispersion is probably something I can eventually get to
grips with, since I just have get variance for the real and modelled data.
However, I cannot find a command to do shapiro-wilks on the site or on these
forums. Also, why do you say that most people here wouldn't recommend this
procedure?


The customary (well, at least to me) to check for overdispersion
is to look at the ratio of the sum of squared Pearson residuals
over residual degrees of freedom. This is well discussed in
MASS (the book).

Example:

library(MASS)
fm1 <- glm(low ~ age + race, family = binomial, data = birthwt)
phi <- sum(resid(fm, type = "pearson")^2) / df.residual(fm)
phi
#[1] 1.011612

For a binomial glm, this value is expected to be near 1.0
as it is here. So there is no indication of overdispersion
in this example.

I don't know of a specific test for overdispersion. Personally,
I start to worry about the adequacy of the model if the data
set is large and phi is greater than about 1.2. For small data
sets I wouldn't be too concerned if phi is less than 1.5.
But this all depends crucially on what you want to do with
your model results. Adjusting phi to be greater than 1.0 will
provide more conservative estimates of the parameters.
Note that using family="quasibinomial" won't change the
parameter estimates, just their SEs.

fm2 <- glm(low ~ age + race, family = quasibinomial, data = birthwt)

Now you can compare summary(fm1) with summary(fm2).

What Shapiro-Wilk has to do with this is: Nothing!

 -Peter Ehlers



David Winsemius wrote:


On Nov 24, 2009, at 3:41 PM, djpren wrote:


I am looking for the correct commands to do the following things:

1. I have a binomial logistic regression model and i want to test for
overdispersion.

Under the teach a man to fish precept,   ... try:

RSiteSearch("test over dispersion binomial models")

2. If I do indeed have overdispersion i need to then run a quasi- 
binomial

model, but I'm not sure of the command.

?glm
# and follow the appropriate links

3. I can get the residuals of the model, but i need to then apply a  
shapiro

wilk test to test them. Does anyone know the command for this?


RSiteSearch("shapiro-wilks")   # not that people here recommend this  
procedure


The overall flavor of these questions is "homework", so I'm  
speculating that you may want to consult your instructors.


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overdispersion and quasibinomial model

2009-11-25 Thread David Winsemius


On Nov 25, 2009, at 7:04 AM, djpren wrote:



Thanks for the reply. Naturally I already searched the site and help  
for the

answers to these questions. I think I've figured out how to run a
quasi-binomial model, but I cannot figure out how to test for
over-dispersion or how to apply a shapiro-wilk test.

This is not homework, neither do I have an instructor who is  
proficient in
using R. This program was suggested to me by another researcher  
after he

witnessed my frustration with the inflexibility of SPSS and other such
programs. I am on a very tight schedule and I don't have time to  
become a

statistician and computer scientist, which is why I wrote 3 very quick
questions asking for commands that i had already tried to find myself.


"Quick questions" are somewhat deprecated here. Have you read the  
Posting Guide? Its overall message is that the list readership expects  
more detail rather than less. Perhaps with a better search method and  
a pointer to the glm()  function, which will do what was requested,   
you might compose a more complete description of the data and the  
problem, and offer code that shows what progress you are making.




Testing for over-dispersion is probably something I can eventually  
get to
grips with, since I just have get variance for the real and modelled  
data.
However, I cannot find a command to do shapiro-wilks on the site or  
on these

forums.


I would have thought my original reply would have pointed the way to  
more effective searching. The obvious search strategy using the  
RSiteSearch function would seem to be:


> RSiteSearch("shapiro wilks")
A search query has been submitted to http://search.r-project.org
The results page should open in your browser shortly

A Browser window did open up and there were 8 hits, at least two of  
which were to functions that would do what you appear to be determined  
to do on a rather dubious basis.




Also, why do you say that most people here wouldn't recommend this
procedure?


Are you doing this because some reviewer asked you to do so or because  
you are copying a path that someone else laid out for you? Testing for  
normality in a binomial model seems rather puzzling on the face of it.


--
David.




David Winsemius wrote:



On Nov 24, 2009, at 3:41 PM, djpren wrote:



I am looking for the correct commands to do the following things:

1. I have a binomial logistic regression model and i want to test  
for

overdispersion.


Under the teach a man to fish precept,   ... try:

RSiteSearch("test over dispersion binomial models")


2. If I do indeed have overdispersion i need to then run a quasi-
binomial
model, but I'm not sure of the command.


?glm
# and follow the appropriate links


3. I can get the residuals of the model, but i need to then apply a
shapiro
wilk test to test them. Does anyone know the command for this?



RSiteSearch("shapiro-wilks")   # not that people here recommend this
procedure

The overall flavor of these questions is "homework", so I'm
speculating that you may want to consult your instructors.

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://old.nabble.com/overdispersion-and-quasibinomial-model-tp26502728p26511410.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overdispersion and quasibinomial model

2009-11-25 Thread djpren

Thanks for the reply. Naturally I already searched the site and help for the
answers to these questions. I think I've figured out how to run a
quasi-binomial model, but I cannot figure out how to test for
over-dispersion or how to apply a shapiro-wilk test.

This is not homework, neither do I have an instructor who is proficient in
using R. This program was suggested to me by another researcher after he
witnessed my frustration with the inflexibility of SPSS and other such
programs. I am on a very tight schedule and I don't have time to become a
statistician and computer scientist, which is why I wrote 3 very quick
questions asking for commands that i had already tried to find myself.

Testing for over-dispersion is probably something I can eventually get to
grips with, since I just have get variance for the real and modelled data.
However, I cannot find a command to do shapiro-wilks on the site or on these
forums. Also, why do you say that most people here wouldn't recommend this
procedure?


David Winsemius wrote:
> 
> 
> On Nov 24, 2009, at 3:41 PM, djpren wrote:
> 
>>
>> I am looking for the correct commands to do the following things:
>>
>> 1. I have a binomial logistic regression model and i want to test for
>> overdispersion.
> 
> Under the teach a man to fish precept,   ... try:
> 
> RSiteSearch("test over dispersion binomial models")
> 
>> 2. If I do indeed have overdispersion i need to then run a quasi- 
>> binomial
>> model, but I'm not sure of the command.
> 
> ?glm
> # and follow the appropriate links
> 
>> 3. I can get the residuals of the model, but i need to then apply a  
>> shapiro
>> wilk test to test them. Does anyone know the command for this?
> 
> 
> RSiteSearch("shapiro-wilks")   # not that people here recommend this  
> procedure
> 
> The overall flavor of these questions is "homework", so I'm  
> speculating that you may want to consult your instructors.
> 
> -- 
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/overdispersion-and-quasibinomial-model-tp26502728p26511410.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overdispersion and quasibinomial model

2009-11-25 Thread Ben Bolker



djpren wrote:
> 
> Thanks for the reply. Naturally I already searched the site and help for
> the answers to these questions. I think I've figured out how to run a
> quasi-binomial model, but I cannot figure out how to test for
> over-dispersion or how to apply a shapiro-wilk test.
> 
> This is not homework, neither do I have an instructor who is proficient in
> using R. This program was suggested to me by another researcher after he
> witnessed my frustration with the inflexibility of SPSS and other such
> programs. I am on a very tight schedule and I don't have time to become a
> statistician and computer scientist, which is why I wrote 3 very quick
> questions asking for commands that i had already tried to find myself.
> 
> Testing for over-dispersion is probably something I can eventually get to
> grips with, since I just have get variance for the real and modelled data.
> However, I cannot find a command to do shapiro-wilks on the site or on
> these forums. Also, why do you say that most people here wouldn't
> recommend this procedure?
> 
> 

??shapiro
stats::shapiro.test Shapiro-Wilk Normality Test

(maybe you were searching for "shapiro-wilks" (sic)?)

People often disrecommend statistical tests of normality because they 
have low power for small data sets (hence you don't have power to
detect non-normality when it is present) and high power for large
data sets even when the degree of non-normality detected is not
enough to invalidate the results of some statistical procedures.
Under what circumstances are the residuals from a quasibinomial
GLM expected to be normally distributed ... ?







-- 
View this message in context: 
http://old.nabble.com/overdispersion-and-quasibinomial-model-tp26502728p26512593.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overdispersion and quasibinomial model

2009-11-24 Thread David Winsemius


On Nov 24, 2009, at 3:41 PM, djpren wrote:



I am looking for the correct commands to do the following things:

1. I have a binomial logistic regression model and i want to test for
overdispersion.


Under the teach a man to fish precept,   ... try:

RSiteSearch("test over dispersion binomial models")

2. If I do indeed have overdispersion i need to then run a quasi- 
binomial

model, but I'm not sure of the command.


?glm
# and follow the appropriate links

3. I can get the residuals of the model, but i need to then apply a  
shapiro

wilk test to test them. Does anyone know the command for this?



RSiteSearch("shapiro-wilks")   # not that people here recommend this  
procedure


The overall flavor of these questions is "homework", so I'm  
speculating that you may want to consult your instructors.


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] overdispersion and quasibinomial model

2009-11-24 Thread djpren

I am looking for the correct commands to do the following things:

1. I have a binomial logistic regression model and i want to test for
overdispersion.
2. If I do indeed have overdispersion i need to then run a quasi-binomial
model, but I'm not sure of the command.
3. I can get the residuals of the model, but i need to then apply a shapiro
wilk test to test them. Does anyone know the command for this?

Any help would be hugely appreciated,

Thanks,

Djp


-- 
View this message in context: 
http://old.nabble.com/overdispersion-and-quasibinomial-model-tp26502728p26502728.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.