I know that this is a quite old post but I am dealing with a similar warning
message and, also after reading all the posts here, I am still in doubt with
what I should do with my analysis.
I have a dataset where the binary response variable is sex, and the
predictors are several variables (they
[analogs]=x), sp+se - 1))
[1] 0.9443561 0.9269231 0.8712792
So it appears that the dataset is quite well separated into two samples at the
cutpoint 0.209 Re: [R] OT: (quasi-?) separation in a logistic GLM
Grant Izmirlian
NCI
On 15 Dec 2008, at 18:03, Gavin Simpson wrote:
Dear List
sorry for reposting. Some code was missing in my previous email...
--
Dear Gavin
glm reported exactly what it noticed, giving a warning that some very
small fitted probabilities have been found.
However, your data are **not** quasi-separated. The
On Tue, 2008-12-16 at 13:31 +0100, vito muggeo wrote:
dear Gavin,
I do not know whether such comment may be still useful..
Very much so, Thank you.
Why are you unsure about quasi-separation?
I think that it is quite evident in the plot
Unsure in the sense that I had been unable to
dear Gavin,
I do not know whether such comment may be still useful..
Why are you unsure about quasi-separation?
I think that it is quite evident in the plot
plot(analogs ~ Dij, data = dat)
Also it may be useful to see the plot of the monotone (profile) deviance
(or the log-lik) for the coef
Dear Gavin,
glm reported exactly what it noticed, giving a warning that some very
small fitted probabilities have been found.
However, your data are **not** quasi-separated. The maximum likelihood
estimates are really those reported by glm.
A first elementary way is to change the tolerance
Dear List,
Apologies for this off-topic post but it is R-related in the sense that
I am trying to understand what R is telling me with the data to hand.
ROC curves have recently been used to determine a dissimilarity
threshold for identifying whether two samples are from the same type
or not.
If you look at the distribution of those with analogs==TRUE versus the
whole groups it is not surprising that the upper range of Dij's result
in a very low probability estimate:
plot(density(dat$Dij))
lines(density(dat[dat$analogs == TRUE, 2]))
Appears as though more than 25% of the
8 matches
Mail list logo