Re: [R] logistic regression for a data set with perfect separation

David Firth Wed, 10 Sep 2003 11:42:01 -0700

On Wednesday, Sep 10, 2003, at 18:50 Europe/London, Christoph Lehmann wrote:

Dear R experts

I have the follwoing data
          V1 V2
1 -5.8000000  0
2 -4.8000000  0
3 -2.8666667  0
4 -0.8666667  0
5 -0.7333333  0
6 -1.6666667  0
7 -0.1333333  1
8  1.2000000  1
9  1.3333333  1

and I want to know, whether V1 can predict V2: of course it can, since
there is a perfect separation between cases 1..6 and 7..9

How can I test, whether this conclusion (being able to assign an
observation i to class j, only knowing its value on Variable V1)  holds
also for the population, our data were drawn from?

For this you really need more data. The only way you'll ever be able to reject that hypothesis is by finding an instance of 010 or 101 in the (ordered by V1) sample. And if you find such then you can reject with certainty.


Means, which inference procedure is recommended? Logistic regression,
for obvious reasons makes no sense.

Not so obvious to me! Logistic regression still makes sense, but care is needed in the method of estimation/inference. The maximum likelihood solution in the above case is a model which says V2 is 1 with certainty at some values of V1, and is zero with certainty at other values; and that seems an unwarranted inference with so little data. That's a criticism of maximum likelihood, rather than a criticism of logistic regression. (Think about the more extreme situation of tossing a coin once: if a head is observed, the ML solution is that the coin lands heads with certainty, ie that there no chance of tails.)

There are alternative (Bayesian and pseudo-Bayesian) methods of inference which can yield more sensible answers in general. [One such is implemented in package brlr ("bias reduced logistic regression") on CRAN.] To "test" the hypothesis described above, though, with the data you have, would seem to require a fully Bayesian analysis whose conclusions would depend strongly on the prior probability attached to the hypothesis. ie you need more data...

I hope that helps in some way!

Regards,
David

Many thanks for your help

Christoph
--
Christoph Lehmann <[EMAIL PROTECTED]>

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] logistic regression for a data set with perfect separation

Reply via email to