Re: [R] Consistency of Logistic Regression

Uwe Ligges Sat, 13 Nov 2010 07:49:55 -0800


On 12.11.2010 20:11, Marc Schwartz wrote:

You are not creating your data set properly.

Your 'mat' is:

mat

    column1 column2
1        1       0
2        1       0
3        0       1
4        0       0
5        1       1
6        1       0
7        1       0
8        0       1
9        0       0
10       1       1


What you really want is:

DF<- data.frame(y = c(1,0,1,0,0,1,0,0,1,1), x = c(5,4,1,6,3,6,5,3,7,9))

Actually it is in general safer to have a factor y rather than numeric yfor classification tasks.


Best,
Uwe

DF

    y x
1  1 5
2  0 4
3  1 1
4  0 6
5  0 3
6  1 6
7  0 5
8  0 3
9  1 7
10 1 9



MOD<- glm(y ~ x, data = DF, family = binomial)

summary(MOD)


Call:
glm(formula = y ~ x, family = binomial, data = DF)

Deviance Residuals:
     Min       1Q   Median       3Q      Max
-1.3353  -1.0229  -0.1239   0.9956   1.7477

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)  -1.6118     1.7833  -0.904    0.366
x             0.3293     0.3383   0.973    0.330

(Dispersion parameter for binomial family taken to be 1)

     Null deviance: 13.863  on 9  degrees of freedom
Residual deviance: 12.767  on 8  degrees of freedom
AIC: 16.767

Number of Fisher Scoring iterations: 4


HTH,

Marc Schwartz


On Nov 12, 2010, at 12:56 PM, Benjamin Godlove wrote:

I think it is likely I am missing something.  Here is a very simple example:

R code:

mat<- matrix(nrow = 10, ncol = 2, c(1,0,1,0,0,1,0,0,1,1),
c(5,4,1,6,3,6,5,3,7,9), dimnames = list(c(1,2,3,4,5,6,7,8,9,10),
c("column1","column2")))

g<- glm(mat[1:10] ~ mat[11:20], family = binomial (link = logit))

g$converged


SAS code:

data mat;
input col1 col2;
datalines;
1 5
0 4
1 1
0 6
0 3
1 6
0 5
0 3
1 7
1 9
;

proc logistic data=mat descending;
model col1 = col2 / link=logit;
run;

SAS output (in case you don't have access to SAS):
Convergence criterion satisfied

                  Estimate       SE
Intercept    -1.6118          1.7833
col2            0.3293          0.3383


Of course, with an example this small, it is not so surprising that the two
methods differ; and they hardly differ by a single S.  But as the datasets
get larger, the difference is more pronounced.  Let me know if you would
like me to send you a large dataset.  I get the feeling I am doing something
wrong in R, so please let me know what you think.

Thank you!

Ben Godlove

On Thu, Nov 11, 2010 at 1:59 PM, Albyn Jones<jo...@reed.edu>  wrote:

do you have factors (categorical variables) in the model?  it could be
just a parameterization difference.

albyn

On Thu, Nov 11, 2010 at 12:41:03PM -0500, Benjamin Godlove wrote:

Dear R developers,

I have noticed a discrepancy between the coefficients returned by R's

glm()

for logistic regression and SAS's PROC LOGISTIC.  I am using dist =

binomial

and link = logit for both R and SAS.  I believe R uses IRLS whereas SAS

uses

Fisher's scoring, but the difference is something like 100 SE on the
intercept.  What accounts for such a huge difference?

Thank you for your time.

Ben Godlove

      [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Albyn Jones
Reed College
jo...@reed.edu


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Consistency of Logistic Regression

Reply via email to