On Wed, 10 Nov 2004, Jean Eid wrote:
Dear All,
I have been struggling to understand why for the housing data in MASS
library R and stata give coef. estimates that are really different. I also
tried to come up with many many examples myself (see below, of course I
did not have the set.seed command included) and all of my
`random' examples seem to give verry similar output. For the housing data,
I have changed the data into numeric vectors instead of factors/ordered
factors. I did so to try and get the same results as in STATA and to have
the housing example as close as possible to the one I constructed.
I run a debian sid, kernel 2.4, R 2.0.0, and STATA version 8.2, MASS
version 7.2-8.
here's the example ( I assume that you have STATA installed and can run in
batch mode, if not the output is also given below)
That example shows the same results with Stata and polr() from MASS.
For the housing data, I also get the same coefficients in Stata as with
polr():
In R:
library(MASS)
library(foreign)
write.dta(housing, file="housing.dta")
house.probit<-polr(Sat ~ Infl + Type + Cont, data = housing, weights =
Freq, method = "probit")
summary(house.probit)
-------------------------
Re-fitting to get Hessian
Call:
polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq,
method = "probit")
Coefficients:
Value Std. Error t value
InflMedium 0.3464233 0.06413706 5.401297
InflHigh 0.7829149 0.07642620 10.244063
TypeApartment -0.3475372 0.07229093 -4.807480
TypeAtrium -0.2178874 0.09476607 -2.299213
TypeTerrace -0.6641737 0.09180004 -7.235005
ContHigh 0.2223862 0.05812267 3.826153
Intercepts:
Value Std. Error t value
Low|Medium -0.2998 0.0762 -3.9371
Medium|High 0.4267 0.0764 5.5850
Residual Deviance: 3479.689
AIC: 3495.689
------------------------
In Stata
-----------------
. use housing.dta
. xi: oprobit Sat i.Infl i.Type i.Cont [fw=Freq]
i.Infl _IInfl_1-3 (naturally coded; _IInfl_1 omitted)
i.Type _IType_1-4 (naturally coded; _IType_1 omitted)
i.Cont _ICont_1-2 (naturally coded; _ICont_1 omitted)
Iteration 0: log likelihood = -1824.4388
Iteration 1: log likelihood = -1739.9254
Iteration 2: log likelihood = -1739.8444
Ordered probit estimates Number of obs = 1681
LR chi2(6) = 169.19
Prob > chi2 = 0.0000
Log likelihood = -1739.8444 Pseudo R2 = 0.0464
------------------------------------------------------------------------------
Sat | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
_IInfl_2 | .3464228 .064137 5.40 0.000 .2207165 .472129
_IInfl_3 | .7829146 .076426 10.24 0.000 .6331224 .9327069
_IType_2 | -.3475367 .0722908 -4.81 0.000 -.4892241 -.2058493
_IType_3 | -.2178875 .094766 -2.30 0.021 -.4036254 -.0321497
_IType_4 | -.6641735 .0917999 -7.24 0.000 -.844098 -.484249
_ICont_2 | .2223858 .0581226 3.83 0.000 .1084676 .336304
-------------+----------------------------------------------------------------
_cut1 | -.2998279 .0761537 (Ancillary parameters)
_cut2 | .4267208 .0764043
------------------------------------------------------------------------------
-thomas