On 22 ××× 2004, at 06:09, [EMAIL PROTECTED] wrote:
Message: 5 Date: Wed, 21 Jul 2004 13:48:53 +0200 From: [EMAIL PROTECTED] ( Bj?rn-Helge Mevik ) Subject: Re: [R] Precision in R To: [EMAIL PROTECTED] Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=iso-8859-1
Since you didn't say anything about _what_ you did, either in SAS or R, my first thought was: Have you checked that you use the same parametrization of the models in R and SAS?
Well, I'm running Poisson regressions for the incidence of childhood acute lymphoblastic leukemia in a set of US counties (and in this data set, for some reason, Hawaii counts as an entire county). Separate models are calculated for males and females. Independent variable of interest are race ("white", "black", "other") and (in the model for males only) -log(proportion of people in county who moved between 1985 and 1990) (AKA "minus log proportion moved" or "MLPM").
SAS code:
title "Males";
proc genmod data=males order=formatted;
class race sex;
model observed = race mlpm*mlpm*mlpm mlpm*mlpm mlpm / dist=poisson link=log offset=lPYAR covb;
run;
title "Females"; proc genmod data=females order=formatted; class race sex; model observed = race / dist=poisson link=log offset=lPYAR; run;
R code:
Female.model <- glm(Observed ~ Black + Other, family = poisson(link=log), offset=log(PYAR), data=Females)
Male.model <- glm(Observed ~ Black + Other + I(Minus.log.proportion.moved^3) + I(Minus.log.proportion.moved^2) + Minus.log.proportion.moved, family = poisson(link=log), offset=log(PYAR), data=Males)
The difference in how race is included in the models is due to me wanting both programs to use "whites" as the referent group (seeing as I have more data from them than "blacks" and "others").
SAS results:
Males 12:08 Wednesday, April 21, 2004 173
The GENMOD Procedure
Model Information
Data Set WORK.MALES Distribution Poisson Link Function Log Dependent Variable Observed Offset Variable lPYAR Observations Used 526
Class Level Information
Class Levels Values
Race 3 B O W Sex 1 M
Parameter Information
Parameter Effect Race
Prm1 Intercept Prm2 Race B Prm3 Race O Prm4 Race W Prm5 mlPM*mlPM*mlPM Prm6 mlPM*mlPM Prm7 mlPM
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 520 239.5025 0.4606
Scaled Deviance 520 239.5025 0.4606
Pearson Chi-Square 520 360.5677 0.6934
Scaled Pearson X2 520 360.5677 0.6934
Log Likelihood 320.5910
Males 12:08 Wednesday, April 21, 2004 174
The GENMOD Procedure
Algorithm converged.
Estimated Covariance Matrix
Prm1 Prm2 Prm3 Prm5 Prm6 Prm7
Prm1 9.25071 -0.01841 0.04877 -13.71192 37.88798 -33.20414
Prm2 -0.01841 0.03392 0.002521 0.03045 -0.07720 0.06191
Prm3 0.04877 0.002521 0.02027 -0.07622 0.21457 -0.18748
Prm5 -13.71192 0.03045 -0.07622 22.11044 -59.26190 50.49281
Prm6 37.88798 -0.07720 0.21457 -59.26190 160.70 -138.32
Prm7 -33.20414 0.06191 -0.18748 50.49281 -138.32 120.18
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 -15.8294 3.0415 -21.7907 -9.8682 27.09 <.0001
Race B 1 -0.6646 0.1842 -1.0256 -0.3036 13.02 0.0003
Race O 1 -0.1058 0.1424 -0.3848 0.1733 0.55 0.4575
Race W 0 0.0000 0.0000 0.0000 0.0000 . .
mlPM*mlPM*mlPM 1 15.4205 4.7022 6.2044 24.6366 10.75 0.0010
mlPM*mlPM 1 -36.8423 12.6768 -61.6884 -11.9961 8.45 0.0037
mlPM 1 27.2989 10.9627 5.8124 48.7855 6.20 0.0128
Scale 0 1.0000 0.0000 1.0000 1.0000
NOTE: The scale parameter was held fixed.
Females 12:08 Wednesday, April 21, 2004 175
The GENMOD Procedure
Model Information
Data Set WORK.FEMALES Distribution Poisson Link Function Log Dependent Variable Observed Offset Variable lPYAR Observations Used 534
Class Level Information
Class Levels Values
Race 3 B O W Sex 1 F
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 531 245.2305 0.4618
Scaled Deviance 531 245.2305 0.4618
Pearson Chi-Square 531 484.8219 0.9130
Scaled Pearson X2 531 484.8219 0.9130
Log Likelihood 183.8640
Algorithm converged.
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 -9.7630 0.0577 -9.8762 -9.6499 28595.0 <.0001
Race B 1 -1.0917 0.2493 -1.5803 -0.6030 19.17 <.0001
Race O 1 0.0014 0.1569 -0.3061 0.3088 0.00 0.9931
Race W 0 0.0000 0.0000 0.0000 0.0000 . .
Females 12:08 Wednesday, April 21, 2004 176
The GENMOD Procedure
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Scale 0 1.0000 0.0000 1.0000 1.0000
NOTE: The scale parameter was held fixed.
R results:
> summary(Female.model)
Call: glm(formula = Observed ~ Black + Other, family = poisson(link = log), data = Females, offset = log(PYAR))
Deviance Residuals: Min 1Q Median 3Q Max -2.4060 -0.5315 -0.1109 -0.0284 2.6520
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -9.763025 0.057735 -169.101 < 2e-16 *** BlackTRUE -1.091679 0.249309 -4.379 1.19e-05 *** OtherTRUE 0.001363 0.156876 0.009 0.993 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 272.49 on 533 degrees of freedom Residual deviance: 245.23 on 531 degrees of freedom AIC: 520.71
Number of Fisher Scoring iterations: 7
> summary(Male.model)
Call:
glm(formula = Observed ~ Black + Other + I(Minus.log.proportion.moved^3) +
I(Minus.log.proportion.moved^2) + Minus.log.proportion.moved,
family = poisson(link = log), data = Males, offset = log(PYAR))
Deviance Residuals: Min 1Q Median 3Q Max -2.24568 -0.49137 -0.10197 -0.03262 3.88346
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -16.39065 3.31644 -4.942 7.72e-07 ***
BlackTRUE -0.66461 0.18418 -3.608 0.000308 ***
OtherTRUE -0.09513 0.14278 -0.666 0.505245
I(Minus.log.proportion.moved^3) 24.39920 7.51188 3.248 0.001162 **
I(Minus.log.proportion.moved^2) -51.17011 17.75857 -2.881 0.003959 **
Minus.log.proportion.moved 33.48773 13.52491 2.476 0.013286 *
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 278.68 on 525 degrees of freedom Residual deviance: 240.54 on 520 degrees of freedom AIC: 582.68
Number of Fisher Scoring iterations: 6
Now, you'll notice (after scrolling up and down a lot) that the models for females have identical results, but the models for males have different results. Anybody have any ideas why I'm getting a difference and which program (if either) is giving me the right answer? Thanks in advance again.
Aaron
------------- Aaron Solomonâ (âben Saul Josephâ) âAdelman E-mailâ: [EMAIL PROTECTED] Web siteâ: âhttpâ://âpeople.musc.eduâ/â~adelmaasâ/â AOL Instant Messengerâ & âYahooâ! âMessenger: âHiergargo AIM chat-room (preferred): Adelmania
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
