I have a linear model y~x1+x2 of some data where the coefficient for x1 is higher than I would have expected from theory (0.7 vs 0.88) I wondered whether this would be an artifact due to x1 and x2 being correlated despite that the variance inflation factor is not too high (1.065): I used perturbation analysis to evaluate collinearity library(perturb) P<-perturb(A,pvars=c("x1","x2"),prange=c(1,1)) > summary(P) Perturb variables: x1 normal(0,1) x2 normal(0,1)
Impact of perturbations on coefficients: mean s.d. min max (Intercept) -26.067 0.270 -27.235 -25.481 x1 0.726 0.025 0.672 0.882 x2 0.060 0.011 0.037 0.082 I get a mean for x1 of 0.726 which is closer to what is expected. I am not an statistical expert so I'd like to know if my evaluation of the effects of collinearity is correct and in that case any solutions to obtain a reliable linear model. Thanks, Manuel Some more detailed information: > A<-lm(y~x1+x2) > summary(A) Call: lm(formula = y ~ x1 + x2) Residuals: Min 1Q Median 3Q Max -4.221946 -0.484055 -0.004762 0.397508 2.542769 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -27.23472 0.27996 -97.282 < 2e-16 *** x1 0.88202 0.02475 35.639 < 2e-16 *** x2 0.08180 0.01239 6.604 2.53e-10 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 0.823 on 241 degrees of freedom Multiple R-Squared: 0.8411, Adjusted R-squared: 0.8398 F-statistic: 637.8 on 2 and 241 DF, p-value: < 2.2e-16 > cor.test(x1,x2) Pearson's product-moment correlation data: x1 and x2 t = -3.9924, df = 242, p-value = 8.678e-05 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.3628424 -0.1269618 sample estimates: cor -0.248584 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html