Re: [R] Regression Error: Otherwise good variable causes singularity. Why?
@JLucke: As for the africa variable: I took it out of the model, so that we can exclude this variable itself and collinearity between the africa and the litrate variable as causes for the litrate-problem. This also removed the singularity remark at the top. However, the problem with litrate-variable seen as many factors remains. Just to clarify: The second results table is fictional to explain where I was headed with my regression. Anyway, thanks for the quick answer. @David: Thanks for the pointer. It was in fact a bad variable, but I created it myself. I changed the set halfway in between my calculations and thought I had adjusted everything. It turns out, that I forgot to adjust the set-length which is re-set in between the two steps of my Heckman-procedure. In any case: Thanks for the quick and helpful reply. :-) -- View this message in context: http://r.789695.n4.nabble.com/Regression-Error-Otherwise-good-variable-causes-singularity-Why-tp2322780p2322925.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression Error: Otherwise good variable causes singularity. Why?
On Aug 12, 2010, at 10:35 AM, asdir wrote: This command cdmoutcome<- glm(log(value)~factor(year) +log(gdppcpppconst)+log(gdppcpppconstAII) +log(co2eemisspc)+log(co2eemisspcAII) +log(dist) +fdiboth +odapartnertohost +corrupt +log(infraindex) +litrate +africa +imr , data=cdmdata2, subset=zero==1, gaussian(link = "identity")) results in this table Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept)1.216e+01 5.771e+01 0.211 0.8332 factor(year)2006 -1.403e+00 5.777e-01 -2.429 0.0157 * factor(year)2007 -2.799e-01 7.901e-01 -0.354 0.7234 log(gdppcpppconst) 2.762e-01 5.517e+00 0.050 0.9601 log(gdppcpppconstAII) -1.344e-01 9.025e-01 -0.149 0.8817 log(co2eemisspc) 5.655e+00 2.903e+00 1.948 0.0523 . log(co2eemisspcAII) -1.411e-01 4.245e-01 -0.332 0.7399 log(dist) -2.938e-01 4.023e-01 -0.730 0.4658 fdiboth1.326e-04 1.133e-04 1.171 0.2425 odapartnertohost 2.319e-03 1.437e-03 1.613 0.1078 corrupt1.875e+00 3.313e+00 0.566 0.5718 log(infraindex)4.783e+00 1.091e+01 0.438 0.6615 You have probably created litrate as a factor without realizing it. That can easily happen if you just use read.table and one of the values cannot be gracefully interpreted as a numeric. Either read in with stringsAsFactors=FALSE or asIs=TRUE and then coerce it to numeric. or if you want to fix an existing factor f%^&-up, then the FAQ tells you to use something like: cdmdata2$f_ed_variable <- as.numeric(as.character(cdmdata2$f_ed_variable) litrate0.47 -2.485e+01 3.190e+01 -0.779 0.4365 litrate0.499 -1.657e+01 2.591e+01 -0.639 0.5230 litrate0.523 -2.440e+01 3.427e+01 -0.712 0.4769 litrate0.528 -9.184e+00 1.379e+01 -0.666 0.5060 litrate0.595 -2.309e+01 2.776e+01 -0.832 0.4062 litrate0.66 -1.451e+01 2.734e+01 -0.531 0.5961 litrate0.675 -1.707e+01 2.813e+01 -0.607 0.5444 litrate0.68 -6.346e+00 1.063e+01 -0.597 0.5509 litrate0.699 2.717e+00 3.541e+00 0.768 0.4434 litrate0.706 -1.960e+01 2.933e+01 -0.668 0.5046 litrate0.714 -2.586e+01 4.002e+01 -0.646 0.5186 litrate0.736 5.641e+00 1.561e+01 0.361 0.7181 litrate0.743 -2.692e+01 4.253e+01 -0.633 0.5273 litrate0.762 -2.208e+01 3.100e+01 -0.712 0.4767 litrate0.802 -2.325e+01 3.766e+01 -0.617 0.5375 litrate0.847 -2.620e+01 3.948e+01 -0.664 0.5075 litrate0.86 -3.576e+01 4.950e+01 -0.722 0.4707 litrate0.864 -4.482e+01 6.274e+01 -0.714 0.4755 litrate0.872 -1.946e+01 2.715e+01 -0.717 0.4739 litrate0.877 -2.710e+01 3.702e+01 -0.732 0.4646 litrate0.879 -3.460e+01 5.147e+01 -0.672 0.5020 litrate0.886 -3.276e+01 4.860e+01 -0.674 0.5008 litrate0.889 -4.120e+01 5.755e+01 -0.716 0.4746 litrate0.904 -2.282e+01 2.985e+01 -0.764 0.4453 litrate0.91 -3.478e+01 5.037e+01 -0.691 0.4904 litrate0.923 -1.762e+01 2.551e+01 -0.691 0.4902 litrate0.925 -2.445e+01 3.611e+01 -0.677 0.4990 litrate0.926 -2.995e+01 4.565e+01 -0.656 0.5123 litrate0.928 -2.839e+01 3.933e+01 -0.722 0.4710 litrate0.937 -2.571e+01 3.795e+01 -0.677 0.4986 litrate0.94 -2.109e+01 3.051e+01 -0.691 0.4900 litrate0.959 -2.078e+01 2.895e+01 -0.718 0.4735 litrate0.96 -3.403e+01 4.798e+01 -0.709 0.4787 litrate0.962 -4.084e+01 5.755e+01 -0.710 0.4785 litrate0.971 -3.743e+01 5.247e+01 -0.713 0.4761 litrate0.98 -3.709e+01 5.170e+01 -0.717 0.4737 litrate0.986 -2.663e+01 4.437e+01 -0.600 0.5488 litrate0.991 -3.045e+01 4.166e+01 -0.731 0.4654 litrate1 -2.732e+01 4.459e+01 -0.613 0.5405 africaNA NA NA NA imr2.160e+00 9.357e-01 2.309 0.0216 * although it should result in something similar to this: Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept)1.216e+01 5.771e+01 0.211 0.8332 factor(year)2006 -1.403e+00 5.777e-01 -2.429 0.0157 * factor(year)2007 -2.799e-01 7.901e-01 -0.354 0.7234 log(gdppcpppconst) 2.762e-01 5.517e+00 0.050 0.9601 log(gdppcpppconstAII) -1.344e-01 9.025e-01 -0.149 0.8817 log(co2eemisspc) 5.655e+00 2.903e+00 1.948 0.0523 . log(co2eemisspcAII) -1.411e-01 4.245e-01 -0.332 0.7399 log(dist) -2.938e-01 4.023e-01 -0.730 0.4658 fdiboth
Re: [R] Regression Error: Otherwise good variable causes singularity. Why?
There appears to be a problem in both regressions, as a singularity is also reported in the second regression analysis as well. It appears that the litrate variable is considered a factor in the first analysis and continuous in the second. There also appears to be collinearity between the litrate variable and the Africa variable. Look at the package lm.influence for regression diagnostics. asdir Sent by: r-help-boun...@r-project.org 08/12/2010 10:35 AM To r-help@r-project.org cc Subject [R] Regression Error: Otherwise good variable causes singularity. Why? This command cdmoutcome<- glm(log(value)~factor(year) > +log(gdppcpppconst)+log(gdppcpppconstAII) > +log(co2eemisspc)+log(co2eemisspcAII) > +log(dist) > +fdiboth > +odapartnertohost > +corrupt > +log(infraindex) > +litrate > +africa > +imr > , data=cdmdata2, subset=zero==1, gaussian(link = > "identity")) results in this table Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept)1.216e+01 5.771e+01 0.211 0.8332 > factor(year)2006 -1.403e+00 5.777e-01 -2.429 0.0157 * > factor(year)2007 -2.799e-01 7.901e-01 -0.354 0.7234 > log(gdppcpppconst) 2.762e-01 5.517e+00 0.050 0.9601 > log(gdppcpppconstAII) -1.344e-01 9.025e-01 -0.149 0.8817 > log(co2eemisspc) 5.655e+00 2.903e+00 1.948 0.0523 . > log(co2eemisspcAII) -1.411e-01 4.245e-01 -0.332 0.7399 > log(dist) -2.938e-01 4.023e-01 -0.730 0.4658 > fdiboth1.326e-04 1.133e-04 1.171 0.2425 > odapartnertohost 2.319e-03 1.437e-03 1.613 0.1078 > corrupt1.875e+00 3.313e+00 0.566 0.5718 > log(infraindex)4.783e+00 1.091e+01 0.438 0.6615 > litrate0.47 -2.485e+01 3.190e+01 -0.779 0.4365 > litrate0.499 -1.657e+01 2.591e+01 -0.639 0.5230 > litrate0.523 -2.440e+01 3.427e+01 -0.712 0.4769 > litrate0.528 -9.184e+00 1.379e+01 -0.666 0.5060 > litrate0.595 -2.309e+01 2.776e+01 -0.832 0.4062 > litrate0.66 -1.451e+01 2.734e+01 -0.531 0.5961 > litrate0.675 -1.707e+01 2.813e+01 -0.607 0.5444 > litrate0.68 -6.346e+00 1.063e+01 -0.597 0.5509 > litrate0.699 2.717e+00 3.541e+00 0.768 0.4434 > litrate0.706 -1.960e+01 2.933e+01 -0.668 0.5046 > litrate0.714 -2.586e+01 4.002e+01 -0.646 0.5186 > litrate0.736 5.641e+00 1.561e+01 0.361 0.7181 > litrate0.743 -2.692e+01 4.253e+01 -0.633 0.5273 > litrate0.762 -2.208e+01 3.100e+01 -0.712 0.4767 > litrate0.802 -2.325e+01 3.766e+01 -0.617 0.5375 > litrate0.847 -2.620e+01 3.948e+01 -0.664 0.5075 > litrate0.86 -3.576e+01 4.950e+01 -0.722 0.4707 > litrate0.864 -4.482e+01 6.274e+01 -0.714 0.4755 > litrate0.872 -1.946e+01 2.715e+01 -0.717 0.4739 > litrate0.877 -2.710e+01 3.702e+01 -0.732 0.4646 > litrate0.879 -3.460e+01 5.147e+01 -0.672 0.5020 > litrate0.886 -3.276e+01 4.860e+01 -0.674 0.5008 > litrate0.889 -4.120e+01 5.755e+01 -0.716 0.4746 > litrate0.904 -2.282e+01 2.985e+01 -0.764 0.4453 > litrate0.91 -3.478e+01 5.037e+01 -0.691 0.4904 > litrate0.923 -1.762e+01 2.551e+01 -0.691 0.4902 > litrate0.925 -2.445e+01 3.611e+01 -0.677 0.4990 > litrate0.926 -2.995e+01 4.565e+01 -0.656 0.5123 > litrate0.928 -2.839e+01 3.933e+01 -0.722 0.4710 > litrate0.937 -2.571e+01 3.795e+01 -0.677 0.4986 > litrate0.94 -2.109e+01 3.051e+01 -0.691 0.4900 > litrate0.959 -2.078e+01 2.895e+01 -0.718 0.4735 > litrate0.96 -3.403e+01 4.798e+01 -0.709 0.4787 > litrate0.962 -4.084e+01 5.755e+01 -0.710 0.4785 > litrate0.971 -3.743e+01 5.247e+01 -0.713 0.4761 > litrate0.98 -3.709e+01 5.170e+01 -0.717 0.4737 > litrate0.986 -2.663e+01 4.437e+01 -0.600 0.5488 > litrate0.991 -3.045e+01 4.166e+01 -0.731 0.4654 > litrate1 -2.732e+01 4.459e+01 -0.613 0.5405 > africaNA NA NA NA > imr2.160e+00 9.357e-01 2.309 0.0216 * although it should result in something similar to this: Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (I
[R] Regression Error: Otherwise good variable causes singularity. Why?
This command cdmoutcome<- glm(log(value)~factor(year) > +log(gdppcpppconst)+log(gdppcpppconstAII) > +log(co2eemisspc)+log(co2eemisspcAII) > +log(dist) > +fdiboth > +odapartnertohost > +corrupt > +log(infraindex) > +litrate > +africa > +imr > , data=cdmdata2, subset=zero==1, gaussian(link = > "identity")) results in this table Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept)1.216e+01 5.771e+01 0.211 0.8332 > factor(year)2006 -1.403e+00 5.777e-01 -2.429 0.0157 * > factor(year)2007 -2.799e-01 7.901e-01 -0.354 0.7234 > log(gdppcpppconst) 2.762e-01 5.517e+00 0.050 0.9601 > log(gdppcpppconstAII) -1.344e-01 9.025e-01 -0.149 0.8817 > log(co2eemisspc) 5.655e+00 2.903e+00 1.948 0.0523 . > log(co2eemisspcAII) -1.411e-01 4.245e-01 -0.332 0.7399 > log(dist) -2.938e-01 4.023e-01 -0.730 0.4658 > fdiboth1.326e-04 1.133e-04 1.171 0.2425 > odapartnertohost 2.319e-03 1.437e-03 1.613 0.1078 > corrupt1.875e+00 3.313e+00 0.566 0.5718 > log(infraindex)4.783e+00 1.091e+01 0.438 0.6615 > litrate0.47 -2.485e+01 3.190e+01 -0.779 0.4365 > litrate0.499 -1.657e+01 2.591e+01 -0.639 0.5230 > litrate0.523 -2.440e+01 3.427e+01 -0.712 0.4769 > litrate0.528 -9.184e+00 1.379e+01 -0.666 0.5060 > litrate0.595 -2.309e+01 2.776e+01 -0.832 0.4062 > litrate0.66 -1.451e+01 2.734e+01 -0.531 0.5961 > litrate0.675 -1.707e+01 2.813e+01 -0.607 0.5444 > litrate0.68 -6.346e+00 1.063e+01 -0.597 0.5509 > litrate0.699 2.717e+00 3.541e+00 0.768 0.4434 > litrate0.706 -1.960e+01 2.933e+01 -0.668 0.5046 > litrate0.714 -2.586e+01 4.002e+01 -0.646 0.5186 > litrate0.736 5.641e+00 1.561e+01 0.361 0.7181 > litrate0.743 -2.692e+01 4.253e+01 -0.633 0.5273 > litrate0.762 -2.208e+01 3.100e+01 -0.712 0.4767 > litrate0.802 -2.325e+01 3.766e+01 -0.617 0.5375 > litrate0.847 -2.620e+01 3.948e+01 -0.664 0.5075 > litrate0.86 -3.576e+01 4.950e+01 -0.722 0.4707 > litrate0.864 -4.482e+01 6.274e+01 -0.714 0.4755 > litrate0.872 -1.946e+01 2.715e+01 -0.717 0.4739 > litrate0.877 -2.710e+01 3.702e+01 -0.732 0.4646 > litrate0.879 -3.460e+01 5.147e+01 -0.672 0.5020 > litrate0.886 -3.276e+01 4.860e+01 -0.674 0.5008 > litrate0.889 -4.120e+01 5.755e+01 -0.716 0.4746 > litrate0.904 -2.282e+01 2.985e+01 -0.764 0.4453 > litrate0.91 -3.478e+01 5.037e+01 -0.691 0.4904 > litrate0.923 -1.762e+01 2.551e+01 -0.691 0.4902 > litrate0.925 -2.445e+01 3.611e+01 -0.677 0.4990 > litrate0.926 -2.995e+01 4.565e+01 -0.656 0.5123 > litrate0.928 -2.839e+01 3.933e+01 -0.722 0.4710 > litrate0.937 -2.571e+01 3.795e+01 -0.677 0.4986 > litrate0.94 -2.109e+01 3.051e+01 -0.691 0.4900 > litrate0.959 -2.078e+01 2.895e+01 -0.718 0.4735 > litrate0.96 -3.403e+01 4.798e+01 -0.709 0.4787 > litrate0.962 -4.084e+01 5.755e+01 -0.710 0.4785 > litrate0.971 -3.743e+01 5.247e+01 -0.713 0.4761 > litrate0.98 -3.709e+01 5.170e+01 -0.717 0.4737 > litrate0.986 -2.663e+01 4.437e+01 -0.600 0.5488 > litrate0.991 -3.045e+01 4.166e+01 -0.731 0.4654 > litrate1 -2.732e+01 4.459e+01 -0.613 0.5405 > africaNA NA NA NA > imr2.160e+00 9.357e-01 2.309 0.0216 * although it should result in something similar to this: Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept)1.216e+01 5.771e+01 0.211 0.8332 > factor(year)2006 -1.403e+00 5.777e-01 -2.429 0.0157 * > factor(year)2007 -2.799e-01 7.901e-01 -0.354 0.7234 > log(gdppcpppconst) 2.762e-01 5.517e+00 0.050 0.9601 > log(gdppcpppconstAII) -1.344e-01 9.025e-01 -0.149 0.8817 > log(co2eemisspc) 5.655e+00 2.903e+00 1.948 0.0523 . > log(co2eemisspcAII) -1.411e-01 4.245e-01 -0.332 0.7399 > log(dist) -2.938e-01 4.023e-01 -0.730 0.4658 > fdiboth1.326e-04 1.133e-04 1.171 0.2425 > odapartnertohost 2.319e-03 1.437e-03 1.613 0.1078 > corrupt1.875e+00 3.313e+00 0.566 0.5718 > log(infraindex)4.783e+00 1.091e+01 0