I am asking whether corporal punishment of children is associated
with behavior problems. I am using data from the National
Longitudinal Survey of Youth. I am interested in the results of a
question that asks how often the parent has spanked the child in
the last week. This data is extremely right skewed with some
extreme outliers. Most of the responses are zeros and ones.
Square root and log transforms have very little effect on the
right skew. (I added 1 to each score and took the log to avoid
zeros.)
The regression (output below) shows such a small R-squared that
there would appear to be no meaningful association, although the
slope is significantly different from zero. However, on general
principle: Is there some way to properly transform such skewed
data? If not, can it still be used in a regression? Of what
errors must I be aware if I were to use it?
Milton Steinberg
============================
Dep Var: BEHPROBS N: 1107 Multiple R: 0.226 Squared
multiple R: 0.051
Adjusted squared multiple R: 0.050 Standard error of estimate:
14.780
Effect Coefficient Std Error Std Coef
Tolerance t P(2 Tail)
CONSTANT 102.839 0.538 0.000 .
191.289 0.000
SPANK9235 1.381 0.179 0.226 1.000
7.719 0.000
Analysis of Variance
Source Sum-of-Squares df Mean-Square
F-ratio P
Regression 13015.793 1 13015.793
59.583 0.000
Residual 241384.857 1105 218.448