Have you plotted the data?  Impossible to tell much from a simple 
regression analysis;  especially without any definition of the two 
variables.  If I were compelled to guess, I'd suppose that BEHPROBS 
(your dependent variable) was the number of behavioral problems 
reported, probably over some defined time span (perhaps the week 
mentioned with respect to "how often the parent has spanked the 
child", which I presume to be the dependent variable SPANK9235?). 
But if you haven't even _looked_at_ the bivariate relationship, you 
can't tell whether a _linear_ functional relation makes any sense. 

On Wed, 19 Jan 2000, steinberg wrote:

> I am asking whether corporal punishment of children is associated
> with behavior problems.  

        Controlling for what other variables?  The analysis you report 
below shows none;  but surely there are many that need to be controlled 
(such as propensity for administering punishment at all, propensity for 
corporal versus other kinds of punishment, whether corporal punishment 
is administered by only one parent or by both, the severity of the 
(alleged?) behavior problems, ...

> I am using data from the National
> Longitudinal Survey of Youth.  I am interested in the results of a 
> question that asks how often the parent has spanked the child in
> the last week.  This data is extremely right skewed with some
> extreme outliers.  Most of the responses are zeros and ones.
> Square root and log transforms have very little effect on the
> right skew.  (I added 1 to each score and took the log to avoid
> zeros.)

But the important question is, what effect (if any) do these 
transformations have on the bivariate relationship?  Does it look 
more (or less) linear in one form than in the others?

> The regression (output below) shows such a small R-squared that
> there would appear to be no meaningful association, although the
> slope is significantly different from zero. 

Again:  If you haven't examined the scatterplot, you cannot tell whether 
there is an association or not.  It is not at all clear that a simple 
linear association is to be expected;  especially if your respondents 
include parents who refuse to use corporal punishment at all, however 
great the behavioral provocation, as well as parents who believe firmly 
in the dictum "Spare the rod and spoil the child".  
        With 1100 degrees of freedom, quite small effects can be found 
formally significant;  but your analysis reports  r = .226.

> ... However, on general principle:  Is there some way to properly 
> transform such skewed data? 

Sounds as though you've reasonably well addressed that, at least at the 
simple level of bivariate regression, insofar as one can without looking 
at the data.

> If not, can it still be used in a regression? 

        Certainly.

> Of what errors must I be aware if I were to use it?

Mainly, oversimplified models, I should think.  You might profitably 
spend some time thinking about how the data you have might have arisen, 
and what other variables will affect the relationship you wish to 
consider.  AND you might also think about whether you've got the 
relationship the right way round.  You're using number of spankings in a 
week to predict (number of?) behavior problems;  it would not be 
unreasonable, from one point of view, to predict the number of spankings 
from the number (or intensity?) of the problems.
        An assumption embedded in your analysis is that it makes sense to 
think of spanking as inducing (or causing) behavior problems.  Parents 
who spank, if asked, will ordinarily claim that they are trying to reduce 
or prevent behavior problems, and that spanking is a response to overt 
behavior problems, not a cause of them. 

> ============================
> 
> Dep Var: BEHPROBS   N: 1107   Multiple R: 0.226   
>                                       Squared multiple R: 0.051
>  
> Adjusted squared multiple R: 0.050  
> Standard error of estimate:  14.780
>  
> Effect  Coefficient  Std Error   Std Coef Tolerance     t       P
>  
> CONSTANT    102.839      0.538     0.000      .     191.289    0.000
> SPANK9235     1.381      0.179     0.226     1.000    7.719    0.000 
>
>                              Analysis of Variance
>  
> Source      Sum-of-Squares   df  Mean-Square     F-ratio       P
>  
> Regression      13015.793     1    13015.793     59.583       0.000
> Residual       241384.857  1105      218.448


 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  

Reply via email to