- sorry to take so long to get to this -

On 18 Sep 2003 01:03:35 -0700, [EMAIL PROTECTED] (emma) wrote:

> Rich Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...

 [ snip ]
RU> >
> > You need to explore your data, and explore the *logical*
> > relationships of your measures.  Figure out what *might*
> > make sense, before you even look at a scatter chart.
> > Then look at a scatter chart before you even think of
> > doing any transformation or regression.
> > 
> > You might try Judd and McClelland's book since it
> > emphasizes models.
 
> ok - thanx for your reply - my DV is a proportion because,
> theoretically, I'm interested in the overlap between 2 variables
> rather than either in isolation.  I'm afraid I wasn't aware of any
> particular problem with transforming proportions??

Yes, one particular is that, for various reasons, what you
select should be symmetric at 50%, if data use the full range.
The logit is popular, these days:  log of the ratio p/(1-p) .

Or, for two non-zero numbers that make up part of a
total, using the log of the ratios *might*  make sense.

> 
> Neither variable was arbitarily transformed - both were transformed in

I guess I use "arbitrarily transformed"  to describe 
whatever I consider to be too-weak  as a justification.
The best reason is that you know it needs it, from what
you know about the measurement and the model --
log of some biological assays;  square root of counts
that you expect to be Poisson.  Or:  The model demands it.

The next reason is that examination of the numbers has
shown an extreme range with a natural, unreachable zero,
so that taking the log is natural and "not arbitrary"  for
many models.  Or:  Homogeneity demands it, so that the
linear ANOVA tests will have a chance to be reasonable.
(Herman Rubin might say, Construct your model so you
don't need to use ANOVA, if that is the case;  but most of
us try to use the tools we are familiar with.)

Correcting for *skew*,  based only on  a S-W  test, is 
probably "arbitrary"  in my lexicon.  If the N   is large enough,
the S-W  will always reject, for proportions as data.  If
the proportions are between 20% and 80%,  it  *may*
remain sound to use the original values.

> order to correct a positive skew which led to a significant non-normal
> distribution (Shapiro-Wilks).  The covariate wasn't actually
> transformed for consistency - although perhaps this is important
> seeing as the covariate overlaps with the outcome variable (as u
> suggest).

Oh, yes, I almost forgot.  If you take the log of 
"X0"  in the analysis, for period 0,  the variable
would have to be different - say, scored in different
ranges -  to justify NOT  taking the log at every
period.   That would *definitely*  be "arbitrary".

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization." 
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to