- sorry to take so long to get to this - On 18 Sep 2003 01:03:35 -0700, [EMAIL PROTECTED] (emma) wrote:
> Rich Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... [ snip ] RU> > > > You need to explore your data, and explore the *logical* > > relationships of your measures. Figure out what *might* > > make sense, before you even look at a scatter chart. > > Then look at a scatter chart before you even think of > > doing any transformation or regression. > > > > You might try Judd and McClelland's book since it > > emphasizes models. > ok - thanx for your reply - my DV is a proportion because, > theoretically, I'm interested in the overlap between 2 variables > rather than either in isolation. I'm afraid I wasn't aware of any > particular problem with transforming proportions?? Yes, one particular is that, for various reasons, what you select should be symmetric at 50%, if data use the full range. The logit is popular, these days: log of the ratio p/(1-p) . Or, for two non-zero numbers that make up part of a total, using the log of the ratios *might* make sense. > > Neither variable was arbitarily transformed - both were transformed in I guess I use "arbitrarily transformed" to describe whatever I consider to be too-weak as a justification. The best reason is that you know it needs it, from what you know about the measurement and the model -- log of some biological assays; square root of counts that you expect to be Poisson. Or: The model demands it. The next reason is that examination of the numbers has shown an extreme range with a natural, unreachable zero, so that taking the log is natural and "not arbitrary" for many models. Or: Homogeneity demands it, so that the linear ANOVA tests will have a chance to be reasonable. (Herman Rubin might say, Construct your model so you don't need to use ANOVA, if that is the case; but most of us try to use the tools we are familiar with.) Correcting for *skew*, based only on a S-W test, is probably "arbitrary" in my lexicon. If the N is large enough, the S-W will always reject, for proportions as data. If the proportions are between 20% and 80%, it *may* remain sound to use the original values. > order to correct a positive skew which led to a significant non-normal > distribution (Shapiro-Wilks). The covariate wasn't actually > transformed for consistency - although perhaps this is important > seeing as the covariate overlaps with the outcome variable (as u > suggest). Oh, yes, I almost forgot. If you take the log of "X0" in the analysis, for period 0, the variable would have to be different - say, scored in different ranges - to justify NOT taking the log at every period. That would *definitely* be "arbitrary". -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html "Taxes are the price we pay for civilization." . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
