Hello all, Thanks to all who responded to my question about whether backtransformed CI's are kosher. Like most questions in ecology, the answer is "it depends...". Strictly speaking, the backtransformation is valid and useful for interpretation because it returns data to the original measurement scale. However, once data have been transformed, interpretation of what the transformed (or backtransformed) mean, CI's and differences among means represent requires special care and is not necessarily intuitive.
In particular, transforms change an additive, linear model into a multiplicative model with the following (and other) consequences: If the data are monotonic (as for a lognormal distribution), and the transformed distribution is symmetric (e.g., normal), the mean of the transformed data best represents the median of the original data, and consequently the backtransformed mean should be interpreted and reported as the median +/- the 95% CI of the median, not the mean and CI of the untransformed data. The backtransformed CI will be asymmetric around the estimate of the median. Moreover, if one calculates a mean difference between two groups using the transformed data (i.e. log(mean1) - log(mean2)) as in my ANOVA example, the CI of the backtransformed difference is actually a CI for the ratio of medians between the two groups. In short, the old caution applies: if you transform data to meet the assumptions of a statistical test, biological interpretation of the output should be made with care. I found Mike Colvin's pointer to the McArdle and Anderson paper quite instructive (special thanks to Mike!). His post and several others are pasted below. Thanks again, Chris C. My original post: Here's a question that I feel like I should know the answer to... I've conducted an ANOVA in sas on a large data set using log(e) transformed data. I'd like to plot means and 95% CI's using the sas output. Is it kosher to simply back-transform the CI's (I have a nagging feeling that it isn't). Thanks in advance. Mike Colvin's post (repeated for those who missed it): Chris, check out the article: Variance heterogeneity, transformations, and models of species abundance: a cautionary tale Brian H. McArdle and Marti J. Anderson Can. J. Fish. Aquat. Sci. 61(7): 1294-1302 (2004) http://pubs.nrc-cnrc.gc.ca/cgi-bin/rp/rp2_abst_e?cjfas_f04-051_61_ns_nf_cjfas7-04 by McArdle an Anderson, they address the back calculation of transformed confidence intervals on page 1296 in the effects of transformations on the model. Apperently in this article you can exponate the confidence intervals to approximate the intervals of the original data, however there are some cautions on interpretations interpretting the mean of the logged data versus the arithmatic mean as log of the arithmatic mean is not the same as the mean of the logged data, depending on the skewness of your orignal data. Mike Colvin Selected other responses: Yes, its fine to do that. The ends of the CIs are the 2.5% and 97.5% quantiles, so that 2.5% of the probability lies below the lower confidence limit. If you, for example, squared the CI, 2.5% would still lie below it. All you need is that the transformation is monotonic: i.e. if A<B on the original scale, then this is still true after the transformation, no matter what the values of A and B are. The exponential transformation is monotonic, so you're OK. The problem is that the mean doesn't act like this: technically, you need to multiply it by the Jacobian of the transformation. Fortunately, this is a standard result, so the mean of e^x is mu + 0.5*s^2, where mu and s^2 are the mean and variance on the un-transformed scale. Chris, You may want to see Gotelli and Ellison (2004) A Primer of Ecological Statistics. They have a great discussion on transformations in Chapter 8. They state that results and CIs should be back transformed so that they are in the original units (see pages 233-4). It's OK to back-transform the confidence *limits*, not the confidence *intervals*. Remember, you log-transformed the data because it was non-normal; it was log-normal, in fact. The 95% confidence interval tells you that you are 95% sure the true mean lies within that interval. With a symmetric, normal distribution, this interval will be symmetric around the mean. Because a log-normal distribution is not symmetric (at least in untransformed, "real" space), the interval in that space won't be, either. So what you need to do is figure out what the 95% upper and lower confidence limits are in log space, then transform those values back to real space. Your confidence intervals won't be symmetric, nor should they be. [note: my understanding is that the above response is an example where the backtransformed CI relates to CI of the median value in the original measurement scale ('real space') and does not represent a good estimate of the CI of the mean of untransformed data, which is how I think most would be tempted to interpret the backtransformed CI. CCC] Christopher C. Caudill Department of Fish and Wildlife Resources College of Natural Resources University of Idaho Moscow, ID 83844-1136 208-885-7614 (voice) 208-885-9080 (fax) http://www.cnr.uidaho.edu/UIFERL/Christopher_C._Caudill.htm NOTE NEW EMAIL: [EMAIL PROTECTED]
