Hello all,

Thanks to all who responded to my question about whether 
backtransformed CI's are kosher.  Like most questions in ecology, the 
answer is "it depends...".  Strictly speaking, the backtransformation 
is valid and useful for interpretation because it returns data to the 
original measurement scale.  However, once data have been 
transformed, interpretation of what the transformed (or 
backtransformed) mean, CI's and differences among means represent 
requires special care and is not necessarily intuitive.

In particular, transforms change an additive, linear model into a 
multiplicative model with the following (and other) consequences:

If the data are monotonic (as for a lognormal distribution), and the 
transformed distribution is symmetric (e.g., normal), the mean of the 
transformed data best represents the median of the original data, and 
consequently the backtransformed mean should be interpreted and 
reported as the median +/- the 95% CI of the median, not the mean and 
CI of the untransformed data.  The backtransformed CI will be 
asymmetric around the estimate of the median.

Moreover, if one calculates a mean difference between two groups 
using the transformed data (i.e. log(mean1) - log(mean2)) as in my 
ANOVA example, the CI of the backtransformed difference is actually a 
CI for the ratio of medians between the two groups.

In short, the old caution applies: if you transform data to meet the 
assumptions of a statistical test, biological interpretation of the 
output should be made with care.

I found Mike Colvin's pointer to the McArdle and Anderson paper quite 
instructive (special thanks to Mike!).  His post and several others 
are pasted below.

Thanks again,
Chris C.



My original post:

Here's a question that I feel like I should know the answer to...

I've conducted an ANOVA in sas on a large data set using log(e) 
transformed data.   I'd like to plot means and 95% CI's using the sas output.

Is it kosher to simply back-transform the CI's (I have a nagging 
feeling that it isn't).  Thanks in advance.


Mike Colvin's post (repeated for those who missed it):

Chris,

check out the article:
Variance heterogeneity, transformations, and models of species abundance:
a cautionary tale
Brian H. McArdle and Marti J. Anderson Can. J. Fish. Aquat. Sci. 61(7):
1294-1302 (2004)

http://pubs.nrc-cnrc.gc.ca/cgi-bin/rp/rp2_abst_e?cjfas_f04-051_61_ns_nf_cjfas7-04

by McArdle an Anderson, they address the back calculation of transformed
confidence intervals on page 1296 in the effects of transformations on the
model.  Apperently in this article you can exponate the confidence
intervals to approximate the intervals of the original data, however there
are some cautions on interpretations interpretting the mean of the logged
data versus the arithmatic mean as log of the arithmatic mean is not the
same as the mean of the logged data, depending on the skewness of your
orignal data.


Mike Colvin


Selected other responses:

Yes, its fine to do that.  The ends of the CIs are the 2.5% and 97.5% 
quantiles, so that 2.5% of the probability lies below the lower 
confidence limit.  If you, for example, squared the CI, 2.5% would 
still lie below it.  All you need is that the transformation is 
monotonic: i.e. if A<B on the original scale, then this is still true 
after the transformation, no matter what the values of A and B 
are.  The exponential transformation is monotonic, so you're OK.

The problem is that the mean doesn't act like this: technically, you 
need to multiply it by the Jacobian of the 
transformation.  Fortunately, this is a standard result, so the mean 
of e^x is mu + 0.5*s^2, where mu and s^2 are the mean and variance on 
the un-transformed scale.




Chris,
You may want to see Gotelli and Ellison (2004) A Primer of Ecological 
Statistics.  They have a great discussion on transformations in 
Chapter 8. They state that results and CIs should be back transformed 
so that they are in the original units (see pages 233-4).


It's OK to back-transform the confidence *limits*, not the confidence
*intervals*.  Remember, you log-transformed the data because it was
non-normal; it was log-normal, in fact.  The 95% confidence interval
tells you that you are 95% sure the true mean lies within that
interval.  With a symmetric, normal distribution, this interval will
be symmetric around the mean.  Because a log-normal distribution is
not symmetric (at least in untransformed, "real" space), the interval
in that space won't be, either.  So what you need to do is figure out
what the 95% upper and lower confidence limits are in log space, then
transform those values back to real space.  Your confidence intervals
won't be symmetric, nor should they be.

[note:  my understanding is that the above response is an example 
where the backtransformed CI relates to CI of the median value in the 
original measurement scale ('real space') and does not represent a 
good estimate of the CI of the mean of untransformed data, which is 
how I think most would be tempted to interpret the backtransformed CI. CCC]




Christopher C. Caudill
Department of Fish and Wildlife Resources
College of Natural Resources
University of Idaho
Moscow, ID 83844-1136
208-885-7614 (voice)
208-885-9080 (fax)

http://www.cnr.uidaho.edu/UIFERL/Christopher_C._Caudill.htm

NOTE NEW EMAIL:
[EMAIL PROTECTED]  

Reply via email to