Hello,

I was looking through the source code of trimmean() and I just realized 
that in general it does not remove data evenly from the top and bottom. 
Here is the source:


"""
    trimmean(x, p)

Compute the trimmed mean of `x`, i.e. the mean after removing a
proportion `p` of its highest- and lowest-valued elements.
"""
function trimmean(x::RealArray, p::Real)
    n = length(x)
    n > 0 || error("x can not be empty.")
    0 <= p < 1 || error("p must be non-negative and less than 1.")
    rn = min(round(Int, n * p), n-1)

    sx = sort(x)
    nl = rn >> 1
    nh = (rn - nl)
    s = 0.0
    for i = (1+nl) : (n-nh)
        @inbounds s += sx[i]
    end
    return s / (n - rn)
end


So this removes `nl` elements from the bottom and `nh` elements from the 
top. Some times these are the same number, and some times `nh` is one 
higher. This means that some times trimmean() removes values unevenly. This 
is not how I have seen the trimmed mean defined. Every source that I know 
says that the trimmed mean removes the same number of elements from the top 
and bottom. For example, Wilcox (2010) says: "More generally, if we round 
[p * n] down to the nearest integer g, remove the g smallest and largest 
values and average the n - 2g values that remain". This distinction is not 
irrelevant. There are theorems about how to compute the variance and 
confidence intervals for the trimmed mean that rely on one particular 
definition of the trimmed mean. If you change the definition, I can no 
longer compute a confidence interval for the computed value.

Another difference between the trimmean() function and the usual definition 
is that the "p% trimmed mean" should mean that you remove p% from the top 
and p% from the bottom. Whereas in the trimmean() function it means that 
you remove (p/2)% from the top and (p/2)% from the bottom.


Is there any chance that the definition of trimmean() could be changed in a 
future release to agree with Wilcox (2010) and other texts?


Cheers,
Daniel.

-- 
You received this message because you are subscribed to the Google Groups 
"julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to