On Fri, 19 Oct 2001 11:58:18 +1000, "Glen Barnett"
<[EMAIL PROTECTED]> wrote:

> 
> Rich Strauss <[EMAIL PROTECTED]> wrote in message
> news:[EMAIL PROTECTED]...
> > However, the arcsin transformation is for proportions (with fixed
> 
> It's also designed for stabilising variance rather than specifically inducing
> symmetry.
>  Does it actually produce symmetry as well?
> 
> > denominator), not for ratios (with variable denominator).  The "proportion
> > of sentences in a number of texts that belong to a certain category" sounds
> > like a problem in ratios, since the total number of sentences undoubtedly
> > vary among texts.  Log transformations work well because they linearize
> > such ratios.
> 
> Additionally for small proportions logs are close to logits, so logs are
> sometimes helpful even if the data really are proportions. Logs also go 
> some way to reducing the skewness and stabilising the variance, though 
> they don't stabilise it as well as the arcsin square root that's specifically 
> designed for it.

The transformation is okay but not great for proportions 
less than (say) 5%.   Jez Hill followed up on a reference that
gave him this summer, and posted further detail --
============== from June 27, 2001, Jez Hill.
Subject: Re: [Q ] transforming binomial proportions 
Newsgroups: sci.stat.math

Rich Ulrich wrote in article
[EMAIL PROTECTED]:

> The fixed variance was the main appeal of the approximation,
> "arcsin(sqrt(p))". [snip]
> "A more accurate transformation for small n has been tabulated by
> Mosteller and Youtz."[ Biometrika 48 (1961):433.]

Thanks very much for that - it looks pretty good to me at n=500,
6<=np<=494

FYI: Following up on your reference, Mosteller and Youtz give the
following formula from Freeman and Tukey [Ann. Math. Statist.
21(1950):607].

    arcsin(sqrt( np/(n+1) ))/2 + arcsin(sqrt( (np+1)/(n+1) ))/2

which gives asymptotic variance 821/(n+0.5) "for a substantial range
of p if  n is not too small". I find that the improvement is quite
significant, to the point where I would be quite happy to use it even
for np=1, 2 or 3 at  n=500, minor glitches in that region
notwithstanding.
=========== end of post
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to