In article <[EMAIL PROTECTED]>,
Rich Ulrich  <[EMAIL PROTECTED]> wrote:
>On Wed, 17 Oct 2001 15:50:35 +0200, Tobias Richter
><[EMAIL PROTECTED]> wrote:



>> We have collected variables that represent proportions (i. e., the
>> proportion of sentences in a number of texts that belong to a certain
>> category). The distributions of these variables are highly skewed (the
>> proportions  for most of the texts are zero or rather low). So my

>Low proportions, and a lot at  zero? 

>There is no way you can transform to "symmetry"  when 
>there is a clump at one end and a long tail at the other.

If one has a continuous distribution, one can always
transform to symmetry.  It is by no means clear that
it dces any good.

>First thought:  the dichotomy of None/Some  sometimes
>contains most of the information that is useful.  Dummy Var1.

>Related thought:  "none"  is sometimes a separate dimension
>from what is implicitly measured by the continuous values above zero.
>If that dimension does seem useful:  Dummy Var2.


>> question is: Is there a function that transforms the proportions into
>> symmetrically distributed variables? And is there a reliable statistics
>> text that discusses such transformations?

>"Symmetry"  might happen, and it is good to have for the sake
>of testing.  However, describing a scientific  model  with 
>meaningful parameters  is a better starting point, and you 
>can devise tests from there.   

This is VERY important.  The model should always come from
the subject field, not from the data, or from mathematical
convenience.  However, one can use robustness results, in
the real sense of the term, to see that procedures devised
for certain models are good for others.

>I mean: it is useful if you have a "Poisson model with a 
>Poisson parameter", say, at the stage of setting up a model.  
>You might want to take the square root before you do testing, 
>and you know that is appropriate for the Poisson;  but the raw 
>Poisson parameter is a number that is ordinarily additive.

>I have not seen many texts that tackle transformations in
>the abstract.  Finney's classic text on bioassay has a few pages.
>Or, I think, Mosteller and Tukey, "Data Analysis and Regression."
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to