In article <[EMAIL PROTECTED]>,
Donald F. Burrill <[EMAIL PROTECTED]> wrote:
>Bob, is this not isomorphic to a chi-square goodness of fit test?
>For each coin, if you know or can estimate (separately from these 
>data!) the probability of heads, you can calculate an expected number 
>of heads from the number of times the coin is flipped.  You have an 
>observed number of heads for that coin.  Sum ((O-E)^2)/E) for a chi-sq 
>with (presumably) 99 d.f.
>       The low _observed_ frequencies you mention are not a problem; 
>the problem that occasions the usual caution has to do with a low (e.g., 
>fractional) expected frequency coupled with a positive observed 
>frequency, which can unreasonably inflate the total chi-sq.  (If the 
>observed frequency is 0, the contribution of that coin to the total 
>chi-sq. is merely the expected frequency.)

Some of these cells are too small for this.  Total numbers
of flips can be as low as 4, and dichotomizing the Bernouilli
trials loses information.  

With samples of this size, the discrepancy between the
actual distribution of the chi-squared test of goodness of
fit with the chi-squared distribution cannot be ignored;
instead of combining goodness of fit sums of squares one
should combine log likelihood functions.

There often is good reason to use asymptotics, but using
it on each of 100 test statistics to get an overall test
is not one of them.  If using the chi-squared test, use
the exact distribution of that statistic for the given
theoretical model, if it is to be used at all.  Using
the chi-squared test on a 2x100 table is likely to be
rather distorted.

 To some degree, should the 
>problem arise (i.e., should there be largeish contributions to chi-sq. 
>from a few cells because of observed frequencies of 2 and expected 
>frequencies of, say, 0.05 [close to your worst case] and consequent 
>contributions of 76 to the total chi-sq), you can reduce the intensity 
>of the problem by combining categories;  although before doing that I 
>would always want to look at the large contributors to see whether 
>there may be something interesting going on.  One out of a hundred 
>coins showing this behavior I'm prepared to ignore;  four or five 
>doing so would excite my suspicion.
>                                       -- DFB.

>On Thu, 30 Mar 2000, Bob Parks wrote:

>> Consider the following problem (which has a real world
>> problem behind it)

>> You have 100 coins, each of which has a different
>> probability of heads (assume that you know that
>> probability or worse can estimate it).

>> Each coin is labeled.  You ask one person (or machine
>> if you will) to flip each coin a different number of times,
>> and you record the number of heads.

>> Assume that the (known/estimated) probability of heads
>> is between .01 and .20, and the number of flips for
>> each coin is between 4 and 40.

>> The question is how to test that the person/machine
>> doing the flipping is flipping 'randomly/fairly'.  That is,
>> the person/machine might not flip 'randomly/fairly/...'
>> and you want to test that hypothesis.

>> One can easily state the null hypothesis as

>>   p_hat_i = p_know_i  for i=1 to 100

>> where p_hat_i is the observed # heads / # flips for each i.

>> Since each coin has a different probability of heads,
>> you can not directly aggregate.

>> Since the expected number of heads is low, asymptotics for
>> chi-squares will not apply (each coin has substantial
>> probability of obtaining 0 heads so empirically you obtain
>> lots of cells with 0,1,2 etc.).

>> Given that, I have failed to come up with a statistic to test it.

>> TIA for any pointers to help.

>> Bob

> ------------------------------------------------------------------------
> Donald F. Burrill                                 [EMAIL PROTECTED]
> 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
> MSC #29, Plymouth, NH 03264                                 603-535-2597
> 184 Nashua Road, Bedford, NH 03110                          603-471-7128  



>===========================================================================
>This list is open to everyone.  Occasionally, less thoughtful
>people send inappropriate messages.  Please DO NOT COMPLAIN TO
>THE POSTMASTER about these messages because the postmaster has no
>way of controlling them, and excessive complaints will result in
>termination of the list.

>For information about this list, including information about the
>problem of inappropriate messages and information about how to
>unsubscribe, please see the web page at
>http://jse.stat.ncsu.edu/
>===========================================================================


-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to