Re: testing a coin flipper

2000-04-02 Thread Herman Rubin

In article [EMAIL PROTECTED],
Donald F. Burrill [EMAIL PROTECTED] wrote:
Bob, is this not isomorphic to a chi-square goodness of fit test?
For each coin, if you know or can estimate (separately from these 
data!) the probability of heads, you can calculate an expected number 
of heads from the number of times the coin is flipped.  You have an 
observed number of heads for that coin.  Sum ((O-E)^2)/E) for a chi-sq 
with (presumably) 99 d.f.
   The low _observed_ frequencies you mention are not a problem; 
the problem that occasions the usual caution has to do with a low (e.g., 
fractional) expected frequency coupled with a positive observed 
frequency, which can unreasonably inflate the total chi-sq.  (If the 
observed frequency is 0, the contribution of that coin to the total 
chi-sq. is merely the expected frequency.)

Some of these cells are too small for this.  Total numbers
of flips can be as low as 4, and dichotomizing the Bernouilli
trials loses information.  

With samples of this size, the discrepancy between the
actual distribution of the chi-squared test of goodness of
fit with the chi-squared distribution cannot be ignored;
instead of combining goodness of fit sums of squares one
should combine log likelihood functions.

There often is good reason to use asymptotics, but using
it on each of 100 test statistics to get an overall test
is not one of them.  If using the chi-squared test, use
the exact distribution of that statistic for the given
theoretical model, if it is to be used at all.  Using
the chi-squared test on a 2x100 table is likely to be
rather distorted.

 To some degree, should the 
problem arise (i.e., should there be largeish contributions to chi-sq. 
from a few cells because of observed frequencies of 2 and expected 
frequencies of, say, 0.05 [close to your worst case] and consequent 
contributions of 76 to the total chi-sq), you can reduce the intensity 
of the problem by combining categories;  although before doing that I 
would always want to look at the large contributors to see whether 
there may be something interesting going on.  One out of a hundred 
coins showing this behavior I'm prepared to ignore;  four or five 
doing so would excite my suspicion.
   -- DFB.

On Thu, 30 Mar 2000, Bob Parks wrote:

 Consider the following problem (which has a real world
 problem behind it)

 You have 100 coins, each of which has a different
 probability of heads (assume that you know that
 probability or worse can estimate it).

 Each coin is labeled.  You ask one person (or machine
 if you will) to flip each coin a different number of times,
 and you record the number of heads.

 Assume that the (known/estimated) probability of heads
 is between .01 and .20, and the number of flips for
 each coin is between 4 and 40.

 The question is how to test that the person/machine
 doing the flipping is flipping 'randomly/fairly'.  That is,
 the person/machine might not flip 'randomly/fairly/...'
 and you want to test that hypothesis.

 One can easily state the null hypothesis as

   p_hat_i = p_know_i  for i=1 to 100

 where p_hat_i is the observed # heads / # flips for each i.

 Since each coin has a different probability of heads,
 you can not directly aggregate.

 Since the expected number of heads is low, asymptotics for
 chi-squares will not apply (each coin has substantial
 probability of obtaining 0 heads so empirically you obtain
 lots of cells with 0,1,2 etc.).

 Given that, I have failed to come up with a statistic to test it.

 TIA for any pointers to help.

 Bob

 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===


-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED] Phone: (765)494-6054   FAX: (765)494-0558


===
This list is open to everyone.  Occasionally, less 

Re: testing a coin flipper

2000-03-31 Thread David A. Heiser


- Original Message -
From: Bob Parks [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, March 30, 2000 6:44 AM
Subject: testing a coin flipper


 Consider the following problem (which has a real world
 problem behind it)

 You have 100 coins, each of which has a different
 probability of heads (assume that you know that
 probability or worse can estimate it).

 Each coin is labeled.  You ask one person (or machine
 if you will) to flip each coin a different number of times,
 and you record the number of heads.

..
Incidentally, I found that William Feller in chapter III (vol I) of his
classic book "An Introduction to Probability Theory and its Applications",
covers coin flipping nicely.

The sequence is treated as a random walk. The probability of the sign
reversal (i.e. heads is +1 and tails is -1) is low, indicating long
intervals between successive crossing of the axis. His Theorem 1 (page 84)
states that the probability (e), that up to epoch 2n+1 (n flips), there
occurs exactly r changes of sign equals 2 times the probability of the sum
of events being equal to 2r+1 in 2n+1 trials. (Involves the number of paths
to 2r+1 out of 2n+1 trials.)

His table on page 85 gives the probabilities of zero sign reversal in 99
trials as 0.1592, which is surprisingly high.

DAHeiser



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: testing a coin flipper

2000-03-31 Thread A. G. McDowell

Here is a somewhat DIY approach. Comments?

In article [EMAIL PROTECTED], Bob Parks
[EMAIL PROTECTED] writes
Consider the following problem (which has a real world
problem behind it)

You have 100 coins, each of which has a different
probability of heads (assume that you know that
probability or worse can estimate it).

Each coin is labeled.  You ask one person (or machine
if you will) to flip each coin a different number of times,
and you record the number of heads.

Assume that the (known/estimated) probability of heads
is between .01 and .20, and the number of flips for
each coin is between 4 and 40.
So there are only about 41 possible different results (# of heads seen)
for each individual coin, and it is possible to calculate the
probability of each of those 41 different results under the null
hypothesis: prob(observed) ~ Binomial(p_know_i, 40) or something

The question is how to test that the person/machine
doing the flipping is flipping 'randomly/fairly'.  That is,
the person/machine might not flip 'randomly/fairly/...'
and you want to test that hypothesis.

One can easily state the null hypothesis as

  p_hat_i = p_know_i  for i=1 to 100

where p_hat_i is the observed # heads / # flips for each i.

Since each coin has a different probability of heads,
you can not directly aggregate.

But here I assume that, for each coin, you can attach some sort of
'score' to each of its 41 possible results. This might be (observed -
expected)^2/expected, or -log(prob observed | null hypothesis), or
something that reflects your desired alternative hypothesis more
closely: e.g. if you are looking for a consistent bias to heads you
might include the sign of the deviation in the score, or if you are
looking for a trend effect you might set scores for a coin according to
its position in your list of 100 coins.

I also assume that the final statistic is produced by summing the
individual scores. The remaining question is how to estimate the
significance of the result.

Chances are, your scores are small floating point numbers. Shift, scale
and round them to convert them all to integers of reasonable size - say
in the range 0,1,2,... 1000. The total score is then in the range
0..4 or so. It isn't quite as powerful a statistic as the original
one, but it is susceptible to exact calculation. The distribution of an
integer valued score can be represented by an array of floating point
numbers: the probabilities that the score is equal to 0, 1, 2, ...
4. What is more, the distribution of an independent sum of two such
scores is computed by simply convolving the two distributions. Even
without the FFT, convolving arrays of 1000 and 40,000 floats looks
doable on a modern machine. In fact, it's easier than that because only
41 of the 1000 floats in the smaller of the two arrays to be convolved
at each stage are non-zero. Repeat this process 100 times and you've got
the exact distribution of your final (integer-valued) score.
-- 
A. G. McDowell


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===