Re: testing a coin flipper
In article [EMAIL PROTECTED], Donald F. Burrill [EMAIL PROTECTED] wrote: Bob, is this not isomorphic to a chi-square goodness of fit test? For each coin, if you know or can estimate (separately from these data!) the probability of heads, you can calculate an expected number of heads from the number of times the coin is flipped. You have an observed number of heads for that coin. Sum ((O-E)^2)/E) for a chi-sq with (presumably) 99 d.f. The low _observed_ frequencies you mention are not a problem; the problem that occasions the usual caution has to do with a low (e.g., fractional) expected frequency coupled with a positive observed frequency, which can unreasonably inflate the total chi-sq. (If the observed frequency is 0, the contribution of that coin to the total chi-sq. is merely the expected frequency.) Some of these cells are too small for this. Total numbers of flips can be as low as 4, and dichotomizing the Bernouilli trials loses information. With samples of this size, the discrepancy between the actual distribution of the chi-squared test of goodness of fit with the chi-squared distribution cannot be ignored; instead of combining goodness of fit sums of squares one should combine log likelihood functions. There often is good reason to use asymptotics, but using it on each of 100 test statistics to get an overall test is not one of them. If using the chi-squared test, use the exact distribution of that statistic for the given theoretical model, if it is to be used at all. Using the chi-squared test on a 2x100 table is likely to be rather distorted. To some degree, should the problem arise (i.e., should there be largeish contributions to chi-sq. from a few cells because of observed frequencies of 2 and expected frequencies of, say, 0.05 [close to your worst case] and consequent contributions of 76 to the total chi-sq), you can reduce the intensity of the problem by combining categories; although before doing that I would always want to look at the large contributors to see whether there may be something interesting going on. One out of a hundred coins showing this behavior I'm prepared to ignore; four or five doing so would excite my suspicion. -- DFB. On Thu, 30 Mar 2000, Bob Parks wrote: Consider the following problem (which has a real world problem behind it) You have 100 coins, each of which has a different probability of heads (assume that you know that probability or worse can estimate it). Each coin is labeled. You ask one person (or machine if you will) to flip each coin a different number of times, and you record the number of heads. Assume that the (known/estimated) probability of heads is between .01 and .20, and the number of flips for each coin is between 4 and 40. The question is how to test that the person/machine doing the flipping is flipping 'randomly/fairly'. That is, the person/machine might not flip 'randomly/fairly/...' and you want to test that hypothesis. One can easily state the null hypothesis as p_hat_i = p_know_i for i=1 to 100 where p_hat_i is the observed # heads / # flips for each i. Since each coin has a different probability of heads, you can not directly aggregate. Since the expected number of heads is low, asymptotics for chi-squares will not apply (each coin has substantial probability of obtaining 0 heads so empirically you obtain lots of cells with 0,1,2 etc.). Given that, I have failed to come up with a statistic to test it. TIA for any pointers to help. Bob Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 === This list is open to everyone. Occasionally, less
Re: testing a coin flipper
- Original Message - From: Bob Parks [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, March 30, 2000 6:44 AM Subject: testing a coin flipper Consider the following problem (which has a real world problem behind it) You have 100 coins, each of which has a different probability of heads (assume that you know that probability or worse can estimate it). Each coin is labeled. You ask one person (or machine if you will) to flip each coin a different number of times, and you record the number of heads. .. Incidentally, I found that William Feller in chapter III (vol I) of his classic book "An Introduction to Probability Theory and its Applications", covers coin flipping nicely. The sequence is treated as a random walk. The probability of the sign reversal (i.e. heads is +1 and tails is -1) is low, indicating long intervals between successive crossing of the axis. His Theorem 1 (page 84) states that the probability (e), that up to epoch 2n+1 (n flips), there occurs exactly r changes of sign equals 2 times the probability of the sum of events being equal to 2r+1 in 2n+1 trials. (Involves the number of paths to 2r+1 out of 2n+1 trials.) His table on page 85 gives the probabilities of zero sign reversal in 99 trials as 0.1592, which is surprisingly high. DAHeiser === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: testing a coin flipper
Here is a somewhat DIY approach. Comments? In article [EMAIL PROTECTED], Bob Parks [EMAIL PROTECTED] writes Consider the following problem (which has a real world problem behind it) You have 100 coins, each of which has a different probability of heads (assume that you know that probability or worse can estimate it). Each coin is labeled. You ask one person (or machine if you will) to flip each coin a different number of times, and you record the number of heads. Assume that the (known/estimated) probability of heads is between .01 and .20, and the number of flips for each coin is between 4 and 40. So there are only about 41 possible different results (# of heads seen) for each individual coin, and it is possible to calculate the probability of each of those 41 different results under the null hypothesis: prob(observed) ~ Binomial(p_know_i, 40) or something The question is how to test that the person/machine doing the flipping is flipping 'randomly/fairly'. That is, the person/machine might not flip 'randomly/fairly/...' and you want to test that hypothesis. One can easily state the null hypothesis as p_hat_i = p_know_i for i=1 to 100 where p_hat_i is the observed # heads / # flips for each i. Since each coin has a different probability of heads, you can not directly aggregate. But here I assume that, for each coin, you can attach some sort of 'score' to each of its 41 possible results. This might be (observed - expected)^2/expected, or -log(prob observed | null hypothesis), or something that reflects your desired alternative hypothesis more closely: e.g. if you are looking for a consistent bias to heads you might include the sign of the deviation in the score, or if you are looking for a trend effect you might set scores for a coin according to its position in your list of 100 coins. I also assume that the final statistic is produced by summing the individual scores. The remaining question is how to estimate the significance of the result. Chances are, your scores are small floating point numbers. Shift, scale and round them to convert them all to integers of reasonable size - say in the range 0,1,2,... 1000. The total score is then in the range 0..4 or so. It isn't quite as powerful a statistic as the original one, but it is susceptible to exact calculation. The distribution of an integer valued score can be represented by an array of floating point numbers: the probabilities that the score is equal to 0, 1, 2, ... 4. What is more, the distribution of an independent sum of two such scores is computed by simply convolving the two distributions. Even without the FFT, convolving arrays of 1000 and 40,000 floats looks doable on a modern machine. In fact, it's easier than that because only 41 of the 1000 floats in the smaller of the two arrays to be convolved at each stage are non-zero. Repeat this process 100 times and you've got the exact distribution of your final (integer-valued) score. -- A. G. McDowell === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===