"seferiad" <[EMAIL PROTECTED]> wrote in message Yk%k9.3546$[EMAIL PROTECTED]">news:Yk%k9.3546$[EMAIL PROTECTED]... > I'm not sure how to do a T test for binomial distributions. > > Let's say I pull 2 sets of samples (100 each). I want to compare to see if > they came from the same parent distribution. I take 100 and do some process > to them, I take the other 100 and do something else to those. Some of these > parts in both sets fail. So the mean probability of failure is u1 (for set > 1) and u2 (for set 2). u1 = n x p1 , u2 = n x p2, where n= 100. The > variances are automatically different, since variance = n x p x q (and p1 > and p2 are different). > > Here is where I get confused. If I do the conventional t-test, what do I > assume for the standard deviation? To do the T-test, I can use the stdev > from Set #1 or Set #2, or I can pool the stdev. Since I don't know which is > the corect stdev to use, does it make sense to do Ttest for all 3 stdev's, > and conclude the following: > > Pooled stdev will give the most likely probability of being correct, but > that using stdev1 and stdev2 (in the denomiator) will effectively give 2 > confidence intervals that "bound" the correct answer? What is the accepted > approach. > > > Also, > When we do a T-test for a normal distribution, the stdev is divided by the > square root of n (which makes sense to me). But for the binomial population, > the stdev we calculate is sqrt (n x p x q), which is already associated to a > stdev taken n-samples at a time. As such, it seems to me that when doing a > T-test for binominal distribution, we shouldn't divide the stdev by square > root of n, since it has already been included. Otherwise, we would be double > counting. Just wanted to check, since I'm confused about this. Is my > interpretation correct? > > Thanks, > Jay > > The T-test is derived on the basis that the denominator of the test statistic is derived from sums of squares of Normal variables. In the binomial case, the estimates of the variance are based on the mean(s) of the observations. The appropriate procedure for hypothesis testing is to use a 2 x 2 contingency table .
Significance tests are rarely useful with large samples as a statistically significant difference may be too small to matter in practice. In any case, if a significant difference is found, one is always interested in how big it is. Confidence limits for a difference in proportions is outlined in most elementary texts. The statistic used can be used for a (T-like) test, using Normal(not t) tables. Hope this helps Jim Snow . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
