"Gottfried Helms" <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > Hi , > > there was a tricky problem, recently, with the chi-square-density > of higher dgf's. > I discussed thath in sci.stat.consult and in a german newsgroup, > got some answers and also think to have understood the real point. > > But I would like to have a smoother explanation, as I have to > deal with it in my seminars. Maybe someone out has an idea or > a better shortcut, how to describe it. > To illustrate this I just copy&paste an exchange from s.s.consult; > hope you forgive my lazyness. On the other hand: maybe the > true point comes out better this way. > > Regards > Gottfried > > > 3 postings added: > ---(1/3)------------------------------- > [Gottfried] > > Hi - > > > > im stumbling in the dark... eventually only missing any > > simple hint. > > I'm trying to explain the concept of significance of the > > deviation of an empirical sample from a given, expected > > distribution. > > If we discuss the chi-square-distribution > > | > > |* > > | * > > | * > > | * > > | * > > | * > > | * > > | * > > +--------------------------------- > > > > then this graph illustrates us very well, that and how a > > small deviation is more likely to happen than a high deviation - > > thus backing the concept of the 95%tiles etc. in the beginners > > literature. > > Just cutting it in equal slices this curve gives us expected > > frequencies of occurences of samples with individual chi-squared > > deviations from the expected occurences. > > > > If I have more df's, then the curve changes its shape; in this > > case a 5 df-curve for samples of thrown dices, where I count > > the frequencies of occurences of each number and the deviation > > of these frequencies from the uniformity. > > > > | > > | > > | > > | > > | * > > | * * > > | * * > > | * * > > | * * > > +------------------------------------------------- > > 0 X²(df=5) > > > > Now the slices with the highest frequency of occurences > > are not the ones with the smallest deviation from the > > expected distribution (X²=0) - and even if I accept, that this > > is at least so for the cumulative distribution, it is > > suddenly no more "self-explaining". It is congruent with > > the reality, but our common language is different: > > the most likely chisquare-deviation from the uniformity > > is now an area which is not at the zero-mark. > > So, now: do we EXPECT a deviation from uniformity? > > That the count of frequencies of the occurences of the > > 6 dices numbers is NOT most likely uniform? HÄH? > > Is this suddenly the Nullhypothesis? And do we calculate > > the deviation of our empirical sample then from this new > > Nullhypothesis??? > > > > I never thought about that in this way, but since I do > > now, I feel a bit confused, maybe I only have to step > > aside a bit? > > Any good hint appreciated - > > > > Gottfried. > > > ---------------------------------------------------------------------- > > ---(2/3)--------------------------------------- > Then one participant answered: > > > Actually, that corresponds to the notion that if a "random" sequence is > > *too* uniform, it isn't really random. For example, if you were to toss a > > coin 1000 times, you'd be a little surprised if you got *exactly* 500 > > heads and 500 tails. If you think in terms of taking samples from a > > multinomial population, the non-monotonicity of the chi-square density > > means that a *small* amount of sampling error is more probable than *no* > > sampling error, as well as more probable than a *large* sampling error, > > which I think corresponds pretty well to our intuition. > > > > ------------------------------------------------------------------- > > --(3/3)----------------------------------------- > I was not really satisfied with this and answered, after I had > got some more insight: > > [Gottfried] > > [xxxx] wrote: > > > Actually, that corresponds to the notion that if a "random" sequence is > > > *too* uniform, it isn't really random. For example, if you were to toss a > > > coin 1000 times, you'd be a little surprised if you got *exactly* 500 > > > heads and 500 tails. If you think in terms of taking samples from a > > > > > > Yes, this is true. But it is the same with each other combination. > > No one is more likely to occur (or better: one should say: variation?). > > But then, a student would ask, how could you still attribute a near-expected- > > variation more likely than a far-away-expected variation in generality? > > > > The reason is, that we don't argue about a specific variation, > > but about properties of a variation, or in this case, of a combination. > > We commonly select the property of "having a distance from the > > expected variation", measured in terms of squared deviation. > > The mess is, that with this criterion, with multinomial configuration, > > there are plenty of variations satisfying the same combinatorial > > distance in terms of the squared deviation - up to a local maximum. > > My difficulties are, to make this clear in simple words; best in > > such simple words, as I used, when I explained the rationale of > > chi-square and significance... > > Ok, maybe, it's more a subject for news://sci.stat,edu , I guess. > > > > Thanks again for your input - > > > > Gottfried Helms. > For the smoother explanation that you want, point out that a ChiSquared rv on n degrees of freedom is the sum of (at least) n observations. (Sometimes n+r observations with r linear constraints.) Your students will be conscious of this as the way they always calculate this. If we add n random variables, each of which is positive, the most likely value of the sum will obviously increase as n increases and the probability of very small values will decrease. In addition, the Central Limit Theorem shows that the density of the sum approaches a normal density.
================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =================================================================