Raul Miller wrote:

> Sure -- that's why I called that numerical model a way of "checking
> my work" instead of a "proof".  But, if the math is valid, then the math
> should remain valid when I plug in the numbers.

This is where the problem lies.

Let S^2=(1/n)\sum (X_i-\mu)^2.  Then E(S^2)=\sigma^2, the population
variance.

However, if you replace \mu by \bar X, it is not true that

E((1/n)\sum (X_i-\bar X)^2=\sigma^2.

You are assuming that E((X-Y)^2)=E(X-E(Y))^2, which is false.  Assuming
this is equivalent to saying E(Y^2)=E(Y)^2.

>> If you are trying to calculate an expected value by averaging it over
>> samples taken from the population, you will get an estimate, but what
>> does it mean?  This is precisely what estimation is about.
>
> In this case, my "samples" precisely represent the entire population.
>
> For example, let's consider your hypothetical case of number of
> heads from a coin toss.
>
> With a fair coin, the population, with distribution is:
>    0: 50%
>    1: 50%
>
> The possible samples for a sample size of 2 are then
>    0 0: 25%
>    0 1: 25%
>    1 0: 25%
>    1 1: 25%
>
> I don't actually need to enumerate probabilities for this
> case, since it's evenly distributed.  However, if I had an
> unfair coin I could deal with that as well, using basically
> the same approach:
>    0: 25%
>    1: 75%
>
> with possibilities:
>    0 0: 0.0625%
>    0 1: 0.1875%
>    1 0: 0.1875%
>    1 1: 0.5625%
>
> Thus, for E(\sum (X_i-\bar X)^2)=\sum (x_i-\bar x)^2

I still don't get this.  I am using the notation x_i for an actual sample
value.  The left hand side is a number.  The right hand side will vary
based on the actual sample chosen.

>
> I can determine \sum(X_i-\bar X)^2 for each of those
> potential sample cases (0, 0.5, 0.5, 0) and then
> average them.  For the fair coin, this average is
> 0.25.  For that unfair coin, I get 0.1875 for this
> average.

This precisely illustrates the point.  The population variance is 3%8, not
3%16.  You can see in this case that using denominator n=2 gives the wrong
answer, using denominator n-1 gives the correct answer.

Best wishes,

John

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to