Re: [Jgeneral] "Standard Deviation"

John Randall Thu, 28 Jun 2007 04:10:56 -0700

Raul Miller wrote:
> On 6/27/07, John Randall <[EMAIL PROTECTED]> wrote:
>> This is trivial.  Suppose X1,...,Xn are independent random variables,
>> and
>> a1,...,an are numbers.  Then
>>
>> E(a1 X1+...+an Xn)=a1 E(X1)+...+an E(Xn),
>>
>> that is, E is linear (indeed affine) on independent variables.  This
>> just
>> follows from the linearity of summation or integration.
>
> Ok, but I do not think that works for standard deviation, because
> standard deviation is not a linear operation.  Then again, I still
> do not understand your proof, so this may be a moot point..
>


Standard deviation is not linear, as you say.  If V denotes variance, the
corresponding formula is

V(a1 X1+...+an Xn)=a1^2 V(X1)+...+an^2 V(Xn).

For standard deviations, you can regard this as some kind of length with
respect to orthogonal axes (which corresponds to independence, i.e. zero
covariance).

> Anyways, I'm still stuck working through your earlier post:
>
> Let's take the roll of a (six sided) die, and a sample of that
> happens to be the values 1 2 3.  You asserted
>    $E(S^2)=E((1/n-1)\sum (X_i-\bar X)^2$
>
> The expected value for X_i-\bar X is a random value from the
> set _2.5 _1.5 _0.5 0.5 1.5 2.5.

The expected value for X_i-\bar X is a number, not a random variable.
The random variable X_i-\mu (not \bar X) takes the values you give.

Let's take an even simpler answer: tossing a single unbiased coin and
counting the number of heads.  Then \mu=1/2, \sigma^2=1/4.

Suppose we take a sample of size 2, so X_1 and X_2 are random
variables having the above distribution, \bar X=(X_1+X_2)/2.

Then \sum (X_i-\bar X)^2=((X_1-X_2)^2)/2, and by direct computation,
we get E(\sum (X_i-\bar X)^2)=1/4.  Thus in this case

E((1/n)\sum (X_i-\bar X)^2)=1/8=\sigma^2 / 2

but

E((1/n-1)\sum (X_i-\bar X)^2)=1/4=\sigma^2 .

The proof I gave establishes this result in general.

The expected value for
> (X_i-\bar X)^2 is a random value from the set 0.25 2.25 6.25.
> So the right hand side of that assertion seems
> to be, for this case, one of the possibilities for
>    0.5*+/(?3#3){0.25 2.25 6.25
> If I assign an equal chance to each of these possibilities, I
> get:

(X_i-\bar X)^2 need not assume these values.  For example, if n=3, x_1=0,
x_2=0, x_3=1, then (x_i-\bar x)^2 is either 1/9 or 4/9.

Why do you assign equal probabilities?  If the die is fair, large values of
(X_i-\bar X)^2 are less likely than small values.

> If you're not sick of me pestering you on this subject, could you take
> a look over the above and tell me where you think we differ?
>

I hope the coin-tossing example elucidates things.

I think your main issue is the difference between a random variable
and its value.  Random variables are poorly named: they are neither
random, nor variables.  In the discrete examples above, if the sample
space is S, a random variable X is a function X:S->R and its
distribution is another function f:S->R, and P(X=x) is
\sum f(s) | X(s)=x .  You cannot find the estimated value of a random
variable by choosing one outcome s and evaluating X(s).

A sample is a list of random variables.  The sample mean is a random
variable.  The value of the sample mean on an actual sample is a
number giving a point estimate for the population mean, but we have no
idea how good an estimate it is without knowing something about the
distribution of the sample mean.  This general point is what
estimation is about.

Best wishes,

John



----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jgeneral] "Standard Deviation"

Reply via email to