In sci.stat.consult Rich Ulrich <[EMAIL PROTECTED]> wrote: > On Wed, 18 Sep 2002 19:31:02 +0000 (UTC), Ronald Bloom > <[EMAIL PROTECTED]> wrote:
>> >> One has M>p multiple measurments of a p-dimensional random vector. >> >> The objective is to build up, from these M measurements, an >> estimate S of the "true" covariance matrix C. >> >> This sample covariance is intended to be used in the standard >> role in the "hotelling" quadratic form >> >> D = X'*inv(S)*X >> >> in order to make standard inferential tests on subsequent p-dimensional >> random vectors X, (drawn from the putatively stable process from >> which the sample covariance was estimated), subject to the >> standard assumptions. >> >> For standard inference on D to make any sense, D must be non-negative, >> for all X. >> >> Question is: >> >> I am concerned about the positive definiteness of the form D >> being robust with respect to sampling. >> > [ snip - redundant questioning ] > Any covariance matrix computed on full data is assured to be > non-negative definite. Are you saying that it is a Theorem that any sample covariance matrix computed from the array of M p-dimensional observations: X(1,1),...,X(1,p) (vector observation 1) X(2,1),...,X(2,p) (vector observation 2) . . X(M,1),...,X(M,p) (vector observation M) is guaranteed to be positive definite? I do not believe that is a theorem of algebra. In fact I know it is not. Sampling from a multivariate normal population with population covariance matrix C, one gets various sample covariance matrices S, that should be, for large enough sample sizes, "close" to C, and so, by and large, they will tend to be positive definite like C. However, due to sampling fluctuation, one will, occasionally turn up a sample whose sample covariance matrix S is *not* positive definite. Unless you can quote a theorem of algebra to the contrary, I don't believe that there's any hidden mechanism in sampling fluctuation that *prevents* a random symmetric sample covariance matrix from having a zero or negative eigenvalue. To put it another way, through sampling fluctuation alone, one sweeps out the entire space of symmetric matrices. Not all symmetric matrices are positive definite. ---------------------------------- The question stands: what do you do with a sample covariance matrix that is defective in this regard? Do you wait for more data to come in; or do you "adjust" the matrix? Or do adjust the data used to compute the matrix? . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
