Robert Kern wrote: > Neal Becker wrote: >> I noticed that if I generate complex rv i.i.d. with var=1, that numpy >> says: >> >> var (<real part>) -> (close to 1.0) >> var (<imag part>) -> (close to 1.0) >> >> but >> >> var (complex array) -> (close to complex 0) >> >> Is that not a strange definition? > > There is some discussion on this in the tracker. > > http://projects.scipy.org/scipy/numpy/ticket/638 > > The current state of affairs is that the implementation of var() just > naively applies the standard formula for real numbers. > > mean((x - mean(x)) ** 2) > > I think this is pretty obviously wrong prima facie. AFAIK, no one > considers this a valid definition of variance for complex RVs or in fact a > useful value. I think we should change this. Unfortunately, there is no > single alternative but several. > > 1. Punt. Complex numbers are inherently multidimensional, and a single > scale parameter doesn't really describe most distributions of complex > numbers. Instead, you need a real covariance matrix which you can get with > cov([z.real, z.imag]). This estimates the covariance matrix of a 2-D > Gaussian distribution over RR^2 (interpreted as CC). > > 2. Take a slightly less naive formula for the variance which seems to show > up in some texts: > > mean(absolute(z - mean(z)) ** 2) > > This estimates the single parameter of a circular Gaussian over RR^2 > (interpreted as CC). It is also the trace of the covariance matrix above. > > 3. Take the variances of the real and imaginary components independently. > This is equivalent to taking the diagonal of the covariance matrix above. > This wouldn't be the definition of "*the* complex variance" that anyone > else uses, but rather another form of punting. "There isn't a single > complex variance to give you, but in the spirit of broadcasting, we'll > compute the marginal variances of each dimension independently." > > Personally, I like 1 a lot. I'm hesitant to support 2 until I've seen an > actual application of that definition. The references I have been given in > the ticket comments are all early parts of books where the authors are > laying out definitions without applications. Personally, it feels to me > like the authors are just sticking in the absolute()'s ex post facto just > so they can extend the definition they already have to complex numbers. > I'm also not a fan of the expectation-centric treatments of random > variables. IMO, the variance of an arbitrary RV isn't an especially > important quantity. It's a parameter of a Gaussian distribution, and in > this case, I see no reason to favor circular Gaussians in CC over general > ones. > > But if someone shows me an actual application of the definition, I can > amend my view. >
2 is what I expected. Suppose I have a complex signal x, with additive Gaussian noise (i.i.d, real and imag are independent). y = x + n Consider an estimate \hat{x} = y. What is the mean-squared-error E[(y - x)^2] ? Definition 2 is consistent with that, and gets my vote. _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion