Hi Oliver, Robert, Thanks to both of you for your speedy and enlightening responses. They really cleared things up.
Now my simulations seem to work as intended. All the best, Josh On 11/9/11 12:14 PM, Robert Kern wrote: > On Wed, Nov 9, 2011 at 16:40, Joshua Anthony Reyes > <[email protected]> wrote: >> Hi List, >> >> I'm new to Numpy and I'm a little confused about the behavior of >> numpy.random.multivariate_normal(). I'm not sure I'm passing the >> variances correctly. My goal is to sample from a bivariate normal, but >> the kooky behavior shows up when I sample from a univariate distribution. >> >> In short, the multivariate normal function doesn't seem to give me >> values in the appropriate ranges. Here's an example of what I mean. >> >> In[1]: from numpy.random import normal, multivariate_normal >> >> In [2]: normal(100, 100, 10) >> Out[2]: >> array([ 258.62344586, 70.16378815, 49.46826997, 49.58567724, >> 182.68652256, 226.67292034, 92.03801549, 18.2686146 , >> 94.09104313, 80.35697507]) > Here, you are asking for a Gaussian distribution with a mean of 100 > and a stdev of 100 (or equivalently, a variance of 10000). > >> The samples look about right to me. But then when I try to do the same >> using the multivariate_normal, the values it draws look too close to the >> mean. >> >> In [3]: multivariate_normal([100], [[100]], 10) > Here you are asking for a Gaussian with a mean of 100 and a variance > of 100 (or equivalently, a stdev of 10). Note the differences between > the two signatures: > > np.random.normal(mean, stdev) > np.random.multivariate_normal(mean, covariance) > >> Out[3]: >> array([[ 109.10083984], >> [ 97.43526716], >> [ 108.43923772], >> [ 97.87345947], >> [ 103.405562 ], >> [ 110.2963736 ], >> [ 103.96445585], >> [ 90.58168544], >> [ 91.20549222], >> [ 104.4051761 ]]) >> >> These values all fall within 10 units of the mean. >> >> In [4]: multivariate_normal([100], [[1000]], 10) >> Out[4]: >> array([[ 62.04304611], >> [ 123.29364557], >> [ 83.53840083], >> [ 64.67679778], >> [ 127.82433157], >> [ 11.3305656 ], >> [ 95.4101701 ], >> [ 126.53213908], >> [ 104.68868736], >> [ 32.45886112]]) >> >> In [5]: normal(100, 1000, 10) >> Out[5]: >> array([ 1052.93998938, -1254.12576419, 258.75390045, 688.32715327, >> -2.36543806, -1570.54842269, 472.90045029, 394.62908014, >> 137.10320437, 1741.85017871]) >> >> And just to exaggerate things a little more: >> >> In [6]: multivariate_normal([100], [[10000]], 10).T][0] >> Out[6]: >> array([ 274.45446694, 85.79359682, 245.03248721, 120.10912405, >> -34.76526896, 134.93446664, 47.6768889 , 93.34140984, >> 80.27244669, 229.64700591]) >> >> Whereas >> In [7]: normal(100, 10000, 10) >> Out[7]: >> array([ -554.68666687, 3724.59638363, -14873.55303901, -3111.22162495, >> -10813.66412901, 4688.81310356, -9510.92470735, -12689.02667559, >> -10379.01381925, -4534.60480031]) >> >> I'm shocked that I don't get some negative values in Out[4]. And Out[6] >> really ought to have some numbers in the thousands. >> >> I'd totally be willing to believe that I don't understand the >> multivariate normal and/or variance. Can someone tell me whether these >> numbers look sane? >> >> For the bivariate case I do something like this: >> >> means = [100, 100] >> variances = [100, 1000] >> Sx, Sy = variances >> sx, sy = map(sqrt, variances) >> cor = 0.7 >> cov = [[Sx, cor*sx*sy], [cor*sy*sx, Sy]] >> draws = 10 >> samples = multivariate_normal(means, cov, draws) >> >> As mentioned before, the samples are shockingly close to their means. > They look right to me. > > [~] > |19> means = [100, 100] > > [~] > |20> variances = [100, 1000] > > [~] > |21> Sx, Sy = variances > > [~] > |22> sx, sy = map(sqrt, variances) > > [~] > |23> cor = 0.7 > > [~] > |24> cov = np.array([[Sx, cor*sx*sy], [cor*sy*sx, Sy]]) > > [~] > |26> samples = np.random.multivariate_normal(means, cov, 10000) > > [~] > |27> cov > array([[ 100. , 221.35943621], > [ 221.35943621, 1000. ]]) > > [~] > |28> np.cov(samples.T) > array([[ 101.16844481, 222.00301056], > [ 222.00301056, 1001.58403922]]) > _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
