Hi Oliver, Robert,

Thanks to both of you for your speedy and enlightening responses. They 
really cleared things up.

Now my simulations seem to work as intended.

All the best,
Josh

On 11/9/11 12:14 PM, Robert Kern wrote:
> On Wed, Nov 9, 2011 at 16:40, Joshua Anthony Reyes
> <[email protected]>  wrote:
>> Hi List,
>>
>> I'm new to Numpy and I'm a little confused about the behavior of
>> numpy.random.multivariate_normal(). I'm not sure I'm passing the
>> variances correctly. My goal is to sample from a bivariate normal, but
>> the kooky behavior shows up when I sample from a univariate distribution.
>>
>> In short, the multivariate normal function doesn't seem to give me
>> values in the appropriate ranges. Here's an example of what I mean.
>>
>> In[1]: from numpy.random import normal, multivariate_normal
>>
>> In [2]: normal(100, 100, 10)
>> Out[2]:
>> array([ 258.62344586,   70.16378815,   49.46826997,   49.58567724,
>>          182.68652256,  226.67292034,   92.03801549,   18.2686146 ,
>>           94.09104313,   80.35697507])
> Here, you are asking for a Gaussian distribution with a mean of 100
> and a stdev of 100 (or equivalently, a variance of 10000).
>
>> The samples look about right to me. But then when I try to do the same
>> using the multivariate_normal, the values it draws look too close to the
>> mean.
>>
>> In [3]: multivariate_normal([100], [[100]], 10)
> Here you are asking for a Gaussian with a mean of 100 and a variance
> of 100 (or equivalently, a stdev of 10). Note the differences between
> the two signatures:
>
>    np.random.normal(mean, stdev)
>    np.random.multivariate_normal(mean, covariance)
>
>> Out[3]:
>> array([[ 109.10083984],
>>         [  97.43526716],
>>         [ 108.43923772],
>>         [ 97.87345947],
>>         [ 103.405562  ],
>>         [ 110.2963736 ],
>>         [ 103.96445585],
>>         [ 90.58168544],
>>         [  91.20549222],
>>         [ 104.4051761 ]])
>>
>> These values all fall within 10 units of the mean.
>>
>> In [4]: multivariate_normal([100], [[1000]], 10)
>> Out[4]:
>> array([[  62.04304611],
>>         [ 123.29364557],
>>         [ 83.53840083],
>>         [ 64.67679778],
>>         [ 127.82433157],
>>         [  11.3305656 ],
>>         [  95.4101701 ],
>>         [ 126.53213908],
>>         [ 104.68868736],
>>         [  32.45886112]])
>>
>> In [5]: normal(100, 1000, 10)
>> Out[5]:
>> array([ 1052.93998938, -1254.12576419,   258.75390045,   688.32715327,
>>            -2.36543806, -1570.54842269,   472.90045029,   394.62908014,
>>           137.10320437,  1741.85017871])
>>
>> And just to exaggerate things a little more:
>>
>> In [6]: multivariate_normal([100], [[10000]], 10).T][0]
>> Out[6]:
>> array([ 274.45446694,   85.79359682,  245.03248721,  120.10912405,
>>          -34.76526896,  134.93446664,   47.6768889 ,   93.34140984,
>>           80.27244669,  229.64700591])
>>
>> Whereas
>> In [7]: normal(100, 10000, 10)
>> Out[7]:
>> array([  -554.68666687,   3724.59638363, -14873.55303901,  -3111.22162495,
>>         -10813.66412901,   4688.81310356,  -9510.92470735, -12689.02667559,
>>         -10379.01381925,  -4534.60480031])
>>
>> I'm shocked that I don't get some negative values in Out[4]. And Out[6]
>> really ought to have some numbers in the thousands.
>>
>> I'd totally be willing to believe that I don't understand the
>> multivariate normal and/or variance. Can someone tell me whether these
>> numbers look sane?
>>
>> For the bivariate case I do something like this:
>>
>> means = [100, 100]
>> variances = [100, 1000]
>> Sx, Sy = variances
>> sx, sy = map(sqrt, variances)
>> cor = 0.7
>> cov = [[Sx, cor*sx*sy], [cor*sy*sx, Sy]]
>> draws = 10
>> samples = multivariate_normal(means, cov, draws)
>>
>> As mentioned before, the samples are shockingly close to their means.
> They look right to me.
>
> [~]
> |19>  means = [100, 100]
>
> [~]
> |20>  variances = [100, 1000]
>
> [~]
> |21>  Sx, Sy = variances
>
> [~]
> |22>  sx, sy = map(sqrt, variances)
>
> [~]
> |23>  cor = 0.7
>
> [~]
> |24>  cov = np.array([[Sx, cor*sx*sy], [cor*sy*sx, Sy]])
>
> [~]
> |26>  samples = np.random.multivariate_normal(means, cov, 10000)
>
> [~]
> |27>  cov
> array([[  100.        ,   221.35943621],
>         [  221.35943621,  1000.        ]])
>
> [~]
> |28>  np.cov(samples.T)
> array([[  101.16844481,   222.00301056],
>         [  222.00301056,  1001.58403922]])
>

_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to