# Re: [Numpy-discussion] Question about numpy.random.choice with probabilties

```2017-01-18 9:35 GMT+01:00 Nadav Har'El <n...@scylladb.com>:

>
> On Wed, Jan 18, 2017 at 1:58 AM, aleba...@gmail.com <aleba...@gmail.com>
> wrote:
>
>>
>>
>>
>>>
>>> On Tue, Jan 17, 2017 at 7:18 PM, aleba...@gmail.com <aleba...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> I may be wrong, but I think that the result of the current
>>>> implementation is actually the expected one.
>>>> Using you example: probabilities for item 1, 2 and 3 are: 0.2, 0.4 and
>>>> 0.4
>>>>
>>>> P([1,2]) = P( | 1st=) P() + P( | 1st=) P()
>>>>
>>>
>>> Yes, this formula does fit well with the actual algorithm in the code.
>>> But, my question is *why* we want this formula to be correct:
>>>
>>> Just a note: this formula is correct and it is one of statistics
>> fundamental law: https://en.wikipedia.org/wiki/Law_of_total_probability
>> + https://en.wikipedia.org/wiki/Bayes%27_theorem
>>
>
> Hi,
>
> Yes, of course the formula is correct, but it doesn't mean we're not
> applying it in the wrong context.
>
> I'll be honest here: I came to numpy.random.choice after I actually coded
> a similar algorithm (with the same results) myself, because like you I
> thought this was the "obvious" and correct algorithm. Only then I realized
> that its output doesn't actually produce the desired probabilities
> specified by the user - even in the cases where that is possible. And I
> started wondering if existing libraries - like numpy - do this differently.
> And it turns out, numpy does it (basically) in the same way as my algorithm.
>
>
>>
>> Thus, the result we get from random.choice IMHO definitely makes sense.
>>
>
> Let's look at what the user asked this function, and what it returns:
>
> User asks: please give me random pairs of the three items, where item 1
> has probability 0.2, item 2 has 0.4, and 3 has 0.4.
>
> Function returns: random pairs, where if you make many random returned
> results (as in the law of large numbers) and look at the items they
> contain, item 1 is 0.2333 of the items, item 2 is 0.38333, and item 3 is
> 0.38333.
> These are not (quite) the probabilities the user asked for...
>
> Can you explain a sense where the user's requested probabilities (0.2,
> 0.4, 0.4) are actually adhered in the results which random.choice returns?
>```
```
I think that the question the user is asking by specifying p is a slightly
different one:
"please give me random pairs of the three items extracted from a
population of 3 items where item 1 has probability of being extracted of
0.2, item 2 has 0.4, and 3 has 0.4. Also please remove extract items once
extracted."

> Thanks,
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

--
--------------------------------------------------------------------------
NOTICE: Dlgs 196/2003 this e-mail and any attachments thereto may contain
confidential information and are intended for the sole use of the
recipient(s) named above. If you are not the intended recipient of this
message you are hereby notified that any dissemination or copying of this
message is strictly prohibited. If you have received this e-mail in error,
please notify the sender either by telephone or by e-mail and delete the
material from any computer. Thank you.
--------------------------------------------------------------------------
```
```_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion
```