Re: [Rdkit-discuss] MaxMinPicker Bug

2017-05-18 Thread Greg Landrum
Hi Steve,

That is indeed a bug. thanks for the detailed report!

Here's a very small reproducible that demonstrates it:

def pick2(n=1000,m=10,seed=2748):
def func(i, j):
  assert(ihttps://github.com/rdkit/rdkit/issues/1421

-greg





On Thu, May 18, 2017 at 8:59 PM, Steven Wilkens  wrote:

> I've been using MaxMinPicker() to run a series of simulations where I
> select several small subsets of molecules from a larger set and I've come
> across some odd behavior. In summary, this is my algorithm:
>
> 1. select a small subset using MaxMinPicker.Pick()
> 2. remove that subset from the input set
> 3. repeat until the desired number of subsets is reached
> 4. store subsets, and restart the process to generate a new set of subsets
>
> The process seems to work fine for a few simulations. However, eventually
> and randomly MaxMinPicker.Pick() returns an index that is 1 position above
> the end of the input array. After debugging the behavior, I added error
> checking to detect this situation. This fix works fine in Linux. However,
> my fix does not work in Windows. The error condition is detected, but
> Python still crashes.
>
> The most obvious source of the bug is that I'm making an error when I
> construct the input matrix. However, I've gone over my code several times
> and I'm quite sure I'm doing it right. Also, successful simulations produce
> subsets that are diverse by the desired metric. Unfortunately, the random
> nature of the bug makes it difficult to pinpoint the root cause. My current
> hunch is that MaxMinPicker has some static variables that are hanging
> around from one run to the next. If that is the case, one would only
> encounter the bug if one were to repeatedly call the Pick() method within a
> single script like I am doing (maybe that is why no one has encountered
> this bug yet?)
>
> Any help would be most appreciated. Thanks!
> Regards,
> Steve
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] MaxMinPicker Bug

2017-05-18 Thread Steven Wilkens
I've been using MaxMinPicker() to run a series of simulations where I
select several small subsets of molecules from a larger set and I've come
across some odd behavior. In summary, this is my algorithm:

1. select a small subset using MaxMinPicker.Pick()
2. remove that subset from the input set
3. repeat until the desired number of subsets is reached
4. store subsets, and restart the process to generate a new set of subsets

The process seems to work fine for a few simulations. However, eventually
and randomly MaxMinPicker.Pick() returns an index that is 1 position above
the end of the input array. After debugging the behavior, I added error
checking to detect this situation. This fix works fine in Linux. However,
my fix does not work in Windows. The error condition is detected, but
Python still crashes.

The most obvious source of the bug is that I'm making an error when I
construct the input matrix. However, I've gone over my code several times
and I'm quite sure I'm doing it right. Also, successful simulations produce
subsets that are diverse by the desired metric. Unfortunately, the random
nature of the bug makes it difficult to pinpoint the root cause. My current
hunch is that MaxMinPicker has some static variables that are hanging
around from one run to the next. If that is the case, one would only
encounter the bug if one were to repeatedly call the Pick() method within a
single script like I am doing (maybe that is why no one has encountered
this bug yet?)

Any help would be most appreciated. Thanks!
Regards,
Steve
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss