[Rdkit-discuss] format.py in Pandas 0.20.1 Has Moved

2017-06-01 Thread Steven Wilkens
This may have already been addressed in the next release, but I wanted to
be sure. It appears that Pandas was refactored in the 0.20.1 release in a
way that breaks PandasTools:

Python 3.6.1 |Continuum Analytics, Inc.| (default, May 11 2017, 13:25:24)
[MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> print('pandas version: ' + pd.__version__)
pandas version: 0.20.1
>>> from rdkit.Chem import PandasTools
Traceback (most recent call last):
  File
"C:\miniconda3\envs\rdkit-2017_03_1\lib\site-packages\rdkit\Chem\PandasTools.py",
line 152, in 
from pandas.formats import format as fmt
ImportError: cannot import name 'format'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "", line 1, in 
  File
"C:\miniconda3\envs\rdkit-2017_03_1\lib\site-packages\rdkit\Chem\PandasTools.py",
line 154, in 
from pandas.core import format as fmt  # older versions
ImportError: cannot import name 'format'

In particular, format.py was moved from pandas/formats to
pandas/io/formats. An immediate quick fix is to copy
pandas/io/formats/format.py to pandas/formats.

Cheers,
Steve
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] MaxMinPicker Bug

2017-05-18 Thread Steven Wilkens
I've been using MaxMinPicker() to run a series of simulations where I
select several small subsets of molecules from a larger set and I've come
across some odd behavior. In summary, this is my algorithm:

1. select a small subset using MaxMinPicker.Pick()
2. remove that subset from the input set
3. repeat until the desired number of subsets is reached
4. store subsets, and restart the process to generate a new set of subsets

The process seems to work fine for a few simulations. However, eventually
and randomly MaxMinPicker.Pick() returns an index that is 1 position above
the end of the input array. After debugging the behavior, I added error
checking to detect this situation. This fix works fine in Linux. However,
my fix does not work in Windows. The error condition is detected, but
Python still crashes.

The most obvious source of the bug is that I'm making an error when I
construct the input matrix. However, I've gone over my code several times
and I'm quite sure I'm doing it right. Also, successful simulations produce
subsets that are diverse by the desired metric. Unfortunately, the random
nature of the bug makes it difficult to pinpoint the root cause. My current
hunch is that MaxMinPicker has some static variables that are hanging
around from one run to the next. If that is the case, one would only
encounter the bug if one were to repeatedly call the Pick() method within a
single script like I am doing (maybe that is why no one has encountered
this bug yet?)

Any help would be most appreciated. Thanks!
Regards,
Steve
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss