Hi JP,
On Wed, Jun 19, 2013 at 7:07 PM, JP jeanpaul.ebe...@inhibox.com wrote:
Dearest RDKitters,
I am trying to help a friend of mine, with an RDKit issue (using the
latest RDKit) and I am surprised by some output we are getting. Perhaps
someone here has an explanation.
import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
core = Chem.MolFromSmiles('c1cncs1') # first molecule
print AllChem.EmbedMolecule(core)
AllChem.UFFOptimizeMolecule(core)
Chem.MolToMolBlock(core) # we have some coordinates
print
mol = Chem.MolFromSmiles('C(=O)(O)c1cncs1')
AllChem.ConstrainedEmbed(mol, core, randomseed=123)
Chem.MolToMolBlock(mol)
We get the following error:
---
ValueErrorTraceback (most recent call last)
/home/jp/ipython-input-30-f4834f0dae19 in module()
11 mol = Chem.MolFromSmiles('C(=O)(O)c1cncs1')
12
--- 13 AllChem.ConstrainedEmbed(mol, core, randomseed=123)
14 Chem.MolToMolBlock(mol)
/opt/RDKit_2012_12_1/rdkit/Chem/AllChem.pyc in ConstrainedEmbed(mol, core,
useTethers, coreConfId, randomseed)
295 ci = EmbedMolecule(mol,coordMap=coordMap,randomSeed=randomseed)
296 if ci0:
-- 297 raise ValueError,'Could not embed molecule.'
298
299 algMap=[(j,i) for i,j in enumerate(match)]
ValueError: Could not embed molecule.
0
*(0) Why does this happen?* We have coordinates in core and the match
between core and mol is obvious.
I can reproduce the problem, but I don't have a quick answer as to what's
causing it.Something is going wrong in the construction/smoothing of the
bounds matrix with the constraints. I'm sure it's fixable, but this isn't
one of those easy ones.
Do you mind doing a bug report on github for this?
(1) If we change the aromatic s to o, the code works and we get coorindates
for mol - but these coordinates do not match exactly core. Why is this?
Chem.MolToMolBlock(mol)
RDKit 3D
8 8 0 0 0 0 0 0 0 0999 V2000
-2.49270.70250.1023 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.82501.88750.5227 O 0 0 0 0 0 0 0 0 0 0 0 0
-3.5158 -0.2123 -0.3041 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.08220.31300.0474 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6171 -0.9082 -0.3781 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7280 -0.8842 -0.2830 N 0 0 0 0 0 0 0 0 0 0 0 0
1.03220.34820.1959 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.06081.13120.4178 O 0 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0
1 3 1 0
1 4 1 0
4 5 2 0
5 6 1 0
6 7 2 0
7 8 1 0
8 4 1 0
M END
Chem.MolToMolBlock(core) # we have some coordinates
RDKit 3D
5 5 0 0 0 0 0 0 0 0999 V2000
-1.08800.31450.0476 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6149 -0.9044 -0.3765 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7299 -0.8840 -0.2828 N 0 0 0 0 0 0 0 0 0 0 0 0
1.03360.34700.1955 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.06061.12690.4162 O 0 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0
2 3 1 0
3 4 2 0
4 5 1 0
5 1 1 0
M END
Can you help explain the mystery please?
Sure.
First some expectation setting: the coordinates are never going to be
exactly the same, but you can get close.
The trick is to use the useTethers argument to ConstrainedEmbed. Here's
what the docstring says:
- useTethers: (optional) if True, the final conformation will be
optimized subject to a series of extra forces that pull the
matching atoms to the positions of the core atoms. Otherwise
simple distance constraints based on the core atoms will be
used in the optimization.
And here's what it does in your case (using Os instead of Ss, of course):
In [20]: AllChem.ConstrainedEmbed(mol, core, randomseed=123,useTethers=True)
Out[20]: rdkit.Chem.rdchem.Mol at 0x106fa1360
In [21]: print Chem.MolToMolBlock(core)
RDKit 3D
5 5 0 0 0 0 0 0 0 0999 V2000
-0.9106 -0.6751 -0.0067 C 0 0 0 0 0 0 0 0 0 0 0 0
0.4012 -1.08450.0287 C 0 0 0 0 0 0 0 0 0 0 0 0
1.18040.01570.0247 N 0 0 0 0 0 0 0 0 0 0 0 0
0.31861.0608 -0.0130 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.98950.6830 -0.0337 O 0 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0
2 3 1 0
3 4 2 0
4 5 1 0
5 1 1 0
M END
In [22]: print Chem.MolToMolBlock(mol)
RDKit 3D
8 8 0 0 0 0 0 0 0 0999 V2000
-2.0709 -1.5584 -0.0148 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.2705 -1.0572 -0.0497 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.9028 -2.97950.0154 O 0 0 0 0 0 0 0 0 0 0 0 0
-0.9059 -0.6714 -0.0067 C 0 0