Thank you, Sereina.I understand importance of addition of hydrogens to get a reasonable 3D coordinates. But the situation may be not that simple.
1. Addition of hydrogen is only required for custom coordinates supplied from an external file. If coordinates of a template is generated with rdkit embedding it works without addition of explicit hydrogens.
2. I found an opposite example where addition of hydrogens breaks constrained embedding if custom coordinates of a template is used. And again if I generate coordinates of a template by rdkit everything is OK without addition of Hs.
These suggest that there is some issue with custom coordinates usage for constrained embedding.
I provided the code and output below. Code: data = [('1.mol', 'C[C@@H]1CCCCC1=O', 'C[C@@H]1CC[C@H](O)CC1=O'),('2.mol', 'CCCCCCCC[C@@H](CCC)NC(=O)c1ccc(F)cc1', 'CCCC[C@H](CCC[C@@H](CCC)NC(=O)c1ccc(F)cc1)NC(=O)c1ccco1')]
for i, (mol_fname, smi_template, smi_child) in enumerate(data): print('iteration', i) mode = 'read template mol file, no AddHs' print(mode) mol_template = Chem.MolFromMolFile(mol_fname) mol_child = Chem.MolFromSmiles(smi_child) try: mol = AllChem.ConstrainedEmbed(mol_child, mol_template) print(mol.GetProp('EmbedRMS')) except ValueError as e: print(e) mode = 'read template mol file, AddHs' print(mode) mol_template = Chem.MolFromMolFile(mol_fname) mol_child = Chem.MolFromSmiles(smi_child) try: mol = AllChem.ConstrainedEmbed(Chem.AddHs(mol_child), mol_template) print(mol.GetProp('EmbedRMS')) except ValueError as e: print(e) mode = 'embed template mol in rdkit, no AddHs' print(mode) mol_template = Chem.MolFromSmiles(smi_template) AllChem.EmbedMolecule(mol_template) mol_child = Chem.MolFromSmiles(smi_child) try: mol = AllChem.ConstrainedEmbed(mol_child, mol_template) print(mol.GetProp('EmbedRMS')) except ValueError as e: print(e) Output: iteration 0 read template mol file, no AddHs Could not embed molecule. read template mol file, AddHs 0.05014807519735495 embed template mol in rdkit, no AddHs 0.12358989886023371 iteration 1 read template mol file, no AddHs 0.057937898735270194 read template mol file, AddHsCould not embed molecule. # <-- here rdkit spends a lot of time but fails
embed template mol in rdkit, no AddHs 0.1012757033705761 Pavel. On 07/07/2020 21:41, Sunhwan Jo wrote:
Makes sense :)On Jul 7, 2020, at 12:35 PM, Sereina Riniker <sereina.rini...@gmail.com <mailto:sereina.rini...@gmail.com>> wrote:Dear Pavel and Sunhwan,Please note that hydrogens should always be added for the embedding algorithm to work properly (i.e. it’s not a walk around but what should be done). See also Section “Working with 3D Molecules” in https://www.rdkit.org/docs/GettingStartedInPython.htmlBest regards, SereinaOn 7 Jul 2020, at 21:26, Sunhwan Jo <sunhw...@gmail.com <mailto:sunhw...@gmail.com>> wrote:The reason constraint embed didn’t work is the molecule simply can’t be embedded using the rdkit’s algorithm.In [25]: mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') In [26]: AllChem.EmbedMolecule(mol_child) Out[26]: -1See more discussion here: https://github.com/rdkit/rdkit/issues/2996The SMILES you posted looks valid to me and doesn’t look that complicated, but the anyway I think somehow the RDKit’s algorithm tripped up and couldn’t finish embedding without some help. Hopesomeone with more in-depth insight can help here. Anyway, for a walk around, adding H seems to do the trick:In [39]: mol = AllChem.AddHs(mol_child) In [40]: AllChem.EmbedMolecule(mol) Out[40]: 0 # worked In [41]: AllChem.ConstrainedEmbed(mol, mol_parent) Out[41]: <rdkit.Chem.rdchem.Mol at 0x7fe8000f6f80> # also workedSunhwanOn Jul 7, 2020, at 12:36 AM, Pavel Polishchuk <pavel_polishc...@ukr.net <mailto:pavel_polishc...@ukr.net>> wrote:Hi all,I have an issue with ConstrainedEmbed and I cannot figure out what exactly causes this. I have a molecule C[C@@H]1CCCCC1=O with 3D coordinates in 1.mol file (attached). And I want to generate coordinates for another structure with this core -C[C@@H]1CC[C@H](O)CC1=O.This is usual way which causes issue with embedding and the corresponding error.mol_parent = Chem.MolFromMolFile('1.mol') mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') try: mol = AllChem.ConstrainedEmbed(mol_child, mol_parent) except ValueError as e: print(e) If I add explicit hydrogens the issue disappears. mol_parent = Chem.MolFromMolFile('1.mol') mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') mol = AllChem.ConstrainedEmbed(Chem.AddHs(mol_child), mol_parent) If I do not use pre-defined coordinates - everything works well. mol_parent = Chem.MolFromSmiles('C[C@@H]1CCCCC1=O') AllChem.EmbedMolecule(mol_parent) mol_child = Chem.MolFromSmiles('C[C@@H]1CC[C@H](O)CC1=O') mol = AllChem.ConstrainedEmbed(mol_child, mol_parent)Does ugly coordinates in 1.mol file cause the embedding issue? Or the issue is caused by some implicit properties of a molecule? How to solve this properly?Kind regards, Pavel. <1.mol>_______________________________________________ Rdkit-discuss mailing listRdkit-discuss@lists.sourceforge.net <mailto:Rdkit-discuss@lists.sourceforge.net>https://lists.sourceforge.net/lists/listinfo/rdkit-discuss_______________________________________________ Rdkit-discuss mailing listRdkit-discuss@lists.sourceforge.net <mailto:Rdkit-discuss@lists.sourceforge.net>https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
1.mol
Description: MOL mdl chemical test
2.mol
Description: MOL mdl chemical test
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss