On Apr 16, 2018, at 05:37, Patrick Walters <wpwalt...@gmail.com> wrote:
> 
> Thanks Andrew, the SMILES approach seemed to have quite a few edge cases so I 
> wrote something to work directly on a molecule. 

That's the approach I started with, until I figured out that it doesn't 
preserve chirality.

If I change the end of your code to:

==========
from mmpdblib import smiles_syntax

def weld_dalke(core, r_groups):
    s1 = smiles_syntax.convert_labeled_wildcards_to_closures(core)
    s2 = smiles_syntax.convert_labeled_wildcards_to_closures(r_groups)
    return Chem.CanonSmiles(s1+"."+s2)

if __name__ == "__main__":
    mol_to_weld = Chem.MolFromSmiles(
        "[*:1][C@](F)(Cl)O.N[*:1]")
    welded_mol = weld_r_groups(mol_to_weld)
    print("Expected  :", Chem.CanonSmiles("N[C@](F)(Cl)O"))
    print("Direct    :", Chem.MolToSmiles(welded_mol, isomericSmiles=True))
    print("Via SMILES:", weld_dalke("[*:1][C@](F)(Cl)O", "N[*:1]"))
==========

These should print identical SMILES strings, but instead give:

Expected  : N[C@](O)(F)Cl
Direct    : N[C@@](O)(F)Cl
Via SMILES: N[C@](O)(F)Cl


If chirality preservation isn't a concern, then there's no problem.

BTW, your current code assumes there will only be one attachment point on an 
atom. For example, the input
 [*:1][C@]([*:2])(Cl)O.N[*:1].F[*:2]
create the output
 N.O[C](F)Cl

It's not hard to fix, and I think more of a d'oh! issue.


In a quick benchmark I put together just now, I found that my SMILES syntax 
manipulation approach was about twice as fast to turn the two core/R-group 
SMILES strings into a molecule.

Cheers,


                                Andrew
                                da...@dalkescientific.com



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to