Hi Thilo,

This is a bug in the generation of the SMARTS from the MCS code. It's not a
new one and I was pretty sure that it had already been filed in github, but
I can't seem to find it. I'll file the bug report and will hopefully be
able to get it fixed in the not-too-distant future.

-greg



On Fri, May 5, 2017 at 11:37 AM, Thilo Bauer <thilo.ba...@fau.de> wrote:

> Dear Mailinglist-members,
>
> in rdkit, when doing a MCS search for molecules bearing a chirality
> center, (how) is it possible to preserve the stereochemical information
> when exporting the subgraph to a SMARTS string?
>
> Consider the following three molecules:
>
>  >>> mol_ccw = Chem.MolFromSmiles('C1=C[C@H](Cl)CCC1')
>  >>> mol_cw  = Chem.MolFromSmiles('C1=C[C@@H](Cl)CCC1')
>  >>> mol_lin = Chem.MolFromSmiles('C=C[C@H](Cl)CCC')
>
> Doing a chirality-sensitive subgraph search leads to the somewhat
> expected result:
>
>  >>> rdFMCS.FindMCS([mol_ccw, mol_cw], matchChiralTag=True).smartsString
> '[#6](=[#6])-[#6]-[#6]-[#6]-[#6]-[#17]'
>  >>> rdFMCS.FindMCS([mol_ccw, mol_lin], matchChiralTag=True).smartsString
> '[#6]=[#6]-[#6](-[#17])-[#6]-[#6]-[#6]'
>  >>> rdFMCS.FindMCS([mol_cw, mol_lin], matchChiralTag=True).smartsString
> '[#6]-[#6]-[#6]-[#6]-[#6]'
>
> The subgraph over mol_cw and mol_lin includes the stereocenter. But
> unfortunately, the chirality information is not stored in the SMARTS
> string, and using [#6]=[#6]-[#6](-[#17])-[#6]-[#6]-[#6] for a
> chirality-sensitive substructure match leads to the expected result of
> the pattern made from that SMARTS string matching all three molecules:
>
>  >>> patt = Chem.MolFromSmarts('[#6]=[#6]-[#6](-[#17])-[#6]-[#6]-[#6]')
>  >>> len({mol_cw, mol_ccw, mol_lin}.GetSubstructMatches(patt,
> useChirality=True))
> 1
>
> Manually inserting a lazy @H or a &*&H1 at the stereocenter leads -of
> course- to the desired result:
>
>  >>> patt = Chem.MolFromSmarts('[#6]=[#6]-[#6@H](-[#17])-[#6]-[#6]-[#6]')
>  >>> len({mol_ccw, mol_lin}.GetSubstructMatches(patt, useChirality=True))
> 1
>  >>> len(mol_cw.GetSubstructMatches(patt, useChirality=True))
> 0
>
> Now, when matching mol_ccw and mol_lin, how do I get the
> stereochemistry-aware SMARTS string
> [#6]=[#6]-[#6&*&H1](-[#17])-[#6]-[#6]-[#6] as the substructure in the
> first place?
>
>
> Thank you & kind regards,
> Thilo
>
>
> --
> Dr. Thilo Bauer
>
> Computer-Chemie-Centrum
> Friedrich-Alexander-Universität
> Nägelsbachstr. 25
> 91052 Erlangen
>
> +49 170 9738141
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to