Hi Jens,

Ed provided some good pointers to the matched pair literature, which is
certainly one way to approach the problem. There are RDKit implementations
of MMP analysis.

A somewhat more direct (though limited) approach would be to use the MCS
code to find the maximum common substructure of your set of molecules and
then remove that with the DeleteSubstructs() function:

In [14]: smis =
('c1ccccc1F','c1ccc(Cl)cc1F','c1cc(CCO)ccc1','c1cc(C1CC1)ccc1F',
    ...: )

In [15]: ms = [Chem.MolFromSmiles(x) for x in smis]

In [16]: from rdkit.Chem import rdFMCS

In [17]: mcs = rdFMCS.FindMCS(ms)

In [18]: patt = mcs.smartsString

In [19]: patt
Out[19]: '[#6](:,-[#6]:,-[#6]:,-[#6]):,-[#6]:,-[#6]'

In [20]: commonMol =Chem.MolFromSmarts(patt)

In [21]: diffs = [Chem.DeleteSubstructs(x,commonMol) for x in ms]

In [22]: [Chem.MolToSmiles(x) for x in diffs]
Out[22]: ['F', 'Cl.F', 'O', 'C.CC.F']

This won't solve every problem, but it's a quick start.

-greg



On Fri, Nov 3, 2017 at 8:15 AM, Jens Kristian Munk <
jens.kristian.m...@regionh.dk> wrote:

> Hi list,
>
>
>
> I’ve searched far and wide for an answer to this; I apologize if the
> answer is obvious...
>
>
>
> I can use rdFMCS (http://www.rdkit.org/Python_
> Docs/rdkit.Chem.rdFMCS-module.html) to find the maximum common
> substructure of a set of molecules... But how do I find the difference(s)
> between two (or more) molecules?
>
>
>
> I work with lipids a lot, so for example, the difference between palmitoic
> acid (C16:0) and stearic acid (C18:0) is SMILES ‘CC’. I would like RDkit to
> tell me just that, as well as tell me where on the maximum common
> substructure (which in this example is palmitoic acid) to add the ‘CC’ to
> get stearic acid – i.e. on the terminus of the fatty chain.
>
>
>
> Any ideas?
>
>
>
> The example above is just the first step. After that comes identifying and
> locating double bonds in the fatty chains... And then jump to
> phospholipids, with two fatty chains and a head group... J
>
>
>
> Med venlig hilsen
>
>
>
> *Jens Kristian Munk*
>
> Kemiker, Cand. Scient., Ph.D.
>
>
>
> Telefon: 3862 0398
>
> Mobil: 5142 3483
>
> E-mail: *jens.kristian.m...@regionh.dk <jens.kristian.m...@regionh.dk>*
>
>
>
> Klinisk Biokemisk afdeling
>
> Amager og Hvidovre hospital
>
> Kettegård Allé 30
>
> 2650 Hvidovre
>
>
> Web: www.regionh.dk
>
>
>
> ------------------------------
>
>
> Denne e-mail indeholder fortrolig information. Hvis du ikke er den rette
> modtager af denne e-mail eller hvis du modtager den ved en fejltagelse,
> beder vi dig venligst informere afsender om fejlen ved at bruge
> svarfunktionen. Samtidig bedes du slette e-mailen med det samme uden at
> videresende eller kopiere den.
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to