Hi,
Thank you Ed and Greg for good suggestions. Your input is certainly usable.
I come from R but had failed to find tools there that could do the job, so
found rdkit. Your input actually led me back to R and the ChemmineR package
fmcsR.
Thanks again.
Med venlig hilsen
Jens Kristian Munk
Kemiker, Cand. Scient., Ph.D.
Telefon: 3862 0398
Mobil: 5142 3483
E-mail: jens.kristian.m...@regionh.dk<mailto:jens.kristian.m...@regionh.dk>
Klinisk Biokemisk afdeling
Amager og Hvidovre hospital
Kettegård Allé 30
2650 Hvidovre
Web: www.regionh.dk<http://www.regionh.dk/>
Fra: Greg Landrum [mailto:greg.land...@gmail.com]
Sendt: Friday, November 03, 2017 9:41 PM
Til: Jens Kristian Munk
Cc: Rdkit-discuss@lists.sourceforge.net
Emne: Re: [Rdkit-discuss] Find difference(s) between molecules
Hi Jens,
Ed provided some good pointers to the matched pair literature, which is
certainly one way to approach the problem. There are RDKit implementations of
MMP analysis.
A somewhat more direct (though limited) approach would be to use the MCS code
to find the maximum common substructure of your set of molecules and then
remove that with the DeleteSubstructs() function:
In [14]: smis = ('c1ccccc1F','c1ccc(Cl)cc1F','c1cc(CCO)ccc1','c1cc(C1CC1)ccc1F',
...: )
In [15]: ms = [Chem.MolFromSmiles(x) for x in smis]
In [16]: from rdkit.Chem import rdFMCS
In [17]: mcs = rdFMCS.FindMCS(ms)
In [18]: patt = mcs.smartsString
In [19]: patt
Out[19]: '[#6](:,-[#6]:,-[#6]:,-[#6]):,-[#6]:,-[#6]'
In [20]: commonMol =Chem.MolFromSmarts(patt)
In [21]: diffs = [Chem.DeleteSubstructs(x,commonMol) for x in ms]
In [22]: [Chem.MolToSmiles(x) for x in diffs]
Out[22]: ['F', 'Cl.F', 'O', 'C.CC.F']
This won't solve every problem, but it's a quick start.
-greg
On Fri, Nov 3, 2017 at 8:15 AM, Jens Kristian Munk
<jens.kristian.m...@regionh.dk<mailto:jens.kristian.m...@regionh.dk>> wrote:
Hi list,
I’ve searched far and wide for an answer to this; I apologize if the answer is
obvious...
I can use rdFMCS
(http://www.rdkit.org/Python_Docs/rdkit.Chem.rdFMCS-module.html) to find the
maximum common substructure of a set of molecules... But how do I find the
difference(s) between two (or more) molecules?
I work with lipids a lot, so for example, the difference between palmitoic acid
(C16:0) and stearic acid (C18:0) is SMILES ‘CC’. I would like RDkit to tell me
just that, as well as tell me where on the maximum common substructure (which
in this example is palmitoic acid) to add the ‘CC’ to get stearic acid – i.e.
on the terminus of the fatty chain.
Any ideas?
The example above is just the first step. After that comes identifying and
locating double bonds in the fatty chains... And then jump to phospholipids,
with two fatty chains and a head group... ☺
Med venlig hilsen
Jens Kristian Munk
Kemiker, Cand. Scient., Ph.D.
Telefon: 3862 0398
Mobil: 5142 3483
E-mail: jens.kristian.m...@regionh.dk<mailto:jens.kristian.m...@regionh.dk>
Klinisk Biokemisk afdeling
Amager og Hvidovre hospital
Kettegård Allé 30
2650 Hvidovre
Web: www.regionh.dk<http://www.regionh.dk/>
________________________________
Denne e-mail indeholder fortrolig information. Hvis du ikke er den rette
modtager af denne e-mail eller hvis du modtager den ved en fejltagelse, beder
vi dig venligst informere afsender om fejlen ved at bruge svarfunktionen.
Samtidig bedes du slette e-mailen med det samme uden at videresende eller
kopiere den.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
________________________________
Denne e-mail indeholder fortrolig information. Hvis du ikke er den rette
modtager af denne e-mail eller hvis du modtager den ved en fejltagelse, beder
vi dig venligst informere afsender om fejlen ved at bruge svarfunktionen.
Samtidig bedes du slette e-mailen med det samme uden at videresende eller
kopiere den.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss