Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + clearAromaticFlags
Ok guys, I came up with a possible solution for the N,C 6 rings aromatic rearomatisation after kekulize. I still need to find a ways to do it also for guanidinium salts. ? def TestL_n(L,n,aro): suppl = Chem.SDMolSupplier('/Users/mbp/Downloads/molecules-20-18279-s001/Compounds List for Heat-of-Combustion Calculations.sdf',removeHs=False) i=0 cp=0 v=zeros(len(L)) for mol in suppl: if aro: s = Chem.MolToSmiles(mol) m = rearomatization(s) else: m = mol i=i+1 for j in range(0, len(L)): if j==n-1: try: c= Occ(L[j],m) v[j]+=c if c>0: cp+=1 print Chem.MolToSmiles(Chem.RemoveHs(m)) except: print "error" return v[n-1],cp # r6 (C or N) rearomatization: C/N & guanidinium def keep6aro(m): #[N;v3X3,v4X4+][CX3](=[N;v3X2,v4X3+])[N;v3X3,v4X4+] # greg version r6 = Chem.MolFromSmarts('[#6,#7;a]1[#6,#7;a][#6,#7;a][#6,#7;a][#6,#7;a][#6,#7;a]1') # ring of 6 of C or N only! atomkeep = m.GetSubstructMatches(r6) ri = m.GetRingInfo() BondRing = ri.BondRings() bondkeep=[] for bondring in BondRing: if len(bondring)==6 and isRingAromatic(m,bondring): bondkeep.append(bondring) return atomkeep, bondkeep def Aromatics6ring2(m,atomkeep,bondkeep): #[N;v3X3,v4X4+][CX3](=[N;v3X2,v4X3+])[N;v3X3,v4X4+] # greg version for match in atomkeep: for mi in match: m.GetAtomWithIdx(mi).SetIsAromatic(True) for bondring in bondkeep: for bondid in bondring: mb = m.GetBondWithIdx(bondid) mb.SetBondType(Chem.rdchem.BondType.AROMATIC) mb.SetIsAromatic(True) return m def isRingAromatic(mol,BondRing): for id in BondRing: if not mol.GetBondWithIdx(id).GetIsAromatic(): return False return True def rearomatization(s): mol = Chem.MolFromSmiles(s) atomkeep, bondkeep= keep6aro(mol) Chem.rdmolops.Kekulize(mol,clearAromaticFlags=True) mol=Aromatics6ring2(mol,atomkeep,bondkeep) return mol Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8 De : Greg LandrumEnvoyé : jeudi 22 septembre 2016 10:22 À : Guillaume GODIN Cc : RDKit Discuss Objet : Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + clearAromaticFlags On Wed, Sep 21, 2016 at 4:31 PM, Guillaume GODIN > wrote: After testing the code, It works perfectly, thanks! Well, there's at least that. ;-) Unfortunatly, I discovered that it's still not compatible with the aromaticity method used in the article i mention in another post from Rudolf Naef. Before going further with this I have a question for you (note that I still haven't had time to read the paper in detail): I understand that in order to exactly reproduce the results from that paper you do need to reproduce the aromaticity model used. However, if you were to borrow the methods and data from the paper, you could theoretically build your own models based on RDKit aromaticity. This would likely be more efficient at runtime than re-perceiving aromaticity. I need to keep aromaticity of all 6 rings (having C or N which is possible using your function), but also keep info of aromaticity of fused 6 rings (aka. naphthalene, ...) + convert/keep guanidium moieties aromatic too. So, I would be more interesting to fine a fast process to revoke aromaticity on rings that are not 6 members rings only, which should preserve all 6 rings + fused aromatic rings and also set guanidium salt as aromatic. "Revoking" aromaticity is tricky because you really need to also kekulize the rings that you remove aromaticity from. I think you're going to be better off just describing the features that are aromatic and applying the method I described in the previous message. The SMARTS I sent to you should certainly work for fused rings like naphthalene and could be adapted to support heteroatoms. Guanidinium is a different problem though... the RDKit does not tolerate aromatic bonds/atoms that aren't in rings. What exactly do you want to do there? -greg ** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their
Re: [Rdkit-discuss] (no subject)
Hello: Greg and Curt your comments are very much appreciated. Thanks for getting back to me! Best, Markus On Wed, Sep 21, 2016 at 8:31 PM, Greg Landrumwrote: > Hi Markus, > > Curt's instincts are dead on: the problem here is the rings. > > I'll show the fix and then explain what's going on. You just need to add > one line to your code: > > core = "[a]12[a][a][a][a][a]1[a][a][a]2" > pattern = Chem.MolFromSmarts(core) > Chem.GetSSSR(pattern) > AllChem.Compute2DCoords(pattern) > > when I do this, I get the following depiction for "c1(ocn2)c21": > > (The highlighting is due to the substructure match that's done during the > generation of coordinates). > > So why is this necessary? > The code that generates 2D coordinates uses information about the size of > ring systems in the molecule as part of the coordinate generation. If no > ring information is present (which is true of molecules generated from > SMARTS since they are not fully sanitized on construction) then the code > calls FastFindRings(). This function is perfectly capable of identifying > all ring atoms and bonds, but it isn't very good at getting ring sizes > correct for fused systems (it finds rings, but not the smallest rings). The > consequences are the badly generated coordinates for fused ring systems > that you were seeing. > > I think the current behavior of the code "isn't really ideal": the > coordinate generation code should call the SSSR algorithm in these cases so > that it can generate better coordinates. I'll take a look at the code and > think about changing it. > > As an aside: if you're puzzled by the behavior of AllChem. > GenerateDepictionMatching2DStructure() you can always just take a look at > the drawing of the query molecule itself. It's not always the most > informative depiction when it comes to what the atom and bond queries are, > but you at least will see the coordinates. > > A second aside: the molecule depictions in that notebook indicate that you > are stuck using the fallback drawing code, which creates fairly ugly > pictures. You can get better drawings by either installing cairo and > pycairo (in which case the code should automatically use those) or telling > the drawing code to use SVG for the rendering: > > from rdkit.Chem.Draw import IPythonConsole > IPythonConsole.ipython_useSVG=True > > It really does make the drawings a lot better. > > I hope this helps, > -greg > > > > > > > On Wed, Sep 21, 2016 at 8:47 PM, Markus Metz wrote: > >> Hello all: >> >> I am trying to perform a 2D alignment of molecules by using a pattern for >> which I am using Compute2DCoords. >> >> If I use a smarts string matching napthalene the 2D depiction is as one >> would expect. >> However, if I am switching to a 5,6 aromatic smarts pattern the matched >> benzoxazol the 2D structure looks rather unusual. >> >> Is there a way to match the 5,6 with the 6,6 pattern behavior? >> >> Any hint is very much appreciated, >> >> Markus >> >> P.S. a work book is attached. >> >> >> -- >> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] error while calculating rms and shape tanimoto
Hi >>Files 1.pdb and 2.pdb do not contain CONECT records (so missing bond orders?). I removed CONECT records from Files a.pdb and b.pdb. still code is running fine >>File 1.pdb contains an atom with name BR43. Maybe the PDB parser can't parse that (seems valid to me, FWTW). Also tried to run by removing BR43 from the file. Still same error. Amit On Thu, Sep 22, 2016 at 4:26 PM, Paul Emsleywrote: > > > File 1.pdb contains an atom with name BR43. Maybe the PDB parser can't > parse that (seems valid to me, FWTW). > > Paul > > > > On 22/09/16 11:15, Amit singh wrote: > > Hi > > Files a.pdb and b.pdb are from RDKit test data (working fine) > > Files 1.pdb and 2.pdb (other than test data, which are giving error) > > On Thu, Sep 22, 2016 at 3:03 PM, Greg Landrum > wrote: > >> HI Amit, >> >> >> On Thu, Sep 22, 2016 at 9:23 AM, Amit singh >> wrote: >> >>> >>> I am a new entry in this discussion forum and also for RDKit >>> >> >> Welcome! >> >> >>> I am trying to calculate shape tanimoto and rms between two molecules >>> (PDB files) from 3D functionality of RDKit. >>> Code is working fine for the pdb files given in test data. >>> But gives error whenever I uses other pdb files >>> >>> >>> >>> rms = rdMolAlign.AlignMol(mol1, mol2) >>> Traceback (most recent call last): >>> File "", line 1, in >>> RuntimeError: std::exception >>> --- >>> It looks like there is a problem in input files, but help required >>> >> >> In order to be able to answer the question, we need a bit more >> information. Can you please share what files you loaded mol1 and mol2 from >> so that we can reproduce the problem? >> >> -greg >> >> > > > > > > -- > > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] error while calculating rms and shape tanimoto
Files 1.pdb and 2.pdb do not contain CONECT records (so missing bond orders?). File 1.pdb contains an atom with name BR43. Maybe the PDB parser can't parse that (seems valid to me, FWTW). Paul On 22/09/16 11:15, Amit singh wrote: Hi Files a.pdb and b.pdb are from RDKit test data (working fine) Files 1.pdb and 2.pdb (other than test data, which are giving error) On Thu, Sep 22, 2016 at 3:03 PM, Greg Landrum> wrote: HI Amit, On Thu, Sep 22, 2016 at 9:23 AM, Amit singh > wrote: I am a new entry in this discussion forum and also for RDKit Welcome! I am trying to calculate shape tanimoto and rms between two molecules (PDB files) from 3D functionality of RDKit. Code is working fine for the pdb files given in test data. But gives error whenever I uses other pdb files >>> rms = rdMolAlign.AlignMol(mol1, mol2) Traceback (most recent call last): File "", line 1, in RuntimeError: std::exception --- It looks like there is a problem in input files, but help required In order to be able to answer the question, we need a bit more information. Can you please share what files you loaded mol1 and mol2 from so that we can reproduce the problem? -greg -- Dr. Amit Kumar Scientist B National Institute of Cancer Prevention and Research (Formly Institute of Cytology and Preventive Oncology) I-7, Sector - 39, Noida - 201301 Uttar Pradesh -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] error while calculating rms and shape tanimoto
Hi Files a.pdb and b.pdb are from RDKit test data (working fine) Files 1.pdb and 2.pdb (other than test data, which are giving error) On Thu, Sep 22, 2016 at 3:03 PM, Greg Landrumwrote: > HI Amit, > > > On Thu, Sep 22, 2016 at 9:23 AM, Amit singh wrote: > >> >> I am a new entry in this discussion forum and also for RDKit >> > > Welcome! > > >> I am trying to calculate shape tanimoto and rms between two molecules >> (PDB files) from 3D functionality of RDKit. >> Code is working fine for the pdb files given in test data. >> But gives error whenever I uses other pdb files >> >> >> >>> rms = rdMolAlign.AlignMol(mol1, mol2) >> Traceback (most recent call last): >> File "", line 1, in >> RuntimeError: std::exception >> --- >> It looks like there is a problem in input files, but help required >> > > In order to be able to answer the question, we need a bit more > information. Can you please share what files you loaded mol1 and mol2 from > so that we can reproduce the problem? > > -greg > > -- Dr. Amit Kumar Scientist B National Institute of Cancer Prevention and Research (Formly Institute of Cytology and Preventive Oncology) I-7, Sector - 39, Noida - 201301 Uttar Pradesh b.pdb Description: application/aportisdoc a.pdb Description: application/aportisdoc 2.pdb Description: application/aportisdoc 1.pdb Description: application/aportisdoc -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] error while calculating rms and shape tanimoto
HI Amit, On Thu, Sep 22, 2016 at 9:23 AM, Amit singhwrote: > > I am a new entry in this discussion forum and also for RDKit > Welcome! > I am trying to calculate shape tanimoto and rms between two molecules > (PDB files) from 3D functionality of RDKit. > Code is working fine for the pdb files given in test data. > But gives error whenever I uses other pdb files > > > >>> rms = rdMolAlign.AlignMol(mol1, mol2) > Traceback (most recent call last): > File "", line 1, in > RuntimeError: std::exception > --- > It looks like there is a problem in input files, but help required > In order to be able to answer the question, we need a bit more information. Can you please share what files you loaded mol1 and mol2 from so that we can reproduce the problem? -greg -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] rearomatize only benzene rigns after kekulize + clearAromaticFlags
On Wed, Sep 21, 2016 at 4:31 PM, Guillaume GODIN < guillaume.go...@firmenich.com> wrote: > After testing the code, It works perfectly, thanks! > Well, there's at least that. ;-) > Unfortunatly, I discovered that it's still not compatible with the > aromaticity method used in the article i mention in another post from > Rudolf Naef. > Before going further with this I have a question for you (note that I still haven't had time to read the paper in detail): I understand that in order to exactly reproduce the results from that paper you do need to reproduce the aromaticity model used. However, if you were to borrow the methods and data from the paper, you could theoretically build your own models based on RDKit aromaticity. This would likely be more efficient at runtime than re-perceiving aromaticity. > I need to keep aromaticity of all 6 rings (having C or N which is > possible using your function), but also keep info of aromaticity of fused 6 > rings (aka. naphthalene, ...) + convert/keep guanidium moieties aromatic > too. > > So, I would be more interesting to fine a fast process to > revoke aromaticity on rings that are not 6 members rings only, which should > preserve all 6 rings + fused aromatic rings and also set guanidium salt as > aromatic. > "Revoking" aromaticity is tricky because you really need to also kekulize the rings that you remove aromaticity from. I think you're going to be better off just describing the features that are aromatic and applying the method I described in the previous message. The SMARTS I sent to you should certainly work for fused rings like naphthalene and could be adapted to support heteroatoms. Guanidinium is a different problem though... the RDKit does not tolerate aromatic bonds/atoms that aren't in rings. What exactly do you want to do there? -greg -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] error while calculating rms and shape tanimoto
Dear All I am a new entry in this discussion forum and also for RDKit I am trying to calculate shape tanimoto and rms between two molecules (PDB files) from 3D functionality of RDKit. Code is working fine for the pdb files given in test data. But gives error whenever I uses other pdb files >>> rms = rdMolAlign.AlignMol(mol1, mol2) Traceback (most recent call last): File "", line 1, in RuntimeError: std::exception --- It looks like there is a problem in input files, but help required Thanks Amit -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss