Thanks, Paolo, for a good and clear example.
I adapted your code into my workflow to calculate some Lipinski-properties of RNA pdb-structures, and ran into some issues. I'm not sure if I should make a new thread or throw this onto this one I already created? I used the following code under from rdkit import Chem from rdkit.Chem import rdmolops, Lipinski from urllib.request import urlopen import gzip import pprint pp = pprint.PrettyPrinter(indent=4) Lipinski_dic = {'FractionCSP3':Lipinski.FractionCSP3, 'HeavyAtomCount':Lipinski.HeavyAtomCount, 'NHOHCount': Lipinski.NHOHCount, "NOCount":Lipinski.NOCount, "NumAliphaticCarbocycles": Lipinski.NumAliphaticCarbocycles, "NumAliphaticHeterocycles" : Lipinski.NumAliphaticHeterocycles, 'NumAliphaticRings' : Lipinski.NumAliphaticRings, 'NumAromaticCarbocycles' : Lipinski.NumAromaticCarbocycles, 'NumAromaticHeterocycles' : Lipinski.NumAromaticHeterocycles, 'NumAromaticRings' : Lipinski.NumAromaticRings, 'NumHAcceptors' : Lipinski.NumHAcceptors, 'NumHDonors' : Lipinski.NumHDonors, 'NumHeteroatoms' : Lipinski.NumHeteroatoms, 'NumRotatableBonds' : Lipinski.NumRotatableBonds, 'NumSaturatedCarbocycles' : Lipinski.NumSaturatedCarbocycles, 'NumSaturatedHeterocycles' : Lipinski.NumSaturatedHeterocycles, 'NumSaturatedRings' : Lipinski.NumSaturatedRings, 'RingCount' : Lipinski.RingCount } url = "https://files.rcsb.org/download/1arj.pdb.gz" pdb_data = gzip.decompress(urlopen(url).read()) mol = Chem.RWMol(Chem.MolFromPDBBlock(pdb_data)) bonds_to_cleave = {(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) for b in mol.GetBonds() if b.GetBeginAtom().GetPDBResidueInfo().GetIsHeteroAtom() ^ b.GetEndAtom().GetPDBResidueInfo().GetIsHeteroAtom()} [mol.RemoveBond(*b) for b in bonds_to_cleave] hetatm_frags = [f for f in rdmolops.GetMolFrags(mol, asMols=True, sanitizeFrags=True) if f.GetNumAtoms() and f.GetAtomWithIdx(0).GetPDBResidueInfo().GetIsHeteroAtom()] for hetatm in hetatm_frags: res_name = hetatm.GetAtomWithIdx(0).GetPDBResidueInfo().GetResidueName() calculated_props = {} for prop in Lipinski_dic: function = Lipinski_dic[prop] x = function(hetatm) calculated_props[prop] = x pp.pprint(calculated_props) and as you can see the properties of the ligand doesn't match up with what is expected (The number of SP3-atoms doesn't match up). When parsing through the structure 3got, it fails to recognize the aromatic rings of the ligand A2F. I'm assuming this is caused by RDKit not assigning bond orders correctly when reading in RNA and DNA pdb files (something which I have reported in earlier on this mailing list)? Running hetatm.UpdatePropertyCache(strict=True) does not remedy this problem. Is there a clever way I can fix this quickly without waiting for this to be fixed in a future version? Illimar Rekand Ph.D. candidate, Brenk-lab, Haug-lab Department of Biomedicine Department of Chemistry University of Bergen ________________________________ From: Illimar Hugo Rekand Sent: Monday, December 16, 2019 5:55:56 PM To: Paolo Tosco Subject: Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand Hey, Paolo, thanks for a good and clear example! all the best, Illimar Rekand Ph.D. candidate, Brenk-lab, Haug-lab Department of Biomedicine Department of Chemistry University of Bergen ________________________________ From: Paolo Tosco <paolo.tosco.m...@gmail.com> Sent: Monday, December 16, 2019 5:52:18 PM To: Illimar Hugo Rekand; rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand Hi Illimar, this gist: https://gist.github.com/ptosco/2ee199b219b27e01052a7a1433b3bd22 shows a way to achieve this. Cheers, p. On 16/12/2019 16:07, Illimar Hugo Rekand wrote: > Hello, everyone > > > Is there a simple way to create a mol object from just the HETATM/ligand > lines from a pdb-file? > > Would it be viable to create a function where you could create a mol object > from specific lines within a pdb-file? > > > Illimar Rekand > Ph.D. candidate, > Brenk-lab, Haug-lab > Department of Biomedicine > Department of Chemistry > University of Bergen > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss