Le 05/10/2019 à 12:46, Chris Swain via Rdkit-discuss a écrit :
Hi,
I have a number of PDB files (foo.pdb.gz) and I want to separate each chain in
each file out into a separate file. So if a file contains 4 chains it will
generate 4 separate files.
Can I do this using RDKit, if so how?
Cheers
Chris
Dear Chris,
Even this could be performed in rdkit, I would recommend doing it using
an external tool, for instance using Biopython and the Bio.PDB module
(https://biopython.org/wiki/The_Biopython_Structural_Bioinformatics_FAQ),
or even ProDy (http://prody.csb.pitt.edu/).
Rdkit needs to wrap a lot of atom definitions to load the pdb file
properly, and it takes time (minutes on my machine, which is a decent
workstation :-).
It will be lightning fast using Bio.PDB or prody, compared to rdkit.
If you still want to use rdkit only, and need to reuse rdkit
representation of the PDB file, then (c)pickle it (python2):
import cPickle
from rdkit import Chem
def processReceptor(r):
try:
h=open('receptor.pkl','r')
receptor=cPickle.load(h)
h.close()
except Exception as e:
receptor = Chem.MolFromPDBFile(r)
f=open('receptor.pkl','w')
cPickle.dump(receptor,f)
f.close()
return receptor
HTH,
Stéphane
--
Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein
Design In Silico
UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322
Nantes cedex 03, France
Tél : +33 251 125 636 / Fax : +33 251 125 632
http://www.ufip.univ-nantes.fr/ - http://www.steletch.org
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss