Hi Chris,

The following, though quite inefficient, will work:

from rdkit import Chem
mol = Chem.MolFromPDBFile("1CX2.pdb")
chains = {a.GetPDBResidueInfo().GetChainId() for a in mol.GetAtoms()}
chain_mols = {c: Chem.RWMol(mol) for c in chains}
for c, m in chain_mols.items():
    bonds_to_remove = [(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) for b in 
m.GetBonds() if b.GetBeginAtom().GetPDBResidueInfo().GetChainId() != c or 
b.GetEndAtom().GetPDBResidueInfo().GetChainId() != c]
    atoms_to_remove = [a.GetIdx() for a in m.GetAtoms() if 
a.GetPDBResidueInfo().GetChainId() != c]
    [m.RemoveBond(*b) for b in bonds_to_remove]
    [m.RemoveAtom(a) for a in sorted(atoms_to_remove, reverse=True)]
    Chem.MolToPDBFile(m, "{0:s}.pdb".format(c))

Individual chains are saved to <chain_id.pdb>.

As chains will be separate fragments, a more efficient way would to use 
rdmolops.GetMolFrags(asMols=True) which would avoid the bond/atom removal.

Sorry for the poor formatting but this is what I could come up with IPython on 
the iPhone :-(

p.

> On 5 Oct 2019, at 12:46, Chris Swain via Rdkit-discuss 
> <rdkit-discuss@lists.sourceforge.net> wrote:
> 
> Hi,
> 
> I have a number of PDB files (foo.pdb.gz) and I want to separate each chain 
> in each file out into a separate file. So if a file contains 4 chains it will 
> generate 4 separate files.
> 
> Can I do this using RDKit, if so how?
> 
> Cheers
> 
> Chris
> 
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to