Hi Chris,
The following, though quite inefficient, will work:
from rdkit import Chem
mol = Chem.MolFromPDBFile("1CX2.pdb")
chains = {a.GetPDBResidueInfo().GetChainId() for a in mol.GetAtoms()}
chain_mols = {c: Chem.RWMol(mol) for c in chains}
for c, m in chain_mols.items():
bonds_to_remove = [(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) for b in
m.GetBonds() if b.GetBeginAtom().GetPDBResidueInfo().GetChainId() != c or
b.GetEndAtom().GetPDBResidueInfo().GetChainId() != c]
atoms_to_remove = [a.GetIdx() for a in m.GetAtoms() if
a.GetPDBResidueInfo().GetChainId() != c]
[m.RemoveBond(*b) for b in bonds_to_remove]
[m.RemoveAtom(a) for a in sorted(atoms_to_remove, reverse=True)]
Chem.MolToPDBFile(m, "{0:s}.pdb".format(c))
Individual chains are saved to <chain_id.pdb>.
As chains will be separate fragments, a more efficient way would to use
rdmolops.GetMolFrags(asMols=True) which would avoid the bond/atom removal.
Sorry for the poor formatting but this is what I could come up with IPython on
the iPhone :-(
p.
> On 5 Oct 2019, at 12:46, Chris Swain via Rdkit-discuss
> <[email protected]> wrote:
>
> Hi,
>
> I have a number of PDB files (foo.pdb.gz) and I want to separate each chain
> in each file out into a separate file. So if a file contains 4 chains it will
> generate 4 separate files.
>
> Can I do this using RDKit, if so how?
>
> Cheers
>
> Chris
>
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss