Dear Adrian and Markus,

On Fri, May 23, 2008 at 11:41 AM, Adrian Schreyer
<[email protected]> wrote:
>
> I do not use RDKit for handling protein structures (I do not think
> this is supported);

Indeed it's not. The system can not currently represent macromolecules
in anything like an efficient manner.

> instead, I use Bio.PDB to parse PDB files. This is
> one of the best PDB parsers (in my opinion), and better then the open
> babel or OEChem implementations. Unfortunately, there is no software I
> am aware of which can handle small molecules as well as protein
> structures reliably, i.e. keep track of disordered atoms, insertions
> code, ligand identifiers, atom names, alternative atom locations etc.
>
> There is an implementation of the KDTree algorithm in Bio.PDB, which I
> use to get all contacts between ligand and amino acid atoms. I am
> working on a paper at the moment which describes the methods I used to
> get the SIFts from protein-ligand complexes. I intend to make my
> protein-ligand interaction database including the API publicly
> available (as database dumps) maybe that will be helpful.

I'd be interested in this and the paper. SIFTs (or PLIFs as CCG calls
them) are a very interesting method.

> Bio.PDB uses the Biopython license which is extremely liberal; maybe
> there is way to use it as the foundation for a PDBMolSupplier and a
> RDBioMol class...okay I am dreaming a bit here! ;)

I'm not sure that it's too extreme of a dream. Since most of the RDKit
functionality doesn't make sense for proteins and most protein
functionality doesn't make sense for small molecules, it would
primarily be a matter of having some code to extract ligand
information from the BioPython complex and convert that into an RDK
molecule. Probably via a mol block. If you store the PDB atom ID <->
RDK atom number information you could then map information from
small-molecule operations (e.g. substructure searches) back onto
ligand atoms in the complex.

or something

-greg

Reply via email to