Hi Markus,

On Fri, May 23, 2008 at 10:49 AM, markus <[email protected]> wrote:
> Hi there,
>
> some questions about File input:
> 1. I want to read in large multiconformer sdf file.
>  This could be done by checking the canonical isomeric smiles.
>  However I wonder if that could be (or already is) implemented in some
> Supplier
>  as I suspect my Python Code to be slow.

There's not currently a multiconformer SD file reader. Given a format
that people could agree on, it would not be hard to implement one, but
first you have to have an agreed-upon structure for the file. Matching
molecules using canonical smiles would be very, very inefficient. It
would be much better to use a property to identify the conformers
(either the molecule name field or an SD property).

As an aside: SDF is a very inefficient storage mechanism for
multi-conf SD files: there's a lot of duplicated information. The
RDKit supports a format called TPL (it's an old MSI/biosym format)
that's a lot more efficient.

> 2. How about Proteins / binding Pockets? As RDKit explicitly is designed
> for
> small Mols, I wonder if anyone has already loaded a Protein?
> I found a post by  Adrian Schreyer (*[Rdkit-discuss] Atom-level SMARTS
> matching*)
> mentioning SIFTs, This implies the perception of Pocket-residues.
> Was that done / Can that be done by the RDKit?

I'll say more about this in my reply to Adrian's reply.

-greg

Reply via email to