Dear George,

On Wed, Mar 4, 2009 at 6:15 PM, George Oakman <[email protected]> wrote:
> Hi all,
>
> I have noticed that the MACCS Fingerprints are only available via the Python
> framework. Has anyone implemented a C++ version on RDKit already?

Not that I'm aware of, but it would be something useful to have.

>
> If not, I would be very grateful if someone could give me a few pointers to
> get me started on implementing a C++ equivalent to
> the Chem.MACCSKeys.GenMACCSKeys Python function.

Frequently translation of RDKit Python code into C++ is pretty
straightforward; the datastructures and interfaces usually not
massively different.

A possible starting point for figuring out how to port the MACCS keys
is the code for the Crippen descriptors in
$RDBASE/Code/GraphMol/Descriptors/Crippen.cpp
That has some of the pieces you would need, including some code for
reading a set of SMARTS definitions from a text string in a flexible
manner that one could use as a model.

Once you have the definitions for the MACCS keys, one needs to apply
them to molecules, count the number of matches, and compare those
counts to the targets in the parameter set. The fingerprints can be
returned as an ExplicitBitVect. Note that keys 125 and 166 can't be
defined purely as SMARTS, but the definitions for these are relatively
simple.

That's not exactly a recipe for the reimplementation, but it might be
enough to get you started if you choose to do it.

> This may be silly, but could I simply write a C++ wrapper around the Python
> function?

you probably could, but I'm guessing that it would be more trouble
than the reimplementation.

-greg

Reply via email to