Dear George, On Wed, Mar 4, 2009 at 6:15 PM, George Oakman <[email protected]> wrote: > Hi all, > > I have noticed that the MACCS Fingerprints are only available via the Python > framework. Has anyone implemented a C++ version on RDKit already?
Not that I'm aware of, but it would be something useful to have. > > If not, I would be very grateful if someone could give me a few pointers to > get me started on implementing a C++ equivalent to > the Chem.MACCSKeys.GenMACCSKeys Python function. Frequently translation of RDKit Python code into C++ is pretty straightforward; the datastructures and interfaces usually not massively different. A possible starting point for figuring out how to port the MACCS keys is the code for the Crippen descriptors in $RDBASE/Code/GraphMol/Descriptors/Crippen.cpp That has some of the pieces you would need, including some code for reading a set of SMARTS definitions from a text string in a flexible manner that one could use as a model. Once you have the definitions for the MACCS keys, one needs to apply them to molecules, count the number of matches, and compare those counts to the targets in the parameter set. The fingerprints can be returned as an ExplicitBitVect. Note that keys 125 and 166 can't be defined purely as SMARTS, but the definitions for these are relatively simple. That's not exactly a recipe for the reimplementation, but it might be enough to get you started if you choose to do it. > This may be silly, but could I simply write a C++ wrapper around the Python > function? you probably could, but I'm guessing that it would be more trouble than the reimplementation. -greg

