I'm trying to get to grips with using the RDKit cartridge, and so far its going well. One thing I'm concerned about is molecule standardization, along the lines of the ChemAxon Standardizer that allows substructure searches to be done is a way that is largely independent of the quirks of structure representation. The classic example would be how nitro groups are represented, so that it didn't matter which nitro representation was in the query or target structures, because both were converted to a canonical form.
My initial thoughts are that this would be done by: 1. loading the "raw" structures into a source column that would never be changed 2. defining a function that performed the necessary transform to generate the canonical form of a molecule. 3. generating a "canonical" structure column that was the result of passing the raw structures through that function 4. building the SSS index on that canonical column 5. executing queries using that function to canonicalize the query structure The problem I'm finding is that there do not seem to be postgres functions defined for doing molecular transforms (essentially a reaction transform) and doing things like removing explicit hydrogens. At least not in the functions listed on this page: http://rdkit.org/docs/Cartridge.html#functions Am I missing something here, or might I be barking up completely the wrong tree? Tim ------------------------------------------------------------------------------ Monitor Your Dynamic Infrastructure at Any Scale With Datadog! Get real-time metrics from all of your servers, apps and tools in one place. SourceForge users - Click here to start your Free Trial of Datadog now! http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140 _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss