On 2015-09-24 16:22, Tim Dudgeon wrote:
> I'm trying to get to grips with using the RDKit cartridge, and so far
> its going well.
> One thing I'm concerned about is molecule standardization, along the
> lines of the ChemAxon Standardizer that allows substructure searches to
> be done is a way that is largely independent of the quirks of structure
> representation. The classic example would be how nitro groups are
> represented, so that it didn't matter which nitro representation was in
> the query or target structures, because both were converted to a
> canonical form.
>
> My initial thoughts are that this would be done by:
> 1. loading the "raw" structures into a source column that would never be
> changed
> 2. defining a function that performed the necessary transform to
> generate the canonical form of a molecule.
> 3. generating a "canonical" structure column that was the result of
> passing the raw structures through that function
> 4. building the SSS index on that canonical column
> 5. executing queries using that function to canonicalize the query structure
>
> The problem I'm finding is that there do not seem to be postgres
> functions defined for doing molecular transforms (essentially a reaction
> transform) and doing things like removing explicit hydrogens. At least
> not in the functions listed on this page:
> http://rdkit.org/docs/Cartridge.html#functions
>
> Am I missing something here, or might I be barking up completely the
> wrong tree?
>
> Tim

Hi Tim,

We have about the same situation and we're adding standardization 
(beyond what RDKit implicitly does when it sanitizes the molecule) 
through Python stored procedures. You will need to build and maintain a 
normal Python-enabled RDKit installation in parallel to the cartridge. 
The Python stored procedures can access the normal RDKit installation 
and then run whatever Python code is necessary to do additional molecule 
cleanup.

You will need to tweak your Postgres environment so the Python stored 
procedures can load RDKit. This is what I have defined in an environment 
file on CentOS:

RDBASE=/opt/rdkit
LD_LIBRARY_PATH=/opt/rdkit/lib
PYTHONPATH=/opt/rdkit

On Ubuntu this would go into /etc/postgresql/9.x/main/environment (in a 
slightly different format where the values have to be single-quoted).

Cheers
-- Jan, Biochemfusion

------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to