Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Dimitri Maziuk
On 11/28/2016 10:25 AM, Stephen O'hagan wrote: > Has anyone come up with fool-proof way of matching structurally equivalent > molecules? This is somewhat convoluted and there is no proof that it's fool-proof. A few years ago we had good results from running graphpowerhash() function here:

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Rocco Moretti
On Mon, Nov 28, 2016 at 11:31 AM, Christos Kannas wrote: I think it would be better to use a similarity metric based on fingerprints. > Hi Christos, Fingerprints will only work if the fingerprint method you use captures all of the salient information you're interested

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Rajarshi Guha
It really boils down to how you standardize molecules such that you end up with a canonical structure. SMILES not the issue here - if you standardizer does a proper job with aromaticity, tautomers etc then you can get a canonical SMILES. You can use the InChI model as well as to generate a

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Christos Kannas
Hi Steve, I think it would be better to use a similarity metric based on fingerprints. Regards, Christos Christos Kannas Researcher Ph.D Student [image: View Christos Kannas's profile on LinkedIn] On 28 November 2016 at 18:25, Stephen O'hagan

[Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Stephen O'hagan
Has anyone come up with fool-proof way of matching structurally equivalent molecules? Unique Smiles or InChI String comparisons don't appear to work presumable because there are different but equivalent structures, e.g. explicit vs non-explicit H's, Kekule vs Aromatic, isomeric forms vs

[Rdkit-discuss] tuning postgres for substructure search

2016-11-28 Thread Alexander Klenner-Bajaja
Dear all, Has anyone experience with the postgrsql.conf and tuning its parameters for efficient sub structure search in collections of about 5-10 million compounds? We have about 250GB of RAM and 40 CPUs at hand. Concurrency will be around 10 simultaneous connections/queries so any memory

Re: [Rdkit-discuss] Using ETKDG for terminal ureas and thioureas

2016-11-28 Thread Paolo Tosco
Hi Susan, that's an interesting one. As a separate problem from the fact that the NH2 is tilted out of plane, according to this paper: Godfrey, Peter D., Ronald D. Brown, and Andrew N. Hunter. "The shape of urea." /Journal of molecular structure/ 413 (1997): 405-414.