Hi Stephan,

On Thu, Mar 7, 2013 at 6:08 PM, Stephan Reiling <[email protected]> wrote:
> I am encountering the following problem with the rdkit postgresql cartridge:
> The smiles below can be inserted to a table as a datatype mol. I can
> calculate fingerprints on it, no problem. is_valid_smiles() returns true,
> mol_numatoms() works.
> But when I try to retrieve the structure I get the error:
> #ERROR: makeMolText: Unknown exception
> which causes backups of the database using pg_dump to fail. This is with
> RDKit version 2012_12_1 and postgresql 9.1.5
>
> The problematic Smiles is:
> "C1CC2CC3C45C2C2C6C7C8C9C%10C(C1)C1C%11%10C%109C98C87C76C42C24C65C3C3C56C64C4%12C72C28C79C8%10C9%11C1C1C%109C98C87C42C24C7%12C%116C65C3C3C56C6%11C%117C74C4%12C82C29C8%10C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1CC8C4C1C%122C27C4%11C76C65C3CC6C7C4C12"
>

Here's what's going on.
That SMILES (which, by the way, looks like one of the pathological
"someone transcribing the structures in a patent drew a table using
ChemDraw" things that Noel blogged about last year:
http://nextmovesoftware.com/blog/2012/11/02/patently-wrong-tracing-the-origin-of-an-unusual-molecule-in-pubchem/)
can be happily processed by the RDKit:

In [6]: 
smi="C1CC2CC3C45C2C2C6C7C8C9C%10C(C1)C1C%11%10C%109C98C87C76C42C24C65C3C3C56C64C4%12C72C28C79C8%10C9%11C1C1C%109C98C87C42C24C7%12C%116C65C3C3C56C6%11C%117C74C4%12C82C29C8%10C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1CC8C4C1C%122C27C4%11C76C65C3CC6C7C4C12"

In [7]: m = Chem.MolFromSmiles(smi)


But it cannot be converted back to SMILES, because the SMILES-creation
algorithm ends up having more than 99 rings open at once:

In [8]: Chem.MolToSmiles(m)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-6529fcd6b60a> in <module>()
----> 1 Chem.MolToSmiles(m)

ValueError: Too many rings open at once. SMILES cannot be generated.


It appears that postgresql is using makeMolText, which creates a
smiles, when you're backing up the database. This is leading to the
problem.

I have to look back at how to control what which functions pg_dump
calls, it probably shouldn't be using SMILES (which is computationally
expensive to parse) in backups.

Thanks for the bug report.
-greg

------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to