Hi Stephan, On Thu, Mar 7, 2013 at 6:08 PM, Stephan Reiling <[email protected]> wrote: > I am encountering the following problem with the rdkit postgresql cartridge: > The smiles below can be inserted to a table as a datatype mol. I can > calculate fingerprints on it, no problem. is_valid_smiles() returns true, > mol_numatoms() works. > But when I try to retrieve the structure I get the error: > #ERROR: makeMolText: Unknown exception > which causes backups of the database using pg_dump to fail. This is with > RDKit version 2012_12_1 and postgresql 9.1.5 > > The problematic Smiles is: > "C1CC2CC3C45C2C2C6C7C8C9C%10C(C1)C1C%11%10C%109C98C87C76C42C24C65C3C3C56C64C4%12C72C28C79C8%10C9%11C1C1C%109C98C87C42C24C7%12C%116C65C3C3C56C6%11C%117C74C4%12C82C29C8%10C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1CC8C4C1C%122C27C4%11C76C65C3CC6C7C4C12" >
Here's what's going on. That SMILES (which, by the way, looks like one of the pathological "someone transcribing the structures in a patent drew a table using ChemDraw" things that Noel blogged about last year: http://nextmovesoftware.com/blog/2012/11/02/patently-wrong-tracing-the-origin-of-an-unusual-molecule-in-pubchem/) can be happily processed by the RDKit: In [6]: smi="C1CC2CC3C45C2C2C6C7C8C9C%10C(C1)C1C%11%10C%109C98C87C76C42C24C65C3C3C56C64C4%12C72C28C79C8%10C9%11C1C1C%109C98C87C42C24C7%12C%116C65C3C3C56C6%11C%117C74C4%12C82C29C8%10C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1C1C98C84C4%10C%122C27C7%11C%116C65C3C3C56C6%11C%117C42C24C7%11C%116C65C3C3C56C6%11C%117C74C4%12C%102C28C89C1C1C98C42C24C89C1CC8C4C1C%122C27C4%11C76C65C3CC6C7C4C12" In [7]: m = Chem.MolFromSmiles(smi) But it cannot be converted back to SMILES, because the SMILES-creation algorithm ends up having more than 99 rings open at once: In [8]: Chem.MolToSmiles(m) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-8-6529fcd6b60a> in <module>() ----> 1 Chem.MolToSmiles(m) ValueError: Too many rings open at once. SMILES cannot be generated. It appears that postgresql is using makeMolText, which creates a smiles, when you're backing up the database. This is leading to the problem. I have to look back at how to control what which functions pg_dump calls, it probably shouldn't be using SMILES (which is computationally expensive to parse) in backups. Thanks for the bug report. -greg ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

