Hi Andrew, You could substitute the aromatic atoms into non-organic atoms (eg. aromatic carbon to calcium [Ca], aromatic nitrogen to sodium [Na] and so on with single bonds between them), and use the atom deleting procedure as normal. The structures should still be valid so should be able to use the usual rdkit functions on them.
Thanks Jameed ________________________________ From: Andrew Dalke <da...@dalkescientific.com> Sent: 20 August 2019 21:06 To: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Subject: [Rdkit-discuss] aromatic bonds and graph edit distance Hi all, Someone asked me recently about finding the graph edit distance of two small (<= 14 atom) fragments. I figured this was something that could be brute forced. Following SmallWorld's example at https://cisrg.shef.ac.uk/shef2016/talks/oral13.pdf , given a fragment, incrementally delete terminals (except the "*" connection point atom), and ring bonds. For chain bonds, and non-aromatic bonds, it's easy to delete the bond and add the correct number of hydrogens to either side. But, what should I do when I cut an aromatic bond? For something like the first "co" in "c1cocn1", I want the result to be C=CN=CO. That's because the "o" can only be "-O-" in Kekule form. For something like "c1cnncn1", breaking on the "nn", I think I would like to get both 'N=CC=NC=N' and 'NC=CN=CN' because the "nn" can be a single or a double bond, depending on the Kekule representation, as in: >>> Chem.CanonSmiles("C-1=N-N=C-C=N-1") 'c1cnncn1' >>> Chem.CanonSmiles("C-1=N.N=C-C=N-1") 'N=CC=NC=N' >>> Chem.CanonSmiles("C=1-N=N-C=C-N=1") 'c1cnncn1' >>> Chem.CanonSmiles("C=1-N-[HH].[HH]N-C=C-N=1") 'NC=CN=CN' Problem is, I don't know how to figure out if a given aromatic bond must be a "-" or "=", or can be both. (Well, I could brute-force enumerae all 2**n possible aromatic bond assignments, then canonicalize, and see if both assignments are possible for a given bond.) As a non-chemist, I also ask if I'm even on a chemically meaningful track. Andrew da...@dalkescientific.com _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss Disclaimer This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed and contain information which is confidential or otherwise protected from disclosure, except for the purpose for which they are intended. Dissemination, distribution, or reproduction by anyone other than the intended recipients is prohibited and may be illegal. If you are not an intended recipient, please immediately inform the sender and return the electronic mail and its attachments and destroy any copies which may be in your possession. Dotmatics Limited screens electronic mails for viruses but does not warrant that this electronic mail is free of any viruses. Dotmatics Limited accepts no liability for any damage caused by any virus transmitted by this electronic mail. Dotmatics Limited is registered in England & Wales No. 5614524 with offices at The Old Monastery, Windhill, Bishops Stortford, Herts, CM23 2ND, UK.
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss