Hi Andrew,

You could substitute the aromatic atoms into non-organic atoms (eg. aromatic 
carbon to calcium [Ca], aromatic nitrogen to sodium [Na] and so on with single 
bonds between them), and use the atom deleting procedure as normal. The 
structures should still be valid so should be able to use the usual rdkit 
functions on them.

Thanks
Jameed


________________________________
From: Andrew Dalke <da...@dalkescientific.com>
Sent: 20 August 2019 21:06
To: RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
Subject: [Rdkit-discuss] aromatic bonds and graph edit distance

Hi all,

  Someone asked me recently about finding the graph edit distance of two small 
(<= 14 atom) fragments.

I figured this was something that could be brute forced. Following SmallWorld's 
example at https://cisrg.shef.ac.uk/shef2016/talks/oral13.pdf , given a 
fragment, incrementally delete terminals (except the "*" connection point 
atom), and ring bonds.

For chain bonds, and non-aromatic bonds, it's easy to delete the bond and add 
the correct number of hydrogens to either side.

But, what should I do when I cut an aromatic bond?

For something like the first "co" in "c1cocn1", I want the result to be 
C=CN=CO. That's because the "o" can only be "-O-" in Kekule form.

For something like "c1cnncn1", breaking on the "nn", I think I would like to 
get both 'N=CC=NC=N' and 'NC=CN=CN' because the "nn" can be a single or a 
double bond, depending on the Kekule representation, as in:

>>> Chem.CanonSmiles("C-1=N-N=C-C=N-1")
'c1cnncn1'
>>> Chem.CanonSmiles("C-1=N.N=C-C=N-1")
'N=CC=NC=N'

>>> Chem.CanonSmiles("C=1-N=N-C=C-N=1")
'c1cnncn1'
>>> Chem.CanonSmiles("C=1-N-[HH].[HH]N-C=C-N=1")
'NC=CN=CN'

Problem is, I don't know how to figure out if a given aromatic bond must be a 
"-" or "=", or can be both.

(Well, I could brute-force enumerae all 2**n possible aromatic bond 
assignments, then canonicalize, and see if both assignments are possible for a 
given bond.)

As a non-chemist, I also ask if I'm even on a chemically meaningful track.


                                Andrew
                                da...@dalkescientific.com




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Disclaimer

This electronic mail and its attachments are intended solely for the person(s) 
to whom they are addressed and contain information which is  confidential or 
otherwise protected from disclosure, except for the purpose for which they are 
intended. Dissemination, distribution, or reproduction by anyone other than the 
intended recipients is prohibited and may be illegal. If you are not an 
intended recipient, please immediately inform the sender and return the 
electronic mail and its attachments and destroy any copies which may be in your 
possession. Dotmatics Limited screens electronic mails for viruses but does not 
warrant that this electronic mail is free of any viruses. Dotmatics Limited 
accepts no liability for any damage caused by any virus transmitted by this 
electronic mail. Dotmatics Limited is  registered in England & Wales No. 
5614524 with offices at The Old Monastery,  Windhill, Bishops Stortford, Herts, 
CM23 2ND, UK.
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to