Hi Janusz, I'm not 100% sure what you're looking for, but I think it has something to do with including information about bond conjugation in the MCS procedure.
To confirm, can you please give a couple of examples of what you would like to have as output from the algorithm? Something like this with the input molecules on the left and the desired result on the right would help : ['CNC=CC', 'C=CNC=CC'] -> 'CNC=CC' (I realize that specific example is not what you're looking for, it's just intended to be an example) Once I've seen that I can try to figure out if it is currently doable and, if not, if it's possible to modify the code to support it. Best, -greg On Fri, Nov 13, 2015 at 9:17 PM, Janusz Petkowski <[email protected]> wrote: > Dear RDKit Community, > > I am looking for a way to use MCS module in RDKit to compare atoms and > bonding of two molecules which will also take under consideration the > hybridization of an atom. > > The solution to similar problem was suggested before, (Inspired by this > RDKit-discuss thread started by Liz Wylie: > http://www.mail-archive.com/[email protected]/msg03676.html > and see here http://sourceforge.net/p/rdkit/mailman/message/31830412/ ) > > but even if it is computationally correct it does not necessarily mirror some > nuances of chemistry and one may want to modify it in certain specific cases. > While it works most of the time for cases like those proposed in the solution > of Liz Wylie case: > > smis = ['CC(C)=C','CC(C)C'] > > or > > smis2 = ['CC(C)=C','CC(C)=N'] > > If we check if 'CCC' substructure is present in molecules from those two > data sets upon implementation of Greg Landrum solution to CCC will be found > only in 'CC(C)C', taking in to the account the atoms, the bonding and > the hybridization of the atoms. It is all correct and cool! > > > But let's look at the other example: > > Let's look for the N\CC\N substructure in 'C\C=C\NCCN\C=C\C' or the 'NCN' > substructure in NCN-C=C or ' C=CNCNC=C'. It will not be found there even if > "structurally speaking" it is there. > > The problem is as follows: an electronegative atom next to a C=C bond > will pull electron density from that bond and so the N-C bond in NCN-C=C > will have a ‘bit of’ double bond character, even if technically it is a > single bond. The current solution to the Liz Wylie problem does not ignore > that and distinguishes between regular N-C bond and an N-C bond next to C=C > bond (like in NCN-C=C, because of that it will not find NCN in this > structure). NCS in NCSC=C is matched because the S bond is more > electropositive than N or O and so does not have that double-bond > character. My question to the RDKit community is: How to modify Greg > Landrum solution to Liz Wylie case to successfully match such cases I > mentioned above, while still retaining the hybridization check (we do want > to have hybridization match, we just want the bonding to be more > important). The problem is that the atoms that are not matched like the N > atoms above have sp2 hybridization but technically are bonded by single > bonds from all sides. > > Thanks a lot for your help, time and consideration. This is my first post > on RDKit forum, I am new to RDKit and python in general, so I apologize if > I anything is not clear. > > I would really appreciate your help! > > > Best regards, > > > Janusz Petkowski > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Rdkit-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > >
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

