Dear Kovas,
It looks like GetSubstructMatch() only finds a match if the dummy atom
is in the query, not if it is in the molecule they you are matching the
query against.
This notebook present a possible solution off the top of my head:
https://gist.github.com/ptosco/a35ac28a14103b47096f6d6af1aec831
which does not involve changes to the C++ layer, even though it is
computationally more expensive and will fail with disconnected fragments
as it uses FindMCS(). There may be better solutions - this is what I
came out with yesterday night in the little time I had available.
Cheers,
P.
On 08/22/18 19:34, Kovas Palunas wrote:
Hi All,
I’m interested in having GetSubstructMatches return non-“null” results
in the following example. The results should lead to a match where
atom 1 maps to atom 11, 2 to 12, etc.
m1 = Chem.MolFromSmiles('[*:1][CH2:2][C:3]([CH3:4])=[CH2:5]')
m2 = Chem.MolFromSmiles('[F:11][CH2:12][C:13]([*:14])=[CH2:15]')
### do something here so that the mols will match ###
qp = Chem.AdjustQueryParameters()
qp.makeDummiesQueries = True
m1 = Chem.AdjustQueryProperties(m1, qp)
m2 = Chem.AdjustQueryProperties(m2, qp)
# I’d like both of the following to return results
m1.GetSubstructMatches(m2)
m2.GetSubstructMatches(m1)
My understanding of why these mols currently do not match is as
follows: because only the dummy atoms are made queries (based on my
query parameter adjustment), when one mol is matched to another dummy
1 may match to F:11, but dummy 14 will then not match to methyl:14.
This is because (as I understand), normal atoms can only be matched by
queries, and cannot match them themselves.
Potential ideas to make this work as I’d like:
1. Override atom.Match in the python code – not sure that this would
work since the C++ version of this function is what would be
called during GetSubstructMatches
2. Override atom.Match in the C++ code – not quite sure how to do
this, or what side affects it might have. Ideally the changes I
make would only affect this example (and other similar ones)
3. Make all atoms in both molecules QueryAtoms, but otherwise leave
them unchanged. I’m not quite sure how to do this!
Does anyone have any ideas for what the best approach here would be,
or knows if there is already built in functionality for something like
this? I’d prefer to not use SMARTS to construct my molecules if
possible, since I don’t really think of them as queries, just as other
molecules in the system that happen to not be fully specified.
- Kovas
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss