Is it possible to search for a fragment that is not a valid structure
itself, but part of a structure?

Problem: "Given a structure, and a decomposition of the structure,
highlight each part with a different color"
The decomposition is always in the form of 1 SMILES and n SMILES FRAGMENTS
The "smiles fragments" are noted with an asterisk in the "connection bonds".

For example:
mol: CCC=C
decomposition:  C*   CC    *=C

For a human it takes nothing to spot "who is who", but how would you
approach it?

- I cannot match the SMARTS "C=": it's not a valid SMARTS
- I cannot match it without the broken bonds: I would lose the difference
between C* and C=*
- I cannot match it like it is: the asterisks will match the first atom of
the other fragment. (Maybe is there a way to get which part matched with
who? In that case I could remove the atom matching the asterisk...)

Maybe there is an easy way to represent this pattern 'C=' in SMARTS, but
the daylight manual is not clear about it. Or maybe I'm just too lazy to
get it....

In other words: is it possible to write n SMARTS that together match the
whole structure (all the atoms and all the bonds, with no overlapping and
no gaps)? Because if the SMARTS must be a complete structure (without
"unbonded" bonds), that's actually not possible.
Thank you
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to