Hello, I tried out RGroupDecompose on a set of indazoles, using "c1ccc2[nH]ncc2c1" as core molecule. Most of them gave a valid core SMILES:
n1c([*:2])c2c([*:1])c([*:7])c([*:6])c([*:5])c2n1[*:4] However, some gave this core SMILES: [nH]1c2c([*:5])c([*:6])c([*:7])c([*:1])c2c([*:2])n1[*:3] which rdkit itself then refuses to convert to a molecule (other software like Dotmatics Vortex does instead (?)). [cid:image002.png@01D95CAE.0D1F24B0] Any idea what may be going wrong? I noticed that the tautomeric form of the indazole ring is different in the molecules that originated the 'wrong' core, in particular the H (or other substituent) is on the nitrogen atom that is not attached to the benzene ring. [In fact, that also raises the question of why a tautomer of the original core was matched by RGroupDecompose, and how one would instead force the matching of the chosen tautomer only]. Thanks Giovanni Tricarico Principal Scientist Computational Chemistry [cid:image001.png@01D95CA8.7ECCEA80] Galapagos Generaal De Wittelaan L11 A3 2800 Mechelen Belgium T: +32 15 6514 30 www.glpg.com<http://www.glpg.com/> This e-mail and its attachment(s) (if any) may contain confidential and/or proprietary information and is intended for its addressee(s) only. Any unauthorized use of the information contained herein (including, but not limited to, alteration, reproduction, communication, distribution or any other form of dissemination) is strictly prohibited. If you are not the intended addressee, please notify the originator promptly and delete this e-mail and its attachment(s) (if any) subsequently. Neither Galapagos nor any of its affiliates shall be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message (by a third party) or as a result of a virus being passed on.
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss