Thanks Tricarico - I was afraid this might be the answer, but thanks for your suggestion. I'm not entirely sure I understand how adding an enhanced stereo collection reflecting the status of the chiral flag when going from V2000 to V3000 is a problem; it would be good to see some examples. I know the chiral flag is a nightmare in when reading V3000 but when reading V2000 if it's not set correctly then the file is broken and setting the enhanced collection doesn't make it more broken. It would be nice if creating an enhanced collection from the chiral flag (when reading V2000 only) was available as an option.
Cheers Nick From: Giovanni Tricarico <giovanni.tricar...@glpg.com> Sent: Wednesday, January 31, 2024 9:55 AM To: Tomkinson, Nicholas <nick.tomkin...@astrazeneca.com>; rdkit-discuss@lists.sourceforge.net Subject: RE: V2000 to V3000 enhanced stereo question Hello Nick, We faced a (seemingly) related problem a while ago. In our case we were trying to convert V2000 CTABs to CXSMILES, and we were expecting that the V2000 chirality flag would translate to an enhanced stereo string in the CXSMILES. That is not so, by design. See my question, and the answer it got, here: V2000 chiral flag does not seem to be read by Chem.MolFromMolBlock() * Issue #6062 * rdkit/rdkit * GitHub<https://github.com/rdkit/rdkit/issues/6062> I imagine that the reason why the V2000 to V3000 conversion does not use the V2000 chirality flag is conceptually the same, but indeed worth checking. FYI, the practical solution for our workflow was: * create a function 'chiral_flag_from_molblock' that detects if a CTAB is V2000 or V3000; if V2000, reads the flag (by simple text parsing) and returns it (0 or 1), if V3000, returns -1 * create a function 'CTAB_to_CXSMILES' that calls the above; for V3000, the rdkit-generated CXSMILES is (or usually is) already correct; for V2000, if the flag is 1, the SMILES is identical to the CXSMILES; if the flag is 0, the function loops through all atoms, identifies those that have tetrahedral stereochemistry, and uses their indices to put together an '&1' enhanced stereo group string, which is then appended to the SMILES (as a V2000 CTAB with chirality flag 0 can only represent a racemic mixture where all configurations are inverted together, so it only needs one '&' group - of course with all the exceptions and issues you can imagine: meso stereoisomers or moieties, etc) Probably not ideal, but lacking any suggestion or a better 'native' solution, that's what we went for, and it seems to have worked so far. [I'll mention for completeness that we also run a further standardisation function on CXSMILES, which takes care of removing the enhanced stereo flags from meso moieties]. I hope this helps. Regards [cid:image001.png@01DA542E.D3A90830] [cid:image002.png@01DA542E.D3A90830]<https://twitter.com/GalapagosGlobal> [cid:image003.png@01DA542E.D3A90830] <https://www.linkedin.com/company/glpg> [cid:image004.png@01DA542E.D3A90830] <https://www.youtube.com/c/GalapagosGlobal> [cid:image005.png@01DA542E.D3A90830] <https://www.glpg.com/> Giovanni Tricarico Principal Scientist Chemoinformatics +32 15 6514 30<callto:+32%2015%206514%2030> giovanni.tricar...@glpg.com<mailto:giovanni.tricar...@glpg.com> Galapagos NV Generaal De Wittelaan L11 A3 2800 Mechelen, Belgium From: Tomkinson, Nicholas <nick.tomkin...@astrazeneca.com<mailto:nick.tomkin...@astrazeneca.com>> Sent: Tuesday, January 30, 2024 5:28 PM To: rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net> Subject: [Rdkit-discuss] V2000 to V3000 enhanced stereo question Some people who received this message don't often get email from nick.tomkin...@astrazeneca.com<mailto:nick.tomkin...@astrazeneca.com>. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> I am trying to convert a simple V2000 molfile with or without the chiral flag into a V3000 molfile but this does not create an enhanced stereo collection in the V3000 molfile. This is a requirement for another application that does not handle V2000/V3000 mixtures well. Is there anyway of forcing the writing of the enhanced collection in this context? Thanks Nick ________________________________ AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA. This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://www.astrazeneca.com/> This e-mail and its attachment(s) (if any) may contain confidential and/or proprietary information and is intended for its addressee(s) only. Any unauthorized use of the information contained herein (including, but not limited to, alteration, reproduction, communication, distribution or any other form of dissemination) is strictly prohibited. If you are not the intended addressee, please notify the originator promptly and delete this e-mail and its attachment(s) (if any) subsequently. Neither Galapagos nor any of its affiliates shall be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message (by a third party) or as a result of a virus being passed on. ________________________________ AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA. This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com<https://www.astrazeneca.com>
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss