Thanks for your helpful answer. I learned a lot. I have few more questions:
1. How do you achieve non-standard InChI? Is it available in RDKit? 2. What are the 15T and KET options? 3. Is your solution cannot be systematic? As a systematic solution I tried: enumerator = rdMolStandardize.TautomerEnumerator() for smi in my_smi_list: m = Chem.MolFromSmiles(smi) m = enumerator.Canonicalize(m) inchi = Chem.rdinchi.MolToInchi(m) The problem with this solution was that with very big molecules (for example, macrocycles) I have 'MemoryError'. 4. In another case (not for tautomers), I can't understand if the InChI output is correct or not: C[N+]1=C(\C=C\C2=CNC=C2)C=CC2=CC=CC=C12 C[N+]1=C(\C=C/C2=CNC=C2)C=CC2=CC=CC=C12 Usually, when I enter two E/Z stereoisomers - I have two different InChIs (and the difference is in the the /b or /t layers, as should be). However, this time (both in RDKit and OpenBabel) I have: InChI=1S/C16H14N2/c1-18-15(8-6-13-10-11-17-12-13)9-7-14-4-2-3-5-16(14)18/h2-12H,1H3/p+1 InChI=1S/C16H14N2/c1-18-15(8-6-13-10-11-17-12-13)9-7-14-4-2-3-5-16(14)18/h2-12H,1H3/p+1 Only if I remove the charge (hydrogen instead of carbon on the methylquinoline) or modify the pyrrole group on the other side, it gives me different InChI. Why? Thanks a lot, Benny From: Markus Sitzmann [mailto:markus.sitzm...@gmail.com] Sent: Tuesday, July 21, 2020 2:47 PM To: Da'Adoosh Binyamin <daado...@tauex.tau.ac.il> Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] RDKit/tautomers Hi Benny, that is a pure InChI problem (not a RDKit one). Back then when the Standard InChI was defined, the 15T and the KET option for the InChI calculation weren't either available or still experimental (I don't remember :-)), so they didn't make it into the standard set of options for the Standard InChI calculation. Hence it isn't too surprising that this tautomer pair doesn't calculate the same Standard InChI (InChI isn't/wasn't particularly strong regarding tautomerism outside rings). You might use (non-standard) InChI and switch the 15T and KET options on, that should fix your particular case. In general there are still ongoing efforts to make InChI stronger regarding tautomerism: https://pubmed.ncbi.nlm.nih.gov/32043883/ Markus On Tue, Jul 21, 2020 at 12:11 PM Da'Adoosh Binyamin <daado...@tauex.tau.ac.il<mailto:daado...@tauex.tau.ac.il>> wrote: Hi, I have a question about RDKit/tautomers. Let's say I have smiles input: C[CH]2CCC(=O)C1=C(O)[CH](O)C[CH](O)[CH]12 C[CH]2CCC(O)=C1C(=O)[CH](O)C[CH](O)[CH]12 Now, if I make this code for each input: m = Chem.MolFromSmiles(input) inchi = Chem.rdinchi.MolToInchi(m) I get different InChIs: InChI=1S/C11H16O4/c1-5-2-3-6(12)10-9(5)7(13)4-8(14)11(10)15/h5,7-9,13-15H,2-4H2,1H3 InChI=1S/C11H16O4/c1-5-2-3-6(12)10-9(5)7(13)4-8(14)11(10)15/h5,7-9,12-14H,2-4H2,1H3 My question is why is it happening. Usually if I enter two tautomers - they have the same InChI (like it is supposed to be, according to the literature ). What is the difference in this example? Thanks, Benny _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss