THe reason for this is that it will prevent ambiguities due to nonstandard, higher valences. Because of this, it is not possible to infer the implicit hydrogen count, so it must be specified explicitly. For S and P the standard valence would be 2 and 3 respectively, just like for O and N. But S has nonstandard valences available: 4 and 6 as in sulfones and sulfoxides. P can commonly have valence of 5, as in phosphoranes. Your provided SMILES has a valence of at least 3, exceeding the standard valence of 2. This creates and ambiguity, where the SMILES parser has to decide whether the S has a valence of 4 or 6. Likewise, with the SMILES "FP(F)(F)F" a roundtrip through rdkit will convert this into "F[PH](F)(F)F", this means the notation is consistent with F[PH2](F)F and distinguishable from FP(F)F. In general when higher valence states are not possible rdkit will throw a valence error but there are some more examples available. For example "CIC" will become C[IH]C.
best wishes wim On Sat, Apr 29, 2023 at 12:20 PM Thomas <odioidenti...@gmail.com> wrote: > I am not a chemist, so it can be a silly question, but I am interested in > the logic behind it, also because other libraries (like OpenBabel) behave > differently. > > Why sometimes RDKit writes hydrogens explicitly? > > mol = rdkit.MolFromSmiles('CCS=O', sanitize=False) > rdkit.MolToSmiles(mol) > 'CC[SH]=O' > > The input SMILES is intended as a pattern, not a molecule. I make a mol > out of it only to get the canonical SMILES, that will be then used as > SMARTS. > Logically, I don't understand how the number of H attached to the S can be > "guessed" by the library, still it cannot be left implicit. > > Furthermore, I have seen this behaviour only with S and P. I was wondering > if it's a confined issue, or it can happen with any element. > Thank you > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss