Hi Rocco et al.,
I too found this a very clear explanation of the different classes of
hydrogen so many thanks for taking the time. Where would a chiral H fit in?
The sort of H from Cl[C@H](F)Br?  That one needs to stay even if you
collapse all explicit H atoms  to implicit.

On the subject of the documentation, I would encourage you to find the
GettingStartedWithRDKit.rst in the Docs directory, find somewhere where
this discussion fits, add it, and send the new version to Greg. If everyone
did this every time they spent time working out how to do something, the
documentation would grow very rapidly and by definition grow fastest in
areas that people are actively using. We don't need to wait for Greg to do
it all!  He's busy enough as it is, and let's face it, writing docs is dull
and I'm sure he would appreciate the help.

Cheers,
Dave


On Friday, 9 September 2016, Greg Landrum <greg.land...@gmail.com> wrote:

> Thanks for this writeup Rocco. You're right that there's not an easy to
> find and understand collection of this information. That's one of those
> gaps in the documentation that I should eventually address. This is already
> a pretty good start though.
>
> -greg
>
>
> On Thu, Sep 8, 2016 at 9:37 PM, Rocco Moretti <rmoretti...@gmail.com
> <javascript:_e(%7B%7D,'cvml','rmoretti...@gmail.com');>> wrote:
>
>> Greg can correct me if I'm wrong(1), but in RDKit there's actually three
>> "levels" of hydrogens:
>>
>> * "Physical" hydrogens, which are represented as actual, independent
>> atoms in the atom graph. ("Physical hydrogens" is what I'm calling them - I
>> don't know if RDKit has an official term for them.)
>>
>> * "Explicit" hydrogens, which are represented as a numeric annotation on
>> their attached heavy atom. (And *not* as a separate atom object.)
>>
>> * "Implicit" hydrogens, which aren't actually represented anywhere, but
>> are calculated from the standard valence of the heavy atom, and how many
>> are occupied by actual atoms and explicit hydrogens.
>>
>> Generally, except for some coordinate calculations, RDKit seems to be
>> built around working with molecules with explicit or implicit hydrogens.
>> This is why when you read in a molecule, RDKit normally removes any
>> physical hydrogens. (Note that for most file reading code there's a
>> removeHs parameter you can set to False to change this behavior, and read
>> explicitly listed hydrogens as physical hydrogens.)
>>
>> By default "removing hydrogens" means turning them into implicit
>> hydrogens(2), but the RemoveHs() function has an "updateExplicitCount"
>> parameter which will cause the removed hydrogens to be turned into explicit
>> hydrogens instead. The standard MOL file loading code doesn't use this
>> option, though, so the hydrogens in the molecule are usually converted into
>> implicit when you read things in.
>>
>> AddHs(), of course, turns explicit and implicit hydrogens into physical
>> hydrogens. (Though the "explicitOnly" parameter can be used to control
>> this.) It does annotate whether these physical hydrogens came from either
>> the implicit or explicit pool, so you can round trip things through AddHs()
>> and RemoveHs() appropriately. (There's also a "implicitOnly" parameter on
>> RemoveHs() which will only remove those hydrogens.)
>>
>> Regards,
>> -Rocco
>>
>> (1) I don't think the RDKit hydrogen model has ever been formalized in
>> one place for user-facing documentation, so this is the understanding I've
>> gotten from banging my head against various hydrogen-related issues.
>>
>> (2) There's special complications here that there are certain structures,
>> such as imidazole, which needs physical or explicit hydrogens on one of the
>> nitrogens in order to Kekulize properly. If you're implicit only, the RDKit
>> sanitizer will choke. Thus, there's special casing in various Add/RemoveHs
>> function to avoid implicit-izing these critical hydrogens.
>>
>> On Thu, Sep 8, 2016 at 1:46 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu
>> <javascript:_e(%7B%7D,'cvml','dmaz...@bmrb.wisc.edu');>> wrote:
>>
>>> On 09/08/2016 10:25 AM, Greg Landrum wrote:
>>> ...
>>> > Why do you want 2D drawings that include H atoms?
>>>
>>> On the subject of H atoms: when I read in the MOL file that has them, I
>>> need to explicitly call AddHs() in order to have them drawn.
>>>
>>> Question: do they actually get stripped off by the reader and re-added
>>> by AddHs()? Or are they there "hidden" somehow and AddHs() just
>>> "unhides" them?
>>>
>>> TIA
>>> --
>>> Dimitri Maziuk
>>> Programmer/sysadmin
>>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>>
>>>
>>> ------------------------------------------------------------
>>> ------------------
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> <javascript:_e(%7B%7D,'cvml','Rdkit-discuss@lists.sourceforge.net');>
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>> ------------------------------------------------------------
>> ------------------
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> <javascript:_e(%7B%7D,'cvml','Rdkit-discuss@lists.sourceforge.net');>
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to