Craig A. James wrote:
> I've run into a problem: OpenBabel add a hydrogen to any potentially-chiral 
> carbon that doesn't have four bonds, for example, when it parses "ClC(Br)I" 
> (but not "ClC(Cl)Br" since it can't be chiral).  I'm pretty sure it's the new 
> stereo code that's doing this, but I haven't dug into it yet.
> 
> In our web interface, we allow the chemist to draw explicit "blocking" 
> hydrogens using Rich's ChemWriter program, and when we parse these molecules 
> and generate a SMILES, the SMILES writer's "keep hydrogens" option writes the 
> hydrogens explicitely.  Any hydrogen in your drawing will also be in the 
> SMILES, so that when it's used as a SMARTS, the hydrogen blocks substituents 
> at that position.
> 
> Unfortunately, when OpenBabel parses "ClC(Br)I", the "keep hydrogens option 
> writes out "Cl[CH](Br)I" because OpenBabel has added a H to the carbon, 
> resulting in a SMARTS that doesn't match what you intended.
> 
> Assuming my analysis is correct, is there some way to mark or otherwise find 
> these added hydrogens?  We don't have to remove them, I just need to know 
> which ones were in the user's original drawing (SD file).

Writing out the explicit hydrogens like "ClC([H])I" would give the 
intended meaning  when used as SMARTS with the blocking hydrogens - 
there is at least one H on the carbon (there could be two in this 
case). "Cl[CH]I" means there is exactly one hydrogen. SMILES "ClCI" is 
matched by the first SMARTS but not by the second. I was about to make 
a Feature Request for the explicit H option to write out in the first 
form.

> This is actually an example of a deeper question: What is the meaning of an 
> SD file, or any connection-table file?  We often say, "It represents a 
> molecule."  But in some cases, the correct answer is, "It represents what the 
> user drew," which is quite a different thing.

A feature that might help here is the SetIsPatternStructure() hack, 
which stops implicit Hs or radical centres being added to a non-molecule.

OBMol obFrag;
obFrag.SetIsPatternStructure();
conv.ReadString(&obFrag, string);

Incidentally, the line in the smilesformat Description should be 
changed to something like:
"  h  Output explicit hydrogens as such\n"

Chris


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to