Craig A. James wrote: > I've run into a problem: OpenBabel add a hydrogen to any potentially-chiral > carbon that doesn't have four bonds, for example, when it parses "ClC(Br)I" > (but not "ClC(Cl)Br" since it can't be chiral). I'm pretty sure it's the new > stereo code that's doing this, but I haven't dug into it yet. > > In our web interface, we allow the chemist to draw explicit "blocking" > hydrogens using Rich's ChemWriter program, and when we parse these molecules > and generate a SMILES, the SMILES writer's "keep hydrogens" option writes the > hydrogens explicitely. Any hydrogen in your drawing will also be in the > SMILES, so that when it's used as a SMARTS, the hydrogen blocks substituents > at that position. > > Unfortunately, when OpenBabel parses "ClC(Br)I", the "keep hydrogens option > writes out "Cl[CH](Br)I" because OpenBabel has added a H to the carbon, > resulting in a SMARTS that doesn't match what you intended. > > Assuming my analysis is correct, is there some way to mark or otherwise find > these added hydrogens? We don't have to remove them, I just need to know > which ones were in the user's original drawing (SD file).
Writing out the explicit hydrogens like "ClC([H])I" would give the intended meaning when used as SMARTS with the blocking hydrogens - there is at least one H on the carbon (there could be two in this case). "Cl[CH]I" means there is exactly one hydrogen. SMILES "ClCI" is matched by the first SMARTS but not by the second. I was about to make a Feature Request for the explicit H option to write out in the first form. > This is actually an example of a deeper question: What is the meaning of an > SD file, or any connection-table file? We often say, "It represents a > molecule." But in some cases, the correct answer is, "It represents what the > user drew," which is quite a different thing. A feature that might help here is the SetIsPatternStructure() hack, which stops implicit Hs or radical centres being added to a non-molecule. OBMol obFrag; obFrag.SetIsPatternStructure(); conv.ReadString(&obFrag, string); Incidentally, the line in the smilesformat Description should be changed to something like: " h Output explicit hydrogens as such\n" Chris ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ OpenBabel-Devel mailing list OpenBabel-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-devel