On Jan 26, 2011, at 11:54 AM, Chris Morley wrote:
> As may have been discussed here earlier, maybe this option should output
> explicit hydrogen as [H] rather than a hydrogen count on another atom.
> SMARTS [CH2] matches a C with exactly two Hs; SMARTS [H]C[H] will match a
> carbon with at least two Hs and is more versatile in substructure searches.
> Would there be any objections to me changing it in the development code?
None from me. Personally, if I have explicit hydrogens in the structure then I
want hydrogens in the SMILES output.
I will point out that Pascal asked for one of
> [H]OC([H])[H] or [OX2H1][CX4H2]
as output. Your fix gives him the first. Mine code does not give him the
second. For that he needs something more like:
Step 1, encode the connectivity into the isotope
>>> import pybel
>>> mol = pybel.readstring("smi", "OCN")
>>> for atom in mol.atoms:
... atom.OBAtom.SetIsotope(100 + atom.implicitvalence)
...
>>> mol.write("can")
'[103NH2][104CH2][102OH]\t\n'
Step 2, remove the atom(s)
>>> mol.OBMol.DeleteAtom(mol.atoms[-1].OBAtom)
True
Step 3, create the SMARTS based on syntactical transformation of the SMILES
>>> import re
>>> re.sub(r"\[10(\d)", r"[X\1", mol.write("can"))
'[X4CH3][X2OH]\t\n'
if you (like me) think that SMARTS looks ugly, then
>>> re.sub(r"\[10(\d)([^]]+)\]", r"[\2X\1]", mol.write("can"))
'[CH3X4][OHX2]\t\n'
Not exactly matching Pascal's second option, but it's an equivalent SMARTS.
Andrew
[email protected]
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
OpenBabel-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss