Bugs item #3310779, was opened at 2011-06-02 19:35 Message generated for change (Tracker Item Submitted) made by baoilleach You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=428740&aid=3310779&group_id=40728
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Noel O'Boyle (baoilleach) Assigned to: Nobody/Anonymous (nobody) Summary: Handling of Implicit H in Smarts Initial Comment: >From Andrew Dalke on list: One of RDKit MACCS key definitions is [!#6;!#1]~[!#6;!#1;!H0] I'm working on my test suite for those definitions, as mentioned in my previous email. Here's a test case >>> mol = pybel.readstring("smi", "[U]S(C)C") >>> matcher = pybel.Smarts("[!#6;H0]") >>> matcher.findall(mol) [(1,), (2,)] >>> matcher = pybel.Smarts("[!#6;!#1]~[!#6;!#1;!H0]") >>> matcher.findall(mol) [] >>> RDKit, OEChem, and Daylight say that that pattern matches that structure. That's because all three programs say that the "S" has an implicit hydrogen on it. Daylight says that sulfur has valence levels of "S (2,4,6)" http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html This looks to be a bug in the code which calculates the implicit hydrogen count. Here's another another case where the implicit h-count is wrong, this time with P. Daylight says the valence levels for P in SMILES are (3,5) Given N=PPCC The second atom (the first P) has a double bond and a single, so it's valences are filled. It should have no implicit hydrogens. However, here's first the RDKit.MACCS pattern which passed, unexpectedly, in OpenBabel >>> mol = pybel.readstring("smi", "N=PPCC") >>> matcher = pybel.Smarts("[!#6;!#1;!H0]~[!#6;!#1;!H0]") >>> matcher.findall(mol) [(1, 2), (2, 3)] >>> Hmatcher = pybel.Smarts("[!H0]") >>> Hmatcher.findall(mol) [(1,), (2,), (3,), (4,), (5,)] >>> You can see it's because the matcher thinks all of the atoms have at least one implicit hydrogen. Compare this to RDKit, which correctly has the P with no implicit hydrogens. >>> mol = Chem.MolFromSmiles("N=PPCC") >>> pat = Chem.MolFromSmarts("[!#6;!#1;!H0]~[!#6;!#1;!H0]") >>> mol.GetSubstructMatches(pat) () >>> Hpat = Chem.MolFromSmarts("[!H0]") >>> mol.GetSubstructMatches(Hpat) ((0,), (2,), (3,), (4,)) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=428740&aid=3310779&group_id=40728 ------------------------------------------------------------------------------ Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Discover what all the cheering's about. Get your free trial download today. http://p.sf.net/sfu/quest-dev2dev2 _______________________________________________ OpenBabel-Devel mailing list OpenBabel-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-devel