On Feb 7, 2017, at 22:26, Curt Fischer <curt.r.fisc...@gmail.com> wrote: > def same_implicit_valence(mol_1, mol_2, atom_idx=1): > """Returns True if mol_1 and mol_2 have the same implicit valence for the > indexed atom""" > mol_1_implicitH = mol_1.GetAtomWithIdx(atom_idx).GetImplicitValence() > mol_2_implicitH = mol_2.GetAtomWithIdx(atom_idx).GetImplicitValence() > return mol_1_implicitH == mol_2_implicitH
They have have the same implicit valence but they have different numbers of explicit hydrogens. >>> mol_1.GetAtomWithIdx(1).GetNumExplicitHs() 0 >>> mol_2.GetAtomWithIdx(1).GetNumExplicitHs() 2 They also have different InChI strings, as expected: InChI=1S/C2H4O/c1-2-3/h3H,1H3/i2+1 InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3 RDKit and SMILES have different views on what "explicit" and "implicit" hydrogens mean. In SMILES: [C]([H])([H])([H])[H] - 4 explicit hydrogens C - 4 implicit hydrogens, specified via valence (i.e., "implicitly specified") [CH4] - 4 implicit hydrogens, specified via notation (i.e., "explicitly specified") I give the "i.e." terms to show how the term "implicit" can be used in two different ways. This is the long-term source of the ambiguity in what "implicit" and "explicit" mean. In RDKit: C - valence of 4, no explicit hydrogens, therefore 4 implicit hydrogens needed to fill the valence: >>> mol = Chem.MolFromSmiles("C") >>> mol.GetAtomWithIdx(0).GetImplicitValence() 4 >>> mol.GetAtomWithIdx(0).GetNumExplicitHs() 0 [CH4] - 4 explicit hydrogens, nothing else needed to fill the valence >>> mol = Chem.MolFromSmiles("[CH4]") >>> mol.GetAtomWithIdx(0).GetImplicitValence() 0 >>> mol.GetAtomWithIdx(0).GetNumExplicitHs() 4 [C]([H])([H])([H])[H] by default are turned into explicit hydrogens >>> mol = Chem.MolFromSmiles("[C]([H])([H])([H])[H]") >>> mol.GetNumAtoms() 1 >>> mol.GetAtomWithIdx(0).GetImplicitValence() 0 >>> mol.GetAtomWithIdx(0).GetNumExplicitHs() 4 The conversion is part of input sanitization. This can be disabled: >>> mol = Chem.MolFromSmiles("[C]([H])([H])([H])[H]", sanitize=False) >>> mol.GetNumAtoms() 5 BTW, your convert_to_smiles_via_inchi() will lose all atom ordering information. It works for your example because the structure is so simple and there are two carbons. However, consider: >>> mol = Chem.MolFromSmiles("OCP") >>> Chem.MolToInchi(mol) 'InChI=1S/CH5OP/c2-1-3/h2H,1,3H2' >>> mol2 = Chem.MolFromInchi(Chem.MolToInchi(mol)) >>> [atom.GetAtomicNum() for atom in mol2.GetAtoms()] [6, 8, 15] As a SMILES this would be written C(O)P where the atoms are re-ordered. Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss