On Feb 7, 2017, at 22:26, Curt Fischer <curt.r.fisc...@gmail.com> wrote:
> def same_implicit_valence(mol_1, mol_2, atom_idx=1):
>     """Returns True if mol_1 and mol_2 have the same implicit valence for the 
> indexed atom"""
>     mol_1_implicitH = mol_1.GetAtomWithIdx(atom_idx).GetImplicitValence()
>     mol_2_implicitH = mol_2.GetAtomWithIdx(atom_idx).GetImplicitValence()
>     return mol_1_implicitH == mol_2_implicitH

They have have the same implicit valence but they have different numbers of 
explicit hydrogens.

>>> mol_1.GetAtomWithIdx(1).GetNumExplicitHs()
0
>>> mol_2.GetAtomWithIdx(1).GetNumExplicitHs()
2

They also have different InChI strings, as expected:

  InChI=1S/C2H4O/c1-2-3/h3H,1H3/i2+1
  InChI=1S/C2H6O/c1-2-3/h3H,2H2,1H3

RDKit and SMILES have different views on what "explicit" and "implicit" 
hydrogens mean.

In SMILES:
  [C]([H])([H])([H])[H] - 4 explicit hydrogens
  C - 4 implicit hydrogens, specified via valence (i.e., "implicitly specified")
  [CH4] - 4 implicit hydrogens, specified via notation (i.e., "explicitly 
specified")

I give the "i.e." terms to show how the term "implicit" can be used in two 
different ways. This is the long-term source of the ambiguity in what 
"implicit" and "explicit" mean.



In RDKit:

   C - valence of 4, no explicit hydrogens, therefore 4 implicit hydrogens
       needed to fill the valence:

>>> mol = Chem.MolFromSmiles("C")
>>> mol.GetAtomWithIdx(0).GetImplicitValence()
4
>>> mol.GetAtomWithIdx(0).GetNumExplicitHs()
0

   [CH4] - 4 explicit hydrogens, nothing else needed to fill the valence

>>> mol = Chem.MolFromSmiles("[CH4]")
>>> mol.GetAtomWithIdx(0).GetImplicitValence()
0
>>> mol.GetAtomWithIdx(0).GetNumExplicitHs()
4

    [C]([H])([H])([H])[H] by default are turned into explicit hydrogens

>>> mol = Chem.MolFromSmiles("[C]([H])([H])([H])[H]")
>>> mol.GetNumAtoms()
1
>>> mol.GetAtomWithIdx(0).GetImplicitValence()
0
>>> mol.GetAtomWithIdx(0).GetNumExplicitHs()
4


The conversion is part of input sanitization. This can be disabled:
>>> mol = Chem.MolFromSmiles("[C]([H])([H])([H])[H]", sanitize=False)
>>> mol.GetNumAtoms()
5

BTW, your convert_to_smiles_via_inchi() will lose all atom ordering 
information. It works for your example because the structure is so simple and 
there are two carbons. However, consider:

>>> mol = Chem.MolFromSmiles("OCP")
>>> Chem.MolToInchi(mol)
'InChI=1S/CH5OP/c2-1-3/h2H,1,3H2'
>>> mol2 = Chem.MolFromInchi(Chem.MolToInchi(mol))
>>> [atom.GetAtomicNum() for atom in mol2.GetAtoms()]
[6, 8, 15]

As a SMILES this would be written

  C(O)P

where the atoms are re-ordered.

                                Andrew
                                da...@dalkescientific.com


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to