Hi Navid,

I think you have a few options. One is to loop over your molecule’s atoms and 
delete those hydrogens without any neighbors (degree = 0). In Python this would 
look something like the following:

import rdkit
from rdkit import Chem
from rdkit.Chem import rdmolops

# mol = Chem.MolFromSmiles("C#CC(O)C1CCN1.[HH]")
mol = Chem.MolFromSmiles("C#CC(O)C1CCN1.[H].[H]")
disconnected_hydrogens = [atom for atom in mol.GetAtoms() if 
atom.GetAtomicNum() == 1 and atom.GetDegree() == 0]
print([atom.GetIdx() for atom in disconnected_hydrogens])

If you know that your dummy hydrogens aren’t connected to the rest of the graph 
you could also do the following:

disconnected_fragments = rdmolops.GetMolFrags(mol, asMols=True)
print([Chem.MolToSmiles(fragment) for fragment in disconnected_fragments])

As for using dummy atoms, one thing that comes to mind is using atoms with an 
atomic number of 0. Depending on the molecular property you are calculating 
this may be good enough. You can set the atomic number with the 
atom.SetAtomicNum(0) function.

As a side note, I’m not sure the SMILES you provided is valid. Perhaps you 
should separate each hydrogen as their own molecule (see the code above)?

Best regards,
Alan

From: Navid Shervani-Tabar<mailto:nshe...@gmail.com>
Sent: 09 June 2020 21:47
To: RDKit Discuss<mailto:rdkit-discuss@lists.sourceforge.net>
Subject: [Rdkit-discuss] Removing disconnected hydrogens

Hello RDKitters,

I'm using a function to convert a molecular graph to RDKit's mol object. Input 
molecules have a maximum size of N atoms. Molecules with less than N atoms have 
dummy atoms on the corresponding node. Currently, I use hydrogen as the dummy 
atom when building the editable RWmol object. This results in hydrogen atoms 
without neighbours. An example of such a molecule has SMILES representation 
'C#CC(O)C1CCN1.[HH]'. I was wondering

  1.  How can I remove the hydrogen's without neighbours? These hydrogen are 
currently affecting the molecular properties.
  2.  Is there a better option to use as the dummy atom? Something that 
potentially would not affect the molecular properties.
PS: I can't skip the dummy atoms while building the mol object b/c some graphs 
mistakenly have bonds connected to these atoms and I need the statistics on the 
defective molecules.

Thanks,
Navid


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to