Hi Greg,
On Sun, Aug 12, 2012 at 10:00 AM, Greg Landrum <[email protected]>wrote:
> Hi Fabian,
>
> On Fri, Aug 10, 2012 at 3:21 PM, Fabian Dey <[email protected]> wrote:
> >
> >
> > I am in the process of a scaffold analysis, printing the scaffolds in
> smiles
> > format, and came across some unexpected behaviour:
> > Whenever I remove hydrogens with RemoveHs() and print out the smiles
> string,
> > some hydrogens remain attached (irrespective
> > if the original input file type is SDF or smiles):
> >
> >
> >
> > from rdkit import Chem
> > from rdkit.Chem.Scaffolds import MurckoScaffold
> >
> > # molecule from zinc:
> > suppl = Chem.SDMolSupplier("zinc_69443014.sdf");
> > for mol in suppl:
> > mol = Chem.RemoveHs(mol)
> > print 'Mol1: %s' %(Chem.MolToSmiles(mol))
> >
> > mol = Chem.MolFromSmiles("c1cc(C[NH2+]CC2CNc3ccnn3C2)[nH]n1")
> > mol = Chem.RemoveHs(mol)
> > print 'Mol2: %s' %(Chem.MolToSmiles(mol))
> >
> >
> > Output:
> > Mol1: Cn1nccc1C[NH2+]CC1CNc2ccnn2C1
> > Mol2: c1cc(C[NH2+]CC2CNc3ccnn3C2)[nH]n1
>
> The function RemoveHs() goes through the molecular graph and removes
> hydrogens that are present as explicit atoms. It doesn't affect the
> hydrogen count on any given atom though. Any other behavior would
> change the chemistry of the molecule, which is definitely not the
> intent.
>
> What are you trying to do? What output would you expect above?
>
I was in the process of doing a scaffold analysis when I ran into the
"persistent" hydrogens that would not be removed and thus
decided to check also what the function does on a complete molecule. I
understand that information on the hydrogen count
should not be deleted, but I was expecting that the hydrogens would not be
part of the Smiles output (print 'Scaffold2: %s'
%(Chem.MolToSmiles(sc2[0])).
Would it make sense to include an argument for "MolToSmiles(...)" which
results in that only the heavy-atom graph is returned
(one could still keep the default as "false") ?
Thanks for the help
Fabian
ps: For the Bemis & Murcko scaffold the output is similar:
>>> from rdkit import Chem
>>> from rdkit.Chem.Scaffolds import MurckoScaffold
>>> mol = Chem.MolFromSmiles("c1cc(C[NH2+]CC2CNc3ccnn3C2)[nH]n1")
>>> sc = [MurckoScaffold.GetScaffoldForMol(mol)]
>>> print '%s ' %(Chem.MolToSmiles(sc[0]))
c1cc(C[NH2+]CC2CNc3ccnn3C2)[nH]n1
>
> > This is also the case when extracting the carbon scaffold of a molecule
> > (even after
> > resetting all formal charges to zero)
> >
> > mol = Chem.MolFromSmiles("c1cc(C[NH2+]CC2CNc3ccnn3C2)[nH]n1")
> > mol = Chem.RemoveHs(mol,implicitOnly=False)
> > sc = [MurckoScaffold.MakeScaffoldGeneric(mol)]
> > print 'Scaffold1: %s' %(Chem.MolToSmiles(sc[0]))
> >
> >
> > mol = Chem.MolFromSmiles("c1cc(C[NH2+]CC2CNc3ccnn3C2)[nH]n1")
> > sc2 = [MurckoScaffold.MakeScaffoldGeneric(mol)]
> > sc2[0] = Chem.RemoveHs(sc2[0],implicitOnly=False)
> > print 'Scaffold2: %s' %(Chem.MolToSmiles(sc2[0]))
> >
> >
> > mol = Chem.MolFromSmiles("c1cc(C[NH2+]CC2CNc3ccnn3C2)[nH]n1")
> > mol = Chem.RemoveHs(mol,implicitOnly=False)
> > for atom in mol.GetAtoms():
> > atom.SetFormalCharge(0)
> > sc3 = [MurckoScaffold.MakeScaffoldGeneric(mol)]
> > sc3[0] = Chem.RemoveHs(sc3[0],implicitOnly=False)
> > print 'Scaffold3: %s' %(Chem.MolToSmiles(sc3[0]))
> >
> > Output:
> > Scaffold1: [CH]1CCCC1C[CH2+]CC1CCC2CCCC2C1
> > Scaffold2: [CH]1CCCC1C[CH2+]CC1CCC2CCCC2C1
> > Scaffold3: [CH]1CCCC1CCCC1CCC2CCCC2C1
>
> That looks like it's a bug in MurckoScaffold.MakeScaffoldGeneric(); it
> should be removing the explict H counts and setting charges to zero.
> Thanks for reporting it.
>
> -greg
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss