Hi Greg,

I had processed already protonated molecules (thus the charge on the
nitrogen) and it makes sense, as you suggested, to write out the hydrogens
with the smiles-string. Out of curiosity, in order to get the heavy-atom
graph of an initially protonated and potentially charged molecule (without
the
intent of retaining that information) I first reset the charges to "0" and
then removedHs() , which provided the heavy-atom graph:

suppl = Chem.SDMolSupplier("zinc_69443014.sdf");
for mol in suppl:
    for atom in mol.GetAtoms():
        atom.SetFormalCharge(0)
mol = Chem.RemoveHs(mol)
print 'Mol: %s' %(Chem.MolToSmiles(mol))

Ouput:
Mol: Cn1nccc1CNCC1CNc2ccnn2C1


Thanks again & cheers

Fabian



On Mon, Aug 13, 2012 at 8:36 AM, Greg Landrum <[email protected]>wrote:

> Fabian,
>
> On Mon, Aug 13, 2012 at 8:27 AM, Fabian Dey <[email protected]> wrote:
> >
> > I was in the process of doing a scaffold analysis when I ran into the
> > "persistent" hydrogens that would not be removed and thus
> > decided to check also what the function does on a complete molecule. I
> > understand that information on the hydrogen count
> > should not be deleted, but I was expecting that the hydrogens would not
> be
> > part of the Smiles output (print 'Scaffold2: %s'
> > %(Chem.MolToSmiles(sc2[0])).
> > Would it make sense to include an argument for "MolToSmiles(...)" which
> > results in that only the heavy-atom graph is returned
> > (one could still keep the default as "false") ?
>
> Are you asking to have the SMILES "Cn1nccc1C[NH2+]CC1CNc2ccnn2C1"
> printed as "Cn1nccc1C[N+]CC1CNc2ccnn2C1"?
> That would be a different molecule. The rule with SMILES is that as
> soon as an atom is written inside of square brackets (which you have
> to do when there's a charge present) the attached hydrogens also have
> to be explicitly present within the square brackets. If there are no
> hydrogens in the square brackets, the atom has no Hs attached. So the
> second form above, "Cn1nccc1C[N+]CC1CNc2ccnn2C1", is a di-radical.
>
> Hs also need to be present when there is ambiguity about what the H
> count might be. This happens frequently with aromatic N. For example,
> the molecule represented by "C1=CNC=C1" produces the SMILES
> "c1c[nH]cc1" because otherwise it would not be clear that the N
> carries an H atom.
>
> Does this help?
> -greg
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to