Fabian, On Mon, Aug 13, 2012 at 8:27 AM, Fabian Dey <[email protected]> wrote: > > I was in the process of doing a scaffold analysis when I ran into the > "persistent" hydrogens that would not be removed and thus > decided to check also what the function does on a complete molecule. I > understand that information on the hydrogen count > should not be deleted, but I was expecting that the hydrogens would not be > part of the Smiles output (print 'Scaffold2: %s' > %(Chem.MolToSmiles(sc2[0])). > Would it make sense to include an argument for "MolToSmiles(...)" which > results in that only the heavy-atom graph is returned > (one could still keep the default as "false") ?
Are you asking to have the SMILES "Cn1nccc1C[NH2+]CC1CNc2ccnn2C1" printed as "Cn1nccc1C[N+]CC1CNc2ccnn2C1"? That would be a different molecule. The rule with SMILES is that as soon as an atom is written inside of square brackets (which you have to do when there's a charge present) the attached hydrogens also have to be explicitly present within the square brackets. If there are no hydrogens in the square brackets, the atom has no Hs attached. So the second form above, "Cn1nccc1C[N+]CC1CNc2ccnn2C1", is a di-radical. Hs also need to be present when there is ambiguity about what the H count might be. This happens frequently with aromatic N. For example, the molecule represented by "C1=CNC=C1" produces the SMILES "c1c[nH]cc1" because otherwise it would not be clear that the N carries an H atom. Does this help? -greg ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

