[Rdkit-discuss] Canonicalisation with reaction labels

2016-12-16 Thread Stephen Pickett
Hi With a 2013 RDkit install we get consistent canonicalization between reaction labelled and unlabelled atoms. >>> mol = Chem.MolFromSmiles('C1CC([*])CCN1') >>> Chem.MolToSmiles(mol) '[*]C1CCNCC1' >>> mol = Chem.MolFromSmiles('C1CC([*:1])CCN1') >>> Chem.MolToSmiles(mol) '[*:1]C1CCNCC1' In 2015-

Re: [Rdkit-discuss] Canonicalisation with reaction labels

2016-12-16 Thread Andrew Dalke
On Dec 16, 2016, at 1:55 PM, Stephen Pickett wrote: > With a 2013 RDkit install we get consistent canonicalization between reaction > labelled and unlabelled atoms. > >>> mol = Chem.MolFromSmiles('C1CC([*])CCN1') > >>> Chem.MolToSmiles(mol) > '[*]C1CCNCC1' > >>> mol = Chem.MolFromSmiles('C1CC([*:1

Re: [Rdkit-discuss] Canonicalisation with reaction labels

2016-12-16 Thread Greg Landrum
Hi Stephen, The new canonicalization algorithm intentionally takes the atom-mapping information into account. The logic is that the entire SMILES provided should be canonical, so if the SMILES includes atom maps, those atom maps should be considered while canonicalizing. If you have a molecule wi

Re: [Rdkit-discuss] Canonicalisation with reaction labels

2016-12-16 Thread Stephen Pickett
Thanks Greg, that’s clear. Stephen From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 16 December 2016 14:33 To: Stephen Pickett Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] Canonicalisation with reaction labels EXTERNAL Hi Stephen, The new canonicalization algori

Re: [Rdkit-discuss] Canonicalisation with reaction labels

2016-12-16 Thread Syed Asad Rahman
Interesting question. Not sure if it's relevant but in ECBlast we do provide Canonicalised reaction labels. I agree with Greg that AAM is important. https://github.com/asad/ReactionDecoder http://www.ebi.ac.uk/thornton-srv/software/rbl/ Regards, Asad Sent from my iPhone > On 16 Dec 2016, at 1

[Rdkit-discuss] SetAtomAlias

2016-12-16 Thread Jean-Marc Nuzillard
Hi all, I try add labels to atoms in a molecule, so that lines like A1 C12 A2 C3 are written when the molecule is written in a SD file. Considering atom a and alias text txt, I expected the function call SetAtomAlias(a, txt) to do the job. I found this function in a documentation page a

Re: [Rdkit-discuss] SetAtomAlias

2016-12-16 Thread Paolo Tosco
Dear Jean-Marc, here: https://gist.github.com/ptosco/6e4468350f0fff183e4507ef24f092a1#file-pdb_atom_names-ipynb there's an example how to use the atom aliases in RDKit. Cheers, p. On 12/16/2016 10:26 PM, Jean-Marc Nuzillard wrote: > Hi all, > > I try add labels to atoms in a molecule, so t

Re: [Rdkit-discuss] SetAtomAlias

2016-12-16 Thread Peter Gedeck
Hello, SetMolAlias is available in Python as a function and not as an Atom method: from rdkit import Chem import sys m = Chem.MolFromSmiles('CCC') for i, atom in enumerate(m.GetAtoms()): Chem.SetAtomAlias(atom, 'C' + str(i + 1)) w = Chem.SDWriter(sys.stdout) w.write(m) w.close() Best, Pete

[Rdkit-discuss] SDwriter

2016-12-16 Thread Milinda Samaraweera
Dear Users, I was using the SDWriter in the rdkit kit to generate a SD file with mutiple entries generated using smiles and later assign SD tag data (e.g. pubchem_ID, IUPAC_name, etc). However at the end of each tag header I noticed there is a number (bolded): ... > * (1) * N1-(2-ethylbutyl)he

Re: [Rdkit-discuss] SDwriter

2016-12-16 Thread Peter Gedeck
Hello In cases like this i know that Greg did the valid implementation according to the standard. If you check the ctfile definition ( http://c4.cabrillo.edu/404/ctfile.pdf#page41) you will see that the data header is pretty flexible. The only requirement is that it starts with a >. Usually we fin

Re: [Rdkit-discuss] SDwriter

2016-12-16 Thread Andrew Dalke
On Dec 17, 2016, at 1:45 AM, Milinda Samaraweera wrote: > However at the end of each tag header I noticed there is a number (bolded): > > ... > >(1) > N1-(2-ethylbutyl)hexane-1,3,6-triamine ... > What is this number and how you avoid printing this number when SDwriter is > used? As this n

Re: [Rdkit-discuss] SDwriter

2016-12-16 Thread Greg Landrum
It's easy enough to make this an option, but given that it is part of the SDF spec (as Andrew has pointed out) the only reason I can think of to do so would be because it causes problems for some other piece of (likely commonly used) software. Are the sequence numbers causing a problem for you?

Re: [Rdkit-discuss] SDwriter

2016-12-16 Thread Milinda Samaraweera
This SD file is then used as an input for another program, that program is having problems reading the sequence numbers. Thanks, MAK On Fri, Dec 16, 2016 at 10:43 PM, Greg Landrum wrote: > It's easy enough to make this an option, but given that it is part of the > SDF spec (as Andrew has pointe