Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-19 Thread Greg Landrum
Hi Rob,

The results below are quite strange. As John has already pointed out: there
really shouldn't be chirality present on either the N+ or the C that has
two methyls attached.

I tried to reproduce the problem by running corina myself using the same
command-line options you provided (from SMILES instead of SDF, but I don't
think that should make a difference), but I get sensible results;

In [5]: s = Chem.SDMolSupplier('sample.sdf')

In [6]: for m in s:
Chem.AssignAtomChiralTagsFromStructure(m)
Chem.AssignStereochemistry(m,cleanIt=True,force=True)
   ...: print Chem.MolToSmiles(m,True)
   ...:
CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1

In [7]: s = Chem.SDMolSupplier('sample.sdf')

In [8]: for m in s:
Chem.AssignAtomChiralTagsFromStructure(m)
print Chem.MolToSmiles(m,True)
   ...:
CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1


Could you please send the SDF that corina generates so I can try to
reproduce the problem (or at least try to understand what's gong on) from
that?

Thanks,
-greg

On Wed, Aug 19, 2015 at 3:00 PM, Rob Smith  wrote:

> Dear RDKit community,
>
> I'm trying to use RDKit to read in Corina generated stereoisomers (from a
> Mol file), assign chiral tags and stereochemistry to the structure and
> output the canonical smiles string for each isomer of a given molecule (in
> Python), when I do this, half the canonical smiles strings are not unique.
>
> When I read in the output from Corina into an Indigo instance, then use
> the canonical smiles from Indigo to create an RDKit molecule, canonical
> smiles strings generated from the molecule objects are all unique.
>
> I may be missing an option to enable RDKit to 'visualise' the chiral
> centre adjacent to the protonated nitrogen, so if someone can spot where
> I've made a mistake, I'd really appreciate it. I've included the output and
> Python script below. If you require any further information, please let me
> know.
>
> Many thanks,
> Rob
>
> Output:
>
> RDKit Read in of Molecule
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>
> INDIGO Read in of Molecule
> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1
>
> Python script :
>
> from rdkit import Chem
> import subprocess # Used to run Corina
> from indigo import *
>
> def runCorinaTest(inputMol):
> indigo = Indigo()
>
> molFile = Chem.MolToMolBlock(inputMol)
>
> corinaCommand = "echo \'" + molFile + "\' | "
> # Then Corina - generate stereoisomers...
> corinaCommand = corinaCommand + "/apps/corina/corina -t n -d
> canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf"
> corinaResult = subprocess.check_output([corinaCommand], shell=True) #
> Gives the stereoisomer species as an SDF string
>
> allMoleculeObjects = []
> allMolecules = corinaResult.split("\n") # Separate Corina output
> into individual molecules
> allMolecules = allMolecules[0:len(allMolecules)-1]
>
> print("RDKit Read in of Molecule")
>
> for eachMolecule in allMolecules:
> eachMolecule = eachMolecule + "\n"
> mol = Chem.MolFromMolBlock(eachMolecule, sanitize=True,
> removeHs=True, strictParsing=False)
> Chem.rdmolops.AssignAtomChiralTagsFromStructure(mol,
> replaceExistingTags=True)
> Chem.rdmolops.AssignStereochemistry(mol)
> print("RDKit Output -  " + Chem.MolToSmiles(mol,
> isomericSmiles=True))
>
> print("INDIGO Read in of Molecule")
> for eachMolecule in allMolecules:
> eachMolecule = eachMolecule + "\n"
> mol = indigo.loadMolecule(eachMolecule)
> # print("Indigo Output - " + mol.canonicalSmiles())
> # Use Indigo Canonical Smiles to create RDKit molecule
> 

Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-19 Thread Peter Shenkin
Maybe when you have a toolkit as blazingly fast as RDKit it captures the
chirality of N center before it has time to interconvert

-P.

On Wed, Aug 19, 2015 at 10:17 PM, John M 
wrote:

> More odd is the carbon stereocentre with two methyls...
>
> Generally trivalent nitrogens are not considered chiral due to inversion
> of the lone-pair. The two usual exceptions are when they are a bridgehead
> or in a tight ring (cyclopropane). This is the same in most toolkits, the
> InChI technical documentation provides useful examples.
>
> InChI actually only sees one stereo centre since it strips the proton off:
>
> InChI=1S/C13H26N2/c1-4-14-8-5-12(6-9-14)15-10-7-13(15)11(2)3/h11-13H,4-10H2,1-3H3/p+1/t13-/m1/s1
>
> It may well be chiral in this case but since it's not you should also
> strictly remove the other stereocentre in the para position to the nitrogen
>
> For the record just tested and ChemAxon/CDK/OpenBabel do the same.
>
> John
>
> Regards,
> John W May
> john.wilkinson...@gmail.com
>
> On 19 August 2015 at 09:00, Rob Smith  wrote:
>
>> Dear RDKit community,
>>
>> I'm trying to use RDKit to read in Corina generated stereoisomers (from a
>> Mol file), assign chiral tags and stereochemistry to the structure and
>> output the canonical smiles string for each isomer of a given molecule (in
>> Python), when I do this, half the canonical smiles strings are not unique.
>>
>> When I read in the output from Corina into an Indigo instance, then use
>> the canonical smiles from Indigo to create an RDKit molecule, canonical
>> smiles strings generated from the molecule objects are all unique.
>>
>> I may be missing an option to enable RDKit to 'visualise' the chiral
>> centre adjacent to the protonated nitrogen, so if someone can spot where
>> I've made a mistake, I'd really appreciate it. I've included the output and
>> Python script below. If you require any further information, please let me
>> know.
>>
>> Many thanks,
>> Rob
>>
>> Output:
>>
>> RDKit Read in of Molecule
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>>
>> INDIGO Read in of Molecule
>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1
>>
>> Python script :
>>
>> from rdkit import Chem
>> import subprocess # Used to run Corina
>> from indigo import *
>>
>> def runCorinaTest(inputMol):
>> indigo = Indigo()
>>
>> molFile = Chem.MolToMolBlock(inputMol)
>>
>> corinaCommand = "echo \'" + molFile + "\' | "
>> # Then Corina - generate stereoisomers...
>> corinaCommand = corinaCommand + "/apps/corina/corina -t n -d
>> canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf"
>> corinaResult = subprocess.check_output([corinaCommand], shell=True) #
>> Gives the stereoisomer species as an SDF string
>>
>> allMoleculeObjects = []
>> allMolecules = corinaResult.split("\n") # Separate Corina output
>> into individual molecules
>> allMolecules = allMolecules[0:len(allMolecules)-1]
>>
>> print("RDKit Read in of Molecule")
>>
>> for eachMolecule in allMolecules:
>> eachMolecule = eachMolecule + "\n"
>> mol = Chem.MolFromMolBlock(eachMolecule, sanitize=True,
>> removeHs=True, strictParsing=False)
>> Chem.rdmolops.AssignAtomChiralTagsFromStructure(mol,
>> replaceExistingTags=True)
>> Chem.rdmolops.AssignStereochemistry(mol)
>> print("RDKit Output -  " + Chem.MolToSmiles(mol,
>> isomericSmiles=True))
>>
>> print("INDIGO Read in of Molecule")
>> for eachMolecule in allMolecules:
>> eachMolecule = eachMolecule + "\n"
>> mol = indigo.loadMolecule(eachMolecule)
>> # print("Indigo Output - " + mol.canonicalSmiles())
>> # Use Indigo Canonical Smiles to create RDKit molecule
>> mol = Chem.MolFromSmiles(mol.canonicalSmiles())
>> if mol is not None:
>> print("RDKit Output -  " + Chem.MolToSmiles(mol,
>> isomericSmiles=True))
>>
>> return 0
>>
>> mol = Chem.MolFromSmiles("CC(C)C1[NH+](C2CCN(CC)CC2)CC1")
>> z = runCorinaTest(mol)
>>
>>
>> --
>>
>> _

Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-19 Thread John M
More odd is the carbon stereocentre with two methyls...

Generally trivalent nitrogens are not considered chiral due to inversion of
the lone-pair. The two usual exceptions are when they are a bridgehead or
in a tight ring (cyclopropane). This is the same in most toolkits, the
InChI technical documentation provides useful examples.

InChI actually only sees one stereo centre since it strips the proton off:
InChI=1S/C13H26N2/c1-4-14-8-5-12(6-9-14)15-10-7-13(15)11(2)3/h11-13H,4-10H2,1-3H3/p+1/t13-/m1/s1

It may well be chiral in this case but since it's not you should also
strictly remove the other stereocentre in the para position to the nitrogen

For the record just tested and ChemAxon/CDK/OpenBabel do the same.

John

Regards,
John W May
john.wilkinson...@gmail.com

On 19 August 2015 at 09:00, Rob Smith  wrote:

> Dear RDKit community,
>
> I'm trying to use RDKit to read in Corina generated stereoisomers (from a
> Mol file), assign chiral tags and stereochemistry to the structure and
> output the canonical smiles string for each isomer of a given molecule (in
> Python), when I do this, half the canonical smiles strings are not unique.
>
> When I read in the output from Corina into an Indigo instance, then use
> the canonical smiles from Indigo to create an RDKit molecule, canonical
> smiles strings generated from the molecule objects are all unique.
>
> I may be missing an option to enable RDKit to 'visualise' the chiral
> centre adjacent to the protonated nitrogen, so if someone can spot where
> I've made a mistake, I'd really appreciate it. I've included the output and
> Python script below. If you require any further information, please let me
> know.
>
> Many thanks,
> Rob
>
> Output:
>
> RDKit Read in of Molecule
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>
> INDIGO Read in of Molecule
> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1
> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1
>
> Python script :
>
> from rdkit import Chem
> import subprocess # Used to run Corina
> from indigo import *
>
> def runCorinaTest(inputMol):
> indigo = Indigo()
>
> molFile = Chem.MolToMolBlock(inputMol)
>
> corinaCommand = "echo \'" + molFile + "\' | "
> # Then Corina - generate stereoisomers...
> corinaCommand = corinaCommand + "/apps/corina/corina -t n -d
> canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf"
> corinaResult = subprocess.check_output([corinaCommand], shell=True) #
> Gives the stereoisomer species as an SDF string
>
> allMoleculeObjects = []
> allMolecules = corinaResult.split("\n") # Separate Corina output
> into individual molecules
> allMolecules = allMolecules[0:len(allMolecules)-1]
>
> print("RDKit Read in of Molecule")
>
> for eachMolecule in allMolecules:
> eachMolecule = eachMolecule + "\n"
> mol = Chem.MolFromMolBlock(eachMolecule, sanitize=True,
> removeHs=True, strictParsing=False)
> Chem.rdmolops.AssignAtomChiralTagsFromStructure(mol,
> replaceExistingTags=True)
> Chem.rdmolops.AssignStereochemistry(mol)
> print("RDKit Output -  " + Chem.MolToSmiles(mol,
> isomericSmiles=True))
>
> print("INDIGO Read in of Molecule")
> for eachMolecule in allMolecules:
> eachMolecule = eachMolecule + "\n"
> mol = indigo.loadMolecule(eachMolecule)
> # print("Indigo Output - " + mol.canonicalSmiles())
> # Use Indigo Canonical Smiles to create RDKit molecule
> mol = Chem.MolFromSmiles(mol.canonicalSmiles())
> if mol is not None:
> print("RDKit Output -  " + Chem.MolToSmiles(mol,
> isomericSmiles=True))
>
> return 0
>
> mol = Chem.MolFromSmiles("CC(C)C1[NH+](C2CCN(CC)CC2)CC1")
> z = runCorinaTest(mol)
>
>
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@

[Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-19 Thread Rob Smith
Dear RDKit community,

I'm trying to use RDKit to read in Corina generated stereoisomers (from a
Mol file), assign chiral tags and stereochemistry to the structure and
output the canonical smiles string for each isomer of a given molecule (in
Python), when I do this, half the canonical smiles strings are not unique.

When I read in the output from Corina into an Indigo instance, then use the
canonical smiles from Indigo to create an RDKit molecule, canonical smiles
strings generated from the molecule objects are all unique.

I may be missing an option to enable RDKit to 'visualise' the chiral centre
adjacent to the protonated nitrogen, so if someone can spot where I've made
a mistake, I'd really appreciate it. I've included the output and Python
script below. If you require any further information, please let me know.

Many thanks,
Rob

Output:

RDKit Read in of Molecule
RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1

INDIGO Read in of Molecule
RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1
RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1
RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1
RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1
RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1
RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1

Python script :

from rdkit import Chem
import subprocess # Used to run Corina
from indigo import *

def runCorinaTest(inputMol):
indigo = Indigo()

molFile = Chem.MolToMolBlock(inputMol)

corinaCommand = "echo \'" + molFile + "\' | "
# Then Corina - generate stereoisomers...
corinaCommand = corinaCommand + "/apps/corina/corina -t n -d
canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf"
corinaResult = subprocess.check_output([corinaCommand], shell=True) #
Gives the stereoisomer species as an SDF string

allMoleculeObjects = []
allMolecules = corinaResult.split("\n") # Separate Corina output
into individual molecules
allMolecules = allMolecules[0:len(allMolecules)-1]

print("RDKit Read in of Molecule")

for eachMolecule in allMolecules:
eachMolecule = eachMolecule + "\n"
mol = Chem.MolFromMolBlock(eachMolecule, sanitize=True,
removeHs=True, strictParsing=False)
Chem.rdmolops.AssignAtomChiralTagsFromStructure(mol,
replaceExistingTags=True)
Chem.rdmolops.AssignStereochemistry(mol)
print("RDKit Output -  " + Chem.MolToSmiles(mol,
isomericSmiles=True))

print("INDIGO Read in of Molecule")
for eachMolecule in allMolecules:
eachMolecule = eachMolecule + "\n"
mol = indigo.loadMolecule(eachMolecule)
# print("Indigo Output - " + mol.canonicalSmiles())
# Use Indigo Canonical Smiles to create RDKit molecule
mol = Chem.MolFromSmiles(mol.canonicalSmiles())
if mol is not None:
print("RDKit Output -  " + Chem.MolToSmiles(mol,
isomericSmiles=True))

return 0

mol = Chem.MolFromSmiles("CC(C)C1[NH+](C2CCN(CC)CC2)CC1")
z = runCorinaTest(mol)
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss