Hi Greg,

I've attached the SDF that Corina generates. I'm not convinced it is a
problem, more an observation that I'm trying to understand.

Looking at the results again today - it seems that from the Corina output
Indigo is interpreting the conformer (including whether the ethyl
substituent on the piperidine nitrogen is equatorial or axial) - and
outputting a canonical smiles string that has the conformer "encoded" in it
(using the chiral flags). Whereas RDKit is reading in the Corina output,
"discounting" whether the nitrogen is axial or equatorial (which due to
inversion I can understand) and interpreting it as having only two chiral
centers (which is correct).

What is confusing me, is that when I supply RDKit with the canonical smiles
string from Indigo (which has the conformer "encoded" in it), and then ask
for the isomeric canonical smiles, it supplies the canonical smiles with
the conformer still "encoded" within it.

For example, I read in the following canonical smiles string into
RDKit: CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 (which was generated by
reading in one of the mols in the SD File into RDKit and output the
isomeric canonical smiles), running the FindMolChiralCenters on this
molecule, correctly reports the number of chiral centres to be 2 (6S, 9R),
and then asking it to output the canonical smiles string (with
isomericSmiles=True) gives CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1 (1).

If I take the same mol file, read it into Indigo, and ask it to output the
canonical smiles string, I get: CC(C)[C@H]1CC[N@H+]1[C@@H]1CC[N@@](CC1)CC,
if I read this smiles string into RDKit and run FindMolCenters on it, I get
(3R, 6S) - which is fine, if I then out the canonical smiles (again with
isomericSmiles=True) I get CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1. I
expected this isomeric canonical smiles to be the same as (1), however
RDKit appears to conserve the conformer representation given to it from an
isomeric smiles string, but when reading a Mol file doesn't keep all
conformer information (axial or equatorial substituents on a nitrogen).

Thanks to all for your quick (and quick witted) responses

Rob


On Thu, Aug 20, 2015 at 3:46 AM, Greg Landrum <greg.land...@gmail.com>
wrote:

> Hi Rob,
>
> The results below are quite strange. As John has already pointed out:
> there really shouldn't be chirality present on either the N+ or the C that
> has two methyls attached.
>
> I tried to reproduce the problem by running corina myself using the same
> command-line options you provided (from SMILES instead of SDF, but I don't
> think that should make a difference), but I get sensible results;
>
> In [5]: s = Chem.SDMolSupplier('sample.sdf')
>
> In [6]: for m in s:
>     Chem.AssignAtomChiralTagsFromStructure(m)
>     Chem.AssignStereochemistry(m,cleanIt=True,force=True)
>    ...:     print Chem.MolToSmiles(m,True)
>    ...:
> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
>
> In [7]: s = Chem.SDMolSupplier('sample.sdf')
>
> In [8]: for m in s:
>     Chem.AssignAtomChiralTagsFromStructure(m)
>     print Chem.MolToSmiles(m,True)
>    ...:
> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
>
>
> Could you please send the SDF that corina generates so I can try to
> reproduce the problem (or at least try to understand what's gong on) from
> that?
>
> Thanks,
> -greg
>
> On Wed, Aug 19, 2015 at 3:00 PM, Rob Smith <robtsm...@gmail.com> wrote:
>
>> Dear RDKit community,
>>
>> I'm trying to use RDKit to read in Corina generated stereoisomers (from a
>> Mol file), assign chiral tags and stereochemistry to the structure and
>> output the canonical smiles string for each isomer of a given molecule (in
>> Python), when I do this, half the canonical smiles strings are not unique.
>>
>> When I read in the output from Corina into an Indigo instance, then use
>> the canonical smiles from Indigo to create an RDKit molecule, canonical
>> smiles strings generated from the molecule objects are all unique.
>>
>> I may be missing an option to enable RDKit to 'visualise' the chiral
>> centre adjacent to the protonated nitrogen, so if someone can spot where
>> I've made a mistake, I'd really appreciate it. I've included the output and
>> Python script below. If you require any further information, please let me
>> know.
>>
>> Many thanks,
>> Rob
>>
>> Output:
>>
>> RDKit Read in of Molecule
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>>
>> INDIGO Read in of Molecule
>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1
>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1
>>
>> Python script :
>>
>> from rdkit import Chem
>> import subprocess # Used to run Corina
>> from indigo import *
>>
>> def runCorinaTest(inputMol):
>>     indigo = Indigo()
>>
>>     molFile = Chem.MolToMolBlock(inputMol)
>>
>>     corinaCommand = "echo \'" + molFile + "\' | "
>>     # Then Corina - generate stereoisomers...
>>     corinaCommand = corinaCommand + "/apps/corina/corina -t n -d
>> canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf"
>>     corinaResult = subprocess.check_output([corinaCommand], shell=True) #
>> Gives the stereoisomer species as an SDF string
>>
>>     allMoleculeObjects = []
>>     allMolecules = corinaResult.split("$$$$\n") # Separate Corina output
>> into individual molecules
>>     allMolecules = allMolecules[0:len(allMolecules)-1]
>>
>>     print("RDKit Read in of Molecule")
>>
>>     for eachMolecule in allMolecules:
>>         eachMolecule = eachMolecule + "$$$$\n"
>>         mol = Chem.MolFromMolBlock(eachMolecule, sanitize=True,
>> removeHs=True, strictParsing=False)
>>         Chem.rdmolops.AssignAtomChiralTagsFromStructure(mol,
>> replaceExistingTags=True)
>>         Chem.rdmolops.AssignStereochemistry(mol)
>>         print("RDKit Output -  " + Chem.MolToSmiles(mol,
>> isomericSmiles=True))
>>
>>     print("INDIGO Read in of Molecule")
>>     for eachMolecule in allMolecules:
>>         eachMolecule = eachMolecule + "$$$$\n"
>>         mol = indigo.loadMolecule(eachMolecule)
>>         # print("Indigo Output - " + mol.canonicalSmiles())
>>         # Use Indigo Canonical Smiles to create RDKit molecule
>>         mol = Chem.MolFromSmiles(mol.canonicalSmiles())
>>         if mol is not None:
>>             print("RDKit Output -  " + Chem.MolToSmiles(mol,
>> isomericSmiles=True))
>>
>>     return 0
>>
>> mol = Chem.MolFromSmiles("CC(C)C1[NH+](C2CCN(CC)CC2)CC1")
>> z = runCorinaTest(mol)
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>

Attachment: Corina Output.sdf
Description: Binary data

------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to