Re: [Rdkit-discuss] Question on chirality
Thank you very much Jan, this in fact fixed the issue. I just had to add " AllChem.EmbedMolecule(mol)". Best, Navid On Fri, Sep 13, 2019 at 5:49 AM Jan Holst Jensen wrote: > Hi Navid, > > I am not familiar with the paper you mention, but I believe that the > problem is caused by non-isomeric input SMILES. > > Below is an Alanine read in from molfile, with coordinates. It has a > chiral center with "S" configuration. When you output it as non-isomeric > SMILES and read it back in, the chiral information is lost because the > molecule no longer has a conformation: > > >>> mol = Chem.MolFromMolBlock(""" > ... BIOCHEMF09131911262D > ... > ... 7 6 0 0 1 0 0 0 0 0999 V2000 > ... 0.0.0. N 0 0 0 0 0 0 0 0 0 0 0 0 > ... 0.71450.41250. C 0 0 0 0 0 0 0 0 0 0 0 0 > ... 1.42900.0. C 0 0 0 0 0 0 0 0 0 0 0 0 > ... 1.4209 -0.82080. O 0 0 0 0 0 0 0 0 0 0 0 0 > ... 0.70841.24170. C 0 0 0 0 0 0 0 0 0 0 0 0 > ...-1.0.0. H 0 0 0 0 0 0 0 0 0 0 0 0 > ... 2.42900.0. O 0 0 0 0 0 0 0 0 0 0 0 0 > ... 2 3 1 0 0 0 0 > ... 1 2 1 0 0 0 0 > ... 3 4 2 0 0 0 0 > ... 2 5 1 1 0 0 0 > ... 1 6 1 0 0 0 0 > ... 3 7 1 0 0 0 0 > ... M END > ... """) > >>> Chem.AssignAtomChiralTagsFromStructure(mol) > >>> Chem.FindMolChiralCenters(mol) > [(1, 'S')] > >>> Chem.MolToSmiles(mol) > 'CC(N)C(=O)O' > >>> mol = Chem.MolFromSmiles("CC(N)C(=O)O") > >>> Chem.AssignAtomChiralTagsFromStructure(mol) > >>> Chem.FindMolChiralCenters(mol) > [] > >>> > > > You can generate a conformation that produces chiral information by 3D > embedding the molecule. > > >>> from rdkit.Chem import AllChem > >>> AllChem.EmbedMolecule(mol) > 0 > >>> Chem.AssignAtomChiralTagsFromStructure(mol) > >>> Chem.FindMolChiralCenters(mol) > [(1, 'S')] > >>> > > > Another way would be if you can get isomeric SMILES as input. Then the > chiral information is right there. > > >>> Chem.MolToSmiles(mol, isomericSmiles = True) > 'C[C*@*H](N)C(=O)O' > >>> mol = Chem.MolFromSmiles("C[C@H](N)C(=O)O") > >>> Chem.FindMolChiralCenters(mol) > [(1, 'S')] > >>> > > > Cheers > -- Jan Holst Jensen > > > On 2019-09-12 04:44, Navid Shervani-Tabar wrote: > > Hello, > > In the paper: "Graph Networks as a Universal Machine Learning > Framework for Molecules and Crystals", authors introduce chirality as an > atom feature input to analyze QM9 dataset. I was trying to recreate this > atom feature as following > > > Chirality: (categorical) R, S, or not a Chiral center (one-hot encoded). > > The code I used is: > > from chainer_chemistry import datasets > from chainer_chemistry.dataset.preprocessors.ggnn_preprocessor import > GGNNPreprocessor > from rdkit import Chem > import numpy as np > > dataset, dataset_smiles = datasets.get_qm9(GGNNPreprocessor(), > return_smiles=True) > > for i in range(len(dataset_smiles)): > mol = Chem.MolFromSmiles(dataset_smiles[i]) > Chem.AssignAtomChiralTagsFromStructure(mol) > chiral_cc = Chem.FindMolChiralCenters(mol) > > if not len(chiral_cc) == 0: > print(chiral_cc) > > The output shows no Chiral centers for this dataset. When I use > `includeUnassigned=True`, code gives a list of tuples, but instead of > "R/S", I get "?". I was wondering if there is a mistake in my > implementation. If this is expected, any thoughts on how chirality was > assigned in the above paper? Thanks. > > Sincerely, > Navid > > > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Question on chirality
Hi Navid, I am not familiar with the paper you mention, but I believe that the problem is caused by non-isomeric input SMILES. Below is an Alanine read in from molfile, with coordinates. It has a chiral center with "S" configuration. When you output it as non-isomeric SMILES and read it back in, the chiral information is lost because the molecule no longer has a conformation: >>> mol = Chem.MolFromMolBlock(""" ... BIOCHEMF09131911262D ... ... 7 6 0 0 1 0 0 0 0 0999 V2000 ... 0. 0. 0. N 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.7145 0.4125 0. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1.4290 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0 ... 1.4209 -0.8208 0. O 0 0 0 0 0 0 0 0 0 0 0 0 ... 0.7084 1.2417 0. C 0 0 0 0 0 0 0 0 0 0 0 0 ... -1. 0. 0. H 0 0 0 0 0 0 0 0 0 0 0 0 ... 2.4290 0. 0. O 0 0 0 0 0 0 0 0 0 0 0 0 ... 2 3 1 0 0 0 0 ... 1 2 1 0 0 0 0 ... 3 4 2 0 0 0 0 ... 2 5 1 1 0 0 0 ... 1 6 1 0 0 0 0 ... 3 7 1 0 0 0 0 ... M END ... """) >>> Chem.AssignAtomChiralTagsFromStructure(mol) >>> Chem.FindMolChiralCenters(mol) [(1, 'S')] >>> Chem.MolToSmiles(mol) 'CC(N)C(=O)O' >>> mol = Chem.MolFromSmiles("CC(N)C(=O)O") >>> Chem.AssignAtomChiralTagsFromStructure(mol) >>> Chem.FindMolChiralCenters(mol) [] >>> You can generate a conformation that produces chiral information by 3D embedding the molecule. >>> from rdkit.Chem import AllChem >>> AllChem.EmbedMolecule(mol) 0 >>> Chem.AssignAtomChiralTagsFromStructure(mol) >>> Chem.FindMolChiralCenters(mol) [(1, 'S')] >>> Another way would be if you can get isomeric SMILES as input. Then the chiral information is right there. >>> Chem.MolToSmiles(mol, isomericSmiles = True) 'C[C*@*H](N)C(=O)O' >>> mol = Chem.MolFromSmiles("C[C@H](N)C(=O)O") >>> Chem.FindMolChiralCenters(mol) [(1, 'S')] >>> Cheers -- Jan Holst Jensen On 2019-09-12 04:44, Navid Shervani-Tabar wrote: Hello, In the paper: "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals", authors introduce chirality as an atom feature input to analyze QM9 dataset. I was trying to recreate this atom feature as following > Chirality: (categorical) R, S, or not a Chiral center (one-hot encoded). The code I used is: from chainer_chemistry import datasets from chainer_chemistry.dataset.preprocessors.ggnn_preprocessor import GGNNPreprocessor from rdkit import Chem import numpy as np dataset, dataset_smiles = datasets.get_qm9(GGNNPreprocessor(), return_smiles=True) for i in range(len(dataset_smiles)): mol = Chem.MolFromSmiles(dataset_smiles[i]) Chem.AssignAtomChiralTagsFromStructure(mol) chiral_cc = Chem.FindMolChiralCenters(mol) if not len(chiral_cc) == 0: print(chiral_cc) The output shows no Chiral centers for this dataset. When I use `includeUnassigned=True`, code gives a list of tuples, but instead of "R/S", I get "?". I was wondering if there is a mistake in my implementation. If this is expected, any thoughts on how chirality was assigned in the above paper? Thanks. Sincerely, Navid smime.p7s Description: S/MIME Cryptographic Signature ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Question on chirality
Hello, In the paper: "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals", authors introduce chirality as an atom feature input to analyze QM9 dataset. I was trying to recreate this atom feature as following > Chirality: (categorical) R, S, or not a Chiral center (one-hot encoded). The code I used is: from chainer_chemistry import datasets from chainer_chemistry.dataset.preprocessors.ggnn_preprocessor import GGNNPreprocessor from rdkit import Chem import numpy as np dataset, dataset_smiles = datasets.get_qm9(GGNNPreprocessor(), return_smiles=True) for i in range(len(dataset_smiles)): mol = Chem.MolFromSmiles(dataset_smiles[i]) Chem.AssignAtomChiralTagsFromStructure(mol) chiral_cc = Chem.FindMolChiralCenters(mol) if not len(chiral_cc) == 0: print(chiral_cc) The output shows no Chiral centers for this dataset. When I use `includeUnassigned=True`, code gives a list of tuples, but instead of "R/S", I get "?". I was wondering if there is a mistake in my implementation. If this is expected, any thoughts on how chirality was assigned in the above paper? Thanks. Sincerely, Navid ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss