[Rdkit-discuss] identify and setformal charge of carboxylic acid
Hi all, I have hit the wall with how to do this the smartest way - I have a bunch and molecules and I need to set their charge state. One of the molecules: # Has a carboxylic acid m1 = Chem.MolFromSmiles('CC(=O)N[C@@H]1[C@H](C[C@@](O[C@H]1[C@H]([C@H ](CO)O)O)(C(=O)O)O)O') So my question is what is the best way - iterate through the molecule and identify the carbon that has =O,-O attached - maybe there is already a functionality to do this in rdkit. Or is there a function that deprotonates all carboxylic groups? Ay advice how to proceed very much appreciated thanks. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] change SMILES based on charge state of molecule
Hi all, I am try to change my SMILES of a bunch of small molecules that I have downloaded from pubchem. The problem is that the charge in the physiological state of the molecule differs from string from pubchem e.g. ATP charge 0: SMILES = ''C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N' ATP charge -3: 'Nc1ncnc2c1ncn2[C@@H]1O[C@H](COP(=O)([O-])OP(=O)([O-])OP(=O)([O-])O)[C@ @H](O)[C@H]1O' My question is normally I would convert this manually but this is prone to error and for many compounds it is not very optimal. Any suggestions or work arounds highly appreciated thanks! Best ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] regarding hydrogens from SMILES
Dear Sereina and Paolo, Thank you for your replies and they both seemed to correct the problem. I really like to do it during the embedding of the molecule. cheers! On Tue, Oct 8, 2019 at 12:34 PM Sereina wrote: > Hi Jorgen, > > Which version of RDKit are you using? The ETKDG conformer generator (which > will keep sp2 centers flat) has become only recently the default. If you > are using an older RDKit version, the following code should give you a flat > aromatic system for the SMILES you provided in your example. > > m = > Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N’) > mH = AllChem.AddHs(m) > AllChem.EmbedMolecule(mH, params=AllChem.ETKDGv2()) > > Best regards, > Sereina > > > On 8 Oct 2019, at 18:18, Paolo Tosco wrote: > > Hi Jorgen, > > use the MMFF94s variant of the forcefield if you wish to force trigonal > nitrogens to be planar: > > AllChem.MMFFOptimizeMolecule(m2, mmffVariant="MMFF94s") > > More information here: > > > https://doi.org/10.1002/(SICI)1096-987X(199905)20:7%3C720::AID-JCC7%3E3.0.CO;2-X > Cheers, > p. > > On 10/08/19 15:27, Jorgen Simonsen wrote: > > Cheers Paolo, > > It looks like that it keeps sp3 as the optimal geometry and not sp2. > The optimization did converge : > > AllChem.MMFFOptimizeMolecule(m2,) > > #returned 1 > > I think it is getting the types wrong or I have to specify the types? > > > > On Tue, Oct 8, 2019 at 10:10 AM Paolo Tosco > wrote: > >> Hi Jorgen, >> >> optimizing your molecule geometry with UFF or MMFF should fix the problem: >> >> AllChem.UFFOptimizeMolecule(m2) >> >> or >> >> AllChem.MMOptimizeMolecule(m2) >> >> see rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule >> <https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule> >> or rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule >> <https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule> >> . >> >> Cheers, >> p. >> >> On 10/08/19 14:41, Jorgen Simonsen wrote: >> >> Hi all, >> >> I am trying to built 3D structures from SMILES which for most of the >> molecules works fine - I get the SMILES from pubchem ('canonical_smiles' >> and 'isomeric_smiles') but some of the molecules they hydrogens are not >> added correctly and are out of plane - e.g. amide group in ATP ( see below >> for an example or arginine in a peptide). >> >> I use the following code to generate the 3D structure : >> >> from rdkit import Chem >> from rdkit.Chem import AllChem >> m1 = >> Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N') >> >> m2 = Chem.AddHs(m1) >> AllChem.EmbedMolecule(m2) >> >> w = Chem.SDWriter('foo.sdf') >> w.write(m2) >> >> # or to mol file >> >> print(Chem.MolToMolBlock(m2),file=open('foo.mol','w+')) >> >> How to insure that the atomtype are correct ? >> >> Thanks in advance >> >> Best >> Jorgen >> >> >> >> >> >> >> >> >> >> >> >> ___ >> Rdkit-discuss mailing >> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >> > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] regarding hydrogens from SMILES
Cheers Paolo, It looks like that it keeps sp3 as the optimal geometry and not sp2. The optimization did converge : AllChem.MMFFOptimizeMolecule(m2,) #returned 1 I think it is getting the types wrong or I have to specify the types? On Tue, Oct 8, 2019 at 10:10 AM Paolo Tosco wrote: > Hi Jorgen, > > optimizing your molecule geometry with UFF or MMFF should fix the problem: > > AllChem.UFFOptimizeMolecule(m2) > > or > > AllChem.MMOptimizeMolecule(m2) > > see rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule > <https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.UFFOptimizeMolecule> > or rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule > <https://www.rdkit.org/docs/source/rdkit.Chem.rdForceFieldHelpers.html?highlight=optimize#rdkit.Chem.rdForceFieldHelpers.MMFFOptimizeMolecule> > . > > Cheers, > p. > > On 10/08/19 14:41, Jorgen Simonsen wrote: > > Hi all, > > I am trying to built 3D structures from SMILES which for most of the > molecules works fine - I get the SMILES from pubchem ('canonical_smiles' > and 'isomeric_smiles') but some of the molecules they hydrogens are not > added correctly and are out of plane - e.g. amide group in ATP ( see below > for an example or arginine in a peptide). > > I use the following code to generate the 3D structure : > > from rdkit import Chem > from rdkit.Chem import AllChem > m1 = > Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N') > > m2 = Chem.AddHs(m1) > AllChem.EmbedMolecule(m2) > > w = Chem.SDWriter('foo.sdf') > w.write(m2) > > # or to mol file > > print(Chem.MolToMolBlock(m2),file=open('foo.mol','w+')) > > How to insure that the atomtype are correct ? > > Thanks in advance > > Best > Jorgen > > > > > > > > > > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] regarding hydrogens from SMILES
Hi all, I am trying to built 3D structures from SMILES which for most of the molecules works fine - I get the SMILES from pubchem ('canonical_smiles' and 'isomeric_smiles') but some of the molecules they hydrogens are not added correctly and are out of plane - e.g. amide group in ATP ( see below for an example or arginine in a peptide). I use the following code to generate the 3D structure : from rdkit import Chem from rdkit.Chem import AllChem m1 = Chem.MolFromSmiles('C1=NC(=C2C(=N1)N(C=N2)C3C(C(C(O3)COP(=O)(O)OP(=O)(O)OP(=O)(O)O)O)O)N') m2 = Chem.AddHs(m1) AllChem.EmbedMolecule(m2) w = Chem.SDWriter('foo.sdf') w.write(m2) # or to mol file print(Chem.MolToMolBlock(m2),file=open('foo.mol','w+')) How to insure that the atomtype are correct ? Thanks in advance Best Jorgen ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss