While I figure out how to implement your test suggestion in the most realistic 
fashion, I just want to clarify that the test output in the previous email was 
from only one machine for testing purposes.

On one compute node with one thread, two reagents are combined in a synthesis 
function and then the tertNitrogenProt function is called and the current 
molecule is passed through to be searched for the tertiary nitrogen and 
protonated if found.

What I am not clear on, is whether the properties are passed properly from the 
synthesis function to the tertNitrogenProt function.

Question, what are the outputs of UpdatePropertyCache()?  When I test for 
output, only the word _none_ is printed.  I don't know if that means there were 
no properties present, or they did not need to be updated.

Thank you for your suggestions.

Brian


________________________________
From: Greg Landrum [greg.land...@gmail.com]
Sent: Wednesday, August 31, 2016 11:01 AM
To: Bennion, Brian
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] protonating proper tertiary amines

hmm, perplexing.

How about we try something simple.
Instead of doing real molecules that may be proprietary, how about constructing 
a simple input that has 10 copies of CCN(CC)CC and running that.
Then you can safely send the output.
It would also help if you could also just run the function on one machine (not 
using the PP stuff) to see if you can reproduce the problem there.

-greg



On Wed, Aug 31, 2016 at 6:24 AM, Bennion, Brian 
<benni...@llnl.gov<redir.aspx?REF=zTUNLAk_ZUQ0kxyjBS8ZIc2YaFsqOcf1qOP4aqk6Wxh3yQvD8dHTCAFtYWlsdG86YmVubmlvbjFAbGxubC5nb3Y.>>
 wrote:
Hello Greg,

The source that I am use is shown below.  Also, I need to clarify that all this 
code is wrapped around the ParallelPython job control code.  It allows me to 
send each reaction to a separate cpu on my large clusters.

I have been able to use your steps in your email to check my rdkit install from 
the python interpreter.
Next I manually input my compound as a smiles string and performed your set of 
commands and things work as expected.
However, when wrapped within the PP code the updatepropertycache has no effect. 
 My only thought is that I have not properly passed the molecule between python 
modules (not sure if that makes any sense).

This is the log output for one cycle of the code.  The smiles string has been 
clipped to not reveal proprietary data.  The important thing here is that the 
formal charge is correctly assigned but that the implicit hyrdogen atoms are 
not updated.

LOG
Tertiary nitrogen found in oxime:  ((5, 6, 7, 8),)
This is the symbol and charge for the tertiary nitrogen before:  N 0 
C(=O)N([H])C([H])([H])C([H])([H])C1(C([H])([H])N(C([H])([H])[H])C([H])([H])
This is the symbol and charge for the tertiary nitrogen after:  N 1
test3-10:  SANITIZE_NONE C14H27N3O3 C14H27N3O3+ C14H27N3O3+ 3


 def tertNitrogenProt(molecule,molName1,w_sdf,w_smi):
      patt=rdkit.Chem.MolFromSmarts('[#6]-[#7]([#6])-[#6]')
      matches=molecule.GetSubstructMatches(patt)
      tertNHnum=0
      if matches:
        print "Tertiary nitrogen found in: ", matches
        for i in matches:
         moleculeStrings=rdkit.Chem.MolToSmiles(molecule,isomericSmiles=True)
         atomSymbol9=molecule.GetAtomWithIdx(i[1]).GetSymbol()
         formalCharge9=molecule.GetAtomWithIdx(i[1]).GetFormalCharge()
         print "This is the symbol and charge for the tertiary nitrogen before: 
",atomSymbol9,formalCharge9,moleculeStrings
#set the formal charge on the protonated tertiary nitrogen to zero
         test7=rdkit.Chem.AllChem.CalcMolFormula(molecule)
         molecule.GetAtomWithIdx(i[1]).SetFormalCharge(1)
         atomSymbol9=molecule.GetAtomWithIdx(i[1]).GetSymbol()
         formalCharge9=molecule.GetAtomWithIdx(i[1]).GetFormalCharge()
         test8=rdkit.Chem.AllChem.CalcMolFormula(molecule)
         print "This is the symbol and charge for the tertiary nitrogen after: 
",atomSymbol9,formalCharge9
#update property cache and check for nonsense
         molecule.UpdatePropertyCache()
         moleculeH=rdkit.Chem.AddHs(molecule)
         test3=rdkit.Chem.SanitizeMol(moleculeH)
         test9=rdkit.Chem.AllChem.CalcMolFormula(moleculeH)
         test10=moleculeH.GetAtomWithIdx(i[1]).GetDegree()
         print "test3-10: ",test3,test7,test8,test9,test10
#start generating 3 coordinates and optimize the conformation
         rdkit.Chem.AllChem.EmbedMolecule(moleculeH)
         rdkit.Chem.AllChem.UFFOptimizeMolecule(moleculeH,1500)
         molName6=molName1+'NH+_'+str(tertNHnum)+'_XOH'
#find molecular formal charge
         moleculeCharge=rdkit.Chem.GetFormalCharge(moleculeH)
         moleculeH.SetProp('i_user_TOTAL_CHARGE',repr(moleculeCharge))
         moleculeH.SetProp('_Name',molName6)
         w_sdf.write(moleculeH)
         w_smi.write(moleculeH)
         molName3=molName1+'NH+_'+str(tertNHnum)+'_XO'
         totalMolecules=oximeSubStructSearch(moleculeH,molName3,w_sdf,w_smi)
         tertNHnum += 1
      else:
        print "No tertiary nitrogen matches"
        return(molecule,tertNHnum)
      return (moleculeH,tertNHnum)
######################################################################################################


________________________________
From: Greg Landrum 
[greg.land...@gmail.com<redir.aspx?REF=GHtr0KGAy512vwdnDLK_sAgSzJMq0r5xu8HXtndTYJ13yQvD8dHTCAFtYWlsdG86Z3JlZy5sYW5kcnVtQGdtYWlsLmNvbQ..>]
Sent: Monday, August 29, 2016 10:41 PM
To: Bennion, Brian
Cc: 
rdkit-discuss@lists.sourceforge.net<redir.aspx?REF=W25yGkhV3F9-2thj5wxSd90xE1T_AzUJwbA_HfZaE2l3yQvD8dHTCAFtYWlsdG86cmRraXQtZGlzY3Vzc0BsaXN0cy5zb3VyY2Vmb3JnZS5uZXQ.>
Subject: Re: [Rdkit-discuss] protonating proper tertiary amines

Hi Brian,

On Tue, Aug 30, 2016 at 6:41 AM, Bennion, Brian 
<benni...@llnl.gov<redir.aspx?REF=e7qYMdWwW7Ur20D7-6L43Hiay2uX06xTtxjmPGk15Qp3yQvD8dHTCAFodHRwOi8vVXJsQmxvY2tlZEVycm9yLmFzcHg.>>
 wrote:

I have seemed to hit a wall with what seems like a simple task.

First, I have ~9800 compounds that have a primary amine for a reaction that I 
am completing in rdkit.
About 250 of those compounds have a tertiary alkylamine that is most likely 
protonated at pH 7.4.

The dataset is a set of smiles strings for which the tertiary amine is not 
protonated.   I thought this would be easy enough to fix, just use a smarts 
substructure search, set the formal charge on any hits to one and then AddHs, 
sanitize, embed, and then minimize.

Well, what I get is [N+] with all the other carbons with explicit atoms in the 
resulting smiles files, and if output to sdf I get a positively charged  
diradical positioned at the tertiary nitrogen.

Yes, what's happening here is that AddHs() is using the implicit valence on the 
N atoms to determine how many Hs to add. Since the implicit valence is not 
recomputed when you set the formal charge, you end up with the wrong number of 
Hs attached to the N. A call to UpdatePropertyCache() will fix this:

n [16]: m = Chem.MolFromSmiles('CN')

In [17]: AllChem.CalcMolFormula(m)
Out[17]: 'CH5N'

In [18]: m.GetAtomWithIdx(1).SetFormalCharge(1)

In [19]: AllChem.CalcMolFormula(m)
Out[19]: 'CH5N+'

In [20]: m.UpdatePropertyCache()

In [21]: AllChem.CalcMolFormula(m)
Out[21]: 'CH6N+'

In [22]: mh = Chem.AddHs(m)

In [24]: mh.GetAtomWithIdx(1).GetDegree()
Out[24]: 4

Thank you for such a great tool

You're welcome! Thanks for saying thanks. :-)

Hope this helps,
-greg


------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to