I've been struggling to implement the SMARTS-based pKa prediction algorithm
outlined by Crippen here: http://pubs.acs.org/doi/abs/10.1021/ci8001815

This same method has been mentioned elsewhere on this forum:
https://sourceforge.net/p/rdkit/mailman/message/27318424/ ;
http://rdkit-discuss.narkive.com/jOHraNs8/crippen-pka-model-in-rdkit

Am I right in thinking that this method has never been successfully
implemented?

Assuming not, I'm wondering if anyone else has had a hard time reproducing
the values listed in that paper's SI using the provided decision tree. For
example, consider the compound O(C)c1cc(ccc1OC)C(=O)C(O)=O

Running through the decision tree:

Node 2: Does contain [#G6H]C(=O)
Node 4: Does contain [OH][i](=O)*(~*)~*
Node 8: Does contain a[#X]
Node 16: Does contain *~*~*~*~*~*~*~*~*
Node 32: Does contain [i][#G6v2]
Node 64: Does contain [O][i]~[i]~[i]~[i]~[i]~[i]~[i]~[A]
Node 129: Does not contain [OH][i]~[i]~[i]~[i]-*
Node 258: Does contain [OH][i](=O)[i]~[i]~[i]~[i]-*
          Terminal node. 3.1849999 (2.79)

And yet the paper lists the decision-tree output for that molecule as 1.8.

Am I missing something obvious? I'd appreciate any help the community could
offer. Having a basic pKa predictor in rdkit would be so useful...

Thanks!



-- 

Sent from my mobile.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to