I've been struggling to implement the SMARTS-based pKa prediction algorithm
outlined by Crippen here: http://pubs.acs.org/doi/abs/10.1021/ci8001815

This same method has been mentioned elsewhere on this forum:
https://sourceforge.net/p/rdkit/mailman/message/27318424/ ;

Am I right in thinking that this method has never been successfully

Assuming not, I'm wondering if anyone else has had a hard time reproducing
the values listed in that paper's SI using the provided decision tree. For
example, consider the compound O(C)c1cc(ccc1OC)C(=O)C(O)=O

Running through the decision tree:

Node 2: Does contain [#G6H]C(=O)
Node 4: Does contain [OH][i](=O)*(~*)~*
Node 8: Does contain a[#X]
Node 16: Does contain *~*~*~*~*~*~*~*~*
Node 32: Does contain [i][#G6v2]
Node 64: Does contain [O][i]~[i]~[i]~[i]~[i]~[i]~[i]~[A]
Node 129: Does not contain [OH][i]~[i]~[i]~[i]-*
Node 258: Does contain [OH][i](=O)[i]~[i]~[i]~[i]-*
          Terminal node. 3.1849999 (2.79)

And yet the paper lists the decision-tree output for that molecule as 1.8.

Am I missing something obvious? I'd appreciate any help the community could
offer. Having a basic pKa predictor in rdkit would be so useful...



Sent from my mobile.
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Rdkit-discuss mailing list

Reply via email to