Hi Gregory (long time since we met)! > > Why did you use your own 11 smarts for hydrogen bond acceptors instead of > rdkit's CalcNumHBA or NumHAcceptors > ($([O,S;H1;v2]-[!$(*=[O,N,P,S])]),$([O,S;H0;v2]),$([O,S;-]),$([N;v3;!$(N-*=!@[O,N,P,S])]),$([nH0,o,s;+0]))?
The reason why I used a different implementation for the calculation of hydrogen bond acceptors (HBA) is indeed as you described already: the HBA values in the (limited) examples of the original publication are quite different than those returned by RDKit's NumHAcceptors method. Interesting, the correspondence for the hydrogen bond donors seemed to be OK, but - again - there are not too many examples provided in the supporting materials to really check this. > I compared your implementation with PipelinePilot and rdkit, and it > correlates better with PP (r^2=0.978) than rdkit (0.916). Maybe Greg you can > comment on this? > (In PipelinePilot, HBA is described as "number of heteroatoms (Oxygen, > Nitrogen, Sulfur, or Phosphorus) with one or more lone pairs, excluding atoms > with positive formal charges, amide and pyrrole-type Nitrogens, and aromatic > Oxygen and Sulfur atoms in heterocyclic rings") > > On your website, you mention that "discrepancies can be noted in the results > from the logP calculations"; I agree that the end result won't be much > different using MolLogP vs. ALogP. Interesting is that both PP and RDKit's MolLogP are using the same method. With RDKit this can be validated (I didn't do this though), but with PP it can't... > But to be consistent I collected the structures of the 771 drugs mentioned in > the original publication, calculated their MolLogP using rdkit, and fitted > the binned distribution using the described approach. > 700 compounds out of 771 gave the same ALogP as listed in the original paper > (i.e. same structure), however for the 71 remaining drugs some discrepancies > were observed, maybe due to different structures (I collected them from > PubChem) or a different version of Pipeline Pilot (I used version 8.5). > I ended up with a bin size of 0.97 and the following parameters: > a 0.486849448 > b 186.2293718 > c 2.066177165 > d 3.902720615 > e 1.027025453 > f 0.913012565 > Dmax 145.43148 Thanks! I'll add these to the implementation. - Hans > > Feel free to use them! > > Best, > > Grégori > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > Rdkit-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ------------------------------------------------------------------------------ Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

