Dear all,

I'm setting up a small library of 2D pharmacophore fingerprints. Although I 
understand the theory behind 2D pharmacophores, this is the first time I've 
worked with them and therefore I would appreciate your wisdom/guidance.

I'm using the example code from the rdkit HTML documentation:

from rdkit import Chem
from rdkit.Chem import ChemicalFeatures
fdefNameStr: str = "MinimalFeatures.fdef"
featFactory = ChemicalFeatures.BuildFeatureFactory(fdefNameStr)

from rdkit.Chem.Pharm2D.SigFactory import SigFactory
sigFactory = SigFactory(featFactory, minPointCount=2, maxPointCount=3)
sigFactory.SetBins([(0,2),(2,5),(5,8)])
sigFactory.Init()
sigFactory.GetSignature()

Note: I've taken the MinimalFeatures.fdef from the Github location 
rdkit/Docs/Book/data/MinimalFeatures.fdef - I'm not sure if this was the right 
thing to do. I don't have my own set of pharmacophore definitions.

Using the following code I'm able to generate 2D pharmacophores for some of my 
compounds:
drug.setPharm2DFP(Generate.Gen2DFingerprint(drug.getRDKitMol(), sigFactory)

However, some compounds cause an exception (please see below the body of this 
email). I figure it's happening due to my lack of understanding of 
pharmacophores and possibly the use of "MinimalFeatures.fdef".

This is the first compound to throw an exception:

abacavir
C1CC1NC2=C3C(=NC(=N2)N)N(C=N3)C4CC(C=C4)CO

Any thoughts/ideas are appreciated.

Thanks
Anthony


=================EXCEPTION=======================
ValueError                                Traceback (most recent call last)
~\anaconda3\lib\site-packages\rdkit\Chem\Pharm2D\SigFactory.py in 
GetBitIdx(self, featIndices, dists, sortIndices)
    248                 print('\tbins:', repr(self._bins), type(self._bins))
--> 249             bin_ = self._findBinIdx(dists, self._bins, 
self._scaffolds[len(dists)])
    250         except ValueError:

~\anaconda3\lib\site-packages\rdkit\Chem\Pharm2D\SigFactory.py in 
_findBinIdx(self, dists, bins, scaffolds)
    167             whichBins[i] = where
--> 168         res = scaffolds.index(tuple(whichBins))
    169         if _verbose:

ValueError: (2, 0, 0) is not in list

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
<ipython-input-4-ab2c2d6e9026> in <module>
     29         
drug.setTransitionMetalState(containsTransitionMetal(drug.getCanonicalSmiles()))
     30         drug.setDrugClass(drugClassStr)
---> 31         drug.setPharm2DFP(Generate.Gen2DFingerprint(drug.getRDKitMol(), 
sigFactory))
     32         drugDictionary.addCaseDrug(drug, drugNameStr)
     33

~\anaconda3\lib\site-packages\rdkit\Chem\Pharm2D\Generate.py in 
Gen2DFingerprint(mol, sigFactory, perms, dMat, bitInfo)
    160     for match in matchesToMap:
    161       if sigFactory.shortestPathsOnly:
--> 162         idx = _ShortestPathsMatch(match, perm, sig, dMat, sigFactory)
    163         if idx is not None and bitInfo is not None:
    164           l = bitInfo.get(idx, [])

~\anaconda3\lib\site-packages\rdkit\Chem\Pharm2D\Generate.py in 
_ShortestPathsMatch(match, featureSet, sig, dMat, sigFactory)
     71     dist[i] = d
     72
---> 73   idx = sigFactory.GetBitIdx(featureSet, dist, sortIndices=False)
     74   if _verbose:
     75     print('\t', dist, minD, maxD, idx)

~\anaconda3\lib\site-packages\rdkit\Chem\Pharm2D\SigFactory.py in 
GetBitIdx(self, featIndices, dists, sortIndices)
    251             fams = self.GetFeatFamilies()
    252             fams = [fams[x] for x in featIndices]
--> 253             raise IndexError('distance bin not found: feats: %s; 
dists=%s; bins=%s; scaffolds: %s' %
    254                              (fams, dists, self._bins, self._scaffolds))
    255

IndexError: distance bin not found: feats: ['Acceptor', 'Acceptor', 
'Aromatic']; dists=[5, 1, 1]; bins=[(0, 2), (2, 5), (5, 8)]; scaffolds: [0, 
[(0,), (1,), (2,)], 0, [(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (0, 1, 2), 
(0, 2, 1), (0, 2, 2), (1, 0, 0), (1, 0, 1), (1, 0, 2), (1, 1, 0), (1, 1, 1), 
(1, 1, 2), (1, 2, 0), (1, 2, 1), (1, 2, 2), (2, 0, 1), (2, 0, 2), (2, 1, 0), 
(2, 1, 1), (2, 1, 2), (2, 2, 0), (2, 2, 1), (2, 2, 2)], 0]
Kind regards
Dr Anthony Nash PhD MRSC

Senior Research Scientist
Nuffield Department of Clinical Neurosciences
RMCR Kellogg College
University of Oxford
http://www.kellogg.ox.ac.uk/

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to