Hi Antoine,

|rdkit.Chem.Pharm2D.Generate.||Gen2DFingerprint()| expects a single molecule

https://www.rdkit.org/docs/source/rdkit.Chem.Pharm2D.Generate.html?highlight=gen2dfingerprint#rdkit.Chem.Pharm2D.Generate.Gen2DFingerprint

while you are passing a list of molecules:

pharmacophorefps = Generate.Gen2DFingerprint(list(df['ROMol']), sigFactory)

HTH, cheers
p.

On 21/02/2020 16:22, Antoine Dumas wrote:

Hello,

I am trying to generate a set of pharmacophore fingerprints in python using RDKIT from a list of SMILES (20k molecules)

No matter what I do the script keeps throwing an error saying my code doesn’t match the C++ signature.

Here  is a copy of my code as it stands right now

from __future__ import print_function

import os

import csv

import numpy as np

import pandas as pd

from rdkit import RDConfig, Chem, DataStructs, rdBase

from rdkit.Chem import rdFingerprintGenerator, rdMolDescriptors, AllChem, rdFMCS, MACCSkeys, Draw, PandasTools, ChemicalFeatures, rdDepictor

from rdkit.Chem.Fingerprints import FingerprintMols

from rdkit.Chem.Draw import IPythonConsole, MolDraw2D

from rdkit.Chem.Pharm2D import Gobbi_Pharm2D, Generate

from rdkit.Chem.Pharm2D.SigFactory import SigFactory

from IPython.display import SVG

from tabulate import tabulate

os.chdir(r'C:\Users\adumas\Desktop\')

input1 = r'C:\Users\adumas\Desktop\FILE.csv'

output = r'C:\Users\adumas\Desktop\FILE.csv'

df = pd.read_csv(input1, delimiter = ',', header = 0, index_col = [''], names = ['','row ID','MOL_ID', 'SMILES',])

PandasTools.AddMoleculeColumnToFrame(df, smilesCol = 'SMILES')

SMILES = []

SMILES = df.iloc[0:,4]

ID = df.iloc[0:,1]

molecules = [Chem.MolFromSmiles(x) for x in SMILES]

fingerprints = [FingerprintMols.FingerprintMol(x) for x in molecules]

morganfps = rdFingerprintGenerator.GetFPs(list(df['ROMol']))

df['Morgan Fingerprint'] = morganfps

mcassfps = [MACCSkeys.GenMACCSKeys(x) for x in list(df['ROMol'])]

df['MCASS Fingerprint'] = mcassfps

fdefName = 'BaseFeatures.fdef'

featFactory = ChemicalFeatures.BuildFeatureFactory(fdefName)

sigFactory = SigFactory(featFactory, minPointCount=2, maxPointCount=9)

sigFactory.SetBins([(0,2),(2,5),(5,8)])

sigFactory.Init()

sigFactory.GetSigSize()

pharmacophorefps = Generate.Gen2DFingerprint(list(df['ROMol']), sigFactory) ************* Line throwing error constantly no matter whether I specify (SMILES, sigFactory) or (molecules, sigFactory) or (df.iloc[0:,4], sigFactory)

df['Pharmacophore Fingerprints'] = pharmacophorefps

And the error it throws me every time no matter how I try to define the list of smiles

*ArgumentError*: Python argument types in

rdkit.Chem.rdmolops.GetDistanceMatrix(Series, bool)

did not match C++ signature:

    GetDistanceMatrix(class RDKit::ROMol {lvalue} mol, bool useBO=False, bool useAtomWts=False, bool force=False, char const * __ptr64 prefix='')



_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to