Hello all:

I am trying to use the brics algorithm to fragment my compounds, filter the
fragments and try to group the original compounds by selected fragments.

As test I used the cdk2 data set provided by rdkit.

Here is a sample code partly cannibalizing Greg's and others' example code:

This part creates and displays the fragments:
from rdkit.Chem import BRICS

df = PandasTools.LoadSDF('cdk2.sdf')


for i,rows in df.iterrows():
    mol = rows['ROMol']
    pieces = BRICS.BRICSDecompose(mol)

from rdkit.Chem import Descriptors
from rdkit.Chem import rdMolDescriptors

fragList = list(allfrags)
df1 = pd.Series(fragList)
df2 = df1.to_frame()
df2.columns = ['smiles']
PandasTools.AddMoleculeColumnToFrame(df2,smilesCol='smiles', molCol='ROMol')

df2['NumRings'] = df2['ROMol'].map(rdMolDescriptors.CalcNumRings)
df2['RingAroms'] = df2['ROMol'].apply(lambda x:
df2['HeavyAtoms'] = df2['ROMol'].apply(lambda x:

df3 = df2[df2['HeavyAtoms']>6]
df4 = df3[df3['RingAroms'] > 0]
df5 = df4[df4['NumRings'] > 1]

PandasTools.FrameToGridImage(df5, column='ROMol')

This part removes the dummy atoms from smiles and tries to regenerate mol
import re
resultsList = pd.DataFrame()

with open('my_csv.csv', 'a') as f:

    for smi in df5['ROMol']:
        smi = Chem.MolToSmiles(smi)
        smi = re.sub(r"(\(\[\*\]\))", "", smi)
        smi = re.sub(r"(\[\*\])", "", smi)

        pattern = Chem.MolFromSmiles(smi)

This throws me here an error saying:
RDKIT Error: Can't kekulize mol

Do you know what is going on?

Many thanks in advance,
Rdkit-discuss mailing list

Reply via email to