[Rdkit-discuss] Missing atom indices in the last structure

Norwid Behrnd via Rdkit-discuss Tue, 22 Sep 2020 03:40:47 -0700

Starting with lists consisting of a compound identifier, an explicit space,
and a SMILES string, I would like to generate illustrations about these
structures including RDKit's atom indices.  What is puzzling to me is that
consistently the last entry / compound list is converted into a structure
representation where none of the atoms is labeled by the index.  But why?
Is it possible that erred on putting the outer structure (looping over the 
molecules) together with the inner (attribute the atom indices)?


Initially written in Python 3.8.6 backed by RDKit 2019.09.1 as available
in Debian unstable, I post below the code exported from the Jupyter Notebook
(IPython 7.18.1) which runs _as such_ from the CLI of python (the hard line
limit is set to 80 chars/line):

---- 8>< ---- start script ----
from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw

smiles = []
input_file = str("list_1.csv")


def draw_multiple_mol(smiles_list, mols_per_row=4, file_path=None):
    mols = []
    for i in smiles_list:

        for mol in mols:
            for atom in mol.GetAtoms():
                atom.SetAtomMapNum(atom.GetIdx())

        mols.append(Chem.MolFromSmiles(i))
    mols_per_row = min(len(smiles_list), mols_per_row)

    img = Draw.MolsToGridImage(mols,
                               molsPerRow=4,
                               subImgSize=(300, 300),
                               useSVG=True)
    if file_path:
        with open(file_path, 'w') as f_handle:
            f_handle.write(img.data)
    return img


with open(input_file, mode="r") as source:
    for line in source:
        smiles_entry = str(line).split()[1]
        smiles_entry = smiles_entry.strip()
        smiles.append(smiles_entry)

draw_multiple_mol(smiles, file_path='output.svg')
---- 8>< ---- end script ----

The input files read -- both yielding an image with the missing annotation
-- are the two of either file list_1.csv:

---- 8>< ---- start list_1.csv ----
compound_1 C1=CSC=C1
compound_2 C1C=C(SC=1)C
compound_3 C1C=C(OC=1)C=O
compound_4 C1=C(C(=CC2)OC=2)OC=C1
compound_5 C1(C(OC2)=CC=2)OC(=CC=1)C1OC(=CC=1)C
compound_6 C1(C)=C(C)C(=C(O1)C)C
compound_7 C1(Br)=C(C)C(=C(O1)Br)C
---- 8>< ---- end list_1.csv ----

or of the one of file list_2.csv:

---- 8>< ---- start list_2.csv ----
compound_1 C1=CSC=C1
compound_2 C1=C(C(=CC2)SC=2)SC=C1
compound_3 C2=CC1=C(C=CS1)S2
compound_4 C3=CC1=C(C2=C(S1)C=CS2)S3
compound_5 C3=CC1=C(C2=C(S1)C=C(S2)CC)S3
---- 8>< ---- end list_2.csv ----

Norwid


_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] Missing atom indices in the last structure

Reply via email to