Re: [Rdkit-discuss] Missing atom indices in the last structure

Ivan Tubert-Brohman Tue, 22 Sep 2020 05:03:48 -0700

Hi Norwid,

The inner loop over mols here:


    for i in smiles_list:

        for mol in mols:
            for atom in mol.GetAtoms():
                atom.SetAtomMapNum(atom.GetIdx())

        mols.append(Chem.MolFromSmiles(i))

is not in the right place. First, because you'll go over the same mol
multiple times unnecessarily, but also, because it happens before appending
to mols, the mol corresponding to the last item in smiles_list won't be
labelled, as you noted. Either move the loop over mols out and after the
the loop over smiles_list, or better yet, get rid of the inner loop
altogether:

    for i in smiles_list:
        mol = Chem.MolFromSmiles(i)
        for atom in mol.GetAtoms():
            atom.SetAtomMapNum(atom.GetIdx())
        mols.append(mol)

Best regards,
Ivan

On Tue, Sep 22, 2020 at 6:39 AM Norwid Behrnd via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Starting with lists consisting of a compound identifier, an explicit space,
> and a SMILES string, I would like to generate illustrations about these
> structures including RDKit's atom indices.  What is puzzling to me is that
> consistently the last entry / compound list is converted into a structure
> representation where none of the atoms is labeled by the index.  But why?
> Is it possible that erred on putting the outer structure (looping over the
> molecules) together with the inner (attribute the atom indices)?
>
> Initially written in Python 3.8.6 backed by RDKit 2019.09.1 as available
> in Debian unstable, I post below the code exported from the Jupyter
> Notebook
> (IPython 7.18.1) which runs _as such_ from the CLI of python (the hard line
> limit is set to 80 chars/line):
>
> ---- 8>< ---- start script ----
> from rdkit import Chem
> from rdkit.Chem.Draw import IPythonConsole
> from rdkit.Chem import Draw
>
> smiles = []
> input_file = str("list_1.csv")
>
>
> def draw_multiple_mol(smiles_list, mols_per_row=4, file_path=None):
>     mols = []
>     for i in smiles_list:
>
>         for mol in mols:
>             for atom in mol.GetAtoms():
>                 atom.SetAtomMapNum(atom.GetIdx())
>
>         mols.append(Chem.MolFromSmiles(i))
>     mols_per_row = min(len(smiles_list), mols_per_row)
>
>     img = Draw.MolsToGridImage(mols,
>                                molsPerRow=4,
>                                subImgSize=(300, 300),
>                                useSVG=True)
>     if file_path:
>         with open(file_path, 'w') as f_handle:
>             f_handle.write(img.data)
>     return img
>
>
> with open(input_file, mode="r") as source:
>     for line in source:
>         smiles_entry = str(line).split()[1]
>         smiles_entry = smiles_entry.strip()
>         smiles.append(smiles_entry)
>
> draw_multiple_mol(smiles, file_path='output.svg')
> ---- 8>< ---- end script ----
>
> The input files read -- both yielding an image with the missing annotation
> -- are the two of either file list_1.csv:
>
> ---- 8>< ---- start list_1.csv ----
> compound_1 C1=CSC=C1
> compound_2 C1C=C(SC=1)C
> compound_3 C1C=C(OC=1)C=O
> compound_4 C1=C(C(=CC2)OC=2)OC=C1
> compound_5 C1(C(OC2)=CC=2)OC(=CC=1)C1OC(=CC=1)C
> compound_6 C1(C)=C(C)C(=C(O1)C)C
> compound_7 C1(Br)=C(C)C(=C(O1)Br)C
> ---- 8>< ---- end list_1.csv ----
>
> or of the one of file list_2.csv:
>
> ---- 8>< ---- start list_2.csv ----
> compound_1 C1=CSC=C1
> compound_2 C1=C(C(=CC2)SC=2)SC=C1
> compound_3 C2=CC1=C(C=CS1)S2
> compound_4 C3=CC1=C(C2=C(S1)C=CS2)S3
> compound_5 C3=CC1=C(C2=C(S1)C=C(S2)CC)S3
> ---- 8>< ---- end list_2.csv ----
>
> Norwid
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Missing atom indices in the last structure

Reply via email to