On May 17, 2023, at 02:31, Vincent Scalfani <vfscalf...@ua.edu> wrote:
> I thought that this might also be the case for bond indices, but that does 
> not appear to be correct (see example below). Is it possible to get a bond 
> index in the order of the SMILES? 

This may help you understand why that's a difficult question.

What does the bond index mean in something like

 C12.OC23.C3.C1

? Does the bond for closure 1 come first in the bond list, because that's where 
it start, or is it last, because that's where it ends? It looks like you think 
it should be the closure position.

Here's your SMILES labelled by atom index:

        ┌                            1 1 1  1  1 1  1   1    1 1
   atoms│ 0 1 2  3 4  5   6   7 8 9  0 1 2  3  4 5  6   7    8 9
        └ | | |  | |  |   |   | | |  | | |  |  | |  |   |    | |
  SMILES[ C-C-c1:c:c:[nH]:c:1-C-C-C1-C-C-C(-c2:c:c:[nH]:c:2)-C-C-1

I used the program at the end of this email to print the information in bond 
list order:

In bondlist order
i Bnd# a1 ~ a2   frag
0   0   0 -  1   C-C
1   1   1 -  2   C-c
2   2   2 :  3   c:c
3   3   3 :  4   c:c
4   4   4 :  5  c:[nH]
5   5   5 :  6  [nH]:c
6   6   6 -  7   c-C
7   7   7 -  8   C-C
8   8   8 -  9   C-C
9   9   9 - 10   C-C
10  10  10 - 11   C-C
11  11  11 - 12   C-C
12  12  12 - 13   C-c
13  13  13 : 14   c:c
14  14  14 : 15   c:c
15  15  15 : 16  c:[nH]
16  16  16 : 17  [nH]:c
17  17  12 - 18   C-C
18  18  18 - 19   C-C
19  19   6 :  2   c:c
20  20  19 -  9   C-C
21  21  17 : 13   c:c


If you step through them you'll see that the closure atoms (2-6, 9-19, and 
13-17) are added to the bond list at the end, after processing the atoms which 
make up the spanning tree.

It appears the closure bond have the begin and end atom indices with the 
largest first, which makes it possible to tell that a given bond is a closure 
bond.

In principle then it should be possible to reorder the bonds to get the order 
you want.

This proved trickier than I could manage in the time I have.

Perhaps the better question is, why do you need the bond indices in a specific 
order?

Cheers,


                                Andrew
                                da...@dalkescientific.com


from rdkit import Chem

bond_symbols = {
   Chem.BondType.SINGLE: "-",
   Chem.BondType.DOUBLE: "=",
   Chem.BondType.TRIPLE: "#",
   Chem.BondType.AROMATIC: ":",
}

smi = "CCc1cc[nH]c1CCC1CCC(CC1)c1cc[nH]c1"
#smi = "[C@@](F)(Cl)(Br)O"
mol1 = Chem.MolFromSmiles(smi)
smi_explicit = Chem.MolToSmiles(mol1, allBondsExplicit=True)
mol2 = Chem.MolFromSmiles(smi_explicit)

def show(bonds):
   print(" i Bnd# a1 ~ a2   frag")
   for i, b in enumerate(bonds):
       a1, a2 = b.GetBeginAtomIdx(), b.GetEndAtomIdx()
       symbol = bond_symbols[b.GetBondType()]
       s = Chem.MolFragmentToSmiles(mol2, atomsToUse=[a1, a2], rootedAtAtom=a1, 
allBondsExplicit=True)
       print(f"{i:2d}  {b.GetIdx():2d}  {a1:2d} {symbol} {a2:2d} {s.center(8)}")

print(smi_explicit)
print("In bondlist order")
show(mol2.GetBonds())



_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to