On May 17, 2023, at 02:31, Vincent Scalfani wrote:
> I thought that this might also be the case for bond indices, but that does
> not appear to be correct (see example below). Is it possible to get a bond
> index in the order of the SMILES?
This may help you understand why that's a difficult question.
What does the bond index mean in something like
C12.OC23.C3.C1
? Does the bond for closure 1 come first in the bond list, because that's where
it start, or is it last, because that's where it ends? It looks like you think
it should be the closure position.
Here's your SMILES labelled by atom index:
┌1 1 1 1 1 1 1 11 1
atoms│ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 78 9
└ | | | | | | | | | | | | | | | | | || |
SMILES[ C-C-c1:c:c:[nH]:c:1-C-C-C1-C-C-C(-c2:c:c:[nH]:c:2)-C-C-1
I used the program at the end of this email to print the information in bond
list order:
In bondlist order
i Bnd# a1 ~ a2 frag
0 0 0 - 1 C-C
1 1 1 - 2 C-c
2 2 2 : 3 c:c
3 3 3 : 4 c:c
4 4 4 : 5 c:[nH]
5 5 5 : 6 [nH]:c
6 6 6 - 7 c-C
7 7 7 - 8 C-C
8 8 8 - 9 C-C
9 9 9 - 10 C-C
10 10 10 - 11 C-C
11 11 11 - 12 C-C
12 12 12 - 13 C-c
13 13 13 : 14 c:c
14 14 14 : 15 c:c
15 15 15 : 16 c:[nH]
16 16 16 : 17 [nH]:c
17 17 12 - 18 C-C
18 18 18 - 19 C-C
19 19 6 : 2 c:c
20 20 19 - 9 C-C
21 21 17 : 13 c:c
If you step through them you'll see that the closure atoms (2-6, 9-19, and
13-17) are added to the bond list at the end, after processing the atoms which
make up the spanning tree.
It appears the closure bond have the begin and end atom indices with the
largest first, which makes it possible to tell that a given bond is a closure
bond.
In principle then it should be possible to reorder the bonds to get the order
you want.
This proved trickier than I could manage in the time I have.
Perhaps the better question is, why do you need the bond indices in a specific
order?
Cheers,
Andrew
da...@dalkescientific.com
from rdkit import Chem
bond_symbols = {
Chem.BondType.SINGLE: "-",
Chem.BondType.DOUBLE: "=",
Chem.BondType.TRIPLE: "#",
Chem.BondType.AROMATIC: ":",
}
smi = "CCc1cc[nH]c1CCC1CCC(CC1)c1cc[nH]c1"
#smi = "[C@@](F)(Cl)(Br)O"
mol1 = Chem.MolFromSmiles(smi)
smi_explicit = Chem.MolToSmiles(mol1, allBondsExplicit=True)
mol2 = Chem.MolFromSmiles(smi_explicit)
def show(bonds):
print(" i Bnd# a1 ~ a2 frag")
for i, b in enumerate(bonds):
a1, a2 = b.GetBeginAtomIdx(), b.GetEndAtomIdx()
symbol = bond_symbols[b.GetBondType()]
s = Chem.MolFragmentToSmiles(mol2, atomsToUse=[a1, a2], rootedAtAtom=a1,
allBondsExplicit=True)
print(f"{i:2d} {b.GetIdx():2d} {a1:2d} {symbol} {a2:2d} {s.center(8)}")
print(smi_explicit)
print("In bondlist order")
show(mol2.GetBonds())
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss