On May 17, 2023, at 02:31, Vincent Scalfani <vfscalf...@ua.edu> wrote: > I thought that this might also be the case for bond indices, but that does > not appear to be correct (see example below). Is it possible to get a bond > index in the order of the SMILES?
This may help you understand why that's a difficult question. What does the bond index mean in something like C12.OC23.C3.C1 ? Does the bond for closure 1 come first in the bond list, because that's where it start, or is it last, because that's where it ends? It looks like you think it should be the closure position. Here's your SMILES labelled by atom index: ┌ 1 1 1 1 1 1 1 1 1 1 atoms│ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 └ | | | | | | | | | | | | | | | | | | | | SMILES[ C-C-c1:c:c:[nH]:c:1-C-C-C1-C-C-C(-c2:c:c:[nH]:c:2)-C-C-1 I used the program at the end of this email to print the information in bond list order: In bondlist order i Bnd# a1 ~ a2 frag 0 0 0 - 1 C-C 1 1 1 - 2 C-c 2 2 2 : 3 c:c 3 3 3 : 4 c:c 4 4 4 : 5 c:[nH] 5 5 5 : 6 [nH]:c 6 6 6 - 7 c-C 7 7 7 - 8 C-C 8 8 8 - 9 C-C 9 9 9 - 10 C-C 10 10 10 - 11 C-C 11 11 11 - 12 C-C 12 12 12 - 13 C-c 13 13 13 : 14 c:c 14 14 14 : 15 c:c 15 15 15 : 16 c:[nH] 16 16 16 : 17 [nH]:c 17 17 12 - 18 C-C 18 18 18 - 19 C-C 19 19 6 : 2 c:c 20 20 19 - 9 C-C 21 21 17 : 13 c:c If you step through them you'll see that the closure atoms (2-6, 9-19, and 13-17) are added to the bond list at the end, after processing the atoms which make up the spanning tree. It appears the closure bond have the begin and end atom indices with the largest first, which makes it possible to tell that a given bond is a closure bond. In principle then it should be possible to reorder the bonds to get the order you want. This proved trickier than I could manage in the time I have. Perhaps the better question is, why do you need the bond indices in a specific order? Cheers, Andrew da...@dalkescientific.com from rdkit import Chem bond_symbols = { Chem.BondType.SINGLE: "-", Chem.BondType.DOUBLE: "=", Chem.BondType.TRIPLE: "#", Chem.BondType.AROMATIC: ":", } smi = "CCc1cc[nH]c1CCC1CCC(CC1)c1cc[nH]c1" #smi = "[C@@](F)(Cl)(Br)O" mol1 = Chem.MolFromSmiles(smi) smi_explicit = Chem.MolToSmiles(mol1, allBondsExplicit=True) mol2 = Chem.MolFromSmiles(smi_explicit) def show(bonds): print(" i Bnd# a1 ~ a2 frag") for i, b in enumerate(bonds): a1, a2 = b.GetBeginAtomIdx(), b.GetEndAtomIdx() symbol = bond_symbols[b.GetBondType()] s = Chem.MolFragmentToSmiles(mol2, atomsToUse=[a1, a2], rootedAtAtom=a1, allBondsExplicit=True) print(f"{i:2d} {b.GetIdx():2d} {a1:2d} {symbol} {a2:2d} {s.center(8)}") print(smi_explicit) print("In bondlist order") show(mol2.GetBonds()) _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss