Hi all,
I am using the RDKit postgresql cartridge to perform some substructure
searches on a large number of molecules, as described here:
https://www.rdkit.org/docs/Cartridge.html
However, in addition to which row matched the query, I would also like
to know what are the atom indices for each match.
For now I am doing this in 2 consecutive steps, as shown below.
Is there a way to achieve this in a single step from the postgresql query?
Thanks!
Best regards,
Jose Manuel
# 0/ initialization
query_smiles = "c1ccccc1"
query_mol = Chem.MolFromSmiles(query_smiles)
# 1/ get substructure matches
cur.execute(f"select mol_send(m) from rdk.mols where m@>'{query_smiles}'
LIMIT 1")
results = cur.fetchall()
mols = [ Chem.Mol(m[0].tobytes()) for m in results ]
# 2/ get substructure atom indexes:
for m in mols:
print(m.GetSubstructMatches(query_mol))
And for instance I get as results the substructures atom indices:
((5, 6, 7, 8, 16, 21), (9, 10, 11, 12, 14, 15), (22, 23, 24, 25, 33, 34))
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss