Hi all,

I am using the RDKit postgresql cartridge to perform some substructure searches on a large number of molecules, as described here:

https://www.rdkit.org/docs/Cartridge.html

However, in addition to which row matched the query, I would also like to know what are the atom indices for each match.

For now I am doing this in 2 consecutive steps, as shown below.

Is there a way to achieve this in a single step from the postgresql query?

Thanks!

Best regards,

Jose Manuel


# 0/ initialization
query_smiles = "c1ccccc1"
query_mol = Chem.MolFromSmiles(query_smiles)

# 1/ get substructure matches
cur.execute(f"select mol_send(m) from rdk.mols where m@>'{query_smiles}' LIMIT 1")
results = cur.fetchall()
mols = [ Chem.Mol(m[0].tobytes()) for m in results ]

# 2/ get substructure atom indexes:
for m in mols:
    print(m.GetSubstructMatches(query_mol))


And for instance I get as results the substructures atom indices:
((5, 6, 7, 8, 16, 21), (9, 10, 11, 12, 14, 15), (22, 23, 24, 25, 33, 34))



_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to