Re: [Rdkit-discuss] What is the most efficient way to check for exact match with RDKit?

Nils Weskamp Tue, 05 Oct 2021 12:01:17 -0700

Dear Theo,

it might be useful to describe your specific application scenario a bitmore to provide some context. What do you want to do and how would"efficient" look like?

One advantage of using InChiKeys is that they have a fixed length andcan therefore be stored and indexed efficiently in a database. So, ifyou want e.g. to (frequently) compare one compound against a largecompound collection, it might be a good idea to pre-compute theseidentifiers, store them in a database and do a lookup there. BTW: thisis something the RDKit Cartridge can do for you:https://www.rdkit.org/docs/Cartridge.html#substructure-and-exact-structure-search


In other scenarios, other approaches might work better.

Best wishes,
Nils

Am 05.10.2021 um 10:06 schrieb theozh:

Dear Giovanni,
thank you for your explanations and advice. So, I just wanted to excludethat I maybe missed a very basic function of checking identity.
You are suggesting using InChI-Keys (with the very low probabilityhaving the same InChI-key for different molecules).Then, what would be the disadvantage of using InChI strings instead ofInChI-keys? Computation time & power?
The reponse I got from StackOverflow was that the substructure approachwas a little faster than the Canonical SMILES approach.I would assume that a simple string comparison within a fixed set ofstructures is much faster than calculating the Canonical SMILES againand again for each search.
So, I will check the InChI approach and compare it with the otherapproaches.
Thanks,
Theo.


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] What is the most efficient way to check for exact match with RDKit?

Reply via email to