Hi,

The short answer is : MCS will be quicker to reject pairs of
molecules, but if you don't mind the cost of pre-calculating the
inchi, string comparison will be faster for all-v-all searches.

I think that's right. My reasoning is that the UIT/SMSD/whatever could
quickly find reasons why two molecules don't match - a trivial example
being that they have different numbers of atoms - whereas if you
convert to an inchi (or canonical smiles, or signature) the
canonisation step is at least as slow. However, if you only do that
once for each molecule in a set, then do multiple comparisons, you win
overall.

This may have come up on cdk-user before. At least this thread may be relevant:

http://www.mail-archive.com/cdk-user@lists.sourceforge.net/msg01897.html

gilleain

On Mon, Feb 14, 2011 at 5:17 PM, Nick Vandewiele
<nick.vandewi...@ugent.be> wrote:
> Hi,
>
>
>
> When comparing two molecules to verify if they are identical or not, I now
> use the UniversalIsomorphismTester.isIsomorph method to test this.
>
> However, I noticed that this type is actually a complete maximum common
> subgraph (MCS) algorithm, and it offers actually too much for what I am
> requesting.
>
>
>
> In the light of computational cost minimization, would I be better off to
> just generate InChi’s for both structures and compare both strings instead
> of running the UniversalIsomorphism method?
>
> Or does CDK use the full MCS algorithm because this method is more reliable
> (more robust) than the InChi generator?
>
>
>
> Thanks in advance,
>
>
>
> Nick VANDEWIELE
>
>
>
> ------------------------------------------------------------------------------
> The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
> Pinpoint memory and threading errors before they happen.
> Find and fix more than 250 security defects in the development cycle.
> Locate bottlenecks in serial and parallel code that limit performance.
> http://p.sf.net/sfu/intel-dev2devfeb
> _______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to