On 25 February 2011 12:30, <ma...@ebi.ac.uk> wrote:

> >
> > I wonder why I haven't asked this before. How is this done on OrChem?
> >
> > "These columns provide a quick way to materialize a basic CDK molecule to
> > be passed into the VF2 algorithm.  The data structures used are quite
> > straightforward, for instance with data in column atom "C O" interpreted
> > as: "atom 0 is Carbon, atom 1 is Oxygen" and bond
> >  column "0 1 D Y" then implying "there is a bond between C (atom 0) and O
> >  (atom 1) that is double (D) and aromatic is true (Y)". In this way, CDK
> >  molecules can be generated very fast without the need for calculating
> > any properties during the search."
>
>
> Yes, 'properties' here means mainly aromaticity, which is relatively
> expensive to calculate. So the database molecules get a fingerprint but
> they also get stored in a simple format for quick re-assembly before going
> into VF2.
>


I think it was already mentioned in this thread - store precalculated
information for properties , which take long to calculate, e.g. aromaticity
and rings, and just load them into the CDK structure before doing the
substructure match.



> >
> > Is this VF2 the turbo-substructure algorithm? Or a custom one?
>
> It's custom, I still have to see if I can replace it with SMSD and if that
> would be faster. At the time the CDK did not offer a VF2 algorithm.
>
>
UIT would do as well.



> > Do you create real cdk molecules or just some kind of graph
> representation
> > you but in your custom VF2?
>
> The latter, just what I need for VF2.
>
> >
> > Which properties do you need for VF2? Implicit hydrogens? Or is it enough
> > to assign an atom it's symbol "C" and each bond an order?
>
> Element symbol, explicit hydrogens, bond orders, aromaticity. I don't
> include charge, I probably should (so now the search is charge
> insensitive)
>
> It's also really beneficial to 'sort' the atom container, it's described
> somewhere in the paper. Try and let VF2 match the least common elements
> first (generally not the carbons or oxygens). I benchmarked that and
> performance shot up.
>


Mark,  is your implementation of VF2 algorithm available for testing?  We
are doing some benchmarking of isomorphism tester currently and would be
interesting to compare VF2 as well.

Nina



>
> cheers,
> Mark
>
>
>
>
>
> ------------------------------------------------------------------------------
> Free Software Download: Index, Search & Analyze Logs and other IT data in
> Real-Time with Splunk. Collect, index and harness all the fast moving IT
> data
> generated by your applications, servers and devices whether physical,
> virtual
> or in the cloud. Deliver compliance at lower cost and gain new business
> insights. http://p.sf.net/sfu/splunk-dev2dev
> _______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to