Hi,

There is no convenient method or utility class in the CDK to do this,
although it is as 'simple' as:

IAtomContainer query = ... // given some query molecule
AtomSignature queryAtomSignature = new AtomSignature(3, query);     //
the atom signature rooted at the third atom of the query
String queryAtomSignatureString = queryAtomSignature.toCanonicalString();
IAtomContainer target = ... // given some target molecule
MoleculeSignature moleculeSignature = new MoleculeSignature(target);
// the 'set' of atom signatures for the target
for (int targetAtomIndex = 0; targetAtomIndex < target.getAtomCount();
targetAtomIndex++) {
    AtomSignature targetAtomSignature =
moleculeSignature.signatureStringForVertex(targetAtomIndex, HEIGHT);
    if 
(targetAtomSignature.toCanonicalString().equals(queryAtomSignatureString))
{
         System.out.println(queryAtomSignatureString + " found in " +
target); // Or whatever you want to do here
    }
}

Now the first problem here is the HEIGHT variable - you need to know
what height to generate the signatures for the target. The other
problem is that IF a query signature string is found in the target
signature strings THEN the query atom container is a subgraph of the
target. However the opposite is not necessarily true.

Consider cyclobutane and methyl-cyclobutane - clearly one is the
subgraph of the other, however the height-2 signature of cyclobutane
from any of its carbons is not a height-2 signature in
methyl-cyclobutane.

gilleain



On 2/4/15, 杨弘宾 <yanyangh...@163.com> wrote:
>
> Hi,? ? Now I know how to use CDK to generate AtomSignature of a molecule.
> The question is:? ??If I have a certain signature (may be an atom signautre
> of a moelcule or its subgraph), is there any class or method that can
> quickly judge whether a mocule has the signature (i.e. the the signature is
> a substructure of the molecule), by which I can count the number of
> molecules who have the signature.? ? For example. When I found C(C(CC)C) is
> a interesting signature, I wanted to know how many molecules in my database
> have the signature.? ? Some papers show that atomsignautre is a good way to
> represent substructure because it is fast to calculate and match. But it
> seems that there is little tutorials or documents about it. ?? ??? ? Thank
> you.
>
>
> Hongbin Yang 杨弘宾
>
> Research: Toxicophore and Chemoinformatics
> Pharmaceutical Science, School of Pharmacy
>
> East China University of Science and Techonolgy?
>
>

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to