Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-08 Thread John M
Just a note on this - in CDK 1.5 the Pattern class is preferred as the universal isomorphism tester is actually doing MCS (much harder). Pattern ptrn = Pattern.findIdentical(butane); for (Mol m : mols) ptrn.matches(m); In general though this is the slowest way to do it (n^2) vs (n log n) with

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-08 Thread John M
Hi John, If the reader produces and IQueryAtomContainer IQueryAtoms/IQueryBonds then it's simply a matter of passing it into the Pattern class (of UIT). This will do a test of the query against real molecules - if you want to check if one query equals another that's more tricky (I think RDKit has

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-07 Thread John M
So is there a reasonable way to do that? can we just generate canonical form for molecules (like SMILES, then compare the SMILES)? Is it reasonable for this? Or other methods? Thanks. Yep Unique SMILES (canon,no stereo), Absolute SMILES (canon,with stereo) or InChI should do what you need. J

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-07 Thread Zheng Shi
Thanks. I will see. Thank you very much. On Tue, Jul 7, 2015 at 5:03 PM, Egon Willighagen egon.willigha...@gmail.com wrote: Dear Zheng Shi, I think what you are looking for is isomorphism checking. If the chemical graph of two structures is the same, they are called isomorphic. The

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-07 Thread Egon Willighagen
Of course, please remember that if your input is not so clean (various charge states, unclearly defined stereochemistry, etc), then you may want to do some normalization before you do isomorphism checking or SMILES/InChI generation. The InChI generation does some normalization, but not all. If

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-07 Thread Egon Willighagen
Dear Zheng Shi, I think what you are looking for is isomorphism checking. If the chemical graph of two structures is the same, they are called isomorphic. The following Groovy code shows a very basic example: butane = MoleculeFactory.makeAlkane(4); isomorphismTester = new

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-07 Thread John K. Sterling
Thanks, John - On the same vein: what is the appropriate way to handle these comparisons when the source is an MDL V3000 with placeholder atoms/wildcards (nots, etc)? Is there a CDK test that can handle this or would I have to implement a custom comparison library? John On Jul 7,

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-06 Thread John M
Hi, The CDK has always been intended as a toolbox (Lego like) rather than an application. There are a couple of reasons for this but primarily there isn't always a single or best way to accomplish a particular task. If you would like such functionality I believe workflow tools (Knime-CDK) offer a

Re: [Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-06 Thread Zheng Shi
The compare means to tell if two molecules are equal. Suppose I have a molecule B, which is generated by molecule A during a reaction: A - B, and I get a molecule C, which is also generated by molecule A during a reaction: A-C. I want to tell whether B and C are the same. Usually we can visualize

[Cdk-user] puzzles about how to compare two molecules to check whether they are the same in cdk

2015-07-04 Thread Zheng Shi
Hi, I just wonder whether there are any functions in cdk that could be used to compare two molecule. Suppose I have two sdf files, I want to check whether one molecule in the first file is the same as the other molecule in the second file. Are there any functions in cdk that can achieve this? Or