SMILES ------>
ClC(Cl)(Cl)C(O)OCC(COC(O)C(Cl)(Cl)Cl)(COC(O)C(Cl)(Cl)Cl)COC(O)C(Cl)(Cl)Cl
SMARTS ------> 
ClC(Cl)(Cl)[CH]([O-,OH,OC])O[CH2]C([CH2]O[CH]([O-,OH,OC])C(Cl)(Cl)Cl)([CH2]O[CH]([O-,OH,OC])C(Cl)(Cl)Cl)[CH2]O[CH]([O-,OH,OC])C(Cl)(Cl)Cl
 
 



I tried this example (SMILES) against itself using SMSD and I got the 
answer in less than ~2 seconds.

It very difficult to find "a single" best MCS software. Each MCS 
algorithm comes with some pros and cos.
Some are good for finding all possible cliques (which increases the 
runtime) and others are good in finding subgraphs (usually you need only 
one solution). For example, as far as I can see UIT and SMSD (which also 
uses modified UIT in few cases) belongs to former class and VF2 and 
Ullman belongs to later class.

I have tried to use adaptive MCS in SMSD but I guess we can join hands 
to make MCS based solution more effective.

Thanks

Asad
>
> ------------------------------
>
> Message: 3
> Date: Thu, 30 Apr 2009 15:22:55 -0500
> From: Loren Lenzen <loren.len...@sial.com>
> Subject: [Cdk-user] Possible bug in SQT
> To: cdk-user@lists.sourceforge.net
> Message-ID:
>       <ofebcdec80.18caf729-on862575a8.006d4377-862575a8.006fd...@sial.com>
> Content-Type: text/plain; charset="us-ascii"
>
> I have a list of parent molecules in SMILES form, and I was running the 
> SQT against a list of SMARTS queries, to make sure that all my queries 
> were valid.  It works great until it hits Petrichloral.  The SMILES and 
> SMARTS strings parse fine, but there is no more output when the SQT runs 
> into this query (currently 264/326), so I believe there might be a 
> possible recursivity problem.  There was no CDKException thrown even after 
> an hour, even though the first 263 queries ran in 15 seconds. Petrichloral 
> is very symmetric, and daylight's depict.cgi runs the query fine with 
> 31104 matches:  apparently (4*3*2)(3*2)^4.  I was purposely running a dot 
> product iteration here so thet's why there is no inner loop.  I tried 
> formatting the SMARTS with and without brackets. 
>
> SMILES ------>
> ClC(Cl)(Cl)C(O)OCC(COC(O)C(Cl)(Cl)Cl)(COC(O)C(Cl)(Cl)Cl)COC(O)C(Cl)(Cl)Cl
> SMARTS ------> 
> ClC(Cl)(Cl)[CH]([O-,OH,OC])O[CH2]C([CH2]O[CH]([O-,OH,OC])C(Cl)(Cl)Cl)([CH2]O[CH]([O-,OH,OC])C(Cl)(Cl)Cl)[CH2]O[CH]([O-,OH,OC])C(Cl)(Cl)Cl
>  
>  
>   

> public static void main(String[] args) throws CDKException, 
> FileNotFoundException, IOException {
>
>         SmilesParser sp=new 
> SmilesParser(DefaultChemObjectBuilder.getInstance());
>         ArrayList<String> smarts=new ArrayList();
>         ArrayList<String> mols=new ArrayList();
>         String smart=new String();
>         String mol=new String();
>         AtomContainerSet acs=new AtomContainerSet();
>  
>        BufferedReader br1=new BufferedReader(new 
> FileReader("smarts.txt"));
>         while ((smart=br1.readLine()) != null){
>             smarts.add(smart);
>         }
>         br1.close();
>  
>        BufferedReader br2=new BufferedReader(new 
> FileReader("subStructures.txt"));
>        while ((mol=br2.readLine()) != null){
>             mols.add(mol);
>             acs.addAtomContainer(sp.parseSmiles(mol));
>          }
>         br2.close();
>  
>         BufferedWriter stream= new BufferedWriter(new 
> FileWriter("deaOut.txt", true));
>         SMARTSQueryTool sqt=new SMARTSQueryTool("c1ccccc1");    //dummy 
> string for initialization
>         for (int ac=0; ac != acs.getAtomContainerCount(); ac++){
>             sqt.setSmarts(smarts.get(ac));
>             System.out.println(""+ac);     //for debugging purposes
>             try {
>                 if (sqt.matches(acs.getAtomContainer(ac))){
>                 stream.write(mols.get(ac) + " | " + smarts.get(ac));
>                 stream.newLine();
>                 }
>             }
>             catch (CDKException ex){throw new 
> CDKException(ex.toString());}
>         }
>         stream.close();
>     }
> }
>
> This message and any files transmitted with it are the property of
> Sigma-Aldrich Corporation, are confidential, and are intended
> solely for the use of the person or entity to whom this e-mail is
> addressed.  If you are not one of the named recipient(s) or
> otherwise have reason to believe that you have received this
> message in error, please contact the sender and delete this message
> immediately from your computer.  Any other use, retention,
> dissemination, forwarding, printing, or copying of this e-mail is
> strictly prohibited.
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> Message: 4
> Date: Thu, 30 Apr 2009 16:51:58 -0400
> From: Rajarshi Guha <rg...@indiana.edu>
> Subject: Re: [Cdk-user] Possible bug in SQT
> To: Loren Lenzen <loren.len...@sial.com>
> Cc: cdk-user@lists.sourceforge.net
> Message-ID: <05778aef-b22e-4f3b-99ea-20906e88a...@indiana.edu>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
>
> On Apr 30, 2009, at 4:22 PM, Loren Lenzen wrote:
>
>   
>> I have a list of parent molecules in SMILES form, and I was running  
>> the SQT against a list of SMARTS queries, to make sure that all my  
>> queries were valid.  It works great until it hits Petrichloral.  The  
>> SMILES and SMARTS strings parse fine, but there is no more output  
>> when the SQT runs into this query (currently 264/326), so I believe  
>> there might be a possible recursivity problem.  There was no  
>> CDKException thrown even after an hour, even though the first 263  
>> queries ran in 15 seconds.
>>     
>
> The problem is in the isomorphism code it seems. If one ignores the  
> SMARTS, and just tries to match the SMILES against itself, UIT, VF2  
> and Ullman all run forever (or at least 30 seconds, after which I  
> stopped the run). Some symmetry based optimization seems to be called  
> for here
>
> -------------------------------------------------------------------
> Rajarshi Guha  <rg...@indiana.edu>
> GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
> -------------------------------------------------------------------
> Q:  What's polite and works for the phone company?
> A:  A deferential operator.
>
>
>
>
>
> ------------------------------
>
> ------------------------------------------------------------------------------
> Register Now & Save for Velocity, the Web Performance & Operations 
> Conference from O'Reilly Media. Velocity features a full day of 
> expert-led, hands-on workshops and two days of sessions from industry 
> leaders in dedicated Performance & Operations tracks. Use code vel09scf 
> and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
>
> ------------------------------
>
> _______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
> End of Cdk-user Digest, Vol 36, Issue 1
> ***************************************
>   

-- 

****************************************************************

Dr. Syed Asad Rahman (B.Engg, PhD)
Research Scientist

EMBL-EBI                       Phone: +44-(0) 1223-49-2537
Wellcome Trust Genome Campus   Fax: +44-(0) 1223-49-4486
Hinxton CB10 1SD               E-mail: a...@ebi.ac.uk
Cambridge, UK                  Home Page: www.ebi.ac.uk/~asad

*****************************************************************



------------------------------------------------------------------------------
Register Now & Save for Velocity, the Web Performance & Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance & Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to