No problem - it should say in the error message but I think it is be bing 
suppressed (wrapped exceptions) I’ll double check that.

On 8 Nov 2013, at 04:29, Rajarshi Guha <rajarshi.g...@gmail.com> wrote:

> Thanks for the explaination - much clearer now!
> 
> 
> On Thu, Nov 7, 2013 at 1:05 PM, John May <john...@ebi.ac.uk> wrote:
> Hi all,
> 
> Yep the SMILES parser changed on master and won’t accept invalid SMILES by 
> default. Notice how daylight rejects it also. It should now be the case now 
> that if CDK rejects it - daylight also rejects it (If not then it’s a bug). 
> The new parser automatically kekulises on load, verifying the bond orders can 
> be assigned to aromatic systems. This is much friendly for the CDK as you 
> don’t have molecules with all single aromatic bonds floating about. When we 
> added this it fixed 2 failing unit tests.
> 
> In the molecule you're missing a hydrogen of one or more nitrogens, to know 
> which ones is the problem.
> 
> The SMILES should be:
>> c4ccc2c(cc1=Nc3[nH]cccc3(Cn12))c4
>> 
> 
> 
> Some toolkits will fix this by default but that’s making several assumptions 
> and it’s nothing more than an hack for broken SMILES input. To fix this you 
> need to change the formula of the molecule which is never a good start.  You 
> can still parse it with the CDK by turning on ‘preserve aromaticity’ (need to 
> rename) this disables electron checking but I strongly discourage that. The 
> actual fix involves checking every possible combination of hydrogens on 
> aromatic nitrogens and phosphates, checkout the fixarom core from 
> http://www.daylight.com/download/contrib/.  
> 
> Now where this molecules come from is probably more interesting. Most likely 
> it’s people using the aromaticity models on formats which don’t support it. 
> The MDL model for example doesn’t allow lone pair contributions. If you have 
> marvin sketch, try loading ‘[nH]1cccc1’ and then generating an MDL mol file. 
> You’ll notice they have there own non-portable work around to ensure the 
> hydrogen is kept. Of course everyone knows you should never store aromaticity 
> in the mol file :-).
> 
>   Mrv0541 11071317592D          
> 
>   5  5  0  0  0  0            999 V2000
>     1.2964    0.6723    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
>     1.9639    0.1874    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     1.7089   -0.5972    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     0.8839   -0.5972    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>     0.6290    0.1874    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  4  0  0  0  0
>   2  3  4  0  0  0  0
>   4  5  4  0  0  0  0
>   1  5  4  0  0  0  0
>   3  4  4  0  0  0  0
> M  STY  1   1 DAT
> M  SAL   1  1   1
> M  SDT   1 MRV_IMPLICIT_H                                        
> M  SDD   1     0.0000    0.0000    DR    ALL  0       0  
> M  SED   1 IMPL_H1
> M  END
> 
> 
> Oh some more examples which are now correctly rejected.
> 
>> C/1.C/C=C/1
>> C-1.C/C=C=1
>> ccc
>> ccccc
>> p1cccc1                      <- generated by older CDK versions!
> 
> 
> Cheers,
> J
> 
> On 7 Nov 2013, at 16:33, Nina Jeliazkova <jeliazkova.n...@gmail.com> wrote:
> 
>> 
>> 
>> 
>> On 7 November 2013 18:26, Nina Jeliazkova <jeliazkova.n...@gmail.com> wrote:
>> 
>> 
>> 
>> On 7 November 2013 18:18, Rajarshi Guha <rajarshi.g...@gmail.com> wrote:
>> It seems 
>> c4ccc2c(cc1=Nc3ncccc3(Cn12))c4
>> 
>> does not parse using the latest CDK master, but does parse fine using 
>> http://apps.ideaconsult.net:8080/ambit2/depict?search=c4ccc2c%28cc1%3DNc3ncccc3%28Cn12%29%29c4&smarts=
>> 
>> I'm not sure what version ambit is using
>> 
>> 
>> cdk 1.4.11
>> 
>> There is also a test version using cdk 1.5.3 (Sep 2013) and seems to parse 
>> fine 
>> http://apps.ideaconsult.net:8080/bioclipse/depict?search=c4ccc2c%28cc1%3DNc3ncccc3%28Cn12%29%29c4&smarts=
>>  
>> 
>> Nina
>> 
>> Regards,
>> Nina 
>> but could somebody confirm this issue with the latest master?
>> 
>> 
>> -- 
>> Rajarshi Guha | http://blog.rguha.net
>> NIH Center for Advancing Translational Science
>> 
>> ------------------------------------------------------------------------------
>> November Webinars for C, C++, Fortran Developers
>> Accelerate application performance with scalable programming models. Explore
>> techniques for threading, error checking, porting, and tuning. Get the most
>> from the latest Intel processors and coprocessors. See abstracts and register
>> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>> 
>> 
>> 
>> ------------------------------------------------------------------------------
>> November Webinars for C, C++, Fortran Developers
>> Accelerate application performance with scalable programming models. Explore
>> techniques for threading, error checking, porting, and tuning. Get the most 
>> from the latest Intel processors and coprocessors. See abstracts and register
>> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk_______________________________________________
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
> 
> 
> 
> 
> -- 
> Rajarshi Guha | http://blog.rguha.net
> NIH Center for Advancing Translational Science

------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to