Hi,

Here are the results from the shuffle (10x) test for the 5 million
compounds in the eMolecules database. In general the results are good
and only 33 canonicalization errors remain which should be easy to
fix.

Process stops: 3429680, 3429701, 3429702, 3429717, 3429742, 3429767,
3429887, ...  (these are indexes (line number) in the
eMolecules-2010-03-01.smi file starting from 1)

3429680: [Li+251] 24639246
3429701: [ClH+276] 24639289
3429702: CCCC[n+251]1cccc(C)c1 24639291
...

I continued testing from 3500000. Any ideas on how to handle this?

Segfaults: 1278211, 1278212

S=C1NCCCCCCNC(=S)S[Fe]2SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S[Fe](SC(=S)NCCCCCCNC(=S)S[Ni]S1)SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S2
4315482
S=C1NCCCCCCNC(=S)S[Cr]2SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S[Cr](SC(=S)NCCCCCCNC(=S)S[Ni]S1)SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S2
4315484

These have large rings which are not found I think. We should be able
to correctly detect ring membership though since this is done using a
spanning tree before SSSR/LSSR analysis is done. I'll take a look at
this.

Canonicalization errors: 33

All errors are the same problem AFAIK. The canonical code does
consider the H atoms that are added when writing out the smiles. I can
add this to the canonical code but I'll probably copy some code for
this from the smiles format.

Cc1cccc(c1)C(=O)Nc1nnc[nH]1.Cc1cccc(c1)C(=O)Nc1n[nH]cn1 8622926
Cc1cccc(c1)C(=O)Nc1n[nH]cn1.Cc1cccc(c1)C(=O)Nc1nnc[nH]1 8622926

This is not an aromaticity error, the two fragments have identical
canonical code since there is no difference between n and [nH].

CC1=CC(=O)c2c(C1=O)c(O)ccc2O.CCC=C(C)C.OC.[CH].C        19231703
CC1=CC(=O)c2c(C1=O)c(O)ccc2O.CCC=C(C)C.OC.C.[CH]        19231703

C[CH]   23745856
[CH]C   23745856

C[CH2]  23745858
[CH2]C  23745858

O[O]    23903986
[O]O    23903986

C1CC[CH][CH]CCC1.C1CCCCC[CH][CH]1.C1[CH][CH]CCCCC1.C1CC[CH][CH]CCC1.[Ir]Cl.[Ir]Cl
       23904497
[CH]1[CH]CCCCCC1.C1C[CH][CH]CCCC1.[CH]1CCCCCC[CH]1.[CH]1[CH]CCCCCC1.[Ir]Cl.[Ir]Cl
       23904497
[CH]1[CH]CCCCCC1.C1C[CH][CH]CCCC1.C1CCC[CH][CH]CC1.[CH]1[CH]CCCCCC1.[Ir]Cl.[Ir]Cl
       23904497
[CH]1CCCCCC[CH]1.C1CCC[CH][CH]CC1.[CH]1CCCCCC[CH]1.C1CCCC[CH][CH]C1.[Ir]Cl.[Ir]Cl
       23904497

C[C]([CH2])[CH2].[CH2][C]([CH2])C.[Pd]Cl.[Pd]Cl 23906874
C[C]([CH2])[CH2].C[C]([CH2])[CH2].[Pd]Cl.[Pd]Cl 23906874
[CH2][C]([CH2])C.C[C]([CH2])[CH2].[Pd]Cl.[Pd]Cl 23906874
C[C]([CH2])[CH2].[CH2][C]([CH2])C.[Pd]Cl.[Pd]Cl 23906874

[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
   24631596
[CH]1[CH]CC[CH][CH]CC1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
   24631596
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
   24631596
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
   24631596

[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC    26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC    26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC    26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC    26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC    26965008

[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC      26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC      26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC      26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC      26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC      26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC      26965122

[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176

[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC     26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC     26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC     26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC     26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC     26965734

*.FC1(F)Oc2c(O1)cc(c(c2)[N])N   27518948
*.FC1(F)Oc2c(O1)cc(c(c2)N)[N]   27518948
*.FC1(F)Oc2c(O1)cc(c(c2)N)[N]   27518948

CN1CCCC1c1cccnc1.OOOOOO.[CH2]C#CC       27522714
CN1CCCC1c1cccnc1.OOOOOO.CC#C[CH2]       27522714

CCCCC[CH]       29331055
[CH]CCCCC       29331055

CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe]     29370482
CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe]     29370482
CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe]     29370482
CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe]     29370482

[CH]1C[CH][CH][CH][CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr]    29371034
[CH]1[CH]C[CH][CH][CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr]    29371034
C1[CH][CH][CH][CH][CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr]    29371034
[CH]1[CH][CH][CH]C[CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr]    29371034

O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br     29450609

[CH]1[CH]CC[CH][CH]CC1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]     
29491188
C1C[CH][CH]CC[CH][CH]1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]     
29491188
C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]     
29491188
C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]     
29491188

C1[CH][CH]CC[CH][CH]C1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       29491195
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       29491195
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       29491195
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       29491195

C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
     29491197
[CH]1CC[CH][CH]CC[CH]1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
     29491197
C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
     29491197
[CH]1[CH]CC[CH][CH]CC1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
     29491197

[CH]CCCCCCCCCCCCCCC     29536355
CCCCCCCCCCCCCCC[CH]     29536355

C1CCC[CH]1      29538372
[CH]1CCCC1      29538372

[CH]=C  29538463
C=[CH]  29538463

C1[CH]CCCC1     29538482
C1CCC[CH]C1     29538482
[CH]1CCCCC1     29538482
C1C[CH]CCC1     29538482

C/C(=C(/[CH2])\C)/[CH2] 29550750
C/C(=C(/[CH2])\C)/[CH2] 29550750
C/C(=C(/[CH2])\C)/[CH2] 29550750
C/C(=C(/[CH2])\C)/[CH2] 29550750

*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C   29934806
*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C   29934806
*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C   29934806
*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C   29934806

*.CCN(N(N=O)O)CC.CCC.CC[CH2]    29934822
*.CCN(N(N=O)O)CC.CC[CH2].CCC    29934822
*.CCN(N(N=O)O)CC.CCC.[CH2]CC    29934822
*.CCN(N(N=O)O)CC.CCC.CC[CH2]    29934822

[CH2][C]([CH][C]([CH2])C)C.C[C]([CH][C]([CH2])C)[CH2].[Ru]      30155022
C[C]([CH][C]([CH2])C)[CH2].[CH2][C]([CH][C]([CH2])C)C.[Ru]      30155022

C[C]([CH]CC[CH][C](C)[CH2])[CH2].[CH2][C]([CH]CC[CH][C](C)[CH2])C.Cl[Ru]Cl.Cl[Ru]Cl
     30155024
[CH2][C]([CH]CC[CH][C](C)[CH2])C.C[C]([CH]CC[CH][C](C)[CH2])[CH2].Cl[Ru]Cl.Cl[Ru]Cl
     30155024

O=CNc1c(C)cccc1C.CCCCN1[CH]CCCC1.CC     30155687
O=CNc1c(C)cccc1C.CCCCN1CCCC[CH]1.CC     30155687

[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       30177469
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       30177469
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       30177469
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
       30177469

C1[CH][CH]CC[CH][CH]C1.Cl[Ru]Cl 30424431
[CH]1CC[CH][CH]CC[CH]1.Cl[Ru]Cl 30424431
C1[CH][CH]CC[CH][CH]C1.Cl[Ru]Cl 30424431
[CH]1[CH]CC[CH][CH]CC1.Cl[Ru]Cl 30424431

Tim

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to