Bugs item #3310783, was opened at 2011-06-02 19:42
Message generated for change (Tracker Item Submitted) made by baoilleach
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=428740&aid=3310783&group_id=40728

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Noel O'Boyle (baoilleach)
Assigned to: Nobody/Anonymous (nobody)
Summary: Aromatic P not recognised in SMILES

Initial Comment:
>From Andrew Dalke on list:

Perhaps I'm missing something after staring at fingerprint SMARTS definitions 
for the last few days. I'm validating the MACCS substructure keys from RDKit, 
which are also used in OpenBabel and CDK.

 I'm writing a test suite, which will be public when done. (Actually, they are 
public now, if you know where the version control repository is.)

 I'm having a very difficult time generating an aromatic ring with a "P" in it 
in OpenBabel.

>>> import pybel
>>> pybel.readstring("smi", "c1cccp1").write()
'C1CCCP1\t\n'
>>> pybel.readstring("smi", "c1ccccp1").write()
'C1=CC=NC=P1\t\n'

Since P is in the same group and has the same valence levels as N, I expected 
the first of these to return "c1cccp1", similar to

>>> pybel.readstring("smi", "c1cccn1").write()
'c1ccc[nH]1\t\n'


Both RDKit and OEChem have no problem dealing with "c1cccp" and interpreting it 
as an aromatic ring.

I processed about 50K structures from PubChem to find a number with aromatic 
"p" in them. Since PubChem doesn't have aromaticity information, what I did was 
use another program to perceive the aromaticity. Below I show the RDKit SMILES 
for a structure and the OpenBabel equivalent for it.

You can see that of the 53 structures where RDKit has no problems with a "p" in 
an aromatic ring, 51 of them are converted into aliphatic form by OpenBabel.

Is this due to a chemical reason or a design reason for why OpenBabel does 
this? Perhaps it's something subtle about aromaticity perception (which I sadly 
admit I still don't have a good grasp on).

This is with OEChem OBReleaseVersion()  '2.3.0' which I built a couple of days 
ago.





                               Andrew
                               da...@dalkescientific.com

Columns are
 column 1: "p" in OpenBabel's SMILES
 column 2: the SMILES string from RDKit
 column 3: the SMILES string from OpenBabel

False 'CCc1c(CC)p(-c2ccccc2)c(-c2ccccc2)c1-c1ccccc1' 
'CCC1C(CC)P(C2CCCCC2)C(C2CCCCC2)C1C1CCCCC1\t\n'
True 
'[W].Cc1np(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1.[O+]#[C-].[C-]#[O+].[O+]#[C-].[C-]#[O+].[C-]#[O+]'
 
'[W].Cc1[nH]p(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1.[O+]#[C-].[C-]#[O+].[O+]#[C-].[C-]#[O+].[C-]#[O+]\t\n'
True 'Cc1np(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1' 
'Cc1[nH]p(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1\t\n'
False 
'c1ccc2c(c1)ccc1op(OC(C)CC(C)Op3oc4ccc5ccccc5c4c4c5ccccc5ccc4o3)oc3ccc4ccccc4c3c21'
 
'C1CCC2C(C1)CCC1OP(OC(C)CC(C)OP3OC4CCC5CCCCC5C4C4C5CCCCC5CCC4O3)OC3CCC4CCCCC4C3C21\t\n'
False 'Cc1cp(-c2ccccc2)c(Br)c1C' 'CC1CP(C2CCCCC2)C(Br)C1C\t\n'
False 'CCC(C)(C)c1c2c(pc(C(OC)=O)c1C(OC)=O)CCCCCC2' 
'CCC(C)(C)C1=C2C(=PC(=C1C(=O)OC)C(=O)OC)CCCCCC2\t\n'
False 
'[Zr+2].CCC(C)(C)[c-]1p2[c-](C(CC)(C)C)p12.[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1'
 
'[Zr+2].CCC(C)(C)[C-]1P2=P1[C-]2C(CC)(C)C.[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1\t\n'
False 'Cc1cccc2c1op(OC1COC3C(Op4oc5c(C)cccc5c5c(c(C)ccc5)o4)COC13)oc1c2cccc1C' 
'CC1CCCC2C1OP(OC1COC3C(OP4OC5C(C)CCCC5C5C(C(C)CCC5)O4)COC13)OC1C2CCCC1C\t\n'
False 'c1cc2c(cc1)c(=O)o[p+](=O)o2' 'c1cc2c(cc1)C(=O)O[P+](=O)O2\t\n'
False 'c1ccc(-c2cc(-c3ccccn3)cpc2)nc1' 'c1ccc(C2=CC(=CP=C2)c2ccccn2)nc1\t\n'
False 'c1csc(-c2psc(-c3ccccc3)c2)c1' 'c1csc(C2=PSC(=C2)c2ccccc2)c1\t\n'
False 'CC(Np1oc2ccc3c(cccc3)c2c2c(o1)ccc1c2cccc1)c1ccccc1' 
'CC(NP1OC2CCC3C(CCCC3)C2C2C(O1)CCC1C2CCCC1)C1CCCCC1\t\n'
False 
'[Zr+2].[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1.C1C2CC3CC1CC([c-]1p4[c-](C56CC7CC(CC(C7)C5)C6)p14)(C2)C3'
 
'[Zr+2].[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1.C1C2CC3CC1CC([C-]1P4=P1[C-]4C14CC5CC(CC(C5)C1)C4)(C2)C3\t\n'
False 'c1ccc(P(C2C(Op3oc4ccc5c(cccc5)c4c4c(o3)ccc3c4cccc3)COC2)c2ccccc2)cc1' 
'c1ccc(P(C2C(OP3OC4CCC5C(CCCC5)C4C4C(O3)CCC3C4CCCC3)COC2)C2CCCCC2)cc1\t\n'
False 'Cc1c(C)c(C)p(Cc2ccccc2Cp2c(C)c(C)c(C)c2C)c1C' 
'CC1C(C)C(C)P(CC2CCCCC2CP2C(C)C(C)C(C)C2C)C1C\t\n'
False 'CCCN(C)p1oc2ccc3c(c2c2c(ccc4c2CCCC4)o1)CCCC3' 
'CCCN(C)P1OC2CCC3C(C2C2C(CCC4C2CCCC4)O1)CCCC3\t\n'
False 'c1ccc2c(c1)cc(C)c1op(NN3CCCCC3)oc3c(C)cc4ccccc4c3c21' 
'C1CCC2C(C1)CC(C)C1OP(NN3CCCCC3)OC3C(C)CC4CCCCC4C3C21\t\n'
False 'CCOC(=O)C=C(C)Np1oc2ccc3c(c2c2c(ccc4c2CCCC4)o1)CCCC3' 
'CCOC(=O)C=C(C)NP1OC2CCC3C(C2C2C(CCC4C2CCCC4)O1)CCCC3\t\n'
False 'CCCCN(p1oc2ccc3c(c2c2c(o1)ccc1c2CCCC1)CCCC3)CCCC' 
'CCCCN(P1OC2CCC3C(C2C2C(O1)CCC1C2CCCC1)CCCC3)CCCC\t\n'
False 'c1ccc2c(c1)cccc2CNp1oc2ccc3c(c2c2c(o1)ccc1c2CCCC1)CCCC3' 
'c1ccc2c(c1)cccc2CNP1OC2CCC3C(C2C2C(O1)CCC1C2CCCC1)CCCC3\t\n'
False 'Cc1cc(C)c2op(N(C(C)c3ccccc3)C(C)c3ccccc3)oc3c(C)cc(C)cc3c2c1' 
'CC1CC(C)C2OP(N(C(C)C3CCCCC3)C(C)C3CCCCC3)OC3C(C)CC(C)CC3C2C1\t\n'
False 'COc1cc(C)cc2c1op(N(C(C)c1ccccc1)C(C)c1ccccc1)oc1c(OC)cc(C)cc12' 
'COC1CC(C)CC2C1OP(N(C(C)C1CCCCC1)C(C)C1CCCCC1)OC1C(OC)CC(C)CC21\t\n'
False 
'Cc1cc(C)cc(P(CCOp2oc3c(C(C)(C)C)cc(C)c(C)c3c3c(C)c(C)cc(C(C)(C)C)c3o2)c2cc(C)cc(C)c2)c1'
 
'Cc1cc(C)cc(P(CCOP2OC3C(C(C)(C)C)CC(C)C(C)C3C3C(C)C(C)CC(C(C)(C)C)C3O2)C2CC(C)CC(C)C2)c1\t\n'
False 
'CCN(CC)[p+]1c(P(=S)(c2ccccc2)c2ccccc2)c(-c2ccccc2)cc(-c2ccccc2)c1P(=S)(c1ccccc1)c1ccccc1'
 
'CCN(CC)[P+]1=C(P(=S)(c2ccccc2)c2ccccc2)C(=CC(=C1P(=S)(c1ccccc1)c1ccccc1)c1ccccc1)c1ccccc1\t\n'
False 'c1ccc(CCNp2oc3c(C)cc4ccccc4c3c3c(o2)c(C)cc2ccccc23)nc1' 
'c1ccc(CCNP2OC3C(C)CC4CCCCC4C3C3C(O2)C(C)CC2CCCCC32)nc1\t\n'
False 'CN(C)p1n(S(C)(=O)=O)c2ccc3ccccc3c2c2c(ccc3ccccc23)n1S(C)(=O)=O' 
'CN(C)P1N(S(=O)(=O)C)C2CCC3CCCCC3C2C2C(CCC3CCCCC23)N1S(=O)(=O)C\t\n'
False 'Cc1cc(C)c2op(N(C(C)c3ccccc3)C(C)c3ccccc3)oc3c(C)cc(C)c(C)c3c2c1C' 
'CC1CC(C)C2OP(N(C(C)C3CCCCC3)C(C)C3CCCCC3)OC3C(C)CC(C)C(C)C3C2C1C\t\n'
False 'CC(=C)Cc1cccc2c1op(N(C(C)c1ccccc1)C(C)c1ccccc1)oc1c(CC(C)=C)cccc12' 
'CC(=C)CC1CCCC2C1OP(N(C(C)C1CCCCC1)C(C)C1CCCCC1)OC1C(CC(=C)C)CCCC21\t\n'
False 
'COC1COC(c2ccccc2)OC1C1OC(c2ccccc2)OCC1Op1oc2ccc3ccccc3c2c2c3ccccc3ccc2o1' 
'COC1COC(c2ccccc2)OC1C1OC(c2ccccc2)OCC1OP1OC2CCC3CCCCC3C2C2C3CCCCC3CCC2O1\t\n'
False 
'CC(N(p1n(S(C)(=O)=O)c2ccc3ccccc3c2c2c(ccc3ccccc23)n1S(C)(=O)=O)C(C)c1ccccc1)c1ccccc1'
 
'CC(N(P1N(S(=O)(=O)C)C2CCC3CCCCC3C2C2C(CCC3CCCCC23)N1S(=O)(=O)C)C(C)c1ccccc1)c1ccccc1\t\n'
False 
'CC(C)N(C(C)C)p1n(S(c2ccc(C)cc2)(=O)=O)c2ccc3ccccc3c2c2c(ccc3ccccc23)n1S(c1ccc(C)cc1)(=O)=O'
 
'CC(C)N(C(C)C)P1N(S(=O)(=O)c2ccc(C)cc2)C2CCC3CCCCC3C2C2C(CCC3CCCCC23)N1S(=O)(=O)c1ccc(C)cc1\t\n'
False 
'[Pd+2].[CH2][CH][CH2].FC(F)(F)S([O-])(=O)=O.c1ccc(P(COp2oc3ccc4c(cccc4)c3c3c(o2)ccc2c3cccc2)c2ccccc2)cc1'
 
'[Pd+2].[CH2][CH][CH2].FC(F)(F)S(=O)(=O)[O-].c1ccc(P(COP2OC3CCC4C(CCCC4)C3C3C(O2)CCC2C3CCCC2)C2CCCCC2)cc1\t\n'
False 'c1ccc(P(COp2oc3ccc4c(cccc4)c3c3c(o2)ccc2c3cccc2)c2ccccc2)cc1' 
'c1ccc(P(COP2OC3CCC4C(CCCC4)C3C3C(O2)CCC2C3CCCC2)C2CCCCC2)cc1\t\n'
False 'c1ccc(C2C(Op3oc4ccccc4c4ccccc4o3)CCCC2)cc1' 
'c1ccc(C2C(OP3OC4CCCCC4C4CCCCC4O3)CCCC2)cc1\t\n'
False 'CC(C)(C)Np1oc2ccc3c(c2c2c(ccc4c2CCCC4)o1)CCCC3' 
'CC(C)(C)NP1OC2CCC3C(C2C2C(CCC4C2CCCC4)O1)CCCC3\t\n'
False 'COCCNp1oc2c(C)cc3ccccc3c2c2c(c(C)cc3ccccc32)o1' 
'COCCNP1OC2C(C)CC3CCCCC3C2C2C(C(C)CC3CCCCC23)O1\t\n'
False 
'[Li+].[W].Cc1c[p-]cc1C.[C-]#[O+].[O+]#[C-].[C-]#[O+].[O+]#[C-].[O+]#[C-]' 
'[Li+].[W].CC1C[PH-]CC1C.[C-]#[O+].[O+]#[C-].[C-]#[O+].[O+]#[C-].[O+]#[C-]\t\n'
False 'COCC1N(p2oc3c(C)cc4ccccc4c3c3c(c(C)cc4ccccc43)o2)CCC1' 
'COCC1N(P2OC3C(C)CC4CCCCC4C3C3C(C(C)CC4CCCCC34)O2)CCC1\t\n'

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=428740&aid=3310783&group_id=40728

------------------------------------------------------------------------------
Simplify data backup and recovery for your virtual environment with vRanger.
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Discover what all the cheering's about.
Get your free trial download today. 
http://p.sf.net/sfu/quest-dev2dev2 
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to