Hi RDKiters,

I recently started using this wonderful package and I have been getting
familiarized with many of its capabilities.

The problem I am working on now is to determine the number of individual
bonds in any given molecule. I am able to classify the groups in to single,
double or triple and aromatic; however, it seems in some cases some bonds
are recounted. As an example based on my minimum working example (MWE)
given below; ethane should provide only 1 single bond (output also given);
propane should give 2 single bonds, but every time an additional bond is
counted. Another example would be of 1,3-hexadien-5-yne, which has 2 double
and 1 triple bond, but the output has 3 double bonds.

The problem, I believe, is due to the function I created does not consider
the uniqueness of the bonds; how can the uniqueness of the bond be
considered with RDKit?

*MWE*

 from rdkit import Chem
 def n_bonds(molecule):
     molecule = Chem.MolFromSmiles(molecule)
     fuel_bonds = {'single': 0, 'double': 0, 'triple': 0, 'aromatic': 0}
     for atom in molecule.GetAtoms():
         if str(atom.GetBonds()[0].GetBondType()) == 'SINGLE':
             fuel_bonds['single'] += 1
         elif str(atom.GetBonds()[0].GetBondType()) == 'DOUBLE':
             fuel_bonds['double'] += 1
         elif str(atom.GetBonds()[0].GetBondType()) == 'TRIPLE':
             fuel_bonds['triple'] += 1
         elif str(atom.GetBonds()[0].GetBondType()) == 'AROMATIC':
             fuel_bonds['aromatic'] += 1
     return fuel_bonds

 smiles_test = [ 'CC',                          # 1. ethane
                 'CCC',                          # 2. propane
                 'C1CC1',                        # 3. cyclopropane
                 'C1CCC1',                       # 4. cyclobutane
                 'CC(C)(C)C',                    # 5. neopentane
                 'CCCCC1=CC=CC=C1',              # 6. butylbenzene
                 'CCC(CC)CCC(C)C',               # 7. 5-ethyl-3-methylheptane
                 'CCCCCC=CC=CC=C',               # 8. 1,3,5-undecatriene
                 'C=CC=CC#C',                    # 9. 1,3-Hexadien-5-yne
                 'CC#CC#CC#CC=CC=CCCCC=C']       # 10.
Hexadeca-1,6,8-triene-10,12,14-triyne

 for i in range(len(smiles_test)):
     f = n_bonds(smiles_test[i])
     print i+1, '-', f

*Output*

 1 - {'double': 0, 'single': 2, 'aromatic': 0, 'triple': 0}
 2 - {'double': 0, 'single': 3, 'aromatic': 0, 'triple': 0}
 3 - {'double': 0, 'single': 3, 'aromatic': 0, 'triple': 0}
 4 - {'double': 0, 'single': 4, 'aromatic': 0, 'triple': 0}
 5 - {'double': 0, 'single': 5, 'aromatic': 0, 'triple': 0}
 6 - {'double': 0, 'single': 5, 'aromatic': 5, 'triple': 0}
 7 - {'double': 0, 'single': 10, 'aromatic': 0, 'triple': 0}
 8 - {'double': 3, 'single': 8, 'aromatic': 0, 'triple': 0}
 9 - {'double': 3, 'single': 2, 'aromatic': 0, 'triple': 1}
 10 - {'double': 3, 'single': 10, 'aromatic': 0, 'triple': 3}

Regards,
Nimal
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to