Dear Cheng,
On Sun, Feb 22, 2009 at 9:33 PM, Cheng Wang <[email protected]> wrote:
>
> Nowadays we use some large detailed mechanisms to study combustion
> behavior. These
> mechanisms normally involve hundreds (sometimes over 1000) species including
> a lot of
> large hydrocarbons (more than 6 Cs). Because some of these mechanisms are
> generated
> semi-automatically, they include reaction pathways of many isomers. So one
> way to make
> the simulation run faster is to reduce the mechanism by creating
> pseudo-species
> representing all isomers of the same species family. Then the reaction
> pathways involving
> these isomers are combined through lumping process. My plan is to use RDKit
> to identify
> the isomers among the species.
Ok, I think I have it now. You have a set of molecules and you would
like to group together ones that have the same chemical formula.
Somehow it has happened that the RDKit does not have a function to
generate the chemical formula for a molecule, so one would need to
write it from scratch. Here's a simple (and relatively untested) way
of doing this:
#----------------------------
import collections
import Chem
def ChemicalFormula(mol):
""" A molecules' chemical formula
>>> ChemicalFormula(Chem.MolFromSmiles('CC'))
'C2H6'
>>> ChemicalFormula(Chem.MolFromSmiles('C(=O)O'))
'CH2O2'
>>> ChemicalFormula(Chem.MolFromSmiles('C(=O)[O-]'))
'CHO2'
>>> ChemicalFormula(Chem.MolFromSmiles('C(=O)'))
'CH2O'
"""
cnts=collections.defaultdict(int)
for atom in mol.GetAtoms():
symb = atom.GetSymbol()
hs = atom.GetTotalNumHs()
cnts[symb]+=1
cnts['H']+=hs
ks = cnts.keys()
ks.sort()
res=''
for k in ks:
res+=k
if cnts[k]>1:
res+=str(cnts[k])
return res
#----------------------------
For your purposes, this could be simplified a bit since you don't
really need the result as a string, but assuming I understood what you
want to do correctly, this should get you started.
Regards,
-greg