Dear Cheng,

On Sun, Feb 22, 2009 at 9:33 PM, Cheng Wang <[email protected]> wrote:
>
> Nowadays we use some large detailed mechanisms to study combustion
> behavior.  These
> mechanisms normally involve hundreds (sometimes over 1000) species including
> a lot of
> large hydrocarbons (more than 6 Cs).  Because some of these mechanisms are
> generated
> semi-automatically, they include reaction pathways of many isomers.  So one
> way to make
> the simulation run faster is to reduce the mechanism by creating
> pseudo-species
> representing all isomers of the same species family.  Then the reaction
> pathways involving
> these isomers are combined through lumping process. My plan is to use RDKit
> to identify
> the isomers among the species.

Ok, I think I have it now. You have a set of molecules and you would
like to group together ones that have the same chemical formula.

Somehow it has happened that the RDKit does not have a function to
generate the chemical formula for a molecule, so one would need to
write it from scratch. Here's a simple (and relatively untested) way
of doing this:

#----------------------------
import collections
import Chem
def ChemicalFormula(mol):
  """ A molecules' chemical formula

  >>> ChemicalFormula(Chem.MolFromSmiles('CC'))
  'C2H6'
  >>> ChemicalFormula(Chem.MolFromSmiles('C(=O)O'))
  'CH2O2'
  >>> ChemicalFormula(Chem.MolFromSmiles('C(=O)[O-]'))
  'CHO2'
  >>> ChemicalFormula(Chem.MolFromSmiles('C(=O)'))
  'CH2O'

  """
  cnts=collections.defaultdict(int)
  for atom in mol.GetAtoms():
    symb = atom.GetSymbol()
    hs = atom.GetTotalNumHs()
    cnts[symb]+=1
    cnts['H']+=hs
  ks = cnts.keys()
  ks.sort()
  res=''
  for k in ks:
    res+=k
    if cnts[k]>1:
      res+=str(cnts[k])
  return res
#----------------------------

For your purposes, this could be simplified a bit since you don't
really need the result as a string, but assuming I understood what you
want to do correctly, this should get you started.

Regards,
-greg

Reply via email to