Nicholas, > For example, each pattern that matches the following is > alanine : a C bonded to CH3, NH2 and COOH. > The algorithm must also deals with peptide bonds and disulfure bonds. > It must also work when the H atoms are not int the xyz file.
You're on the right track. In fact, most informatics software houses (including DELSCI) have invested considerable time and effort in developing code capable of recognizing such patterns and then acting upon them. The most standards-compliant approach would be to encode all such rules as a set of SMARTs queries <http://www.daylight.com/dayhtml_tutorials/languages/smarts/index.html> that can detect and assign identifiers, chemical atom types, bond orders, formal charges, hydrogens, and atomic properties starting from a connected chiral graph of atomic types. An open-source library of code and patterns for doing this would certainly advance the industry as a whole! CDK and JOElib should both be up to the task, but there may be considerable work to be done in constructing a workable pattern hierachy to successfully perform such assignments based purely on X, Y, Z and element type. PyMOL does something similar to this via an iterative scheme wherein it first detects bonds based on distance, sets bond orders based on identifiers (if recognized), assigns 2D chiralities based on 3D coordinates, and then uses residue-based chemical patterns (not SMARTS-compliant) for everything else in the above list. It isn't quite what you need, but the content linked below might help get you started... <http://cvs.sourceforge.net/viewcvs.py/pymol/pymol/modules/chempy/champ/form al_charges.py?view=markup> <http://cvs.sourceforge.net/viewcvs.py/pymol/pymol/modules/chempy/champ/ambe r99.py?view=markup> Cheers, Warren -- Warren L. DeLano, Ph.D. Principal Scientist . DeLano Scientific LLC . 400 Oyster Point Blvd., Suite 213 . South San Francisco, CA 94080 . Biz:(650)-872-0942 Tech:(650)-872-0834 . Fax:(650)-872-0273 Cell:(650)-346-1154 . mailto:war...@delsci.com > -----Original Message----- > From: jmol-developers-ad...@lists.sourceforge.net > [mailto:jmol-developers-ad...@lists.sourceforge.net] On > Behalf Of Nicolas Vervelle > Sent: Monday, February 07, 2005 12:13 PM > To: jmol-develop...@lists.sourceforge.net > Subject: Re: [Jmol-developers] structure detection? > > > From: "Miguel" <mig...@jmol.org> > > The structure detection that is needed for FAH is at a > level below this. > > The .xyz files do not have the amino acids identified. > Therefore, we > > would need code which identifies amino acids from > individual atoms & > > their bonds. > > > > I am sure that this is an interesting problem and that people have > > worked on it. It is essentially a pattern match on sub-structures. > > My knowledge in chemistry is somewhat limited so I may be > wrong, but it doesn't seem so much work to write code to > identify amino acids from atoms and bonds. > For example, each pattern that matches the following is > alanine : a C bonded to CH3, NH2 and COOH. > The algorithm must also deals with peptide bonds and disulfure bonds. > It must also work when the H atoms are not int the xyz file. > > > It seems to me that it is the kind of thing that warrants a > literature > > search prior to starting in on an implementation. It also > seems to me > > that it would be a great project for a graduate student. > > If I have time this weekend, I will have a look at this and > maybe make a few tests. > > Nicolas > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide Read honest & > candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Jmol-developers mailing list > jmol-develop...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/jmol-developers >