On Mon, Jun 21, 2010 at 6:43 PM, Craig A. James <[email protected]> wrote:
> The recent discussion of SSSR bugs prompted me to dig back through my emails
> to one I wrote on 20 November 2007 to the BlueObelisk mailing list. Here it
> is in its entirety.
>
> Craig
>
> --------------------------------
>
>
> Andrew suggested an algorithmic description of aromaticity. I think this is
> a really good idea.
>
> I propose, for argument's sake, that we should be using a LSSR, "Largest Set
> of Smallest Rings." It would go like this: A breadth-first search for rings,
> which terminates when all ring atoms have been included in at least one ring,
> but only after ALL cyclic paths of a given size have been enumerated. Since
> the LSSR is computed specifically for aromaticity detection, it would stop at
> some reasonable size, say 8-atom rings.
>
> For example, cubane's LSSR would have six (not five or four) 4-atom rings,
> because when you get the the four-atom-ring step of the algorithm, you don't
> stop until you've added all four-atom cyclic paths. Similarly, the LSSR for
> a prism (three square faces joined in a triangle) would have five rings (two
> 3-atom rings and three 4-atom rings, whereas the SSSR for that structure
> would only have four rings).
Implementing this shouldn't be to hard. Changing the last loop in
OBRingSearch should be all. It currently checks if a ring is contained
in the union of all other rings. Checking with a single ring should
give the behavior you describe (assuming we don't check frj). The
SSSR should still be available though. Is there a reference for the
LSSR or is it a suggestion for a name? If there is no reference there
might be some unforeseen cases. For example, in cubane there are 3
6-atom-rings not fully overlapping with a single 4-atom-ring. How do
we decide if the ring should not go in the set? (note: if you check
all 4-rings, the 6th 4-ring wouldn't be found) This may not be
important for aromaticity though.
> Unlike SSSR, this is a very easy algorithm to code, and easy to explain. It's
> computation time is O(R) where R is the number of rings, and there is no
> arbitrariness to the LSSR; no matter how you compute it, you always get the
> same answer.
>
> For most "ordinary" molecules, the LSSR is the same as the SSSR. The LSSR
> and SSSR only differ with "cage" structures, where the rings themselves form
> rings, i.e. when there exists a set of R rings that "covers" all atoms, but R
> is fewer than the number of bonds that must be broken to make the structure
> acyclic. And in these cases, the choice of rings for the SSSR is arbitrary,
> whereas the LSSR is not.
>
> For aromaticity detection, this gives you an algorithmic description of rings
> that is very suitable for the aromaticity definition. It would go something
> like this:
>
> Compute the LSSR
> Sort the LSSR list smallest-to-largest
> for each atom in the molecule that isn't marked "aromatic" yet {
> for each ring in the LSSR (ordered smallest to biggest) that atom is in {
> if that ring is aromatic {
> mark all of its atoms "aromatic"
> }
> else {
> for each fused neighbor ring in the LSSR {
> if the two rings together are aromatic {
> mark all atoms in the two rings aromatic
> }
> else {
> for each ring of the LSSR fused to either of the pair {
> if the three rings together are aromatic {
> mark all atoms in all three rings aromatic
> }
> }
> }
> }
> }
> }
>
>
> I think that's about it. If I understand the chemistry, you never have to go
> to four rings (more than 14 atoms) to discover aromaticity. Or we could just
> decide that was the OpenSMILES rule, even if there are obscure, rare cases
> for which it isn't true.
>
> In extreme cases (fullerenes), this algorithm is still fast, O(nrings),
> because you stop at three rings. Something like 760 steps per ring in a
> fullerene, but it's still linear, not polynomial or exponential. And it's
> only something like 760 steps if the thing turns out to be completely
> non-aromatic. In a real fullerene, it would decide all the atoms were
> aromatic in the first pass of the outer loop, and be finished.
>
> Craig
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit. See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-Devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-devel