OK, final (?) report. Always a good sign when code simplifies. I've
tightened up the algorithm, which passes all pertinent Chapter 9 tests,
plus several more. After removing unnecessary code, Jmol's CIP
implementation is now back to 970 lines, even with added (minimal) Kekule
considerations and full R/S, seqCis/seqTrans, and M/P mixing for Rules 4b
and 5. It would not take much more to add helicene identification.

I'm happy to report that the following simplified pseudocode is sufficient.
There is nothing magical here. The successful one-pass use of auxiliary
descriptors and finite digraphs should put to rest any concerns that CIP
determination of stereochemical descriptors has any cyclical dependencies
or routinely blows up. Except, perhaps, due to too many atoms and too high
symmetry. Every program will have its limits in this regard, of course, and
I think this algorithm could certainly be made more efficient. In any case,
the algorithm I have implemented demonstrates that this process can be a
one-pass process through all eight rules: 1a, 1b, 2, 3, 4a, 4b, 4c, and 5;
once through does it.

There are some nuances that are problematic -- a dependency on generating
multiple Kekule models, and a problem with Rule 1b as currently stated
actually introducing its own Kekule problems. But Rule 4b looks to me now
to be no major problem. A bit complicated, for sure, but not so bad in the
end. And I'm sure John will find some issues with what I have here. The
key, as Peter Murray-Rust mentioned, is the construction of a very good set
of test structures. There's more testing to do. I don't believe the
structures that are in papers or in the IUPAC 2013 Blue Book are hardly
enough to cover the bases. So I have no doubt that I missed something here,
with the limited number of examples available to me. But I'm confident that
the overall strategy is sound, and that additional issues will be minor.

Take a look; let me know what you think. Thanks again for all the great
comments and especially for great test cases.

Bob

  //  getChirality(molecule) {
  //    prefilterAtoms()
  //    checkForAlkenes()
  //    checkForSmallRings()
  //    checkForBridgeheadNitrogens()
  //    checkForKekuleIssues()
  //    checkForAtropisomerism()
  //    for(all filtered atoms) getAtomChirality(atom)
  //    if (haveAlkenes) {
  //      for(all double bonds) getBondChirality(a1, a2)
  //      removeUnnecessaryEZDesignations()
  //    }
  //  }
  //
  // getAtomChirality(atom) {
  //   for (each Rule){
  //     sortSubstituents()
  //     if (done) return checkHandedness();
  //   }
  //   return NO_CHIRALITY
  // }
  //
  //  getBondChirality(a1, a2) {
  //    atop = getAlkeneEndTopPriority(a1)
  //    btop = getAlkeneEndTopPriority(a2)
  //    return (atop >= 0 && btop >= 0 ? getEneChirality(atop, a1, a2,
btop) : NO_CHIRALITY)
  //  }
  //
  // sortSubstituents() {
  //   for (all pairs of substituents a1 and a2) {
  //     score = a1.compareTo(a2, currentRule)
  //     if (score == TIED)
  //       score = breakTie(a1,a2)
  // }
  //
  // breakTie(a,b) {
  //    score = compareShallowly(a, b)
  //    if (score != TIED) return score
  //    a.sortSubstituents(), b.sortSubstituents()
  //    return compareDeeply(a, b)
  // }
  //
  // compareShallowly(a, b) {
  //    for (each substituent pairing i in a and b) {
  //      score = applyCurrentRule(a_i, b_i)
  //      if (score != TIED) return score
  //    }
  //    return TIED
  // }
  //
  // compareDeeply(a, b) {
  //    bestScore = Integer.MAX_VALUE
  //    for (each substituent pairing i in a and b) {
  //      bestScore = min(bestScore, breakTie(a_i, b_i)
  //    }
  //    return bestScore
  // }
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to