Re: [BlueObelisk-discuss] Fwd: Cahn-Ingold-Prelog rules into Jmol

John Mayfield Mon, 10 Apr 2017 13:57:43 -0700

>
> I'm pretty sure it's 1R 5R.

1) Firstly, there is only one stereocentre so how do you name two?
2) What did you get for the other test case, that one checks you have the
ordering ranking for atomic masses.

> CC[C@@](CO)([H])[14CH2]C

3) I'm aware of bugs in various SMILES readers, for example ACD/ChemSketch
doesn't read SMILES stereo correctly on the first atom. Also people mess up
the ring closures semantics. To eliminate that possibility can you confirm
these all give you the same you the same structure

> [C@H](C)(O)CC
> C[C@@H](O)CC
> C[C@@H]12.O1.C2C
> C[C@H]21.O1.C2C
> [C@H]312.O1.C2C.C3

I'll draw out the full digraph tomorrow if we can't work it out from these
tests.

To answer some other parts of the discussion.

1) Has anyone taken the CIP rules and rewritten them as formal logic (and
> machine readable) rules?

With regards to the formal logic encoding, the rules are well documented by
the original papers, and a formal IUPAC document (Chapter 9 [1]). On top of
that there was a paper Paulina Mata [2] that provides a structured flow
chart of program logic. There are "holes" in the rules (see Handbook of
Cheminf, Chapter 6 [3]) but there have been additions, a new one was added
recently Rule 1b [1]. In general the algorithm has (bad) exponential run
time for even small cases... it really is quite poor based on current
computer science knowledge. If you want to exchange structures, try to
avoid CIP rules - i.e. InChI and SMILES are preferable for exchanging
information. Only use CIP if you really need it, IMO that is,
name-to-structure or structure-to-name.

2) Does anyone involved with the development of the current InChI
> specification have any comments to make about its implementation of the CIP
> rules and how that will need to change in the current InChI Trust project
> to do a better job with encoding stereo centers?
>

The InChI doesn't and shouldn't need CIP. You can rank order ligands with
any method and use that as you canonical identifier. For example if I
generate a canonical SMILES (yes there are different implementations) the
windings are invariant @ (anti-clockwise, left) and @@(clockwise, right) so
I can just name them as that.

3) Has there been any discussion here or in IUPAC about revising the CIP
> rules?

Yes see [4,5], but could you imagine trying to get every on change their
definitions or use a new system! As with points 1/2 if you actually need
exchange chemical information there are better ways of doing it.

Here some Open CIP Implementations I can quickly find
- JUMBO6
<https://bitbucket.org/wwmm/jumbo6/src/e76bf83c1eaf6ec65d794b111676913c633f1112/src/main/java/org/xmlcml/cml/tools/StereochemistryTool.java?at=default&fileviewer=file-view-default>
(Notice a bug report from me 5 years ago
<https://bitbucket.org/wwmm/jumbo6/issues/1/incorrect-stereochemistry-determination>
:D)
by Peter Murray Rust
- OPSIN
<https://bitbucket.org/dan2097/opsin/src/343e6340a9ad85f68a08630f8b08de8df8f49557/opsin-core/src/main/java/uk/ac/cam/ch/wwmm/opsin/CipSequenceRules.java?at=default&fileviewer=file-view-default>
by
Daniel Lowe
- CDK
<https://github.com/cdk/cdk/blob/master/descriptor/cip/src/main/java/org/openscience/cdk/geometry/cip/CIPTool.java>
by
Egon Willighagen
- RDKit
<https://github.com/rdkit/rdkit/blob/f2c1a95c6e1548457c1b4bf4f6f8fc7defc5f1a7/Code/GraphMol/Chirality.cpp>
by Greg Landrum
- Centres <https://github.com/johnmay/centres> by Me

I also compared some commercial tools in my thesis also. When Daniel and I
did an investigation at NextMove we found OPSIN/Centres agreed the most.
Centres handles more complicate cases (e.g. decalin, para-cyclohexanes,
inositols) however I know it's still incomplete/wrong - I never bothered to
implement the fraction bond orders for mancude rings see [1].

Anyways if anything this discussion has prompted to me submit the following
abstract to ACS Fall 2017. The main aim is to formalise the problems and
propose a way forward.

> *Comparing CIP Implementations: The Need for an Open CIP*
> *John Mayfield, Daniel Lowe and Roger Sayle*Session: Open Structures:
> Wither & Hence in the Digital Era (Oral)
>

The Cahn-Ingold-Prelog (CIP) priority rules have been the corner stone in
> written communication of stereo-chemical configuration for more than half a
> century. The rules rank ligands around a stereocentre allowing an atom
> order and layout invariant stereo-descriptor to be assigned, for example R
> (right) or S (left) for tetrahedral atoms. Despite their widespread daily
> use, many chemists may be surprised to find that beyond trivial cases,
> different software may assign different labels to the same structure
> diagram.
> There have been several attempts to either replace or amend the CIP rules.
> This talk will highlight the more challenging aspects of the ranking and
> present a comparison of software that provide CIP labels and where they
> disagree. Providing an IUPAC verified free and open source CIP
> implementation would allow software maintainers and vendors to validate and
> improve their implementations. Ultimately this would improve the accuracy
> in exchange of written chemical information for all.

John

[1]
http://old.iupac.org/reports/provisional/abstract04/BB-prs310305/Chapter9.pdf
There should be a final one somewhere...
[2] http://pubs.acs.org/doi/abs/10.1021/ci00019a004
[3] http://onlinelibrary.wiley.com/book/10.1002/9783527618279
[4] http://www.sciencedirect.com/science/article/pii/S0957416600862370
[5] http://pubs.acs.org/doi/abs/10.1021/ci00012a003

On 10 April 2017 at 19:31, Robert Hanson <hans...@stolaf.edu> wrote:

> Structure is at https://chemapps.stolaf.edu/jmol/temp/cip-c13-test.png
>
> John  wrote back to say
>
> 1) "The SMILES should be [13C@@H]12C3C1.C2=CC3"
>
>  -- Thanks for that. Duh!
>
> 2) "The designation is S"
>
> I'm pretty sure it's 1R 5R.
>
> For the chirality at C1, the only question is whether C5 beats C2. The
> highest-priority path via C1-C5 is C1-C5-C6 rather than C1-C5-C2 because
> the duplicated atom C1 with mass 13 coming around the cyclopropane ring
> C1-C5-C6-(C1) beats the alternative pathway C1-C5-C4-C3 based on Rule 2
> (higher mass). And then that pathway beats C1-C2-C3-C4 for the same reason.
> So C5 has higher priority than C2.
>
> It is opposite when there is no isotope. In that case, C1-C2-C3-C4 beats
> C1-C5-C6-(C1) due to the lack of substituents on the duplicated atom C1
> compared to C4 in the *next round*, giving 1S 1R for the original model
> that Mikko sent me.
>
> Am I wrong?
>
> Bob
>
>
> 
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Blueobelisk-discuss mailing list
> Blueobelisk-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
>
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Re: [BlueObelisk-discuss] Fwd: Cahn-Ingold-Prelog rules into Jmol

Reply via email to