I think I agree but need to draw out the digraph to convince my self. The
whole reason for 1b was to fix this case (I believe originally from WDI
hashcode paper IIRC):

CC(C(CCC1CC1)(CCC1CC1)CCC1CC1)C12CCC(CC1)CC2

I think splitting ties when they're the same is undesirable but worse is
naming two different things the same. As I said you can fix the first one
with a different and better algorithm. For the second you have these
examples which are different but get the same R/S labels:

C[C@H]1[C@@H](C)[C@@H](C)[C@@H](C)[C@H](C)[C@H](C)[C@H](C)[C@@H]1C
C[C@H]1[C@H](C)[C@@H](C)[C@H](C)[C@H](C)[C@H](C)[C@@H](C)[C@H]1C

Correct the CDK is rudimentary and only handles simple cases - although in
practise that is most cases :-). But as I've said multiple times the
Centres (https://github.com/johnmay/centres) one as a more complete
implementation which does 1-5 and Aux descriptions. It's a little difficult
to follow as I wrote it toolkit independant so downstream users plug in to
certain interfaces saying how to access atomic number and connected atoms
etc. The parts are spread out a bit but you can see how the rules are
configured here:
https://github.com/johnmay/centres/blob/develop/cdk/src/main/java/uk/ac/ebi/centres/cdk/CDKPerceptor.java#L78-L106

I never spent enough time on it to do the fractional bond orders, I was
considering doing it for this ACS talk but we'll see - it's a lot of effort
for very little gain. The validation of centres was done in my thesis but
there are a handful of examples here:
https://github.com/johnmay/centres/tree/develop/cdk/src/test/resources/uk/ac/ebi/centres/cdk.
Again, I was planning on putting together a comprehensive validation set
for the talk.

John

On 17 May 2017 at 04:36, Robert Hanson <hans...@stolaf.edu> wrote:

> So you agree?  Any particular reason no one has published on this? Just
> too minor a detail?
>
> Any example with two similar, functionalized benzene rings (substituted
> biphenyl, for example), stands a 50% chance of failing this test. I'm quite
> surprised that it wasn't discovered very early. I guess they just never
> considered this Kekule issue. I think that is apparent from the Kekule fix
> for Rule 1a, where it is stated:
>
> [image: Inline image 2]
>
> Well, that certainly is not true, is it?!
>
> Questions for John:
>
> Q1: Does the CDK implement the Kekule considerations required for
> application of Rule 1a?
>
> Q2: Does the CDK implement Rule 1b?
>
> All I can find is a rudimentary atom number/mass consideration -- Rule 2
> and part of Rule 1a. I can't find Rules 3, 4, or 5. I know I must be
> missing something major here.
>
> Q2: Is the CDK validation suite for CIP on the GitHub site somewhere? I
> can't find it.
>
> ​
> Bob
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to