Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2016-03-30 Thread Peter S. Shenkin
Hi, Greg, This is a bit of a meta-comment, so I hope it's appropriate here. I might possibly make more comments on the github site. After this discussion went dead (but before I retired from Schrödinger), I looked into the origin of that bizarre multicyclic structure. It turned out to have

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2016-03-29 Thread Greg Landrum
This is reviving a long-dead thread because I just marked the associated issue as "won't fix" and some of you might be interested in the reasons. Here's the bug: https://github.com/rdkit/rdkit/issues/523 The comment explaining my thinking is here:

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Peter Shenkin
Hi, I do not insist on using kekule forms. In fact, I said that using a double bond between two aromatic atoms in a SMILES does not appear problematic to me. I was trying to say in the line you quoted that even if analysis of QM results leads to a verdict of non-aromaticity, such a verdict

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Dimitri Maziuk
On 06/17/2015 08:36 AM, Peter Shenkin wrote: We could consider some quantum-mechanical calculations Yes! for the question of the true nature of the molecule. But that not need not affect the way canonicalization is done. Again, define canonical. If you insist on using kekule form in a

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Peter Shenkin
Hi, Greg, Within the SMILES framework, it seems to me that if you allow the atoms to be aromatic, then these are two Kekule structures of the same aromatic system, and however you do the canonicalization, they ought to canonicalize to the same structure, which the two examples did not do. I don't

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Markus Sitzmann
We could consider some quantum-mechanical calculations ... well, I always hated this discussion when I heard for my web service with millions of structures, I should consider quantum-mechanical calculations as part of the structure normalization/canonicalization ;-) On Wed, Jun 17, 2015 at 8:22

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Peter Shenkin
We could consider some quantum-mechanical calculations Yes! for the question of the true nature of the molecule. But that not need not affect the way canonicalization is done. These are two different forms of entertainment -P. On Wed, Jun 17, 2015 at 3:24 AM, Markus Sitzmann

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-16 Thread Andrew Dalke
On Jun 16, 2015, at 10:20 PM, Peter Shenkin wrote: [N-]=[N+]=NC(=O)N1C(=O)N([N+]([O-])=O)C2(C13C4=C56)C4=C5C2=C36 [N-]=[N+]=NC(=O)N(C(=O)N1[N+]([O-])=O)C(c23)(c4c56)C16c3c5c24 rdkit canonicalizes the two to the following, respectively:

[Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-16 Thread Peter Shenkin
[N-]=[N+]=NC(=O)N1C(=O)N([N+]([O-])=O)C2(C13C4=C56)C4=C5C2=C36 [N-]=[N+]=NC(=O)N(C(=O)N1[N+]([O-])=O)C(c23)(c4c56)C16c3c5c24 rdkit canonicalizes the two to the following, respectively: [N-]=[N+]=NC(=O)N1C(=O)N([N+](=O)[O-])C23c4c5c2c2c-5c4C213

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-16 Thread Peter Shenkin
Thanks, Andrew... BTW, to help it out, you can ask RDKit to include all of the bond information, as otherwise it will use the single-or-aromatic notation. That's a nice feature. I don't know how it is that RDKit adds a double bond to the second cubane, given only aromatic carbons and