Hi Greg -
Thank you very much for the clear and detailed explanation!
(and, now that I have a chance to say this, thank you for putting the project
together; being able to work with chemistry in the python notebook is great,
and having hooks into pandas is really cool)
In this case I was basically just going through the example code and ran into
some behaviors that I did not understand (and you kindly explained). So it's
all clear now. Uppercase aromatic atoms in MCS output does appear to be a bug;
Hs on aromatic nitrogens I'll need to fix manually or with a transform.
==
Separately, on another thing that came up in my working through that data:
I'd like to add my 2cents-equivalent of vote toward a bit fuller control of
warnings produced by the C++ backend. In that example's data I was getting a
lot of (fully valid, I think) warnings about stereochemistry, but I could not
do anything to catch or hide them - and in an ipython notebook, it can get less
than tidy. I did see this mentioned in other threads, so I understand that
logging is a known issue somewhere on the stack. For now I just clean up
manually.
Thanks again!
Kind regards,
Dmitri
> On Jun 28, 2016, at 1:39 AM, Greg Landrum <greg.land...@gmail.com> wrote:
>
> Hi Dmitri,
>
> The results that come back from the MCS in that examples really describe
> queries, not necessarily stable molecules or things that can be accurately
> translated into SMILES.
>
> I'll describe below what's going on to cause the error, but the more
> important question is: what are you trying to do?
>
> In this case there are two problems. One has to do with the aromatic bonds in
> the SMILES coming from C atoms that are written as capital letters. Here's a
> simplified version of your example:
>
> In [11]: Chem.MolFromSmiles('O=C1:[NH]:C:N:N2:C:*:C:C:1:2')
> [06:43:37] Explicit valence for atom # 1 C, 5, is greater than permitted
>
> If I rewrite the SMILES to have the atoms with aromatic bonds written with
> lower case letters everything is fine:
>
> In [12]: Chem.MolFromSmiles('O=c1:[nH]:c:n:n2:c:*:c:c:1:2')
> Out[12]: <rdkit.Chem.rdchem.Mol at 0x7f3204024440>
>
> This shouldn't make a difference in SMILES, so I'm inclined to think that
> it's a bug.
>
> The second problem was the missing hydrogen specification on the aromatic
> nitrogen that has an H (I fixed this in the SMILES above). Since the RDKit
> does not attempt to guess at chemistry, the general rule is that aromatic
> heteroatoms should have Hs specified if they have any. There have been a
> number of mailing list threads on this topic.
>
> Best,
> -greg
>
>
>
>
> On Mon, Jun 27, 2016 at 8:26 PM, DmitriR <xzf...@gmail.com> wrote:
> Dear RDKitters,
>
> I would appreciate any comments on the following:
>
> I am looking at the 'SureChEMBL iPython Notebook Tutorial'
> http://nbviewer.jupyter.org/github/rdkit/UGM_2014/blob/master/Notebooks/Vardenafil.ipynb
>
> following along with rdkit '2016.03.1' on OSX
>
> In Cell 142, there is this SMILES:
>
> MCS SMILES: O=C1:N:C(C2:C:C:C:C:C:2):N:N2:C:[*]:C:C:1:2
> This is a representation of a generalized structure, not any particular
> molecule.
>
> It was generated with Chem.MolToSmiles(mcsM,isomericSmiles=True)
>
> But when I try
> Chem.MolFromSmiles('O=C1:N:C(C2:C:C:C:C:C:2):N:N2:C:[*]:C:C:1:2')
>
> I get "RDKit ERROR: [14:11:32] Explicit valence for atom # 1 C, 5, is greater
> than permitted"
>
> So there is no "round-trip" possible here.
>
> Which behavior is "correct", given the aromaticity and structure as specified?
> Should this be rendering/creating molecule, or failing?
>
> Thanks!
>
> (MarvinSketch does display the SMILES without complaints.;
> image is attached)
>
> Dmitri
>
>
>
>
> ------------------------------------------------------------------------------
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss