Hi Greg - 

Thank you very much for the clear and detailed explanation!

(and, now that I have a chance to say this, thank you for putting the project 
together; being able to work with chemistry in the python notebook is great, 
and having hooks into pandas is really cool)

In this case I was basically just going through the example code and ran into 
some behaviors that I did not understand (and you kindly explained). So it's 
all clear now. Uppercase aromatic atoms in MCS output does appear to be a bug; 
Hs on aromatic nitrogens I'll need to fix manually or with a transform. 

==

Separately, on another thing that came up in my working through that data:

I'd like to add my 2cents-equivalent of vote toward a bit fuller control of 
warnings produced by the C++ backend. In that example's data I was getting a 
lot of (fully valid, I think) warnings about stereochemistry, but I could not 
do anything to catch or hide them - and in an ipython notebook, it can get less 
than tidy. I did see this mentioned in other threads, so I understand that 
logging is a known issue somewhere on the stack. For now I just clean up 
manually.

Thanks again!

Kind regards,
Dmitri



> On Jun 28, 2016, at 1:39 AM, Greg Landrum <greg.land...@gmail.com> wrote:
> 
> Hi Dmitri,
> 
> The results that come back from the MCS in that examples really describe 
> queries, not necessarily stable molecules or things that can be accurately 
> translated into SMILES.
> 
> I'll describe below what's going on to cause the error, but the more 
> important question is: what are you trying to do?
> 
> In this case there are two problems. One has to do with the aromatic bonds in 
> the SMILES coming from C atoms that are written as capital letters. Here's a 
> simplified version of your example:
> 
> In [11]: Chem.MolFromSmiles('O=C1:[NH]:C:N:N2:C:*:C:C:1:2')
> [06:43:37] Explicit valence for atom # 1 C, 5, is greater than permitted
> 
> If I rewrite the SMILES to have the atoms with aromatic bonds written with 
> lower case letters everything is fine:
> 
> In [12]: Chem.MolFromSmiles('O=c1:[nH]:c:n:n2:c:*:c:c:1:2')
> Out[12]: <rdkit.Chem.rdchem.Mol at 0x7f3204024440>
> 
> This shouldn't make a difference in SMILES, so I'm inclined to think that 
> it's a bug.
> 
> The second problem was the missing hydrogen specification on the aromatic 
> nitrogen that has an H (I fixed this in the SMILES above). Since the RDKit 
> does not attempt to guess at chemistry, the general rule is that aromatic 
> heteroatoms should have Hs specified if they have any. There have been a 
> number of mailing list threads on this topic.
> 
> Best,
> -greg
> 
> 
> 
> 
> On Mon, Jun 27, 2016 at 8:26 PM, DmitriR <xzf...@gmail.com> wrote:
> Dear RDKitters, 
> 
> I would appreciate any comments on the following:
> 
> I am looking at the 'SureChEMBL iPython Notebook Tutorial' 
> http://nbviewer.jupyter.org/github/rdkit/UGM_2014/blob/master/Notebooks/Vardenafil.ipynb
> 
> following along with rdkit '2016.03.1' on OSX 
> 
> In Cell 142, there is this SMILES: 
> 
> MCS SMILES: O=C1:N:C(C2:C:C:C:C:C:2):N:N2:C:[*]:C:C:1:2
> This is a representation of a generalized structure, not any particular 
> molecule.
> 
> It was generated with Chem.MolToSmiles(mcsM,isomericSmiles=True) 
> 
> But when I try 
> Chem.MolFromSmiles('O=C1:N:C(C2:C:C:C:C:C:2):N:N2:C:[*]:C:C:1:2')
> 
> I get "RDKit ERROR: [14:11:32] Explicit valence for atom # 1 C, 5, is greater 
> than permitted"
> 
> So there is no "round-trip" possible here. 
> 
> Which behavior is "correct", given the aromaticity and structure as specified?
> Should this be rendering/creating molecule, or failing?
> 
> Thanks!
> 
> (MarvinSketch does display the SMILES without complaints.;
> image is attached)
> 
> Dmitri
> 
> 
> 
> 
> ------------------------------------------------------------------------------
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> 

------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to