Dmitri,
  Could you send me the notebook that displays these issues?  I can't
reproduce them.

Thanks,
 Brian



On Tue, Jun 28, 2016 at 6:25 PM, Brian Kelley <fustiga...@gmail.com> wrote:

> It looks like there may be an issue calling WrapLogs twice.  If you see
> the error messages in the notebook, it's already been called.  Importing
> IPythonConsole does this automatically.
>
> This may be the cause of our confusion.  I'll look into it.
>
> ----
> Brian Kelley
>
> On Jun 28, 2016, at 3:54 PM, DmitriR <xzf...@gmail.com> wrote:
>
> Hi Brian,
>
> First off, now I can capture the warnings, so for practical purposes my
> question has been addressed, thank you for helping me get to this point.
>
> Cool trick with StringIO. I can even just do:
>
> Python 3.5.1 :: Anaconda 2.4.0 (x86_64), OSX 10.11.5, jupyter 4.1.0,
> Firefox
>
> ```
> import io
> err = sys.stderr
> sys.stderr = io.StringIO()
>
> # capture errors/warnings
> Chem.MolFromSmiles('C1CC')
>
> msgs = sys.stderr.getvalue()
> sys.stderr = err
> print('Captured', msgs)
>
> # now errors show in the notebook again
> Chem.MolFromSmiles('C1CC')
> ```
>
> ==
>
> However, if you feel like digging a bit deeper, I'm a little confused too
> now :)
>
> What is the scope of WrapLogs() effects? (notebook-wide, or cell?) Or, by
> chance, does it set anything really persistent?
>
> In my prior notebook session, prior to trying WrapLogs() I could already
> see the warnings printed on red background (like in your screenshot, except
> that you have an ERROR msg, not WARNINGs as in my example).
>
> A call to WrapLogs() made warnings apparently disappear from the notebook.
>
> Upon reinitializing the session I could see the warnings on red background
> as before, wrote the code snippet in my prior email, and *without calling
> WrapLogs()* I could capture the warnings with it.
>
> So I assumed that RDKit messages went to the notebook's stderr by default,
> and WrapLogs() did something else.
>
> After getting your last email, I made a minimal test case (new notebook
> with just the RDKit call that generates warnings `dff['InChI'] =
> dff['ROMol'].map(Chem.MolToInchi)`, wrapped inside the stderr capture code
> snippet), killed all python instances, restarted the browser, loaded data
> from pickled dataframe.
>
> Now, *without ever having called WrapLogs()* I still get all RDKit
> warnings go to stderr, and I can still  capture them using the snippet.
> Calls to WrapLogs() now appear to have no effect whatsoever.
>
> If this indicates to you any potential issue, we can look more into it.
> Otherwise I'm good.
>
> ==
>
> The other strange behavior that I described below (the number of warnings
> alternating between successive calls to the same code using
> Chem.MolToInchi) remains though. Maybe it's the underlying InChi code, I
> did not investigate.
>
> Thanks again.
> Dmitri
>
>
>
> On Jun 28, 2016, at 2:14 PM, Brian Kelley <fustiga...@gmail.com> wrote:
>
> Dmitri,
>   I admit to being a bit confused.  What WrapLogs() does is simply
> redirect the C++ errors into python's stderr. See attache png.   I think
> you may have noticed that, as you are capturing with sys.stderr.
>
> These errors are output (at least for me) in the IPython notebook.  I'm
> not sure what is being hidden here.  Perhaps the notebook has changed
> somehow?  Here is my version:
>
> Python 2.7.11 |Anaconda 2.1.0 (x86_64)| (default, Dec  6 2015, 18:57:58)
> Type "copyright", "credits" or "license" for more information.
>
> IPython 4.0.0 -- An enhanced Interactive Python.
>
>
> btw - you can use StringIO as opposed to a file
>
> from StringIO import StringIO
>
> err = sys.stderr
> io = sys.stderr = StringIO()
> ....
> sys.stderr = err
> print io.getvalue()
>
>
>
> On Tue, Jun 28, 2016 at 1:24 PM, DmitriR <xzf...@gmail.com> wrote:
> Brian - Thank you!
>
> (on OSX 10.11.5, jupyter 4.1.0)
>
> rdkit.Chem.WrapLogs() does hide the messages.
> I could not figure out how to access them though once they are hidden.
>
> To capture warnings, this mechanism seems to work - but it is ugly.
>
> ```
> import os
> ## switch the streams
> stderr_fn = 'stderr.log'
> orig_stderr = sys.stderr
> sys.stderr = open(stderr_fn, 'w')
>
> ## RDKit code producing warnings goes here
>
> ## switch back stderr, process the warnings
> sys.stderr.flush()
> sys.stderr = orig_stderr
> with open(stderr_fn, 'r') as f: err_data = f.read()
> os.remove(stderr_fn)
> print(len(err_data))
> ```
>
> Assuming it is all even necessary, this could be made much nicer by using
> a context manager/decorator to handle stderr capture and return the
> warnings text in an extra argument, along the lines of
>
> http://stackoverflow.com/questions/5136611/capture-stdout-from-a-script-in-python
>
> ==
>
> But also I noticed something weird:
>
> If I re-run the notebook cell with code that produces warnings, I get *no
> warnings* every third or sometimes second invocation.
>
> And when I run this with data that produce a lot of warnings (hundreds), I
> get different number of warnings between runs, at least with this call:
>
> ```
> #dff is a pandas dataframe
> dff['InChI'] = dff['ROMol'].map(Chem.MolToInchi)
> ```
>
> it cycles higher-number -> lower-number -> higher-number ... Not sure what
> to make of it. Something screwed up with my system?
>
> Dmitri
>
>
>
> > On Jun 28, 2016, at 8:24 AM, Brian Kelley <fustiga...@gmail.com> wrote:
> >
> > Dmitri,  if you import rdkit.Chem.Draw.IPythonConsole the c++ errors and
> warnings should be seen in IPython.  This doesn't appear to work on Windows
> yet, sadly.
> >
> > This is enabled by the command
> > rdkit.Chem.WrapLogs()
> >
> > We are also doing a second pass soon to get better exception details in
> python which has been a pet peeve of mine for a while.
> > ----
> > Brian Kelley
> >
> > On Jun 28, 2016, at 4:04 AM, DmitriR <xzf...@gmail.com> wrote:
> >
> >> Hi Greg -
> >>
> >> Thank you very much for the clear and detailed explanation!
> >>
> >> (and, now that I have a chance to say this, thank you for putting the
> project together; being able to work with chemistry in the python notebook
> is great, and having hooks into pandas is really cool)
> >>
> >> In this case I was basically just going through the example code and
> ran into some behaviors that I did not understand (and you kindly
> explained). So it's all clear now. Uppercase aromatic atoms in MCS output
> does appear to be a bug; Hs on aromatic nitrogens I'll need to fix manually
> or with a transform.
> >>
> >> ==
> >>
> >> Separately, on another thing that came up in my working through that
> data:
> >>
> >> I'd like to add my 2cents-equivalent of vote toward a bit fuller
> control of warnings produced by the C++ backend. In that example's data I
> was getting a lot of (fully valid, I think) warnings about stereochemistry,
> but I could not do anything to catch or hide them - and in an ipython
> notebook, it can get less than tidy. I did see this mentioned in other
> threads, so I understand that logging is a known issue somewhere on the
> stack. For now I just clean up manually.
> >>
> >> Thanks again!
> >>
> >> Kind regards,
> >> Dmitri
> >>
> >>
> >>
> >>> On Jun 28, 2016, at 1:39 AM, Greg Landrum <greg.land...@gmail.com>
> wrote:
> >>>
> >>> Hi Dmitri,
> >>>
> >>> The results that come back from the MCS in that examples really
> describe queries, not necessarily stable molecules or things that can be
> accurately translated into SMILES.
> >>>
> >>> I'll describe below what's going on to cause the error, but the more
> important question is: what are you trying to do?
> >>>
> >>> In this case there are two problems. One has to do with the aromatic
> bonds in the SMILES coming from C atoms that are written as capital
> letters. Here's a simplified version of your example:
> >>>
> >>> In [11]: Chem.MolFromSmiles('O=C1:[NH]:C:N:N2:C:*:C:C:1:2')
> >>> [06:43:37] Explicit valence for atom # 1 C, 5, is greater than
> permitted
> >>>
> >>> If I rewrite the SMILES to have the atoms with aromatic bonds written
> with lower case letters everything is fine:
> >>>
> >>> In [12]: Chem.MolFromSmiles('O=c1:[nH]:c:n:n2:c:*:c:c:1:2')
> >>> Out[12]: <rdkit.Chem.rdchem.Mol at 0x7f3204024440>
> >>>
> >>> This shouldn't make a difference in SMILES, so I'm inclined to think
> that it's a bug.
> >>>
> >>> The second problem was the missing hydrogen specification on the
> aromatic nitrogen that has an H (I fixed this in the SMILES above). Since
> the RDKit does not attempt to guess at chemistry, the general rule is
> that aromatic heteroatoms should have Hs specified if they have any. There
> have been a number of mailing list threads on this topic.
> >>>
> >>> Best,
> >>> -greg
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Jun 27, 2016 at 8:26 PM, DmitriR <xzf...@gmail.com> wrote:
> >>> Dear RDKitters,
> >>>
> >>> I would appreciate any comments on the following:
> >>>
> >>> I am looking at the 'SureChEMBL iPython Notebook Tutorial'
> >>>
> http://nbviewer.jupyter.org/github/rdkit/UGM_2014/blob/master/Notebooks/Vardenafil.ipynb
> >>>
> >>> following along with rdkit '2016.03.1' on OSX
> >>>
> >>> In Cell 142, there is this SMILES:
> >>>
> >>> MCS SMILES: O=C1:N:C(C2:C:C:C:C:C:2):N:N2:C:[*]:C:C:1:2
> >>> This is a representation of a generalized structure, not any
> particular molecule.
> >>>
> >>> It was generated with Chem.MolToSmiles(mcsM,isomericSmiles=True)
> >>>
> >>> But when I try
> >>> Chem.MolFromSmiles('O=C1:N:C(C2:C:C:C:C:C:2):N:N2:C:[*]:C:C:1:2')
> >>>
> >>> I get "RDKit ERROR: [14:11:32] Explicit valence for atom # 1 C, 5, is
> greater than permitted"
> >>>
> >>> So there is no "round-trip" possible here.
> >>>
> >>> Which behavior is "correct", given the aromaticity and structure as
> specified?
> >>> Should this be rendering/creating molecule, or failing?
> >>>
> >>> Thanks!
> >>>
> >>> (MarvinSketch does display the SMILES without complaints.;
> >>> image is attached)
> >>>
> >>> Dmitri
> >>>
> >>>
> >>> <PastedGraphic-3.png>
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> >>> Francisco, CA to explore cutting-edge tech and listen to tech
> luminaries
> >>> present their vision of the future. This family event has something for
> >>> everyone, including kids. Get more information and register today.
> >>> http://sdm.link/attshape
> >>> _______________________________________________
> >>> Rdkit-discuss mailing list
> >>> Rdkit-discuss@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >>>
> >>>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> >> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> >> present their vision of the future. This family event has something for
> >> everyone, including kids. Get more information and register today.
> >> http://sdm.link/attshape
> >> _______________________________________________
> >> Rdkit-discuss mailing list
> >> Rdkit-discuss@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
> <IPythonError.png>
>
>
>
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to