Re: [Rdkit-discuss] Programatic access to the mol sanitation process results
That worker! Although I was too lazy to modify the actual class and used python package for that. If anyone would be interested the minimal code how not to mess the stderr while retaining the error message as a variable to work with, see below. It uses python streams and wurlitzer package https://github.com/minrk/wurlitzer from rdkit import Chem import io from wurlitzer import pipes mol = Chem.MolFromSmiles('CO(C)C', sanitize=False) out_stream = io.BytesIO() with pipes(stderr=out_stream): sanitization_result = Chem.SanitizeMol(mol, catchErrors=True) error_msg = out_stream.getvalue().decode('utf-8') print(error_msg) Lukas From: Peter Gedeck Date: Friday, 9 March 2018 at 15:02 To: Lukas Pravda Cc: Greg Landrum , Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process results Hello Lukas, The file rdkit/TestRunner.py contains a class/context manager called OutputRedirectC. If I remember correctly, this allowed capturing these messages. It's not used anywhere in the RDkit code base, so it not work anymore. Anyway, give it a try and if it works, you can modify it to redirect the output into a variable or StringIO. Best, Peter On 9 Mar 2018, at 9:34 AM, Lukas Pravda wrote: Hello Greg, I’m very sorry for the late reply. Thank you for the hint on disabling the log message, it works on my end. However, I was more interested in catching the other bit i.e. which part of the structure is wrong, rather than which part of the sanitization process failed. That is accessing the message ‘Explicit valence for atom # 1 O, 3, is greater than permitted’ in form to find out that it is the misbehaving oxygen which causes failure of the sanitization process. Perhaps piping the log information into a variable or something like that. Best, Lukas From: Greg Landrum Date: Thursday, 22 February 2018 at 13:32 To: Lukas Pravda Cc: RDKit Discuss Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process results Hi Lukas, On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda wrote: Dear rdkiters, I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 and due to the variety of reasons some of them are violating general principles of chemistry in a way implemented in rdkit, so I’m getting information like: Explicit valence for atom # 14 N, 4, is greater than permitted etc. I wonder if there is a way how to retrieve this piece of information in a programmatic way. In order to work with it. Presently, rdkit only prints this out into terminal and Chem.SanitizeMol() only returns first sanitization flag with the issue. Ideally, I’d like no information to be printed into console, while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater than permitted’ preferably in a structured way (in a property/method?), in order to further deal with those erroneous cases. At last part of this is pretty straightforward. There are two parts: - making it so error messages don't go to the console - capturing the failed operation. The first is a bit fragile (i.e. doesn't always work), so you will sometimes end up still seeing error messages (as here), but the second should be reliable: In [30]: rdBase.DisableLog('rdApp.*') In [31]: m = Chem.MolFromSmiles('c11',sanitize=False) In [32]: Chem.SanitizeMol(m,catchErrors=True) [14:29:37] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE In [35]: Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True) [14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES You can see that the return value indicates what went wrong in the sanitization. I hope this helps, -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Programatic access to the mol sanitation process results
Hello Lukas, The file rdkit/TestRunner.py contains a class/context manager called OutputRedirectC. If I remember correctly, this allowed capturing these messages. It's not used anywhere in the RDkit code base, so it not work anymore. Anyway, give it a try and if it works, you can modify it to redirect the output into a variable or StringIO. Best, Peter > On 9 Mar 2018, at 9:34 AM, Lukas Pravda wrote: > > Hello Greg, > > I’m very sorry for the late reply. Thank you for the hint on disabling the > log message, it works on my end. However, I was more interested in catching > the other bit i.e. which part of the structure is wrong, rather than which > part of the sanitization process failed. That is accessing the message > ‘Explicit valence for atom # 1 O, 3, is greater than permitted’ in form to > find out that it is the misbehaving oxygen which causes failure of the > sanitization process. Perhaps piping the log information into a variable or > something like that. > > Best, > Lukas > > > > From: Greg Landrum mailto:greg.land...@gmail.com>> > Date: Thursday, 22 February 2018 at 13:32 > To: Lukas Pravda mailto:lpra...@ebi.ac.uk>> > Cc: RDKit Discuss <mailto:Rdkit-discuss@lists.sourceforge.net>> > Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process > results > > Hi Lukas, > > On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda <mailto:lpra...@ebi.ac.uk>> wrote: >> Dear rdkiters, >> >> I’m constructing molecules from scratch using python 3.5.4 and RDKit >> 2017.09.2 and due to the variety of reasons some of them are violating >> general principles of chemistry in a way implemented in rdkit, so I’m >> getting information like: >> >> Explicit valence for atom # 14 N, 4, is greater than permitted etc. >> >> I wonder if there is a way how to retrieve this piece of information in a >> programmatic way. In order to work with it. Presently, rdkit only prints >> this out into terminal and Chem.SanitizeMol() only returns first >> sanitization flag with the issue. Ideally, I’d like no information to be >> printed into console, while keeping the log info ‘Explicit valence for atom >> # 14 N, 4, is greater than permitted’ preferably in a structured way (in a >> property/method?), in order to further deal with those erroneous cases. > > At last part of this is pretty straightforward. > > There are two parts: > - making it so error messages don't go to the console > - capturing the failed operation. > > The first is a bit fragile (i.e. doesn't always work), so you will sometimes > end up still seeing error messages (as here), but the second should be > reliable: > > In [30]: rdBase.DisableLog('rdApp.*') > > In [31]: m = Chem.MolFromSmiles('c11',sanitize=False) > > In [32]: Chem.SanitizeMol(m,catchErrors=True) > [14:29:37] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 > > Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE > > In [35]: > Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True) > [14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted > Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES > > > You can see that the return value indicates what went wrong in the > sanitization. > > I hope this helps, > -greg > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org <http://slashdot.org/>! > http://sdm.link/slashdot___ > <http://sdm.link/slashdot___> > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > <mailto:Rdkit-discuss@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Programatic access to the mol sanitation process results
Hello Greg, I’m very sorry for the late reply. Thank you for the hint on disabling the log message, it works on my end. However, I was more interested in catching the other bit i.e. which part of the structure is wrong, rather than which part of the sanitization process failed. That is accessing the message ‘Explicit valence for atom # 1 O, 3, is greater than permitted’ in form to find out that it is the misbehaving oxygen which causes failure of the sanitization process. Perhaps piping the log information into a variable or something like that. Best, Lukas From: Greg Landrum Date: Thursday, 22 February 2018 at 13:32 To: Lukas Pravda Cc: RDKit Discuss Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process results Hi Lukas, On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda wrote: Dear rdkiters, I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 and due to the variety of reasons some of them are violating general principles of chemistry in a way implemented in rdkit, so I’m getting information like: Explicit valence for atom # 14 N, 4, is greater than permitted etc. I wonder if there is a way how to retrieve this piece of information in a programmatic way. In order to work with it. Presently, rdkit only prints this out into terminal and Chem.SanitizeMol() only returns first sanitization flag with the issue. Ideally, I’d like no information to be printed into console, while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater than permitted’ preferably in a structured way (in a property/method?), in order to further deal with those erroneous cases. At last part of this is pretty straightforward. There are two parts: - making it so error messages don't go to the console - capturing the failed operation. The first is a bit fragile (i.e. doesn't always work), so you will sometimes end up still seeing error messages (as here), but the second should be reliable: In [30]: rdBase.DisableLog('rdApp.*') In [31]: m = Chem.MolFromSmiles('c11',sanitize=False) In [32]: Chem.SanitizeMol(m,catchErrors=True) [14:29:37] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE In [35]: Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True) [14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES You can see that the return value indicates what went wrong in the sanitization. I hope this helps, -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Programatic access to the mol sanitation process results
Hi Lukas, On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda wrote: > Dear rdkiters, > > > > I’m constructing molecules from scratch using python 3.5.4 and RDKit > 2017.09.2 and due to the variety of reasons some of them are violating > general principles of chemistry in a way implemented in rdkit, so I’m > getting information like: > > > > Explicit valence for atom # 14 N, 4, is greater than permitted etc. > > > > I wonder if there is a way how to retrieve this piece of information in a > programmatic way. In order to work with it. Presently, rdkit only prints > this out into terminal and Chem.SanitizeMol() only returns first > sanitization flag with the issue. Ideally, I’d like no information to be > printed into console, while keeping the log info ‘Explicit valence for atom > # 14 N, 4, is greater than permitted’ preferably in a structured way (in a > property/method?), in order to further deal with those erroneous cases. > > At last part of this is pretty straightforward. There are two parts: - making it so error messages don't go to the console - capturing the failed operation. The first is a bit fragile (i.e. doesn't always work), so you will sometimes end up still seeing error messages (as here), but the second should be reliable: In [30]: rdBase.DisableLog('rdApp.*') In [31]: m = Chem.MolFromSmiles('c11',sanitize=False) In [32]: Chem.SanitizeMol(m,catchErrors=True) [14:29:37] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE In [35]: Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True) [14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES You can see that the return value indicates what went wrong in the sanitization. I hope this helps, -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Programatic access to the mol sanitation process results
Dear rdkiters, I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 and due to the variety of reasons some of them are violating general principles of chemistry in a way implemented in rdkit, so I’m getting information like: Explicit valence for atom # 14 N, 4, is greater than permitted etc. I wonder if there is a way how to retrieve this piece of information in a programmatic way. In order to work with it. Presently, rdkit only prints this out into terminal and Chem.SanitizeMol() only returns first sanitization flag with the issue. Ideally, I’d like no information to be printed into console, while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater than permitted’ preferably in a structured way (in a property/method?), in order to further deal with those erroneous cases. Thank you for answer, Lukas -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss