Re: [Rdkit-discuss] RDKit installation problem
Hi Sebastian, Quickly looking at the available builds in the rdkit conda channel (https://anaconda.org/rdkit/rdkit) it appears that you are pulling windows 32-bit version of rdkit. Perhaps this is caused by the fact that you use 32bit version of conda? Try installing 64-bit version of conda and pull again. Best, Lukas From: "Sebastián J. Castro" Date: Saturday, 1 August 2020 at 20:29 To: Subject: [Rdkit-discuss] RDKit installation problem I have try the installation suggested at http://www.rdkit.org/docs/Install.html: $ conda create -c rdkit -n my-rdkit-env rdkit But I get 2017 version instead of 2020 (last released). I don't know how to install it. Can you help me? I have Ubuntu 20.04 LTS Thank you Best regards! -- Dr. Sebastián J. Castro Departamento de Ciencias Farmacéuticas Facultad de Ciencias Químicas Universidad Nacional de Córdoba UNITEFA-CONICET ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Manganese ion as a radical?
Dear rdkit community, I’m not quite sure if this is more of an rdkit or a chemistry related question. I’d like to understand why a manganese ion has 3 radical electrons when interpreted by rdkit. I have not seen radicals in any other metal ion so far. The code to get the depiction looks like this: from rdkit import Chem from rdkit.Chem import Draw width = 500 m = Chem.MolFromInchi('InChI=1S/Mn/q+2') drawer = Draw.rdMolDraw2D.MolDraw2DSVG(width, width) Draw.rdMolDraw2D.PrepareMolForDrawing(m, wedgeBonds=True, kekulize=True, addChiralHs=True) drawer.DrawMolecule(m) drawer.FinishDrawing() with open('2d_mol.svg', 'w') as f: svg = drawer.GetDrawingText() f.write(svg) print('done') and the depiction you get looks like the one on the page: https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/MN Thank you in advance for clarification. rdkit through python 2020.03.4 on mac 10.15.6 Best, Lukas ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand
Hi IIllimar, I don’t really know what your use case is, so it may be completely useless. However, just to add my two cents, we've created a package that builds on the top of rdkit and parses PDB ligand definitions from cif files. You can find the package here: https://gitlab.ebi.ac.uk/pdbe/ccdutils and the documentation can be found here: https://pdbe.gitdocs.ebi.ac.uk/ccdutils/ Let me know if this is helpful or you need further help. Best, Lukas On 16/12/2019, 20:03, "Paolo Tosco" wrote: Hi IIllimar, The RDKit PDB reader only recognize standard amino acids and, after the PR I did on Saturday https://github.com/rdkit/rdkit/pull/2850 will be merged, nucleic acid bases. Anything else will not have the correct hybridization/bond orders perceived, as those are not encoded in the PDB format and the PDB reader does not have any functionality to do that. The 1ARJ case is peculiar, as it has an ARG residue which would be recognized if it were in the ATOM records, but not in the HETATM section, for which no attempt to perceive the correct hybridization/bond is made. My suggestion, if you are using standard PDB files, is to download the SDF file: https://www.rcsb.org/pdb/download/downloadLigandFiles.do?ligandIdList=A2F=3GOT=all=false=false and construct your RDKit molecule from that. You should be able to automate that without too much effort either constructing URLs using the template above or using the PDB REST API. Cheers, p. On 16/12/2019 18:24, Illimar Hugo Rekand wrote: > Thanks, Paolo, for a good and clear example. > > > I adapted your code into my workflow to calculate some Lipinski-properties of RNA pdb-structures, and ran into some issues. I'm not sure if I should make a new thread or throw this onto this one I already created? > > > I used the following code under > > > from rdkit import Chem > from rdkit.Chem import rdmolops, Lipinski > from urllib.request import urlopen > import gzip > import pprint > pp = pprint.PrettyPrinter(indent=4) > > > Lipinski_dic = {'FractionCSP3':Lipinski.FractionCSP3, > 'HeavyAtomCount':Lipinski.HeavyAtomCount, > 'NHOHCount': Lipinski.NHOHCount, > "NOCount":Lipinski.NOCount, > "NumAliphaticCarbocycles": Lipinski.NumAliphaticCarbocycles, > "NumAliphaticHeterocycles" : Lipinski.NumAliphaticHeterocycles, > 'NumAliphaticRings' : Lipinski.NumAliphaticRings, > 'NumAromaticCarbocycles' : Lipinski.NumAromaticCarbocycles, > 'NumAromaticHeterocycles' : Lipinski.NumAromaticHeterocycles, > 'NumAromaticRings' : Lipinski.NumAromaticRings, > 'NumHAcceptors' : Lipinski.NumHAcceptors, > 'NumHDonors' : Lipinski.NumHDonors, > 'NumHeteroatoms' : Lipinski.NumHeteroatoms, > 'NumRotatableBonds' : Lipinski.NumRotatableBonds, > 'NumSaturatedCarbocycles' : Lipinski.NumSaturatedCarbocycles, > 'NumSaturatedHeterocycles' : Lipinski.NumSaturatedHeterocycles, > 'NumSaturatedRings' : Lipinski.NumSaturatedRings, > 'RingCount' : Lipinski.RingCount > } > > url = "https://files.rcsb.org/download/1arj.pdb.gz; > pdb_data = gzip.decompress(urlopen(url).read()) > mol = Chem.RWMol(Chem.MolFromPDBBlock(pdb_data)) > bonds_to_cleave = {(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) for b in mol.GetBonds() if b.GetBeginAtom().GetPDBResidueInfo().GetIsHeteroAtom() ^ b.GetEndAtom().GetPDBResidueInfo().GetIsHeteroAtom()} > [mol.RemoveBond(*b) for b in bonds_to_cleave] > hetatm_frags = [f for f in rdmolops.GetMolFrags(mol, asMols=True, sanitizeFrags=True) if f.GetNumAtoms() and f.GetAtomWithIdx(0).GetPDBResidueInfo().GetIsHeteroAtom()] > for hetatm in hetatm_frags: > res_name = hetatm.GetAtomWithIdx(0).GetPDBResidueInfo().GetResidueName() > calculated_props = {} > for prop in Lipinski_dic: > function = Lipinski_dic[prop] > x = function(hetatm) > calculated_props[prop] = x > pp.pprint(calculated_props) > > > and as you can see the properties of the ligand doesn't match up with what is expected (The number of SP3-atoms doesn't match up). When parsing through the structure 3got, it fails to recognize the aromatic rings of the ligand A2F. I'm assuming this is caused by RDKit not assigning bond orders correctly when reading in RNA and DNA pdb files (something which I have reported in earlier on this mailing list)? > > > Running hetatm.UpdatePropertyCache(strict=True) does not remedy this problem.
Re: [Rdkit-discuss] Stereochemistry in rdkit
Hi Greg, Thank you for the answer. I used to use the stereochemistry assignment the way you describe, but someone complained that in one of the molecules they knew the stereochemistry was incorrect. It was suggested that we use the stereochemistry we have in db, so I changed that to setting atom tags (which randomly fixed those couple of issues, but apparently broke everything else down. I was wondering how does rdkit work out R/S from inchi string? Lukas From: Greg Landrum Date: Wednesday, 30 October 2019 at 04:28 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Stereochemistry in rdkit Hi Lukas, The stereochemistry tags that the RDKit uses in determining bond wedging (or for SMILES, generating 3D coordinates, etc.) are the ChiralTags on the atoms: CHI_TETRAHEDRAL_CW and CHI_TETRAHEDRAL_CCW. The current RDKit stereo representation is relative to the ordering of the bonds around an atom, not the ordering of neighboring atoms. So CHI_TETRAHEDRAL_CW means that when you look down the first bond towards the central atom you rotate clockwise to move from the second bond to the third. The CIP (R/S) atomic properties are set by AssignStereochemistry() using the ChiralTags. Note that the R/S assignments are only approximate, the actual CIP rules are quite complex (great paper on this here: https://pubs.acs.org/doi/abs/10.1021/acs.jcim.8b00324) and we've not made a serious attempt to get this right. It isn't currently possible to assign CIP R/S labels to atoms and use those to set the ChiralTags. It would be possible to put together a bit of Python that can do this, but it would only be as accurate as the RDKit's assignment of CIP priorities. I can put together a demo of how to do this, but I think/hope it's not actually what you need... If you have 3D coordinates, the absolute best way to set the ChiralTags (and thus have the chiral representation correct) is to use AssignStereochemistryFrom3D(). This will set the ChiralTags on the atoms as well as assigning the CIP codes (to the extent that those are correct). Here's a gist showing how this works: https://gist.github.com/greglandrum/aa802edd1bc49ac0452beff52d55 I hope this helps, -greg On Tue, Oct 29, 2019 at 12:13 PM Lukas Pravda wrote: Hi guys, I got completely puzzled by stereochemistry and the way to set it in rdkit. Among others we use rdkit to get 2D depictions. What I do in my code is that I construct molecule from scratch and set chiral tags to CHI_TETRAHEDRAL_CW for R, CHI_TETRAHEDRAL_CCW for S (this is the metadata we have for each atom, where applicable), otherwise CHI_UNSPECIFIED. Then I run sanitization on the molecule and generate images. That seems to be working incorrectly even for simple cases: e.g.: https://pdbe.org/chem/004 When constructing the molecule I set the stereocenter for the CA atom to CHI_TETRAHEDRAL_CCW (S), but when I then try to perceive the R/S by FindMolChiralCenters(force=false) it says ‘R’, so as the image. This is wrong. I can also directly set _CIPCODE for each atom where applicable to S/R directly (along with the chiral tags). Then the chiral atom is perceived as S by FindMolChiralCenters(force=false), but then again the image still says R. When I set neither the chiral tag nor the _CIPCODe and run AssignAtomChiralTagsFromStructure() and AssignStereochemistry() on the mol the atom under question gets atom tag CHI_TETRAHEDRAL_CW (I assume incorrectly), the _CIPCODE is correct (S) and the image is correct (why) as well (attached). So my question is, how do I set stereochemistry on individual atoms, so that it is perceived by rdkit and is not overwritten in any subsequent step. I hope the above mentioned description makes at least some sense. If not, I’ll try to distill a code sample for constructing this molecule from raw data. I also reproduced the same steps on the http://pdbe.org/chem/THR, which also gives wrong results when I set chiral tags manually (bond wedging should not be on methyl group I assume. Interestingly here the setting chiral atoms from the structure by rdkit gives incorrect results as well (attached). For rdkit set tags I get CA - CHI_TETRAHEDRAL_CCW (S) – (correct) CB - CHI_TETRAHEDRAL_CCW (R) – (incorrect should be TETRAHEDRAL_CW - R) I’d be grateful for any piece of advice. Because I have no idea what I have been doing wrong the whole time. My settings: Rdkit: 2019.09.1/2019.03.2 Conda: 4.7.12 Python 3.7.4 os mac 10.15 Best, Lukas ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files
That’s true, but I always hated ‘do your homework’ kind of answers, especially when I was completely lost and did not know where to start from (which may not be your case). import sys from rdkit import Chem saved_std_err = sys.stderr log = sys.stderr = open('test_log.log', mode='w') Chem.WrapLogs() mol = Chem.MolFromSmiles('c1c1(C)(C)') log.close() sys.stderr = saved_std_err works Lukas From: Brian Lee Date: Wednesday, 25 September 2019 at 20:55 To: Cc: RDKIT mailing list , Lukas Pravda Subject: Re: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files This isn't so much an RDKit question as it is a python IO question. I'd look into the docs at https://docs.python.org/3/library/io.html as a starting point. On Wed, Sep 25, 2019 at 2:13 AM wrote: Hi. Yes this works for me as per the RDKit docs. But I need to pipe it to a file, any suggestions? Thanks. Mike Get Outlook for Android On Tue, Sep 24, 2019 at 10:28 PM +0100, "Lukas Pravda" wrote: Hi Mike, The following code works for me: import sys from io import StringIO from rdkit import Chem saved_std_err = sys.stderr log = sys.stderr = StringIO() Chem.WrapLogs() # do whatever you want with rdkit whatever_used_to_be_printed_by_rdkit_in_console_as_str = log.getvalue() sys.stderr = saved_std_err This populated the stream in memory, if you replace StringIO with FileIO I think you can directly redirect it to file, but have not tested that. Let me know if this works for you. Lukas From: Date: Tuesday, 24 September 2019 at 16:52 To: RDKIT mailing list Subject: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files Hi Rdkit forum, Easy question, perhaps difficult to answer… I’ve been reading a lot of support messages, many a very old about how to get warnings (only visible at the cmd line) to be send to files. I can get error messages sent, but not warnings. The type I’m trying to capture are – which I can see when I run code at the cmd line, but not in jupyter-notebook. “charges were rearranged” “Omitted undefined stereo” Ideally I’d like to run the code in jupyter-notebook and also at the cmd line with python script.py etc and get the warnings into a file. Can you provide me with a simple example as to how to do this? Thanks, mike Error! Filename not specified. Dr Mike Mazanetz, FRSC Director Honorary Lecturer School of Natural and Computing Sciences University of Aberdeen +44 (0) 141 533 0930 +44 (0) 7780 672509 mi...@novadatasolutions.co.uk www.novadatasolutions.co.uk skype michael.mazanetz NovaData Solutions Ltd. PO Box 639 Abingdon-on-Thames Oxfordshire OX14 9JD United Kingdom ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files
Hi Mike, The following code works for me: import sys from io import StringIO from rdkit import Chem saved_std_err = sys.stderr log = sys.stderr = StringIO() Chem.WrapLogs() # do whatever you want with rdkit whatever_used_to_be_printed_by_rdkit_in_console_as_str = log.getvalue() sys.stderr = saved_std_err This populated the stream in memory, if you replace StringIO with FileIO I think you can directly redirect it to file, but have not tested that. Let me know if this works for you. Lukas From: Date: Tuesday, 24 September 2019 at 16:52 To: RDKIT mailing list Subject: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files Hi Rdkit forum, Easy question, perhaps difficult to answer… I’ve been reading a lot of support messages, many a very old about how to get warnings (only visible at the cmd line) to be send to files. I can get error messages sent, but not warnings. The type I’m trying to capture are – which I can see when I run code at the cmd line, but not in jupyter-notebook. “charges were rearranged” “Omitted undefined stereo” Ideally I’d like to run the code in jupyter-notebook and also at the cmd line with python script.py etc and get the warnings into a file. Can you provide me with a simple example as to how to do this? Thanks, mike Dr Mike Mazanetz, FRSC Director Honorary Lecturer School of Natural and Computing Sciences University of Aberdeen +44 (0) 141 533 0930 +44 (0) 7780 672509 mi...@novadatasolutions.co.uk www.novadatasolutions.co.uk skype michael.mazanetz NovaData Solutions Ltd. PO Box 639 Abingdon-on-Thames Oxfordshire OX14 9JD United Kingdom ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] GenerateDepictionMatching2DStructure question
Hi Pat, >From my experience rdkit uses more or less 1.5 units bond length for 2D >depictions. So it makes sense if you rescale your template so that the bond >length is 1.5. This is the code snippet I use for the same thing to upscale template with bond lengths 1.0 to 1.5 import numpy factor = 1.5 mol = Chem.MolFromMolFile(src, sanitize=True) matrix = numpy.zeros((4, 4), numpy.float) for i in range(3): matrix[i, i] = factor matrix[3, 3] = 1 AllChem.TransformMol(mol, matrix) Chem.MolToMolFile(mol, dst) Let me know if this is what you were looking for. Lukas From: Patrick Walters Date: Thursday, 23 May 2019 at 13:22 To: RDKIT mailing list Subject: [Rdkit-discuss] GenerateDepictionMatching2DStructure question Hi All, I'm trying to align a set of structures to a template that I have as molfile. When I call GenerateDepictionMatching2DStructure it appears that the coordinate for the template are directly copied. This results in a structure like the one below, where the bond lengths for the template are different from those in the rest of the molecule. Is there a way around this so that all of the bond lengths will be the same? My code is below, thanks in advance, Pat from rdkit import Chem from rdkit.Chem import rdDepictor mb = """ RDKit 2D 9 10 0 0 0 0 0 0 0 0999 V2000 2.18450.20000. C 0 0 0 0 0 0 0 0 0 0 0 0 1.4701 -0.21250. C 0 0 0 0 0 0 0 0 0 0 0 0 1.4701 -1.03750. C 0 0 0 0 0 0 0 0 0 0 0 0 2.1845 -1.45000. N 0 0 0 0 0 0 0 0 0 0 0 0 2.8990 -1.03750. C 0 0 0 0 0 0 0 0 0 0 0 0 2.8990 -0.21250. C 0 0 0 0 0 0 0 0 0 0 0 0 3.68360.04250. N 0 0 0 0 0 0 0 0 0 0 0 0 3.6836 -1.29240. N 0 0 0 0 0 0 0 0 0 0 0 0 4.1685 -0.62500. N 0 0 0 0 0 0 0 0 0 0 0 0 5 6 1 0 7 9 1 0 6 7 2 0 8 9 1 0 1 6 1 0 1 2 2 0 2 3 1 0 3 4 2 0 4 5 1 0 5 8 2 0 M END""" tmplt = Chem.MolFromMolBlock(mb) smiles = "FC(F)(F)Oc1(-n2nnc3ccc(NC4CCOCC4)nc32)c1" mol = Chem.MolFromSmiles(smiles) rdDepictor.GenerateDepictionMatching2DStructure(mol, tmplt) ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Get num of heavy atoms returns incorrect value
Hi Paolo, It did. In fact I got confused by the documentation and completely ignored that function. Should update pointer to the documentation on my end, as I somehow landed here: http://www.rdkit.org/docs-beta/api/rdkit.Chem.rdchem.Mol-class.html#GetNumAtoms and that says that onlyHeavy (now explicitOnly) returns heavy atoms. Thanks Lukas From: Paolo Tosco Date: Wednesday, 1 May 2019 at 15:32 To: Lukas Pravda , RDKIT mailing list Subject: Re: [Rdkit-discuss] Get num of heavy atoms returns incorrect value mol.GetNumHeavyAtoms ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Get num of heavy atoms returns incorrect value
Dear all, I construct my own rdkit.Mol objects from mmcif files. I wanted to use mol.GetNumAtoms(onlyExplicit=True) to get the number of heavy atoms for that molecule, however, I have noticed that the function returns all the time number of all atoms in the molecule including hydrogens (47 vs. expected 31). When I try to iterate over the atoms to get number of Implicit/Explicit Hs for each atom I get 0 for all the atoms in the molecule, although the element types are correct (C’s, O’s, H’s etc.) So I assume that I construct the molecule incorrectly and wonder if there’s a way to tag hydrogen atoms correctly when I construct them. Hydrogens are explicitly present in my input structures and I’d like to get GetNumAtoms(onlyExplicit=True) function to work as expected. Attached is a python pickle of ATP molecule with two conformations. Interestingly rdkit.Chem.Descriptors.HeavyAtomCount(self.mol) returns correct value as expected. My configuration: OS: MacOS: 10.14.4 Rdkit: 2019.03.01 Python: 3.7.3 Best, Lukas ATP.pickle Description: Binary data ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Which method to prefer for computing 2D coordinates
Sorry guys for the late reply, The code is available from here: https://gitlab.ebi.ac.uk/pdbe/ccdutils/blob/master/pdbeccdutils/core/depictions.py Hope you will find any use of it. The class you are after is DepictionValidator. Presently, I have a few changes on the development branch, which are soon to be merged to master (basically I started to calculate angle between two bonds if they share common atom because of this http://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/CPT). Let me know if you would have any questions, or suggestions. Lukas From: Thomas Evangelidis Date: Tuesday, 9 April 2019 at 15:43 To: Lukas Pravda Cc: Jose Manuel Gally , RDKIT mailing list Subject: Re: [Rdkit-discuss] Which method to prefer for computing 2D coordinates Hello Lukas, I am also struggling with 2D coordinate generation quite a long time as well as what criteria to use for choosing the most appropriate. Therefore, I would be very interest to use your code for 2D coordinate selection. With best regards, Thomas PS: very nice notebook Jose. I also wanted to write something similar but never really found enough time to finish it. On Tue, 9 Apr 2019 at 16:31, Lukas Pravda wrote: Hi Jose, As you have shown there is no single method which would be perfect for everything. If you don’t care that much about speed, the possible solution could be to compute coordinates with all three approaches and then simply select the best conformer based on some criteria. The solution I use is to generate 2D coordinates using multiple approaches and then I have a set of methods which computes number of bond collisions and atoms being close to each other using KD-tree. Altogether this all is expressed as penalty score, where the lower is better. Should you need any code, let me know. Lukas On 09/04/2019, 14:35, "Jose Manuel Gally" wrote: Dear all, This might sound naive, but I want to compute 2D coordinates for a set of molecules. For now I am considering the 3 methods below [1]. I was wondering if there was any recommendation to use one method over another in some cases? For instance, very large rings are not displayed round for CoordGen but sometimes this method performs worse than the default (AllChem). Computational time is not really an issue here as I generate those coordinates on the fly for a very small set of compounds. Here is a gist with a few examples: https://gist.github.com/jose-manuel/0f2a5e8eae8bf2a72c0faad7f2f2a263 Thanks in advance, any suggestion is welcome! Cheers, Jose Manuel [1] Methods: 1) rdkit.AllChem.Compute2dCoors (equivqlent to rdkit.Chem.rdDepictor.Compute2DCoords) 2) rdkit.Avalon.pyAvalonTools.Generate2DCoords 3) rdkit.Chem.rdCoordGen.AddCoords + rescale ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- == Dr Thomas Evangelidis Research Scientist IOCB - Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Prague, Czech Republic & CEITEC - Central European Institute of Technology Brno, Czech Republic email: teva...@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/ ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Which method to prefer for computing 2D coordinates
Hi Jose, As you have shown there is no single method which would be perfect for everything. If you don’t care that much about speed, the possible solution could be to compute coordinates with all three approaches and then simply select the best conformer based on some criteria. The solution I use is to generate 2D coordinates using multiple approaches and then I have a set of methods which computes number of bond collisions and atoms being close to each other using KD-tree. Altogether this all is expressed as penalty score, where the lower is better. Should you need any code, let me know. Lukas On 09/04/2019, 14:35, "Jose Manuel Gally" wrote: Dear all, This might sound naive, but I want to compute 2D coordinates for a set of molecules. For now I am considering the 3 methods below [1]. I was wondering if there was any recommendation to use one method over another in some cases? For instance, very large rings are not displayed round for CoordGen but sometimes this method performs worse than the default (AllChem). Computational time is not really an issue here as I generate those coordinates on the fly for a very small set of compounds. Here is a gist with a few examples: https://gist.github.com/jose-manuel/0f2a5e8eae8bf2a72c0faad7f2f2a263 Thanks in advance, any suggestion is welcome! Cheers, Jose Manuel [1] Methods: 1) rdkit.AllChem.Compute2dCoors (equivqlent to rdkit.Chem.rdDepictor.Compute2DCoords) 2) rdkit.Avalon.pyAvalonTools.Generate2DCoords 3) rdkit.Chem.rdCoordGen.AddCoords + rescale ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] PDBe-KB ligand centric pages survey
Dear All, We are in the process of redesigning the ligand pages of PDBe and and RDKIT is playing a major role in it. We would be grateful if you could fill out a short survey to help us understand what information about small molecules / ligands you would find useful. The survey is available at https://bit.ly/2FFmHFG Recently, we have introduced protein-specific aggregated views on the structural data (pdbe-kb.org/proteins) as a part of Protein Data Bank in Europe Knowledge Base (PDBe-KB). We highlight the available information related to structures of specific proteins, including structural and functional annotations, domains, ligand-binding sites and interfaces. In the next step we would like to present a similar aggregated view from a small molecule / ligand perspective. Thank you for your time, Lukas -- Lukas Pravda, Ph.D. Bioinformatician/Scientific Programmer Protein Data Bank in Europe (PDBe) European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD United Kingdom ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bond tags in SVGs
Hi Greg, I was wondering If you managed to create the Mac build you were talking about some time ago. Also I wonder If this functionality is going to be part of the next RDKit release? Best, Lukas From: Greg Landrum Date: Tuesday, 5 February 2019 at 14:45 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs Sure. I don't have my Mac with me, so that'll need to wait until I'm back in Basel on the weekend. -greg On Tue, Feb 5, 2019 at 2:39 PM Lukas Pravda wrote: If it is not too much trouble to ask, please build it for mac os (10.14.3) python 3.6.x. Thanks! Lukas From: Greg Landrum Date: Tuesday, 5 February 2019 at 13:40 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs On Tue, Feb 5, 2019 at 12:23 PM Lukas Pravda wrote: Thanks for this. It looks excellent!! Is there a way how I can test this? Other than cloning and compiling the repository? So far I have been using rdkit solely from python and its conda builds, so don’t really know how to test it. At the moment you would need to get a copy of the repo and build it. I can do a build so that it's conda-installable though. Which OS are you using? If I understand this correctly, the atom and bond class ids are added only after TagAtoms() is called, or are they added at the ‘DrawMolecule()’ stage? Bond classes are added as the bonds are written. Atom classes can only be added at the TagAtoms() stage - there's not an object in the SVG for many atoms without TagAtoms() being called. I can imagine a lot of possible scenarios and use cases with this new functionality. However, in order to make the function TagAtoms() sufficiently general, a bit more control over the javascript used in the events would be needed. As a possible suggestions, I can imagine to pass as the third parameter a lambda selector, which would in turn feed the JS function with parameters to display names/charges/whatever. Also it would be nice to have a mean how to pass dict of key-val properties for both atoms and bonds so that you can incorporate related data into the svg. Having said that, in my opinion if svgs end up as a part of html/javascript application, it is the best to expose this interactivity directly from the client, rather than ‘pre-generating’ the behaviour on the server. So I’m not sure If it is worth investing time into mimicking this functionality in C++/python code, Whoever is in a need of generating interactive svgs, can directly consume the svg string and modify it according to their needs. Yeah, that's more or less what I was thinking. We want to write something that can be reasonably easily modified after the fact to produce something useful. To sum up, I think it should enough just to tag positions and identifiers of atoms/bonds exactly as you do and possibly further extend them with a mean how to pass some extra data to all of it. Then users can modify svgs whichever way they want, but others might think differently. Excellent! -greg Best, Lukas From: Greg Landrum Date: Sunday, 3 February 2019 at 17:49 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs Hi Lukas, I had a chance to do a bit of work on this recently and I'd be interested to hear your feedback. Bonds are now tagged with their bond IDs (using classes) and the "TagAtoms()" function now adds clickable transparent circles above each atom. These are also tagged with atom IDs using classes. TagAtoms() also lets you add callback functions for events associated with the atom circles. At the moment these are simply called with the atom id, but there's almost certainly a better way to do that. Suggestions are very welcome. Here's a gist showing what's currently on the branch: https://gist.github.com/greglandrum/d23517cb449003252cf09b5bd14d8637 On Tue, Dec 4, 2018 at 6:46 PM Lukas Pravda wrote: Hi Greg, that’s what I have been thinking, unlucky. Essentially, I want to color the molecule in web-browser with various annotations and make it interactive. For that part I’m converting it internally to the d3.js internal representation (https://d3js.org/) and connecting it to its environment. For most of the parts I’m just fine with the position of atoms in svg using the tag property. What I wanted to avoid is to replicate rdkit svg drawing code in javascript so that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do instead is to use existing svg images and parse them into d3.js, so I know which paths belong to which bond. At this point my only idea is to color bonds individually and based on the overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong to, which is a bit overkill in my view. Lukas From: Greg Landrum Date: Tuesday, 4 December 2018 at 17:24
Re: [Rdkit-discuss] Atom coordinates from PDB-file
Hi Illimar, If you need to access coordinates without creating conformer object do you really need to use rdkit I the first place? PDB file is column based format, so extracting coordinates for atoms for example with python is very straightforward. Lukas -- Lukas Pravda, Ph.D. Bioinformatician/Scientific Programmer Protein Data Bank in Europe (PDBe) European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD United Kingdom On 25/02/2019, 10:11, "Illimar Hugo Rekand" wrote: Hello, I am currently trying to access the xyz-coordinates for specific atoms (in a loop) from a .PDB-file. Is there an easy way to do this without creating a conformer of the molecule? all the best, Illimar Rekand Ph.D. Candidate Department of Biomedicine University of Bergen ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bond tags in SVGs
If it is not too much trouble to ask, please build it for mac os (10.14.3) python 3.6.x. Thanks! Lukas From: Greg Landrum Date: Tuesday, 5 February 2019 at 13:40 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs On Tue, Feb 5, 2019 at 12:23 PM Lukas Pravda wrote: Thanks for this. It looks excellent!! Is there a way how I can test this? Other than cloning and compiling the repository? So far I have been using rdkit solely from python and its conda builds, so don’t really know how to test it. At the moment you would need to get a copy of the repo and build it. I can do a build so that it's conda-installable though. Which OS are you using? If I understand this correctly, the atom and bond class ids are added only after TagAtoms() is called, or are they added at the ‘DrawMolecule()’ stage? Bond classes are added as the bonds are written. Atom classes can only be added at the TagAtoms() stage - there's not an object in the SVG for many atoms without TagAtoms() being called. I can imagine a lot of possible scenarios and use cases with this new functionality. However, in order to make the function TagAtoms() sufficiently general, a bit more control over the javascript used in the events would be needed. As a possible suggestions, I can imagine to pass as the third parameter a lambda selector, which would in turn feed the JS function with parameters to display names/charges/whatever. Also it would be nice to have a mean how to pass dict of key-val properties for both atoms and bonds so that you can incorporate related data into the svg. Having said that, in my opinion if svgs end up as a part of html/javascript application, it is the best to expose this interactivity directly from the client, rather than ‘pre-generating’ the behaviour on the server. So I’m not sure If it is worth investing time into mimicking this functionality in C++/python code, Whoever is in a need of generating interactive svgs, can directly consume the svg string and modify it according to their needs. Yeah, that's more or less what I was thinking. We want to write something that can be reasonably easily modified after the fact to produce something useful. To sum up, I think it should enough just to tag positions and identifiers of atoms/bonds exactly as you do and possibly further extend them with a mean how to pass some extra data to all of it. Then users can modify svgs whichever way they want, but others might think differently. Excellent! -greg Best, Lukas From: Greg Landrum Date: Sunday, 3 February 2019 at 17:49 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs Hi Lukas, I had a chance to do a bit of work on this recently and I'd be interested to hear your feedback. Bonds are now tagged with their bond IDs (using classes) and the "TagAtoms()" function now adds clickable transparent circles above each atom. These are also tagged with atom IDs using classes. TagAtoms() also lets you add callback functions for events associated with the atom circles. At the moment these are simply called with the atom id, but there's almost certainly a better way to do that. Suggestions are very welcome. Here's a gist showing what's currently on the branch: https://gist.github.com/greglandrum/d23517cb449003252cf09b5bd14d8637 On Tue, Dec 4, 2018 at 6:46 PM Lukas Pravda wrote: Hi Greg, that’s what I have been thinking, unlucky. Essentially, I want to color the molecule in web-browser with various annotations and make it interactive. For that part I’m converting it internally to the d3.js internal representation (https://d3js.org/) and connecting it to its environment. For most of the parts I’m just fine with the position of atoms in svg using the tag property. What I wanted to avoid is to replicate rdkit svg drawing code in javascript so that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do instead is to use existing svg images and parse them into d3.js, so I know which paths belong to which bond. At this point my only idea is to color bonds individually and based on the overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong to, which is a bit overkill in my view. Lukas From: Greg Landrum Date: Tuesday, 4 December 2018 at 17:24 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs Hi Lukas, There's not currently a way to do this at the moment. The closest you can get is by calling AddMoleculeMetadata(): In [6]: d = Draw.MolDraw2DSVG(200,200) In [8]: d.DrawMolecule(nm) In [10]: d.AddMoleculeMetadata(nm) In [11]: d.FinishDrawing() In [12]: svg = d.GetDrawingText() In [14]: print(svg) http://www.rdkit.org/xml; version="0.9"> This gets you t
Re: [Rdkit-discuss] Bond tags in SVGs
Hi Greg, Thanks for this. It looks excellent!! Is there a way how I can test this? Other than cloning and compiling the repository? So far I have been using rdkit solely from python and its conda builds, so don’t really know how to test it. If I understand this correctly, the atom and bond class ids are added only after TagAtoms() is called, or are they added at the ‘DrawMolecule()’ stage? I can imagine a lot of possible scenarios and use cases with this new functionality. However, in order to make the function TagAtoms() sufficiently general, a bit more control over the javascript used in the events would be needed. As a possible suggestions, I can imagine to pass as the third parameter a lambda selector, which would in turn feed the JS function with parameters to display names/charges/whatever. Also it would be nice to have a mean how to pass dict of key-val properties for both atoms and bonds so that you can incorporate related data into the svg. Having said that, in my opinion if svgs end up as a part of html/javascript application, it is the best to expose this interactivity directly from the client, rather than ‘pre-generating’ the behaviour on the server. So I’m not sure If it is worth investing time into mimicking this functionality in C++/python code, Whoever is in a need of generating interactive svgs, can directly consume the svg string and modify it according to their needs. To sum up, I think it should enough just to tag positions and identifiers of atoms/bonds exactly as you do and possibly further extend them with a mean how to pass some extra data to all of it. Then users can modify svgs whichever way they want, but others might think differently. Best, Lukas From: Greg Landrum Date: Sunday, 3 February 2019 at 17:49 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs Hi Lukas, I had a chance to do a bit of work on this recently and I'd be interested to hear your feedback. Bonds are now tagged with their bond IDs (using classes) and the "TagAtoms()" function now adds clickable transparent circles above each atom. These are also tagged with atom IDs using classes. TagAtoms() also lets you add callback functions for events associated with the atom circles. At the moment these are simply called with the atom id, but there's almost certainly a better way to do that. Suggestions are very welcome. Here's a gist showing what's currently on the branch: https://gist.github.com/greglandrum/d23517cb449003252cf09b5bd14d8637 On Tue, Dec 4, 2018 at 6:46 PM Lukas Pravda wrote: Hi Greg, that’s what I have been thinking, unlucky. Essentially, I want to color the molecule in web-browser with various annotations and make it interactive. For that part I’m converting it internally to the d3.js internal representation (https://d3js.org/) and connecting it to its environment. For most of the parts I’m just fine with the position of atoms in svg using the tag property. What I wanted to avoid is to replicate rdkit svg drawing code in javascript so that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do instead is to use existing svg images and parse them into d3.js, so I know which paths belong to which bond. At this point my only idea is to color bonds individually and based on the overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong to, which is a bit overkill in my view. Lukas From: Greg Landrum Date: Tuesday, 4 December 2018 at 17:24 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs Hi Lukas, There's not currently a way to do this at the moment. The closest you can get is by calling AddMoleculeMetadata(): In [6]: d = Draw.MolDraw2DSVG(200,200) In [8]: d.DrawMolecule(nm) In [10]: d.AddMoleculeMetadata(nm) In [11]: d.FinishDrawing() In [12]: svg = d.GetDrawingText() In [14]: print(svg) http://www.rdkit.org/xml; version="0.9"> This gets you the information you need to connect bond indices to the atoms, but I suspect that's not what you're looking for. In general you are guaranteed that the order of the bonds in the output SVG is the same as the order in the input molecule, but you can have multiple paths for a given bond. For example here, where the end atoms have different colors: In [25]: print(svg) OH http://www.rdkit.org/xml; version="0.9"> What are you looking to be able to do? That may make it easier to either come up with a work around or figure out what a new feature addition might look like. -greg On Mon, Dec 3, 2018 at 6:57 PM Lukas Pravda wrote: Hi all, I was wondering if there is a way how you can tag elements (bonds) in the svg created by rdkit. i.e. transform something like this: Into:
Re: [Rdkit-discuss] Warning as error
Hi Jean-Marc, Just a thought, but SDMolSupplier has a lazy eval, if I am not mistaken. Technically you should get all the rdkit warnings and errors at the time of processing that bit of the sdf file. You can always read the stderror output, parse it and throw exception every time a 'funny' molecule comes in. I use a routine similar to this: from io import StringIO import sys import rdkit saved_std_err = sys.stderr log = sys.stderr = StringIO() rdkit.Chem.WrapLogs() reader = Chem.SDMolSupplier('my_file.sdf') for mol in reader: error_msgs = log.getvalue() # check error_msgs content if it there are any particular errors and act accordingly, erhaps even flush the stream sys.stderr = saved_std_err Lukas On 21/01/2019, 13:24, "Jean-Marc Nuzillard" wrote: Chem.SDMolSupplier ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Bond tags in SVGs
Hi Greg, that’s what I have been thinking, unlucky. Essentially, I want to color the molecule in web-browser with various annotations and make it interactive. For that part I’m converting it internally to the d3.js internal representation (https://d3js.org/) and connecting it to its environment. For most of the parts I’m just fine with the position of atoms in svg using the tag property. What I wanted to avoid is to replicate rdkit svg drawing code in javascript so that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do instead is to use existing svg images and parse them into d3.js, so I know which paths belong to which bond. At this point my only idea is to color bonds individually and based on the overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong to, which is a bit overkill in my view. Lukas From: Greg Landrum Date: Tuesday, 4 December 2018 at 17:24 To: Lukas Pravda Cc: RDKIT mailing list Subject: Re: [Rdkit-discuss] Bond tags in SVGs Hi Lukas, There's not currently a way to do this at the moment. The closest you can get is by calling AddMoleculeMetadata(): In [6]: d = Draw.MolDraw2DSVG(200,200) In [8]: d.DrawMolecule(nm) In [10]: d.AddMoleculeMetadata(nm) In [11]: d.FinishDrawing() In [12]: svg = d.GetDrawingText() In [14]: print(svg) http://www.rdkit.org/xml; version="0.9"> This gets you the information you need to connect bond indices to the atoms, but I suspect that's not what you're looking for. In general you are guaranteed that the order of the bonds in the output SVG is the same as the order in the input molecule, but you can have multiple paths for a given bond. For example here, where the end atoms have different colors: In [25]: print(svg) OH http://www.rdkit.org/xml; version="0.9"> What are you looking to be able to do? That may make it easier to either come up with a work around or figure out what a new feature addition might look like. -greg On Mon, Dec 3, 2018 at 6:57 PM Lukas Pravda wrote: Hi all, I was wondering if there is a way how you can tag elements (bonds) in the svg created by rdkit. i.e. transform something like this: Into: Or similar. I’ve found possibility of tagging atoms in the SVG using Draw.rdMolDraw2D.MolDraw2DSVG.drawOptions() method that exposes property includeAtomTags. This then renders following additional elements into the SVG: rdkit:atom idx="4" label="O-" x="153.479" y="82.8259" /> But I have not seen anything like this for bonds (latest release of RDKIT and python). Thanks, in advance for any hints. I was wondering about using highlightBondLists and then based on the svg infer the bond annotation, but that seems to be a bit of an overkill. Cheers, Lukas ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Bond tags in SVGs
Hi all, I was wondering if there is a way how you can tag elements (bonds) in the svg created by rdkit. i.e. transform something like this: Into: Or similar. I’ve found possibility of tagging atoms in the SVG using Draw.rdMolDraw2D.MolDraw2DSVG.drawOptions() method that exposes property includeAtomTags. This then renders following additional elements into the SVG: rdkit:atom idx="4" label="O-" x="153.479" y="82.8259" /> But I have not seen anything like this for bonds (latest release of RDKIT and python). Thanks, in advance for any hints. I was wondering about using highlightBondLists and then based on the svg infer the bond annotation, but that seems to be a bit of an overkill. Cheers, Lukas ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] How to set bond width with use of MolDraw2DSVG
Hi all, First of all, my configuration is following: macOS: 10.14 conda: 4.5.11 python: 3.6.6 rdkit: 2018.09.01 I just tried to set bond width for 2D SVG images and I run into situation I don’t understand much (nothing surprising ☺). I’m aware of 2 ways how to generate SVG images A simple one from rdkit import Chem from rdkit.Chem import Draw width=100 mol = Chem.MolFromSmiles('c1cc(CCCO)ccc1') Draw.DrawingOptions.bondLineWidth = 10 Draw.MolToFile(mol, 'img.svg', size=(width, width)) Here I can set bondLineWidth, it works like a charm, but if I use the other approach I know, which allows a bit more configuration, from rdkit import Chem from rdkit.Chem import Draw mol = Chem.MolFromSmiles('c1cc(CCCO)ccc1') drawer = Draw.MolDraw2DSVG(width, width) options = drawer.drawOptions() options.bondLineWidth = 10 # does not work Draw.DrawingOptions.bondLineWidth = 10 # does not work either mol = Draw.PrepareMolForDrawing(mol) drawer.DrawMolecule(mol) drawer.FinishDrawing() with open(f'img.svg', 'w') as f: f.write(drawer.GetDrawingText()) the bondLineWidth property is not part of the object returned by drawOptions(). And setting it in a similar fashion as with the previous case does not work. So, I am bit puzzled at this point, If I do something wrong, or if it is possible to set it at all in the other approach. At best, I’d like to set the other approach as it allows a bit more configuration and I would like to avoid manipulating the svg text directly. I appreciate all your help. Thank you! Lukas ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Coordgen library questions
Hi all, I’m playing with the Coordgen library inside rdkit and I have a couple of questions I could not figure out by myself. Hopefully someone more experienced will know. [comment] The way one can pass a scaling factor to the bond size is very unintuitive. If I don’t provide any parameter a single bond length is 1.0. If I pass 1.5 as a scaling factor, I’d expect to get single bond of a length 1.5. But instead I get 33.3. (measured in pymol) Snippet: from rdkit import Chem from rdkit.Chem import rdCoordGen mol = Chem.MolFromSmiles('Cc1c1', sanitize=True) mol1 = Chem.MolFromSmiles('Cc1c1', sanitize=True) p = rdCoordGen.CoordGenParams() p.coordgenScaling = 1.5 rdCoordGen.AddCoords(mol) rdCoordGen.AddCoords(mol1, p) Chem.MolToMolFile(mol, 'default.sdf') # bond length 1 Chem.MolToMolFile(mol1, '1.5_scale.sdf') # bond length 33.3 Is that intended? Is there any way to modify templates, which can be passed as the ‘templateFileDir’ parameter to match general groups and bonds as described here: http://rdkit.blogspot.com/2016/07/tuning-substructure-queries-ii.html? By default, rdCoordGen module writes to stderr by putting ‘TEMPLATES: /path/to/templates’ line for each depiction generated. Is there any simple way of muting that piece of information without manually hijacking the stderr (rdkit.rdBase.DisableLog('rdApp.*') does not work)? Thanks for possible suggestions. Cheers, Lukas ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] FW: Protein Data Bank in Europe is looking for bioinformaticians
Hi everyone, I’ll deliberately abuse this mailing list, so apologies for spam. In the Protein Data Bank in Europe part of EMBL-EBI (Hinxton, Cambridgeshire, UK) we are looking for Bioinformaticians. Presently we have 3 vacancies. For more information please send me a message or take a look here: https://www.ebi.ac.uk/pdbe/about/jobs Cheers, Lukas Pravda, Ph.D. Bioinformatician/Scientific Programmer Protein Data Bank in Europe (PDBe) European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD United Kingdom -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] [Rdkit-announce] RDKit 2018.03.2 release
Hi Drew, are you sure that you are installing 64bit version? This https://anaconda.org/rdkit/rdkit says that the version you have installed is present in win-32 build if I am not mistaken. Best, Lukas From: Drew Gibson via Rdkit-discuss Reply-To: Drew Gibson Date: Saturday, 9 June 2018 at 11:40 To: , Subject: Re: [Rdkit-discuss] [Rdkit-announce] RDKit 2018.03.2 release Hi, I am just installing the conda build on Windows, and it looks like I am getting rdkit 2017.09.1 rather than 2018.03.2 Current conda install: platform : win-64 conda version : 4.3.23 conda is private : False conda-env version : 4.3.23 conda-build version : not installed python version : 3.6.0.final.0 requests version : 2.12.4 but when I runconda create -c rdkit -n rdkit rdkit I get The following NEW packages will be INSTALLED: ... rdkit: 2017.09.1-py36_1 rdkit ... and sure enough it self identifies as 2017.09.1 - >>> from rdkit import rdBase >>> rdBase.rdkitVersion '2017.09.1' Incorrect version ? Or just an incorrect version number ? Cheers ! Drew On Wed, 6 Jun 2018 at 06:57, Greg Landrum wrote: Hi, The 2018.03.2 release of the RDKit is now available. This is a patch release, so it just contains bug fixes. I've uploaded conda builds for Linux, the Mac, and Windows (I'm still working on the python 3.5 build, but 3.6 is up), as well as Linux and Mac builds of the cartridge. NOTE that this is now called: rdkit-postgresql. There should be builds available that work with the conda postgresql packages for v9.5, 9.6, and 10.0. If you give the cartridge builds a try, I would love to hear feedback on how it goes. The release notes are here: https://github.com/rdkit/rdkit/releases/tag/Release_2018_03_2 Best, -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-announce mailing list rdkit-annou...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-announce -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Programatic access to the mol sanitation process results
That worker! Although I was too lazy to modify the actual class and used python package for that. If anyone would be interested the minimal code how not to mess the stderr while retaining the error message as a variable to work with, see below. It uses python streams and wurlitzer package https://github.com/minrk/wurlitzer from rdkit import Chem import io from wurlitzer import pipes mol = Chem.MolFromSmiles('CO(C)C', sanitize=False) out_stream = io.BytesIO() with pipes(stderr=out_stream): sanitization_result = Chem.SanitizeMol(mol, catchErrors=True) error_msg = out_stream.getvalue().decode('utf-8') print(error_msg) Lukas From: Peter Gedeck <peter.ged...@gmail.com> Date: Friday, 9 March 2018 at 15:02 To: Lukas Pravda <lpra...@ebi.ac.uk> Cc: Greg Landrum <greg.land...@gmail.com>, <Rdkit-discuss@lists.sourceforge.net> Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process results Hello Lukas, The file rdkit/TestRunner.py contains a class/context manager called OutputRedirectC. If I remember correctly, this allowed capturing these messages. It's not used anywhere in the RDkit code base, so it not work anymore. Anyway, give it a try and if it works, you can modify it to redirect the output into a variable or StringIO. Best, Peter On 9 Mar 2018, at 9:34 AM, Lukas Pravda <lpra...@ebi.ac.uk> wrote: Hello Greg, I’m very sorry for the late reply. Thank you for the hint on disabling the log message, it works on my end. However, I was more interested in catching the other bit i.e. which part of the structure is wrong, rather than which part of the sanitization process failed. That is accessing the message ‘Explicit valence for atom # 1 O, 3, is greater than permitted’ in form to find out that it is the misbehaving oxygen which causes failure of the sanitization process. Perhaps piping the log information into a variable or something like that. Best, Lukas From: Greg Landrum <greg.land...@gmail.com> Date: Thursday, 22 February 2018 at 13:32 To: Lukas Pravda <lpra...@ebi.ac.uk> Cc: RDKit Discuss <Rdkit-discuss@lists.sourceforge.net> Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process results Hi Lukas, On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda <lpra...@ebi.ac.uk> wrote: Dear rdkiters, I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 and due to the variety of reasons some of them are violating general principles of chemistry in a way implemented in rdkit, so I’m getting information like: Explicit valence for atom # 14 N, 4, is greater than permitted etc. I wonder if there is a way how to retrieve this piece of information in a programmatic way. In order to work with it. Presently, rdkit only prints this out into terminal and Chem.SanitizeMol() only returns first sanitization flag with the issue. Ideally, I’d like no information to be printed into console, while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater than permitted’ preferably in a structured way (in a property/method?), in order to further deal with those erroneous cases. At last part of this is pretty straightforward. There are two parts: - making it so error messages don't go to the console - capturing the failed operation. The first is a bit fragile (i.e. doesn't always work), so you will sometimes end up still seeing error messages (as here), but the second should be reliable: In [30]: rdBase.DisableLog('rdApp.*') In [31]: m = Chem.MolFromSmiles('c11',sanitize=False) In [32]: Chem.SanitizeMol(m,catchErrors=True) [14:29:37] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE In [35]: Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True) [14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES You can see that the return value indicates what went wrong in the sanitization. I hope this helps, -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Programatic access to the mol sanitation process results
Hello Greg, I’m very sorry for the late reply. Thank you for the hint on disabling the log message, it works on my end. However, I was more interested in catching the other bit i.e. which part of the structure is wrong, rather than which part of the sanitization process failed. That is accessing the message ‘Explicit valence for atom # 1 O, 3, is greater than permitted’ in form to find out that it is the misbehaving oxygen which causes failure of the sanitization process. Perhaps piping the log information into a variable or something like that. Best, Lukas From: Greg Landrum <greg.land...@gmail.com> Date: Thursday, 22 February 2018 at 13:32 To: Lukas Pravda <lpra...@ebi.ac.uk> Cc: RDKit Discuss <Rdkit-discuss@lists.sourceforge.net> Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process results Hi Lukas, On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda <lpra...@ebi.ac.uk> wrote: Dear rdkiters, I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 and due to the variety of reasons some of them are violating general principles of chemistry in a way implemented in rdkit, so I’m getting information like: Explicit valence for atom # 14 N, 4, is greater than permitted etc. I wonder if there is a way how to retrieve this piece of information in a programmatic way. In order to work with it. Presently, rdkit only prints this out into terminal and Chem.SanitizeMol() only returns first sanitization flag with the issue. Ideally, I’d like no information to be printed into console, while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater than permitted’ preferably in a structured way (in a property/method?), in order to further deal with those erroneous cases. At last part of this is pretty straightforward. There are two parts: - making it so error messages don't go to the console - capturing the failed operation. The first is a bit fragile (i.e. doesn't always work), so you will sometimes end up still seeing error messages (as here), but the second should be reliable: In [30]: rdBase.DisableLog('rdApp.*') In [31]: m = Chem.MolFromSmiles('c11',sanitize=False) In [32]: Chem.SanitizeMol(m,catchErrors=True) [14:29:37] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE In [35]: Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True) [14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES You can see that the return value indicates what went wrong in the sanitization. I hope this helps, -greg -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Programatic access to the mol sanitation process results
Dear rdkiters, I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 and due to the variety of reasons some of them are violating general principles of chemistry in a way implemented in rdkit, so I’m getting information like: Explicit valence for atom # 14 N, 4, is greater than permitted etc. I wonder if there is a way how to retrieve this piece of information in a programmatic way. In order to work with it. Presently, rdkit only prints this out into terminal and Chem.SanitizeMol() only returns first sanitization flag with the issue. Ideally, I’d like no information to be printed into console, while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater than permitted’ preferably in a structured way (in a property/method?), in order to further deal with those erroneous cases. Thank you for answer, Lukas -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Generate depiction matching 2D structure
Dear all, I’ve just recently started using rdkit in python. Btw. A very nice piece of work. First, I’d like to generate 2D depiction of molecules I’m constructing from 3D coordinate data. I’m aware of methods such as GenerateDepictionMatching2DStructure(…) and GenerateDepictionMatching3DStructure(…). For most of the cases it works like a charm, however there are bits and pieces which need improving. That is why I wonder if there is a mean how to either a) provide multiple templates or b) to provide mapping between the model and the template on the atomic level. For instance, I have a molecule containing two copies of the template, just one of them gets correctly rendered, while the other is still a mess. The documentation says something about AtomPairsParameters (http://www.rdkit.org/Python_Docs/rdkit.Chem.rdMolDescriptors.AtomPairsParameters-class.html), but given the description I have no clue whether or how to use it. The second question I have in mind is that I’m facing the same issues on macOS and python 3.6.x as described here: https://sourceforge.net/p/rdkit/mailman/message/36093960/ and here https://github.com/rdkit/rdkit/issues/1617 so I hope there will be a solution in the future, as I’d like to use python 3.6 My current setup is: macOS High Siera Conda: 4.3.25 Rdkit 2017.9.2 Python: 3.5.3 I look forward to an answer All the best, lukas -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss