Re: [Rdkit-discuss] RDKit installation problem

2020-08-02 Thread Lukas Pravda
Hi Sebastian,

 

Quickly looking at the available builds in the rdkit conda channel 
(https://anaconda.org/rdkit/rdkit) it appears that you are pulling windows 
32-bit version of rdkit. Perhaps this is caused by the fact that you use 32bit 
version of conda? Try installing 64-bit version of conda and pull again.

 

Best,

Lukas

From: "Sebastián J. Castro" 
Date: Saturday, 1 August 2020 at 20:29
To: 
Subject: [Rdkit-discuss] RDKit installation problem

 

I have try the installation suggested at http://www.rdkit.org/docs/Install.html:

 

$ conda create -c rdkit -n my-rdkit-env rdkit
But I get 2017 version instead of 2020 (last released).

 

I don't know how to install it. Can you help me?

 

I have Ubuntu 20.04 LTS

 

Thank you

 

Best regards!

 

-- 

Dr. Sebastián J. Castro

Departamento de Ciencias Farmacéuticas

Facultad de Ciencias Químicas

Universidad Nacional de Córdoba

UNITEFA-CONICET

___ Rdkit-discuss mailing list 
Rdkit-discuss@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Manganese ion as a radical?

2020-07-27 Thread Lukas Pravda
Dear rdkit community,

 

I’m not quite sure if this is more of an rdkit or a chemistry related question. 
I’d like to understand why a manganese ion has 3 radical electrons when 
interpreted by rdkit. I have not seen radicals in any other metal ion so far.

 

The code to get the depiction looks like this:

from rdkit import Chem

from rdkit.Chem import Draw

 

width = 500

 

m = Chem.MolFromInchi('InChI=1S/Mn/q+2')

drawer = Draw.rdMolDraw2D.MolDraw2DSVG(width, width)

 

Draw.rdMolDraw2D.PrepareMolForDrawing(m, wedgeBonds=True, kekulize=True, 
addChiralHs=True)

 

drawer.DrawMolecule(m)

drawer.FinishDrawing()

 

with open('2d_mol.svg', 'w') as f:

    svg = drawer.GetDrawingText()

    f.write(svg)

 

print('done')

 

and the depiction you get looks like the one on the page: 
https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/MN Thank you in 
advance for clarification.

 

 

rdkit through python 2020.03.4 on mac 10.15.6

 

Best,

Lukas

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand

2019-12-17 Thread Lukas Pravda
Hi IIllimar,

I don’t really know what your use case is, so it may be completely useless. 
However, just to add my two cents, we've created a package that builds on the 
top of rdkit and parses PDB ligand definitions from cif files. You can find the 
package here: https://gitlab.ebi.ac.uk/pdbe/ccdutils and the documentation can 
be found here: https://pdbe.gitdocs.ebi.ac.uk/ccdutils/ 

Let me know if this is helpful or you need further help.

Best,
Lukas

On 16/12/2019, 20:03, "Paolo Tosco"  wrote:

Hi IIllimar,

The RDKit PDB reader only recognize standard amino acids and, after the 
PR I did on Saturday https://github.com/rdkit/rdkit/pull/2850 will be 
merged, nucleic acid bases.

Anything else will not have the correct hybridization/bond orders 
perceived, as those are not encoded in the PDB format and the PDB reader 
does not have any functionality to do that.

The 1ARJ case is peculiar, as it has an ARG residue which would be 
recognized if it were in the ATOM records, but not in the HETATM 
section, for which no attempt to perceive the correct hybridization/bond 
is made.

My suggestion, if you are using standard PDB files, is to download the 
SDF file:


https://www.rcsb.org/pdb/download/downloadLigandFiles.do?ligandIdList=A2F=3GOT=all=false=false

and construct your RDKit molecule from that.

You should be able to automate that without too much effort either 
constructing URLs using the template above or using the PDB REST API.

Cheers,
p.

On 16/12/2019 18:24, Illimar Hugo Rekand wrote:
> Thanks, Paolo, for a good and clear example.
>
>
> I adapted your code into my workflow to calculate some 
Lipinski-properties of RNA pdb-structures, and ran into some issues. I'm not 
sure if I should make a new thread or throw this onto this one I already 
created?
>
>
> I used the following code under
>
>
> from rdkit import Chem
> from rdkit.Chem import rdmolops, Lipinski
> from urllib.request import urlopen
> import gzip
> import pprint
> pp = pprint.PrettyPrinter(indent=4)
>
>
> Lipinski_dic = {'FractionCSP3':Lipinski.FractionCSP3,
>  'HeavyAtomCount':Lipinski.HeavyAtomCount,
>  'NHOHCount': Lipinski.NHOHCount,
>  "NOCount":Lipinski.NOCount,
>  "NumAliphaticCarbocycles": 
Lipinski.NumAliphaticCarbocycles,
>  "NumAliphaticHeterocycles" : 
Lipinski.NumAliphaticHeterocycles,
>  'NumAliphaticRings' :  Lipinski.NumAliphaticRings,
>  'NumAromaticCarbocycles' : 
Lipinski.NumAromaticCarbocycles,
>  'NumAromaticHeterocycles' : 
Lipinski.NumAromaticHeterocycles,
>  'NumAromaticRings' : Lipinski.NumAromaticRings,
>  'NumHAcceptors' : Lipinski.NumHAcceptors,
>  'NumHDonors' : Lipinski.NumHDonors,
>  'NumHeteroatoms' : Lipinski.NumHeteroatoms,
>  'NumRotatableBonds' : Lipinski.NumRotatableBonds,
>  'NumSaturatedCarbocycles' : 
Lipinski.NumSaturatedCarbocycles,
>  'NumSaturatedHeterocycles' : 
Lipinski.NumSaturatedHeterocycles,
>  'NumSaturatedRings' : Lipinski.NumSaturatedRings,
>  'RingCount' : Lipinski.RingCount
>  }
>
> url =  "https://files.rcsb.org/download/1arj.pdb.gz;
> pdb_data = gzip.decompress(urlopen(url).read())
> mol = Chem.RWMol(Chem.MolFromPDBBlock(pdb_data))
> bonds_to_cleave = {(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) for b in 
mol.GetBonds() if b.GetBeginAtom().GetPDBResidueInfo().GetIsHeteroAtom() ^ 
b.GetEndAtom().GetPDBResidueInfo().GetIsHeteroAtom()}
> [mol.RemoveBond(*b) for b in bonds_to_cleave]
> hetatm_frags = [f for f in rdmolops.GetMolFrags(mol, asMols=True, 
sanitizeFrags=True) if f.GetNumAtoms() and 
f.GetAtomWithIdx(0).GetPDBResidueInfo().GetIsHeteroAtom()]
> for hetatm in hetatm_frags:
>  res_name = 
hetatm.GetAtomWithIdx(0).GetPDBResidueInfo().GetResidueName()
>  calculated_props = {}
>  for prop in Lipinski_dic:
>  function = Lipinski_dic[prop]
>  x = function(hetatm)
>  calculated_props[prop] = x
>  pp.pprint(calculated_props)
>
>
> and as you can see the properties of the ligand doesn't match up with 
what is expected (The number of SP3-atoms doesn't match up). When parsing 
through the structure 3got, it fails to recognize the aromatic rings of the 
ligand A2F. I'm assuming this is caused by RDKit not assigning bond orders 
correctly when reading in RNA and DNA pdb files (something which I have 
reported in earlier on this mailing list)?
>
>
> Running hetatm.UpdatePropertyCache(strict=True) does not remedy this 
problem. 

Re: [Rdkit-discuss] Stereochemistry in rdkit

2019-10-30 Thread Lukas Pravda
Hi Greg,

 

Thank you for the answer. I used to use the stereochemistry assignment the way 
you describe, but someone complained that in one of the molecules they knew the 
stereochemistry was incorrect. It was suggested that we use the stereochemistry 
we have in db, so I changed that to setting atom tags (which randomly fixed 
those couple of issues, but apparently broke everything else down.

 

I was wondering how does rdkit work out R/S from inchi string?

 

Lukas

 

From: Greg Landrum 
Date: Wednesday, 30 October 2019 at 04:28
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Stereochemistry in rdkit

 

Hi Lukas,

 

The stereochemistry tags that the RDKit uses in determining bond wedging (or 
for SMILES, generating 3D coordinates, etc.) are the ChiralTags on the atoms: 
CHI_TETRAHEDRAL_CW and CHI_TETRAHEDRAL_CCW. The current RDKit stereo 
representation is relative to the ordering of the bonds around an atom, not the 
ordering of neighboring atoms. So CHI_TETRAHEDRAL_CW means that when you look 
down the first bond towards the central atom you rotate clockwise to move from 
the second bond to the third.

 

The CIP (R/S) atomic properties are set by AssignStereochemistry() using the 
ChiralTags. Note that the R/S assignments are only approximate, the actual CIP 
rules are quite complex (great paper on this here: 
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.8b00324) and we've not made a 
serious attempt to get this right.

 

It isn't currently possible to assign CIP R/S labels to atoms and use those to 
set the ChiralTags. It would be possible to put together a bit of Python that 
can do this, but it would only be as accurate as the RDKit's assignment of CIP 
priorities. I can put together a demo of how to do this, but I think/hope it's 
not actually what you need...

 

If you have 3D coordinates, the absolute best way to set the ChiralTags (and 
thus have the chiral representation correct) is to use 
AssignStereochemistryFrom3D(). This will set the ChiralTags on the atoms as 
well as assigning the CIP codes (to the extent that those are correct). Here's 
a gist showing how this works:

https://gist.github.com/greglandrum/aa802edd1bc49ac0452beff52d55

 

I hope this helps,

-greg

 

 

 

On Tue, Oct 29, 2019 at 12:13 PM Lukas Pravda  wrote:

Hi guys,

 

I got completely puzzled by stereochemistry and the way to set it in rdkit. 
Among others we use rdkit to get 2D depictions. What I do in my code is that I 
construct molecule from scratch and  set chiral tags to CHI_TETRAHEDRAL_CW for 
R, CHI_TETRAHEDRAL_CCW for S (this is the metadata we have for each atom, where 
applicable), otherwise CHI_UNSPECIFIED. Then I run sanitization on the molecule 
and generate images. That seems to be working incorrectly even for simple 
cases:  e.g.: https://pdbe.org/chem/004

 

When constructing the molecule I set the stereocenter for the CA atom to 
CHI_TETRAHEDRAL_CCW (S), but when I then try to perceive the R/S by 
FindMolChiralCenters(force=false) it says ‘R’, so as the image. This is wrong. 
I can also directly set _CIPCODE for each atom where applicable to S/R directly 
(along with the chiral tags). Then the chiral atom is perceived as S by 
FindMolChiralCenters(force=false), but then again the image still says R.  

 

When I set neither the chiral tag nor the _CIPCODe and run  
AssignAtomChiralTagsFromStructure() and AssignStereochemistry()  on the mol the 
atom under question gets atom tag CHI_TETRAHEDRAL_CW (I assume incorrectly), 
the _CIPCODE is correct (S) and the image is correct (why) as well 
(attached). So my question is, how do I set stereochemistry on individual 
atoms, so that it is perceived by rdkit and is not overwritten in any 
subsequent step.

 

I hope the above mentioned description makes at least some sense. If not, I’ll 
try to distill a code sample for constructing this molecule from raw data.

 

I also reproduced the same steps on the  http://pdbe.org/chem/THR, which also 
gives wrong results when I set chiral tags manually (bond wedging should not be 
on methyl group I assume. Interestingly here the setting chiral atoms from the 
structure by rdkit gives incorrect results as well (attached).

 

For rdkit set tags I get

 

CA - CHI_TETRAHEDRAL_CCW (S) – (correct)

CB - CHI_TETRAHEDRAL_CCW (R) – (incorrect should be TETRAHEDRAL_CW - R)

 

I’d be grateful for any piece of advice. Because I have no idea what I have 
been doing wrong the whole time.

 

My settings:

Rdkit: 2019.09.1/2019.03.2

Conda: 4.7.12

Python 3.7.4

os mac 10.15

 

Best,

Lukas

 

 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files

2019-09-25 Thread Lukas Pravda
That’s true, but I always hated ‘do your homework’ kind of answers, especially 
when I was completely lost and did not know where to start from (which may not 
be your case).

 

 import sys

from rdkit import Chem

 

saved_std_err = sys.stderr

log = sys.stderr = open('test_log.log', mode='w')

Chem.WrapLogs()

 

mol = Chem.MolFromSmiles('c1c1(C)(C)')

 

log.close()

sys.stderr = saved_std_err

 

works

 

Lukas

 

From: Brian Lee 
Date: Wednesday, 25 September 2019 at 20:55
To: 
Cc: RDKIT mailing list , Lukas Pravda 

Subject: Re: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files

 

This isn't so much an RDKit question as it is a python IO question. I'd look 
into the docs at https://docs.python.org/3/library/io.html as a starting point.

 

On Wed, Sep 25, 2019 at 2:13 AM  wrote:

Hi.

Yes this works for me as per the RDKit docs.  But I need to pipe it to a file, 
any suggestions?

Thanks.

Mike

Get Outlook for Android

 



On Tue, Sep 24, 2019 at 10:28 PM +0100, "Lukas Pravda"  
wrote:

Hi Mike,

 

The following code works for me:

 

import sys

from io import StringIO

from rdkit import Chem

 

saved_std_err = sys.stderr

log = sys.stderr = StringIO()

Chem.WrapLogs()

 

# do whatever you want with rdkit

 

whatever_used_to_be_printed_by_rdkit_in_console_as_str = log.getvalue()

sys.stderr = saved_std_err

 

 

This populated the stream in memory, if you replace StringIO with FileIO I 
think you can directly redirect it to file, but have not tested that. Let me 
know if this works for you.

 

Lukas

From: 
Date: Tuesday, 24 September 2019 at 16:52
To: RDKIT mailing list 
Subject: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files

 

 

Hi Rdkit forum,

 

Easy question, perhaps difficult to answer…  

 

I’ve been reading a lot of support messages, many a very old about how to get 
warnings (only visible at the cmd line) to be send to files.

I can get error messages sent, but not warnings.

 

The type I’m trying to capture are – which I can see when I run code at the cmd 
line, but not in jupyter-notebook.

“charges were rearranged”

“Omitted undefined stereo”

 

Ideally I’d like to run the code in jupyter-notebook and also at the cmd line 
with python script.py etc  and get the warnings into a file.

 

Can you provide me with a simple example as to how to do this?

 

Thanks,

mike

 

 

Error! Filename not specified.

 

Dr Mike Mazanetz, FRSC

Director

 

Honorary Lecturer

School of Natural and Computing Sciences

University of Aberdeen

 

+44 (0) 141 533 0930

+44 (0) 7780 672509

mi...@novadatasolutions.co.uk

www.novadatasolutions.co.uk

skype michael.mazanetz

 

NovaData Solutions Ltd.

PO Box 639

Abingdon-on-Thames

Oxfordshire

OX14 9JD

United Kingdom

 

 

___ Rdkit-discuss mailing list 
Rdkit-discuss@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files

2019-09-24 Thread Lukas Pravda
Hi Mike,

 

The following code works for me:

 

import sys

from io import StringIO

from rdkit import Chem

 

saved_std_err = sys.stderr

log = sys.stderr = StringIO()

Chem.WrapLogs()

 

# do whatever you want with rdkit

 

whatever_used_to_be_printed_by_rdkit_in_console_as_str = log.getvalue()

sys.stderr = saved_std_err

 

 

This populated the stream in memory, if you replace StringIO with FileIO I 
think you can directly redirect it to file, but have not tested that. Let me 
know if this works for you.

 

Lukas

From: 
Date: Tuesday, 24 September 2019 at 16:52
To: RDKIT mailing list 
Subject: [Rdkit-discuss] Fwd: sending warnings to stderr/stdout to files

 

 

Hi Rdkit forum,

 

Easy question, perhaps difficult to answer…  

 

I’ve been reading a lot of support messages, many a very old about how to get 
warnings (only visible at the cmd line) to be send to files.

I can get error messages sent, but not warnings.

 

The type I’m trying to capture are – which I can see when I run code at the cmd 
line, but not in jupyter-notebook.

“charges were rearranged”

“Omitted undefined stereo”

 

Ideally I’d like to run the code in jupyter-notebook and also at the cmd line 
with python script.py etc  and get the warnings into a file.

 

Can you provide me with a simple example as to how to do this?

 

Thanks,

mike

 

 

 

Dr Mike Mazanetz, FRSC

Director

 

Honorary Lecturer

School of Natural and Computing Sciences

University of Aberdeen

 

+44 (0) 141 533 0930

+44 (0) 7780 672509

mi...@novadatasolutions.co.uk

www.novadatasolutions.co.uk

skype michael.mazanetz

 

NovaData Solutions Ltd.

PO Box 639

Abingdon-on-Thames

Oxfordshire

OX14 9JD

United Kingdom

 

 

___ Rdkit-discuss mailing list 
Rdkit-discuss@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] GenerateDepictionMatching2DStructure question

2019-05-23 Thread Lukas Pravda
Hi Pat,

 

>From my experience rdkit uses more or less 1.5 units bond length for 2D 
>depictions. So it makes sense if you rescale your template so that the bond 
>length is 1.5.

 

This is the code snippet I use for the same thing to upscale template with bond 
lengths 1.0 to 1.5

 

import numpy

 

factor = 1.5

 

mol = Chem.MolFromMolFile(src, sanitize=True)

matrix = numpy.zeros((4, 4), numpy.float)

 

for i in range(3):

    matrix[i, i] = factor

    matrix[3, 3] = 1

 

AllChem.TransformMol(mol, matrix)

Chem.MolToMolFile(mol, dst)

 

Let me know if this is what you were looking for.

 

Lukas

From: Patrick Walters 
Date: Thursday, 23 May 2019 at 13:22
To: RDKIT mailing list 
Subject: [Rdkit-discuss] GenerateDepictionMatching2DStructure question

 

Hi All,

 

I'm trying to align a set of structures to a template that I have as molfile.  
When I call GenerateDepictionMatching2DStructure it appears that the coordinate 
for the template are directly copied.  This results in a structure like the one 
below, where the bond lengths for the template are different from those in the 
rest of the molecule.  

 

Is there a way around this so that all of the bond lengths will be the same?

 

My code is below, thanks in advance,

 

Pat

 

from rdkit import Chem
from rdkit.Chem import rdDepictor

 

mb = """
 RDKit  2D

  9 10  0  0  0  0  0  0  0  0999 V2000
2.18450.20000. C   0  0  0  0  0  0  0  0  0  0  0  0
1.4701   -0.21250. C   0  0  0  0  0  0  0  0  0  0  0  0
1.4701   -1.03750. C   0  0  0  0  0  0  0  0  0  0  0  0
2.1845   -1.45000. N   0  0  0  0  0  0  0  0  0  0  0  0
2.8990   -1.03750. C   0  0  0  0  0  0  0  0  0  0  0  0
2.8990   -0.21250. C   0  0  0  0  0  0  0  0  0  0  0  0
3.68360.04250. N   0  0  0  0  0  0  0  0  0  0  0  0
3.6836   -1.29240. N   0  0  0  0  0  0  0  0  0  0  0  0
4.1685   -0.62500. N   0  0  0  0  0  0  0  0  0  0  0  0
  5  6  1  0
  7  9  1  0
  6  7  2  0
  8  9  1  0
  1  6  1  0
  1  2  2  0
  2  3  1  0
  3  4  2  0
  4  5  1  0
  5  8  2  0
M  END"""


tmplt = Chem.MolFromMolBlock(mb)

smiles = "FC(F)(F)Oc1(-n2nnc3ccc(NC4CCOCC4)nc32)c1"
mol = Chem.MolFromSmiles(smiles)
rdDepictor.GenerateDepictionMatching2DStructure(mol, tmplt)

 

___ Rdkit-discuss mailing list 
Rdkit-discuss@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Get num of heavy atoms returns incorrect value

2019-05-01 Thread Lukas Pravda
Hi Paolo,

 

It did. In fact I got confused by the documentation and completely ignored that 
function. Should update pointer to the documentation on my end, as I somehow 
landed here: 
http://www.rdkit.org/docs-beta/api/rdkit.Chem.rdchem.Mol-class.html#GetNumAtoms 
and that says that onlyHeavy (now explicitOnly) returns heavy atoms.

 

Thanks

Lukas

 

 

From: Paolo Tosco 
Date: Wednesday, 1 May 2019 at 15:32
To: Lukas Pravda , RDKIT mailing list 

Subject: Re: [Rdkit-discuss] Get num of heavy atoms returns incorrect value

 

mol.GetNumHeavyAtoms

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Get num of heavy atoms returns incorrect value

2019-05-01 Thread Lukas Pravda
Dear all,

 

I construct my own rdkit.Mol objects from mmcif files. I wanted to use 
mol.GetNumAtoms(onlyExplicit=True) to get the number of heavy atoms for that 
molecule, however, I have noticed that the function returns all the time number 
of all atoms in the molecule including hydrogens (47 vs. expected 31). When I 
try to iterate over the atoms to get number of Implicit/Explicit Hs for each 
atom I get 0 for all the atoms in the molecule, although the element types are 
correct (C’s, O’s, H’s etc.)

 

So I assume that I construct the molecule incorrectly and wonder if there’s a 
way to tag hydrogen atoms correctly when I construct them. 

 

Hydrogens are explicitly present in my input structures and I’d like to get 
GetNumAtoms(onlyExplicit=True) function to work as expected. Attached is a 
python pickle of ATP molecule with two conformations.

 

Interestingly rdkit.Chem.Descriptors.HeavyAtomCount(self.mol) returns correct 
value as expected.

 

My configuration:

OS: MacOS: 10.14.4

Rdkit: 2019.03.01

Python: 3.7.3

 

Best,

 

Lukas



ATP.pickle
Description: Binary data
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Which method to prefer for computing 2D coordinates

2019-04-10 Thread Lukas Pravda
Sorry guys for the late reply,

 

The code is available from here: 
https://gitlab.ebi.ac.uk/pdbe/ccdutils/blob/master/pdbeccdutils/core/depictions.py
 Hope you will find any use of it. The class you are after is 
DepictionValidator. Presently, I have a few changes on the development branch, 
which are soon to be merged to master (basically I started to calculate angle 
between two bonds if they share common atom because of this 
http://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/CPT).

 

Let me know if you would have any questions, or suggestions.

Lukas

 

 

From: Thomas Evangelidis 
Date: Tuesday, 9 April 2019 at 15:43
To: Lukas Pravda 
Cc: Jose Manuel Gally , RDKIT mailing list 

Subject: Re: [Rdkit-discuss] Which method to prefer for computing 2D coordinates

 

Hello Lukas,

 

I am also struggling with 2D coordinate generation quite a long time as well as 
what criteria to use for choosing the most appropriate. Therefore, I would be 
very interest to use your code for 2D coordinate selection.

 

With best regards,

Thomas

 

PS: very nice notebook Jose. I also wanted to write something similar but never 
really found enough time to finish it.

 

 

 

On Tue, 9 Apr 2019 at 16:31, Lukas Pravda  wrote:

Hi Jose,

As you have shown there is no single method which would be perfect for 
everything. If you don’t care that much about speed, the possible solution 
could be to compute coordinates with all three approaches and then simply 
select the best conformer based on some criteria.

The solution I use is to generate 2D coordinates using multiple approaches and 
then I have a set of methods which computes number of bond collisions and atoms 
being close to each other using KD-tree. Altogether this all is expressed as 
penalty score, where the lower is better.

Should you need any code, let me know.

Lukas

On 09/04/2019, 14:35, "Jose Manuel Gally"  wrote:

Dear all,

This might sound naive, but I want to compute 2D coordinates for a set 
of molecules.

For now I am considering the 3 methods below [1].

I was wondering if there was any recommendation to use one method over 
another in some cases?

For instance, very large rings are not displayed round for CoordGen but 
sometimes this method performs worse than the default (AllChem).

Computational time is not really an issue here as I generate those 
coordinates on the fly for a very small set of compounds.

Here is a gist with a few examples: 
https://gist.github.com/jose-manuel/0f2a5e8eae8bf2a72c0faad7f2f2a263

Thanks in advance, any suggestion is welcome!

Cheers,
Jose Manuel

[1] Methods:

1) rdkit.AllChem.Compute2dCoors (equivqlent to 
rdkit.Chem.rdDepictor.Compute2DCoords)
2) rdkit.Avalon.pyAvalonTools.Generate2DCoords
3) rdkit.Chem.rdCoordGen.AddCoords + rescale





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


 

-- 

==

Dr Thomas Evangelidis

Research Scientist

IOCB - Institute of Organic Chemistry and Biochemistry of the Czech Academy of 
Sciences

Prague, Czech Republic

  & 

CEITEC - Central European Institute of Technology
Brno, Czech Republic 

 

email: teva...@gmail.com

website: https://sites.google.com/site/thomasevangelidishomepage/

 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Which method to prefer for computing 2D coordinates

2019-04-09 Thread Lukas Pravda
Hi Jose,

As you have shown there is no single method which would be perfect for 
everything. If you don’t care that much about speed, the possible solution 
could be to compute coordinates with all three approaches and then simply 
select the best conformer based on some criteria.

The solution I use is to generate 2D coordinates using multiple approaches and 
then I have a set of methods which computes number of bond collisions and atoms 
being close to each other using KD-tree. Altogether this all is expressed as 
penalty score, where the lower is better.

Should you need any code, let me know.

Lukas

On 09/04/2019, 14:35, "Jose Manuel Gally"  wrote:

Dear all,

This might sound naive, but I want to compute 2D coordinates for a set 
of molecules.

For now I am considering the 3 methods below [1].

I was wondering if there was any recommendation to use one method over 
another in some cases?

For instance, very large rings are not displayed round for CoordGen but 
sometimes this method performs worse than the default (AllChem).

Computational time is not really an issue here as I generate those 
coordinates on the fly for a very small set of compounds.

Here is a gist with a few examples: 
https://gist.github.com/jose-manuel/0f2a5e8eae8bf2a72c0faad7f2f2a263

Thanks in advance, any suggestion is welcome!

Cheers,
Jose Manuel

[1] Methods:

1) rdkit.AllChem.Compute2dCoors (equivqlent to 
rdkit.Chem.rdDepictor.Compute2DCoords)
2) rdkit.Avalon.pyAvalonTools.Generate2DCoords
3) rdkit.Chem.rdCoordGen.AddCoords + rescale





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] PDBe-KB ligand centric pages survey

2019-04-04 Thread Lukas Pravda
Dear All,

 

We are in the process of redesigning the ligand pages of PDBe and and RDKIT is 
playing a major role in it. We would be grateful if you could fill out a short 
survey to help us understand what information about small molecules / ligands 
you would find useful. The survey is available at https://bit.ly/2FFmHFG

 

Recently, we have introduced protein-specific aggregated views on the 
structural data (pdbe-kb.org/proteins) as a part of Protein Data Bank in Europe 
Knowledge Base (PDBe-KB). We highlight the available information related to 
structures of specific proteins, including structural and functional 
annotations, domains, ligand-binding sites and interfaces. In the next step we 
would like to present a similar aggregated view from a small molecule / ligand 
perspective. 

 

Thank you for your time,

Lukas

 

--

Lukas Pravda, Ph.D.

Bioinformatician/Scientific Programmer

 

Protein Data Bank in Europe (PDBe)

European Bioinformatics Institute (EMBL-EBI)

Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD

United Kingdom

 

 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bond tags in SVGs

2019-03-12 Thread Lukas Pravda
Hi Greg,

 

I was wondering If you managed to create the Mac build you were talking about 
some time ago. Also I wonder If this functionality is going to be part of the 
next RDKit release?

 

Best,

Lukas

 

From: Greg Landrum 
Date: Tuesday, 5 February 2019 at 14:45
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Sure. I don't have my Mac with me, so that'll need to wait until I'm back in 
Basel on the weekend.

 

-greg

 

 

 

On Tue, Feb 5, 2019 at 2:39 PM Lukas Pravda  wrote:

If it is not too much trouble to ask, please build it for mac os (10.14.3) 
python 3.6.x.

 

Thanks!

Lukas

 

From: Greg Landrum 
Date: Tuesday, 5 February 2019 at 13:40
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

 

 

On Tue, Feb 5, 2019 at 12:23 PM Lukas Pravda  wrote:

 

Thanks for this. It looks excellent!! Is there a way how I can test this? Other 
than cloning and compiling the repository? So far I have been using rdkit 
solely from python and its conda builds, so don’t really know how to test it.

 

At the moment you would need to get a copy of the repo and build it. I can do a 
build so that it's conda-installable though. Which OS are you using?

 

If I understand this correctly, the atom and bond class ids are added only 
after TagAtoms() is called, or are they added at the ‘DrawMolecule()’ stage? 

 

Bond classes are added as the bonds are written. Atom classes can only be added 
at the TagAtoms() stage - there's not an object in the SVG for many atoms 
without TagAtoms() being called. 

 

I can imagine a lot of possible scenarios and use cases with this new 
functionality. However, in order to make the function TagAtoms() sufficiently 
general, a bit more control over the javascript used in the events would be 
needed. As a possible suggestions, I can imagine to pass as the third parameter 
a lambda selector, which would in turn feed the JS function with parameters to 
display names/charges/whatever. Also it would be nice to have a mean how to 
pass dict of key-val properties for both atoms and bonds so that you can 
incorporate related data into the svg. 

 

Having said that, in my opinion if svgs end up as a part of html/javascript 
application, it is the best to expose this interactivity directly from the 
client, rather than ‘pre-generating’ the behaviour on the server. So I’m not 
sure If it is worth investing time into mimicking this functionality in 
C++/python code, Whoever is in a need of generating interactive svgs, can 
directly consume the svg string and modify it according to their needs.

 

Yeah, that's more or less what I was thinking. We want to write something that 
can be reasonably easily modified after the fact to produce something useful.

 

 

To sum up, I think it should enough just to tag positions and identifiers of 
atoms/bonds exactly as you do and possibly further extend them with a mean how 
to pass some extra data to all of it. Then users can modify svgs whichever way 
they want, but others might think differently.

 

Excellent!

 

-greg

 

 

Best,

Lukas

From: Greg Landrum 
Date: Sunday, 3 February 2019 at 17:49
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Hi Lukas,

 

I had a chance to do a bit of work on this recently and I'd be interested to 
hear your feedback.

 

Bonds are now tagged with their bond IDs (using classes) and the "TagAtoms()" 
function now adds clickable transparent circles above each atom. These are also 
tagged with atom IDs using classes. TagAtoms() also lets you add callback 
functions for events associated with the atom circles. At the moment these are 
simply called with the atom id, but there's almost certainly a better way to do 
that. Suggestions are very welcome.

 

Here's a gist showing what's currently on the branch: 
https://gist.github.com/greglandrum/d23517cb449003252cf09b5bd14d8637

 

 

On Tue, Dec 4, 2018 at 6:46 PM Lukas Pravda  wrote:

Hi Greg, 

 

that’s what I have been thinking, unlucky. Essentially, I want to color the 
molecule in web-browser with various annotations and make it interactive. For 
that part I’m converting it internally to the d3.js internal representation 
(https://d3js.org/) and connecting it to its environment. For most of the parts 
I’m just fine with the position of atoms in svg using the tag property.

 

What I wanted to avoid is to replicate rdkit svg drawing code in javascript so 
that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do 
instead is to use existing svg images and parse them into d3.js, so I know 
which paths belong to which bond.

 

At this point my only idea is to color bonds individually and based on the 
overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong 
to, which is a bit overkill in my view.

 

Lukas  

 

 

From: Greg Landrum 
Date: Tuesday, 4 December 2018 at 17:24

Re: [Rdkit-discuss] Atom coordinates from PDB-file

2019-02-25 Thread Lukas Pravda
Hi Illimar,

If you need to access coordinates without creating conformer object do you 
really need to use rdkit I the first place? PDB file is column based format, so 
extracting coordinates for atoms for example with python is very 
straightforward.

Lukas

--
Lukas Pravda, Ph.D.
Bioinformatician/Scientific Programmer
 
Protein Data Bank in Europe (PDBe)
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD
United Kingdom
 


On 25/02/2019, 10:11, "Illimar Hugo Rekand"  wrote:

Hello,


I am currently trying to access the xyz-coordinates for specific atoms (in 
a loop) from a .PDB-file. Is there an easy way to do this without creating a 
conformer of the molecule?


all the best,


Illimar Rekand

Ph.D. Candidate

Department of Biomedicine

University of Bergen


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bond tags in SVGs

2019-02-05 Thread Lukas Pravda
If it is not too much trouble to ask, please build it for mac os (10.14.3) 
python 3.6.x.

 

Thanks!

Lukas

 

From: Greg Landrum 
Date: Tuesday, 5 February 2019 at 13:40
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

 

 

On Tue, Feb 5, 2019 at 12:23 PM Lukas Pravda  wrote:

 

Thanks for this. It looks excellent!! Is there a way how I can test this? Other 
than cloning and compiling the repository? So far I have been using rdkit 
solely from python and its conda builds, so don’t really know how to test it.

 

At the moment you would need to get a copy of the repo and build it. I can do a 
build so that it's conda-installable though. Which OS are you using?

 

If I understand this correctly, the atom and bond class ids are added only 
after TagAtoms() is called, or are they added at the ‘DrawMolecule()’ stage? 

 

Bond classes are added as the bonds are written. Atom classes can only be added 
at the TagAtoms() stage - there's not an object in the SVG for many atoms 
without TagAtoms() being called. 

 

I can imagine a lot of possible scenarios and use cases with this new 
functionality. However, in order to make the function TagAtoms() sufficiently 
general, a bit more control over the javascript used in the events would be 
needed. As a possible suggestions, I can imagine to pass as the third parameter 
a lambda selector, which would in turn feed the JS function with parameters to 
display names/charges/whatever. Also it would be nice to have a mean how to 
pass dict of key-val properties for both atoms and bonds so that you can 
incorporate related data into the svg. 

 

Having said that, in my opinion if svgs end up as a part of html/javascript 
application, it is the best to expose this interactivity directly from the 
client, rather than ‘pre-generating’ the behaviour on the server. So I’m not 
sure If it is worth investing time into mimicking this functionality in 
C++/python code, Whoever is in a need of generating interactive svgs, can 
directly consume the svg string and modify it according to their needs.

 

Yeah, that's more or less what I was thinking. We want to write something that 
can be reasonably easily modified after the fact to produce something useful.

 

 

To sum up, I think it should enough just to tag positions and identifiers of 
atoms/bonds exactly as you do and possibly further extend them with a mean how 
to pass some extra data to all of it. Then users can modify svgs whichever way 
they want, but others might think differently.

 

Excellent!

 

-greg

 

 

Best,

Lukas

From: Greg Landrum 
Date: Sunday, 3 February 2019 at 17:49
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Hi Lukas,

 

I had a chance to do a bit of work on this recently and I'd be interested to 
hear your feedback.

 

Bonds are now tagged with their bond IDs (using classes) and the "TagAtoms()" 
function now adds clickable transparent circles above each atom. These are also 
tagged with atom IDs using classes. TagAtoms() also lets you add callback 
functions for events associated with the atom circles. At the moment these are 
simply called with the atom id, but there's almost certainly a better way to do 
that. Suggestions are very welcome.

 

Here's a gist showing what's currently on the branch: 
https://gist.github.com/greglandrum/d23517cb449003252cf09b5bd14d8637

 

 

On Tue, Dec 4, 2018 at 6:46 PM Lukas Pravda  wrote:

Hi Greg, 

 

that’s what I have been thinking, unlucky. Essentially, I want to color the 
molecule in web-browser with various annotations and make it interactive. For 
that part I’m converting it internally to the d3.js internal representation 
(https://d3js.org/) and connecting it to its environment. For most of the parts 
I’m just fine with the position of atoms in svg using the tag property.

 

What I wanted to avoid is to replicate rdkit svg drawing code in javascript so 
that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do 
instead is to use existing svg images and parse them into d3.js, so I know 
which paths belong to which bond.

 

At this point my only idea is to color bonds individually and based on the 
overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong 
to, which is a bit overkill in my view.

 

Lukas  

 

 

From: Greg Landrum 
Date: Tuesday, 4 December 2018 at 17:24
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Hi Lukas,

 

There's not currently a way to do this at the moment. The closest you can get 
is by calling AddMoleculeMetadata():

 

In [6]: d = Draw.MolDraw2DSVG(200,200)

 

In [8]: d.DrawMolecule(nm)

 

In [10]: d.AddMoleculeMetadata(nm)

 

In [11]: d.FinishDrawing()

 

In [12]: svg = d.GetDrawingText()

 

In [14]: print(svg)





 







http://www.rdkit.org/xml; version="0.9">















 

This gets you t

Re: [Rdkit-discuss] Bond tags in SVGs

2019-02-05 Thread Lukas Pravda
Hi Greg,

 

Thanks for this. It looks excellent!! Is there a way how I can test this? Other 
than cloning and compiling the repository? So far I have been using rdkit 
solely from python and its conda builds, so don’t really know how to test it.

 

If I understand this correctly, the atom and bond class ids are added only 
after TagAtoms() is called, or are they added at the ‘DrawMolecule()’ stage? I 
can imagine a lot of possible scenarios and use cases with this new 
functionality. However, in order to make the function TagAtoms() sufficiently 
general, a bit more control over the javascript used in the events would be 
needed. As a possible suggestions, I can imagine to pass as the third parameter 
a lambda selector, which would in turn feed the JS function with parameters to 
display names/charges/whatever. Also it would be nice to have a mean how to 
pass dict of key-val properties for both atoms and bonds so that you can 
incorporate related data into the svg. 

 

Having said that, in my opinion if svgs end up as a part of html/javascript 
application, it is the best to expose this interactivity directly from the 
client, rather than ‘pre-generating’ the behaviour on the server. So I’m not 
sure If it is worth investing time into mimicking this functionality in 
C++/python code, Whoever is in a need of generating interactive svgs, can 
directly consume the svg string and modify it according to their needs.

 

To sum up, I think it should enough just to tag positions and identifiers of 
atoms/bonds exactly as you do and possibly further extend them with a mean how 
to pass some extra data to all of it. Then users can modify svgs whichever way 
they want, but others might think differently.

 

Best,

Lukas

From: Greg Landrum 
Date: Sunday, 3 February 2019 at 17:49
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Hi Lukas,

 

I had a chance to do a bit of work on this recently and I'd be interested to 
hear your feedback.

 

Bonds are now tagged with their bond IDs (using classes) and the "TagAtoms()" 
function now adds clickable transparent circles above each atom. These are also 
tagged with atom IDs using classes. TagAtoms() also lets you add callback 
functions for events associated with the atom circles. At the moment these are 
simply called with the atom id, but there's almost certainly a better way to do 
that. Suggestions are very welcome.

 

Here's a gist showing what's currently on the branch: 
https://gist.github.com/greglandrum/d23517cb449003252cf09b5bd14d8637

 

 

On Tue, Dec 4, 2018 at 6:46 PM Lukas Pravda  wrote:

Hi Greg, 

 

that’s what I have been thinking, unlucky. Essentially, I want to color the 
molecule in web-browser with various annotations and make it interactive. For 
that part I’m converting it internally to the d3.js internal representation 
(https://d3js.org/) and connecting it to its environment. For most of the parts 
I’m just fine with the position of atoms in svg using the tag property.

 

What I wanted to avoid is to replicate rdkit svg drawing code in javascript so 
that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do 
instead is to use existing svg images and parse them into d3.js, so I know 
which paths belong to which bond.

 

At this point my only idea is to color bonds individually and based on the 
overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong 
to, which is a bit overkill in my view.

 

Lukas  

 

 

From: Greg Landrum 
Date: Tuesday, 4 December 2018 at 17:24
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Hi Lukas,

 

There's not currently a way to do this at the moment. The closest you can get 
is by calling AddMoleculeMetadata():

 

In [6]: d = Draw.MolDraw2DSVG(200,200)

 

In [8]: d.DrawMolecule(nm)

 

In [10]: d.AddMoleculeMetadata(nm)

 

In [11]: d.FinishDrawing()

 

In [12]: svg = d.GetDrawingText()

 

In [14]: print(svg)





 







http://www.rdkit.org/xml; version="0.9">















 

This gets you the information you need to connect bond indices to the atoms, 
but I suspect that's not what you're looking for.

 

In general you are guaranteed that the order of the bonds in the output SVG is 
the same as the order in the input molecule, but you can have multiple paths 
for a given bond. For example here, where the end atoms have different colors:

 

In [25]: print(svg)





 





OH



http://www.rdkit.org/xml; version="0.9">











 

What are you looking to be able to do? That may make it easier to either come 
up with a work around or figure out what a new feature addition might look like.

 

-greg

 

 

 

 

On Mon, Dec 3, 2018 at 6:57 PM Lukas Pravda  wrote:

Hi all,

 

I was wondering if there is a way how you can tag  elements (bonds) in 
the svg created by rdkit.

 

i.e. transform something like this: 





 

Into:





 

Re: [Rdkit-discuss] Warning as error

2019-01-21 Thread Lukas Pravda
Hi Jean-Marc,

Just a thought, but SDMolSupplier has a lazy eval, if I am not mistaken. 
Technically you should get all the rdkit warnings and errors at the time of 
processing that bit of the sdf file. You can always read the stderror output, 
parse it and throw exception every time a 'funny' molecule comes in.

I use a routine similar to this:

from io import StringIO
import sys
import rdkit

saved_std_err = sys.stderr
log = sys.stderr = StringIO()
rdkit.Chem.WrapLogs()

reader = Chem.SDMolSupplier('my_file.sdf')
   for mol in reader:   
error_msgs = log.getvalue()

# check error_msgs content if it there are any particular errors and 
act accordingly, erhaps even flush the stream

sys.stderr = saved_std_err

Lukas

On 21/01/2019, 13:24, "Jean-Marc Nuzillard"  wrote:

Chem.SDMolSupplier




___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Bond tags in SVGs

2018-12-04 Thread Lukas Pravda
Hi Greg, 

 

that’s what I have been thinking, unlucky. Essentially, I want to color the 
molecule in web-browser with various annotations and make it interactive. For 
that part I’m converting it internally to the d3.js internal representation 
(https://d3js.org/) and connecting it to its environment. For most of the parts 
I’m just fine with the position of atoms in svg using the tag property.

 

What I wanted to avoid is to replicate rdkit svg drawing code in javascript so 
that I don’t want to consume the dump of rdkit.Mol object. What I wanted to do 
instead is to use existing svg images and parse them into d3.js, so I know 
which paths belong to which bond.

 

At this point my only idea is to color bonds individually and based on the 
overlay/proximity use kd-tree to reverse-engineer which bonds the paths belong 
to, which is a bit overkill in my view.

 

Lukas  

 

 

From: Greg Landrum 
Date: Tuesday, 4 December 2018 at 17:24
To: Lukas Pravda 
Cc: RDKIT mailing list 
Subject: Re: [Rdkit-discuss] Bond tags in SVGs

 

Hi Lukas,

 

There's not currently a way to do this at the moment. The closest you can get 
is by calling AddMoleculeMetadata():

 

In [6]: d = Draw.MolDraw2DSVG(200,200)

 

In [8]: d.DrawMolecule(nm)

 

In [10]: d.AddMoleculeMetadata(nm)

 

In [11]: d.FinishDrawing()

 

In [12]: svg = d.GetDrawingText()

 

In [14]: print(svg)





 







http://www.rdkit.org/xml; version="0.9">















 

This gets you the information you need to connect bond indices to the atoms, 
but I suspect that's not what you're looking for.

 

In general you are guaranteed that the order of the bonds in the output SVG is 
the same as the order in the input molecule, but you can have multiple paths 
for a given bond. For example here, where the end atoms have different colors:

 

In [25]: print(svg)





 





OH



http://www.rdkit.org/xml; version="0.9">











 

What are you looking to be able to do? That may make it easier to either come 
up with a work around or figure out what a new feature addition might look like.

 

-greg

 

 

 

 

On Mon, Dec 3, 2018 at 6:57 PM Lukas Pravda  wrote:

Hi all,

 

I was wondering if there is a way how you can tag  elements (bonds) in 
the svg created by rdkit.

 

i.e. transform something like this: 





 

Into:





 

Or similar. I’ve found possibility of tagging atoms in the SVG using 
Draw.rdMolDraw2D.MolDraw2DSVG.drawOptions() method that exposes property 
includeAtomTags. This then renders following additional elements into the SVG:

rdkit:atom idx="4" label="O-" x="153.479" y="82.8259" />

 

But I have not seen anything like this for bonds (latest release of RDKIT and 
python). Thanks, in advance for any hints. I was wondering about using 
highlightBondLists and then based on the svg infer the bond annotation, but 
that seems to be a bit of an overkill.

 

Cheers,

Lukas

 

 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Bond tags in SVGs

2018-12-03 Thread Lukas Pravda
Hi all,

 

I was wondering if there is a way how you can tag  elements (bonds) in 
the svg created by rdkit.

 

i.e. transform something like this: 





 

Into:





 

Or similar. I’ve found possibility of tagging atoms in the SVG using 
Draw.rdMolDraw2D.MolDraw2DSVG.drawOptions() method that exposes property 
includeAtomTags. This then renders following additional elements into the SVG:

rdkit:atom idx="4" label="O-" x="153.479" y="82.8259" />

 

But I have not seen anything like this for bonds (latest release of RDKIT and 
python). Thanks, in advance for any hints. I was wondering about using 
highlightBondLists and then based on the svg infer the bond annotation, but 
that seems to be a bit of an overkill.

 

Cheers,

Lukas

 

 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] How to set bond width with use of MolDraw2DSVG

2018-10-30 Thread Lukas Pravda
Hi all,

 

First of all, my configuration is following:

macOS: 10.14

conda: 4.5.11

python: 3.6.6

rdkit: 2018.09.01

 

I just tried to set bond width for 2D SVG images and I run into situation I 
don’t understand much (nothing surprising ☺). I’m aware of 2 ways how to 
generate SVG images

 

A simple one

from rdkit import Chem

from rdkit.Chem import Draw

width=100

mol = Chem.MolFromSmiles('c1cc(CCCO)ccc1')

Draw.DrawingOptions.bondLineWidth = 10

Draw.MolToFile(mol, 'img.svg', size=(width, width))

 

Here I can set bondLineWidth, it works like a charm, but if I use the other 
approach I know, which allows a bit more configuration,

 

from rdkit import Chem

from rdkit.Chem import Draw

mol = Chem.MolFromSmiles('c1cc(CCCO)ccc1')

drawer = Draw.MolDraw2DSVG(width, width)

options = drawer.drawOptions()

options.bondLineWidth = 10 # does not work

Draw.DrawingOptions.bondLineWidth = 10 # does not work either

mol = Draw.PrepareMolForDrawing(mol)

drawer.DrawMolecule(mol)

drawer.FinishDrawing()

with open(f'img.svg', 'w') as f:

    f.write(drawer.GetDrawingText())

 

the bondLineWidth property is not part of the object returned by drawOptions(). 
And setting it in a similar fashion as with the previous case does not work. 
So, I am bit puzzled at this point, If I do something wrong, or if it is 
possible to set it at all in the other approach.

 

At best, I’d like to set the other approach as it allows a bit more 
configuration and I would like to avoid manipulating the svg text directly. I 
appreciate all your help.

 

Thank you!

Lukas 



    

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Coordgen library questions

2018-10-09 Thread Lukas Pravda
Hi all,

 

I’m playing with the Coordgen library inside rdkit and I have a couple of 
questions I could not figure out by myself. Hopefully someone more experienced 
will know.

 
[comment] The way one can pass a scaling factor to the bond size is very 
unintuitive. If I don’t provide any parameter a single bond length is 1.0. If I 
pass 1.5 as a scaling factor, I’d expect to get single bond of a length 1.5. 
But instead I get 33.3. (measured in pymol) 
 

Snippet:

from rdkit import Chem

from rdkit.Chem import rdCoordGen

 

mol = Chem.MolFromSmiles('Cc1c1', sanitize=True)

mol1 = Chem.MolFromSmiles('Cc1c1', sanitize=True)

 

p = rdCoordGen.CoordGenParams()

p.coordgenScaling = 1.5

 

rdCoordGen.AddCoords(mol)

rdCoordGen.AddCoords(mol1, p)

 

Chem.MolToMolFile(mol, 'default.sdf') # bond length 1

Chem.MolToMolFile(mol1, '1.5_scale.sdf') # bond length 33.3

 

 

Is that intended?

 
Is there any way to modify templates, which can be passed as the 
‘templateFileDir’ parameter to match general groups and bonds as described 
here: http://rdkit.blogspot.com/2016/07/tuning-substructure-queries-ii.html? 
By default, rdCoordGen module writes to stderr by putting ‘TEMPLATES: 
/path/to/templates’ line for each depiction generated. Is there any simple way 
of muting that piece of information without manually hijacking the stderr 
(rdkit.rdBase.DisableLog('rdApp.*') does not work)?
 

Thanks for possible suggestions.

 

 

Cheers, 

Lukas

 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] FW: Protein Data Bank in Europe is looking for bioinformaticians

2018-08-07 Thread Lukas Pravda
Hi everyone, 

 

I’ll deliberately abuse this mailing list, so apologies for spam. In the 
Protein Data Bank in Europe part of EMBL-EBI (Hinxton, Cambridgeshire, UK) we 
are looking for Bioinformaticians. Presently we have 3 vacancies. For more 
information please send me a message or take a look here: 
https://www.ebi.ac.uk/pdbe/about/jobs

 

Cheers,

 

Lukas Pravda, Ph.D.

Bioinformatician/Scientific Programmer

 

Protein Data Bank in Europe (PDBe)

European Bioinformatics Institute (EMBL-EBI)

Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD

United Kingdom

 

 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] [Rdkit-announce] RDKit 2018.03.2 release

2018-06-09 Thread Lukas Pravda
Hi Drew,

 

are you sure that you are installing 64bit version? This 
https://anaconda.org/rdkit/rdkit says that the version you have installed is 
present in win-32 build if I am not mistaken.

 

Best,

Lukas

 

From: Drew Gibson via Rdkit-discuss 
Reply-To: Drew Gibson 
Date: Saturday, 9 June 2018 at 11:40
To: , 
Subject: Re: [Rdkit-discuss] [Rdkit-announce] RDKit 2018.03.2 release

 

Hi,

 

I am just installing the conda build on Windows, and it looks like I am getting 
rdkit 2017.09.1 rather than 2018.03.2

 

Current conda install:

 platform : win-64

 conda version : 4.3.23

conda is private : False

conda-env version : 4.3.23

conda-build version : not installed

python version : 3.6.0.final.0

   requests version : 2.12.4

 

but when I runconda create -c rdkit -n rdkit rdkit

 

I get

 

The following NEW packages will be INSTALLED:

...

rdkit:   2017.09.1-py36_1 rdkit

...

 

and sure enough it self identifies as 2017.09.1 -

 

>>> from rdkit import rdBase

>>> rdBase.rdkitVersion

'2017.09.1'

 

Incorrect version ?  Or just an incorrect version number ?

 

Cheers !

 

Drew

 

On Wed, 6 Jun 2018 at 06:57, Greg Landrum  wrote:

Hi,

 

The 2018.03.2 release of the RDKit is now available. This is a patch release, 
so it just contains bug fixes.

 

I've uploaded conda builds for Linux, the Mac, and Windows (I'm still working 
on the python 3.5 build, but 3.6 is up), as well as Linux and Mac builds of the 
cartridge. NOTE that this is now called: rdkit-postgresql. There should be 
builds available that work with the conda postgresql packages for v9.5, 9.6, 
and 10.0. If you give the cartridge builds a try, I would love to hear feedback 
on how it goes.

 

The release notes are here:

https://github.com/rdkit/rdkit/releases/tag/Release_2018_03_2

 

Best,

-greg

 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-announce mailing list
rdkit-annou...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-announce

-- 
Check out the vibrant tech community on one of the world's most engaging tech 
sites, Slashdot.org! 
http://sdm.link/slashdot___ 
Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Programatic access to the mol sanitation process results

2018-03-16 Thread Lukas Pravda
That worker! Although I was too lazy to modify the actual class and used python 
package for that. If anyone would be interested the minimal code how not to 
mess the stderr while retaining the error message as a variable to work with, 
see below. It uses python streams and wurlitzer package 
https://github.com/minrk/wurlitzer

 

 

 

from rdkit import Chem

import io

from wurlitzer import pipes

 

mol = Chem.MolFromSmiles('CO(C)C', sanitize=False)

out_stream = io.BytesIO()

 

with pipes(stderr=out_stream):

    sanitization_result = Chem.SanitizeMol(mol, catchErrors=True)

    error_msg = out_stream.getvalue().decode('utf-8')

    print(error_msg)

 

 

Lukas 

 

From: Peter Gedeck <peter.ged...@gmail.com>
Date: Friday, 9 March 2018 at 15:02
To: Lukas Pravda <lpra...@ebi.ac.uk>
Cc: Greg Landrum <greg.land...@gmail.com>, <Rdkit-discuss@lists.sourceforge.net>
Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process 
results

 

Hello Lukas,

 

The file rdkit/TestRunner.py contains a class/context manager called 
OutputRedirectC. If I remember correctly, this allowed capturing these 
messages. It's not used anywhere in the RDkit code base, so it not work 
anymore. Anyway, give it a try and if it works, you can modify it to redirect 
the output into a variable or StringIO. 

 

Best,

 

Peter

 

 

On 9 Mar 2018, at 9:34 AM, Lukas Pravda <lpra...@ebi.ac.uk> wrote:

 

Hello Greg, 

 

I’m very sorry for the late reply. Thank you for the hint on disabling the log 
message, it works on my end. However, I was more interested in catching the 
other bit i.e. which part of the structure is wrong, rather than which part of 
the sanitization process failed. That is accessing the message ‘Explicit 
valence for atom # 1 O, 3, is greater than permitted’ in form to find out that 
it is the misbehaving oxygen which causes failure of the sanitization process. 
Perhaps piping the log information into a variable or something like that.

 

Best,

Lukas

 

 

 

From: Greg Landrum <greg.land...@gmail.com>
Date: Thursday, 22 February 2018 at 13:32
To: Lukas Pravda <lpra...@ebi.ac.uk>
Cc: RDKit Discuss <Rdkit-discuss@lists.sourceforge.net>
Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process 
results

 

Hi Lukas,

 

On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda <lpra...@ebi.ac.uk> wrote:

Dear rdkiters,

 

I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 
and due to the variety of reasons some of them are violating general principles 
of chemistry in a way implemented in rdkit, so I’m getting information like:

 

Explicit valence for atom # 14 N, 4, is greater than permitted etc.

 

I wonder if there is a way how to retrieve this piece of information in a 
programmatic way. In order to work with it. Presently, rdkit only prints this 
out into terminal and Chem.SanitizeMol() only returns first sanitization flag 
with the issue. Ideally, I’d like no information to be printed into console, 
while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater 
than permitted’ preferably in a structured way (in a property/method?), in 
order to further deal with those erroneous cases.

 

At last part of this is pretty straightforward.

 

There are two parts: 

- making it so error messages don't go to the console 

- capturing the failed operation.

 

The first is a bit fragile (i.e. doesn't always work), so you will sometimes 
end up still seeing error messages (as here), but the second should be reliable:

 

In [30]: rdBase.DisableLog('rdApp.*')

 

In [31]: m = Chem.MolFromSmiles('c11',sanitize=False)

 

In [32]: Chem.SanitizeMol(m,catchErrors=True)

[14:29:37] Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 4

 

Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE

 

In [35]: 
Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True)

[14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted

Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES

 

 

You can see that the return value indicates what went wrong in the sanitization.

 

I hope this helps,

-greg

 

 

 

 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Programatic access to the mol sanitation process results

2018-03-09 Thread Lukas Pravda
Hello Greg, 

 

I’m very sorry for the late reply. Thank you for the hint on disabling the log 
message, it works on my end. However, I was more interested in catching the 
other bit i.e. which part of the structure is wrong, rather than which part of 
the sanitization process failed. That is accessing the message ‘Explicit 
valence for atom # 1 O, 3, is greater than permitted’ in form to find out that 
it is the misbehaving oxygen which causes failure of the sanitization process. 
Perhaps piping the log information into a variable or something like that.

 

Best,

Lukas

 

 

 

From: Greg Landrum <greg.land...@gmail.com>
Date: Thursday, 22 February 2018 at 13:32
To: Lukas Pravda <lpra...@ebi.ac.uk>
Cc: RDKit Discuss <Rdkit-discuss@lists.sourceforge.net>
Subject: Re: [Rdkit-discuss] Programatic access to the mol sanitation process 
results

 

Hi Lukas,

 

On Thu, Feb 22, 2018 at 1:14 PM, Lukas Pravda <lpra...@ebi.ac.uk> wrote:

Dear rdkiters,

 

I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 
and due to the variety of reasons some of them are violating general principles 
of chemistry in a way implemented in rdkit, so I’m getting information like:

 

Explicit valence for atom # 14 N, 4, is greater than permitted etc.

 

I wonder if there is a way how to retrieve this piece of information in a 
programmatic way. In order to work with it. Presently, rdkit only prints this 
out into terminal and Chem.SanitizeMol() only returns first sanitization flag 
with the issue. Ideally, I’d like no information to be printed into console, 
while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater 
than permitted’ preferably in a structured way (in a property/method?), in 
order to further deal with those erroneous cases.

 

At last part of this is pretty straightforward.

 

There are two parts: 

- making it so error messages don't go to the console 

- capturing the failed operation.

 

The first is a bit fragile (i.e. doesn't always work), so you will sometimes 
end up still seeing error messages (as here), but the second should be reliable:

 

In [30]: rdBase.DisableLog('rdApp.*')

 

In [31]: m = Chem.MolFromSmiles('c11',sanitize=False)

 

In [32]: Chem.SanitizeMol(m,catchErrors=True)

[14:29:37] Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 4

 

Out[32]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_KEKULIZE

 

In [35]: 
Chem.SanitizeMol(Chem.MolFromSmiles('CO(C)C',sanitize=False),catchErrors=True)

[14:31:37] Explicit valence for atom # 1 O, 3, is greater than permitted

Out[35]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_PROPERTIES

 

 

You can see that the return value indicates what went wrong in the sanitization.

 

I hope this helps,

-greg

 

 

 

 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Programatic access to the mol sanitation process results

2018-02-22 Thread Lukas Pravda
Dear rdkiters,

 

I’m constructing molecules from scratch using python 3.5.4 and RDKit 2017.09.2 
and due to the variety of reasons some of them are violating general principles 
of chemistry in a way implemented in rdkit, so I’m getting information like:

 

Explicit valence for atom # 14 N, 4, is greater than permitted etc.

 

I wonder if there is a way how to retrieve this piece of information in a 
programmatic way. In order to work with it. Presently, rdkit only prints this 
out into terminal and Chem.SanitizeMol() only returns first sanitization flag 
with the issue. Ideally, I’d like no information to be printed into console, 
while keeping the log info ‘Explicit valence for atom # 14 N, 4, is greater 
than permitted’ preferably in a structured way (in a property/method?), in 
order to further deal with those erroneous cases.

 

Thank you for answer,

Lukas

 

 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Generate depiction matching 2D structure

2018-01-11 Thread Lukas Pravda
Dear all, 

 

I’ve just recently started using rdkit in python. Btw. A very nice piece of 
work. First, I’d like to generate 2D depiction of molecules I’m constructing 
from 3D coordinate data. I’m aware of methods such as 
GenerateDepictionMatching2DStructure(…) and 
GenerateDepictionMatching3DStructure(…). For most of the cases it works like a 
charm, however there are bits and pieces which need improving. That is why I 
wonder if there is a mean how to either a) provide multiple templates or b) to 
provide mapping between the model and the template on the atomic level. For 
instance, I have a molecule containing two copies of the template, just one of 
them gets correctly rendered, while the other is still a mess. The 
documentation says something about AtomPairsParameters 
(http://www.rdkit.org/Python_Docs/rdkit.Chem.rdMolDescriptors.AtomPairsParameters-class.html),
 but given the description I have no clue whether or how to use it.

 

The second question I have in mind is that I’m facing the same issues on macOS 
and python 3.6.x as described here: 
https://sourceforge.net/p/rdkit/mailman/message/36093960/ and here 
https://github.com/rdkit/rdkit/issues/1617 so I hope there will be a solution 
in the future, as I’d like to use python 3.6

 

My current setup is:

macOS High Siera

Conda: 4.3.25

Rdkit 2017.9.2

Python: 3.5.3

 

I look forward to an answer

All the best,
lukas

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss