Thanks Paolo, this works brilliantly. Let's hope astatine inhibitors won't gain 
in popularity 😉


Best,

Jenke


________________________________
From: Paolo Tosco <paolo.tosco.m...@gmail.com>
Sent: 30 October 2019 13:25
To: SCHEEN Jenke <j.sch...@sms.ed.ac.uk>; RDKit Discuss 
<rdkit-discuss@lists.sourceforge.net>
Subject: Re: [Rdkit-discuss] fingerprint a molecule with pseudoatoms denoted by 
'Du'


Hi Jenke,


I have put together a small gist showing a slightly hacky way to round-trip a 
molecule containing dummy atoms through a PDB block (assuming that your 
molecules do not contain astatine). If your dummy atoms are called "DU" rather 
than " *", you may just change the replace() expression with something that 
fits your needs.


HTH, cheers

p.

On 10/30/19 12:06, SCHEEN Jenke wrote:
Hi RDKitters,

I'm trying to use rdkit to generate molecular fingerprints (such as AP or ECFP) 
on molecules that have non-interactive pseudoatoms ('dummy atoms', denoted by 
Du). I attached a sample PDB file containing the dummy atoms on positions 
21-24. Reading this file (Chem.rdmolfiles.MolFromPDBFile("test.pdb", 
sanitize=False) throws a post-condition violation because the element 'Du' 
isn't recognised, which makes sense. I've been searching online and haven't 
been able to find any workarounds, do you have any suggestions?

Some notes:

  *   I'm hoping that once rdkit is able to read in the pdb file the mol object 
can be parsed without the FP constructor (e.g. AllChem.GetMorganFingerprint) 
complaining.
  *   The use of the term dummy atoms here should not be confused with the 
dummy atoms depiction in fragmentising molecules in rdkit (where * is the 
smiles notation).
  *   For this project all I aim to do is generate structural fingerprints for 
these types of ligands. This means I won't have to worry about defining 
chemical properties to Du.
  *   The context for this issue is that we're aiming to featurise the ligands 
for an ML protocol where the dummy atoms are one of the major descriptors of 
the problem.

  *   I thought manually inserting a 119th element in atomic_data.cpp might 
resolve the issue but I've been unable to locate the file in my conda 
installation.
  *   The ODDT python API seems to parse the Du element without any issues but 
is limited in its FP generator diversity.


Best,

Jenke


The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to