Hi Lewis,

You can try to use PreparePDBMol in oddt
https://github.com/oddt/oddt/blob/master/oddt/toolkits/extras/rdkit/fixer.py#L623-L669
that we used in PLEC model training and PDBFixer didn't worked for us
either. Note that as soon as you have correct bonding you can disable
automatic bonding in RDKit using proximityBonding=False.
----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


pon., 27 wrz 2021 o 12:25 Lewis Martin <lewis.marti...@gmail.com>
napisał(a):

> Very interesting - thank you Francois! PDB re-do does the trick:
>
>
>
>
>
>
>
>
>
> *import requestsfrom rdkit import Chemdef getPDB(code):    out =
> requests.get(f'https://pdb-redo.eu/db/{code}/{code}_final.pdb
> <https://pdb-redo.eu/db/%7Bcode%7D/%7Bcode%7D_final.pdb>')    return
> out.contentpdb_string = getPDB('3udn')Chem.MolFromPDBBlock(pdb_string)*
>
> I think this solves it for me, but if anyone knows how to infer correct
> bonding information without relying on distances, I'd love to hear it too!
> So far I've noticed that Parmed and PDBFixer infer correct bonds, but they
> don't determine bond orders, so it's difficult to port the molecule into
> RDKit.
>
> Cheers
> Lewis
>
>
>
> On Mon, Sep 27, 2021 at 5:55 PM Francois Berenger <mli...@ligand.eu>
> wrote:
>
>> Hi Lewis,
>>
>> Just an idea: you might try to load your PDB in UCSF Chimera, then
>> save it as a mol2 or sdf file.
>> Then, try to read this sdf file from rdkit.
>>
>> Another idea: try to get your pdb file through the pdbredo service.
>> https://pdb-redo.eu/
>> They might have fixed a few things; maybe this PDB will read better in
>> rdkit.
>>
>> Regards,
>> F.
>>
>> On 26/09/2021 17:02, Lewis Martin wrote:
>> > Hi RDKit,
>> > While parsing proteins from the PBD with RDKit, I've come across
>> > situations where the distance-based bond determination leads to
>> > 'incorrect' bonds between atoms that are erroneously too close
>> > together. PDB files have no bond information, so it's not really
>> > 'incorrect' (rather the model coordinates are off), but the bonds are
>> > nonphysical - and it means the Mol objects won't sanitize.
>> >
>> > Here's an example:
>> >
>> > import requests
>> > from io import BytesIO
>> > import gzip
>> > from rdkit import Chem
>> >
>> > def getPDB(code):
>> >     out =
>> > requests.get(f'https://files.rcsb.org/download/{code}.pdb1.gz [1]')
>> >     binary_stream =  BytesIO(out.content)
>> >     return gzip.open(binary_stream).read()
>> >
>> > pdb_string = getPDB('3udn')
>> > Chem.MolFromPDBBlock(pdb_string)
>> >
>> > Error is:
>> >
>> > RDKit ERROR: [22:38:21] Explicit valence for atom # 573 O, 3, is
>> > greater than permitted
>> >
>> > This is caused by the threonine 72 sidechain being too close to the
>> > TYR71 backbone carbonyl oxygen (this can be visualized at
>> > https://www.rcsb.org/3d-view/3UDN?preset=ligandInteraction&sele=09B ,
>> > TYR71 is near the ligand).
>> >
>> > Does anyone know how to avoid this to create a Chem.Mol? I've tried
>> > using Parmed and PDBFixer, since they use residue templates to
>> > generate the correct bonding topology, but they don't write CONECT
>> > records or SDFs, so the bonds are still lost to RDKit.
>> >
>> > Thanks for your time!
>> > Lewis
>> > PS - why not just use PDBFixer? I'm trying to calculate atom
>> > invariants using RDKit's morgan fingerprinter implementation, so
>> > ultimately I want a sanitized Mol object
>> >
>> > Links:
>> > ------
>> > [1] https://files.rcsb.org/download/%7Bcode%7D.pdb1.gz
>> > _______________________________________________
>> > Rdkit-discuss mailing list
>> > Rdkit-discuss@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to