On 10/05/2018 10:39, carlo del moro wrote:
I put an example for better explain my problem.
starting from a PDB representing HPE, I use RDKIT/obabel for calculate the
relative SMILES.
The three-letter-code (chemical component id) in a PDB file has meaning - it is a pointer to chemistry. The
chemistry description can be retrieved from the RCSB. Unless you know that the three-letter code doesn't
refer to a standard chemical (as might be the case, for example, in internal use of 'LIG') you'd be well
advised to get the chemistry from the canonical source. Here's a script that displays the SMILES strings.
It seems to me that it would be better to go from PDB file -> RDKit molecule without the straight-jacket of
SMILES.
Paul.
import urllib.request
import sys
if len(sys.argv) > 1:
tlc = sys.argv[1]
url='http://files.rcsb.org/ligands/view/' + tlc + '.cif'
with urllib.request.urlopen(url) as response:
html = response.read()
lines = html.split(b'\n')
print_it = False
for line in lines:
if b'#' in line:
print_it = False
if print_it:
print(line.decode('ascii'))
if b'_pdbx_chem_comp_descriptor.descriptor' in line:
print_it = True
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss