On 10/05/2018 10:39, carlo del moro wrote:

I put an example for better explain my problem.
starting from a PDB representing HPE, I use RDKIT/obabel for calculate the 
relative SMILES.

The three-letter-code (chemical component id) in a PDB file has meaning - it is a pointer to chemistry. The chemistry description can be retrieved from the RCSB. Unless you know that the three-letter code doesn't refer to a standard chemical (as might be the case, for example, in internal use of 'LIG') you'd be well advised to get the chemistry from the canonical source. Here's a script that displays the SMILES strings. It seems to me that it would be better to go from PDB file -> RDKit molecule without the straight-jacket of SMILES.

Paul.
import urllib.request
import sys


if len(sys.argv) > 1:
   tlc = sys.argv[1]

   url='http://files.rcsb.org/ligands/view/' + tlc + '.cif'

   with urllib.request.urlopen(url) as response:
      html = response.read()
      lines = html.split(b'\n')
      print_it = False
      for line in lines:
         if b'#' in line:
            print_it = False
         if print_it:
            print(line.decode('ascii'))
         if b'_pdbx_chem_comp_descriptor.descriptor' in line:
            print_it = True

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to