Re: [Rdkit-discuss] another request for feedback on a new python API documentation format

2018-05-10 Thread Greg Landrum
Hi Maciek,

The old docs are all still online. Since there aren't links, you just need
to know the URL scheme:

http://rdkit.org/RDKit_Docs.2017_09_1.tgz
http://rdkit.org/RDKit_Docs.2016_03_1.tgz

note that there are almost always only the "_1.tgz" version available; the
patch releases don't normally affect the documentaiton.

-greg



On Tue, May 8, 2018 at 3:00 PM Maciek Wójcikowski 
wrote:

> Hi Greg,
>
> Speaking about the new docs - would it be possible to have documentation
> for few stable releases back, like 2017.09, 2017.03, etc. Recently I was
> trying to establish the changes in RDKit's API and ended up using git
> blame, whereas I could be able to get that info from changing the release
> on the docs.
>
> 
> Pozdrawiam,  |  Best regards,
> Maciek Wójcikowski
> mac...@wojcikowski.pl
>
> 2018-05-02 11:17 GMT+02:00 David Cosgrove :
>
>> Hi Greg,
>> After a quick poke about, I think the new documentation looks great in
>> general.  If a change is forced on you, then I suggest you just do it in a
>> way that makes your life as easy as possible.  If people don't like it,
>> they can always put the effort in to do something different and then I
>> expect they'll quickly come round to realising that your way is perfectly
>> fine.  One way of fixing the docstring formatting would be to put
>> instructions and a couple of examples somewhere handy and ask people to fix
>> problems when they encounter them as they read the docs.  That should be a
>> small effort from each person that would hopefully fix the important ones
>> quickly in a self-prioritising manner.
>> Thanks for putting the time into this,
>> Dave
>>
>>
>> On Wed, May 2, 2018 at 8:40 AM, Greg Landrum 
>> wrote:
>>
>>> Dear all,
>>>
>>> Just over a year ago I asked for feedback on a new documentation format
>>> for the RDKit python API:
>>> https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg06688.html
>>> Some useful feedback came in on that thread (thanks to those who replied
>>> there and in private email), but I ran out of time/motivation to spend time
>>> on this.
>>>
>>> With my motivation recharged thanks to the "fun" of using epydoc to
>>> document the last release, I revisited the topic this weekend and actually
>>> made some progress.[1] I'd like to gather a second round of feedback on
>>> that.
>>>
>>> The documentation is here:
>>> http://rdkit.org/docs_temp/index.html
>>> The API docs (which are where the biggest changes are) are here:
>>> http://rdkit.org/docs_temp/api-docs.html
>>>
>>> To address some of the things raised last time:
>>> - This really isn't optional. It's been more than a decade since epydoc
>>> was updated and it requires python 2.7.
>>> - My previous attempt to auto-generate docs used pdoc (
>>> https://github.com/BurntSushi/pdoc). That project also seems to have
>>> died, so it's not really an option.
>>> - Based upon the two factors above I decided to use the autodoc
>>> functionality that's part of Sphinx. It's not perfect, but it's supported
>>> (and seems likely to continue to be so since it's part of Sphinx)
>>>
>>> - The docs now have a search box
>>>
>>> - We've lost the overview (list of classes/functions/etc) that epydoc
>>> provides. There likely is a way to do this with sphinx, but I haven't
>>> managed to get it to work yet
>>>
>>> - Formatting: Some of the docstrings end up looking pretty good, others
>>> are awful. Here's a module that demonstrates both sides of the coin:
>>> http://rdkit.org/docs_temp/source/rdkit.Chem.AtomPairs.Pairs.html#module-rdkit.Chem.AtomPairs.Pairs
>>> Fixing this is "just" a matter of editing the doc strings. This is
>>> reasonably mechanical, but unfortunately not automatable, work. It should
>>> be done, but in the meantime the broken docstrings aren't completely
>>> useless.
>>>
>>> There's also a github issue for this:
>>> https://github.com/rdkit/rdkit/issues/1656
>>> I'm doing the work on this branch:
>>> https://github.com/greglandrum/rdkit/tree/dev/usinx_sphinx_autodoc
>>>
>>> -greg
>>> [1] Remember how I said I was going to take a short break and do
>>> something fun? This isn't that.
>>>
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>>
>> --
>> David Cosgrove
>> Freelance computational chemistry and chemoinformatics developer
>> http://cozchemix.co.uk
>>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> 

Re: [Rdkit-discuss] Atom mapping

2018-05-10 Thread Paul Emsley

On 10/05/2018 10:39, carlo del moro wrote:


I put an example for better explain my problem.
starting from a PDB representing HPE, I use RDKIT/obabel for calculate the 
relative SMILES.


The three-letter-code (chemical component id) in a PDB file has meaning - it is a pointer to chemistry. The 
chemistry description can be retrieved from the RCSB. Unless you know that the three-letter code doesn't 
refer to a standard chemical (as might be the case, for example, in internal use of 'LIG') you'd be well 
advised to get the chemistry from the canonical source. Here's a script that displays the SMILES strings. 
It seems to me that it would be better to go from PDB file -> RDKit molecule without the straight-jacket of 
SMILES.


Paul.
import urllib.request
import sys


if len(sys.argv) > 1:
   tlc = sys.argv[1]

   url='http://files.rcsb.org/ligands/view/' + tlc + '.cif'

   with urllib.request.urlopen(url) as response:
  html = response.read()
  lines = html.split(b'\n')
  print_it = False
  for line in lines:
 if b'#' in line:
print_it = False
 if print_it:
print(line.decode('ascii'))
 if b'_pdbx_chem_comp_descriptor.descriptor' in line:
print_it = True

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Atom mapping

2018-05-10 Thread Maciek Wójcikowski
Hi,

The smiles atom order is saved in a private property
'_smilesAtomOutputOrder', see discussion on Github:
https://github.com/rdkit/rdkit/issues/794

The order of atoms in PDB is the same as in RDKit's Mol object, thus it's
fairly easy to find such mapping.


Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl

2018-05-10 11:39 GMT+02:00 carlo del moro :

> Thanks to all for the replies,
>
> I put an example for better explain my problem.
> starting from a PDB representing HPE, I use RDKIT/obabel for calculate the
> relative SMILES. Next, using a RDKIT's function I fragment the smiles in
> substructure like this "CC(=O)O"; now I need to remap this substructure on
> the starting tridimensional structureand in order to get the atom
> coordinate. The task will be pretty easy if the numeration of the SMILES
> atom representation is the same of the starting PDB file. You know any
> methods to unify this two numeration? or to map the SMILES atom sequence on
> the PDB's one?
> This is the PDB for HPE.
>
>
> HETATM 4176  N   HPE B   2   5.227  20.107  15.512  1.00 17.92
>N
> HETATM 4177  CA  HPE B   2   4.065  20.646  16.205  1.00 16.87
>C
> HETATM 4178  C   HPE B   2   2.784  20.702  15.373  1.00 18.59
>C
> HETATM 4179  O   HPE B   2   2.806  21.092  14.215  1.00 17.45
>O
> HETATM 4180  CB  HPE B   2   4.377  22.085  16.699  1.00 17.52
>C
> HETATM 4181  CG  HPE B   2   5.532  22.067  17.720  1.00 14.97
>C
> HETATM 4182  CD  HPE B   2   5.886  23.416  18.279  1.00 17.87
>C
> HETATM 4183  CE1 HPE B   2   6.717  24.309  17.627  1.00 17.26
>C
> HETATM 4184  CE2 HPE B   2   5.385  23.752  19.520  1.00 19.20
>C
> HETATM 4185  CZ1 HPE B   2   7.025  25.546  18.162  1.00 17.16
>C
> HETATM 4186  CZ2 HPE B   2   5.698  24.993  20.061  1.00 22.45
>C
> HETATM 4187  CH  HPE B   2   6.517  25.906  19.409  1.00 19.18
>C
>
> Thanks to all.
>
> Carlo
>
> On Wed, May 9, 2018 at 8:37 PM, Dimitri Maziuk 
> wrote:
>
>> On 05/09/2018 10:27 AM, carlo del moro wrote:
>> > Dear All,
>> >
>> > we would like to know if it is possible to map the atom's ID of a SMILES
>> > represented substructure to the atom sequence of a ligand contained in a
>> > pdb file. This in order to get the spatial coordinates related to such
>> > substructure.
>>
>> http://alatis.nmrfam.wisc.edu/ will generate unique stable IDs from a 3D
>> structure, and output the old->new ID map. It'll take a PDB,  you'll
>> have to convert your SMILES into a 3D .mol. ALATIS atom IDs should be
>> the same in the two maps, *provided both inputs describe the exact same
>> ligand*.
>>
>> (It's the *substructure* bit that I'm not entirely sure about.)
>> --
>> Dimitri Maziuk
>> Programmer/sysadmin
>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Atom mapping

2018-05-10 Thread carlo del moro
Thanks to all for the replies,

I put an example for better explain my problem.
starting from a PDB representing HPE, I use RDKIT/obabel for calculate the
relative SMILES. Next, using a RDKIT's function I fragment the smiles in
substructure like this "CC(=O)O"; now I need to remap this substructure on
the starting tridimensional structureand in order to get the atom
coordinate. The task will be pretty easy if the numeration of the SMILES
atom representation is the same of the starting PDB file. You know any
methods to unify this two numeration? or to map the SMILES atom sequence on
the PDB's one?
This is the PDB for HPE.


HETATM 4176  N   HPE B   2   5.227  20.107  15.512  1.00 17.92
 N
HETATM 4177  CA  HPE B   2   4.065  20.646  16.205  1.00 16.87
 C
HETATM 4178  C   HPE B   2   2.784  20.702  15.373  1.00 18.59
 C
HETATM 4179  O   HPE B   2   2.806  21.092  14.215  1.00 17.45
 O
HETATM 4180  CB  HPE B   2   4.377  22.085  16.699  1.00 17.52
 C
HETATM 4181  CG  HPE B   2   5.532  22.067  17.720  1.00 14.97
 C
HETATM 4182  CD  HPE B   2   5.886  23.416  18.279  1.00 17.87
 C
HETATM 4183  CE1 HPE B   2   6.717  24.309  17.627  1.00 17.26
 C
HETATM 4184  CE2 HPE B   2   5.385  23.752  19.520  1.00 19.20
 C
HETATM 4185  CZ1 HPE B   2   7.025  25.546  18.162  1.00 17.16
 C
HETATM 4186  CZ2 HPE B   2   5.698  24.993  20.061  1.00 22.45
 C
HETATM 4187  CH  HPE B   2   6.517  25.906  19.409  1.00 19.18
 C

Thanks to all.

Carlo

On Wed, May 9, 2018 at 8:37 PM, Dimitri Maziuk 
wrote:

> On 05/09/2018 10:27 AM, carlo del moro wrote:
> > Dear All,
> >
> > we would like to know if it is possible to map the atom's ID of a SMILES
> > represented substructure to the atom sequence of a ligand contained in a
> > pdb file. This in order to get the spatial coordinates related to such
> > substructure.
>
> http://alatis.nmrfam.wisc.edu/ will generate unique stable IDs from a 3D
> structure, and output the old->new ID map. It'll take a PDB,  you'll
> have to convert your SMILES into a 3D .mol. ALATIS atom IDs should be
> the same in the two maps, *provided both inputs describe the exact same
> ligand*.
>
> (It's the *substructure* bit that I'm not entirely sure about.)
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss