Re: [Rdkit-discuss] another request for feedback on a new python API documentation format
Hi Maciek, The old docs are all still online. Since there aren't links, you just need to know the URL scheme: http://rdkit.org/RDKit_Docs.2017_09_1.tgz http://rdkit.org/RDKit_Docs.2016_03_1.tgz note that there are almost always only the "_1.tgz" version available; the patch releases don't normally affect the documentaiton. -greg On Tue, May 8, 2018 at 3:00 PM Maciek Wójcikowskiwrote: > Hi Greg, > > Speaking about the new docs - would it be possible to have documentation > for few stable releases back, like 2017.09, 2017.03, etc. Recently I was > trying to establish the changes in RDKit's API and ended up using git > blame, whereas I could be able to get that info from changing the release > on the docs. > > > Pozdrawiam, | Best regards, > Maciek Wójcikowski > mac...@wojcikowski.pl > > 2018-05-02 11:17 GMT+02:00 David Cosgrove : > >> Hi Greg, >> After a quick poke about, I think the new documentation looks great in >> general. If a change is forced on you, then I suggest you just do it in a >> way that makes your life as easy as possible. If people don't like it, >> they can always put the effort in to do something different and then I >> expect they'll quickly come round to realising that your way is perfectly >> fine. One way of fixing the docstring formatting would be to put >> instructions and a couple of examples somewhere handy and ask people to fix >> problems when they encounter them as they read the docs. That should be a >> small effort from each person that would hopefully fix the important ones >> quickly in a self-prioritising manner. >> Thanks for putting the time into this, >> Dave >> >> >> On Wed, May 2, 2018 at 8:40 AM, Greg Landrum >> wrote: >> >>> Dear all, >>> >>> Just over a year ago I asked for feedback on a new documentation format >>> for the RDKit python API: >>> https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg06688.html >>> Some useful feedback came in on that thread (thanks to those who replied >>> there and in private email), but I ran out of time/motivation to spend time >>> on this. >>> >>> With my motivation recharged thanks to the "fun" of using epydoc to >>> document the last release, I revisited the topic this weekend and actually >>> made some progress.[1] I'd like to gather a second round of feedback on >>> that. >>> >>> The documentation is here: >>> http://rdkit.org/docs_temp/index.html >>> The API docs (which are where the biggest changes are) are here: >>> http://rdkit.org/docs_temp/api-docs.html >>> >>> To address some of the things raised last time: >>> - This really isn't optional. It's been more than a decade since epydoc >>> was updated and it requires python 2.7. >>> - My previous attempt to auto-generate docs used pdoc ( >>> https://github.com/BurntSushi/pdoc). That project also seems to have >>> died, so it's not really an option. >>> - Based upon the two factors above I decided to use the autodoc >>> functionality that's part of Sphinx. It's not perfect, but it's supported >>> (and seems likely to continue to be so since it's part of Sphinx) >>> >>> - The docs now have a search box >>> >>> - We've lost the overview (list of classes/functions/etc) that epydoc >>> provides. There likely is a way to do this with sphinx, but I haven't >>> managed to get it to work yet >>> >>> - Formatting: Some of the docstrings end up looking pretty good, others >>> are awful. Here's a module that demonstrates both sides of the coin: >>> http://rdkit.org/docs_temp/source/rdkit.Chem.AtomPairs.Pairs.html#module-rdkit.Chem.AtomPairs.Pairs >>> Fixing this is "just" a matter of editing the doc strings. This is >>> reasonably mechanical, but unfortunately not automatable, work. It should >>> be done, but in the meantime the broken docstrings aren't completely >>> useless. >>> >>> There's also a github issue for this: >>> https://github.com/rdkit/rdkit/issues/1656 >>> I'm doing the work on this branch: >>> https://github.com/greglandrum/rdkit/tree/dev/usinx_sphinx_autodoc >>> >>> -greg >>> [1] Remember how I said I was going to take a short break and do >>> something fun? This isn't that. >>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> >> >> -- >> David Cosgrove >> Freelance computational chemistry and chemoinformatics developer >> http://cozchemix.co.uk >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>
Re: [Rdkit-discuss] Atom mapping
On 10/05/2018 10:39, carlo del moro wrote: I put an example for better explain my problem. starting from a PDB representing HPE, I use RDKIT/obabel for calculate the relative SMILES. The three-letter-code (chemical component id) in a PDB file has meaning - it is a pointer to chemistry. The chemistry description can be retrieved from the RCSB. Unless you know that the three-letter code doesn't refer to a standard chemical (as might be the case, for example, in internal use of 'LIG') you'd be well advised to get the chemistry from the canonical source. Here's a script that displays the SMILES strings. It seems to me that it would be better to go from PDB file -> RDKit molecule without the straight-jacket of SMILES. Paul. import urllib.request import sys if len(sys.argv) > 1: tlc = sys.argv[1] url='http://files.rcsb.org/ligands/view/' + tlc + '.cif' with urllib.request.urlopen(url) as response: html = response.read() lines = html.split(b'\n') print_it = False for line in lines: if b'#' in line: print_it = False if print_it: print(line.decode('ascii')) if b'_pdbx_chem_comp_descriptor.descriptor' in line: print_it = True -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Atom mapping
Hi, The smiles atom order is saved in a private property '_smilesAtomOutputOrder', see discussion on Github: https://github.com/rdkit/rdkit/issues/794 The order of atoms in PDB is the same as in RDKit's Mol object, thus it's fairly easy to find such mapping. Pozdrawiam, | Best regards, Maciek Wójcikowski mac...@wojcikowski.pl 2018-05-10 11:39 GMT+02:00 carlo del moro: > Thanks to all for the replies, > > I put an example for better explain my problem. > starting from a PDB representing HPE, I use RDKIT/obabel for calculate the > relative SMILES. Next, using a RDKIT's function I fragment the smiles in > substructure like this "CC(=O)O"; now I need to remap this substructure on > the starting tridimensional structureand in order to get the atom > coordinate. The task will be pretty easy if the numeration of the SMILES > atom representation is the same of the starting PDB file. You know any > methods to unify this two numeration? or to map the SMILES atom sequence on > the PDB's one? > This is the PDB for HPE. > > > HETATM 4176 N HPE B 2 5.227 20.107 15.512 1.00 17.92 >N > HETATM 4177 CA HPE B 2 4.065 20.646 16.205 1.00 16.87 >C > HETATM 4178 C HPE B 2 2.784 20.702 15.373 1.00 18.59 >C > HETATM 4179 O HPE B 2 2.806 21.092 14.215 1.00 17.45 >O > HETATM 4180 CB HPE B 2 4.377 22.085 16.699 1.00 17.52 >C > HETATM 4181 CG HPE B 2 5.532 22.067 17.720 1.00 14.97 >C > HETATM 4182 CD HPE B 2 5.886 23.416 18.279 1.00 17.87 >C > HETATM 4183 CE1 HPE B 2 6.717 24.309 17.627 1.00 17.26 >C > HETATM 4184 CE2 HPE B 2 5.385 23.752 19.520 1.00 19.20 >C > HETATM 4185 CZ1 HPE B 2 7.025 25.546 18.162 1.00 17.16 >C > HETATM 4186 CZ2 HPE B 2 5.698 24.993 20.061 1.00 22.45 >C > HETATM 4187 CH HPE B 2 6.517 25.906 19.409 1.00 19.18 >C > > Thanks to all. > > Carlo > > On Wed, May 9, 2018 at 8:37 PM, Dimitri Maziuk > wrote: > >> On 05/09/2018 10:27 AM, carlo del moro wrote: >> > Dear All, >> > >> > we would like to know if it is possible to map the atom's ID of a SMILES >> > represented substructure to the atom sequence of a ligand contained in a >> > pdb file. This in order to get the spatial coordinates related to such >> > substructure. >> >> http://alatis.nmrfam.wisc.edu/ will generate unique stable IDs from a 3D >> structure, and output the old->new ID map. It'll take a PDB, you'll >> have to convert your SMILES into a 3D .mol. ALATIS atom IDs should be >> the same in the two maps, *provided both inputs describe the exact same >> ligand*. >> >> (It's the *substructure* bit that I'm not entirely sure about.) >> -- >> Dimitri Maziuk >> Programmer/sysadmin >> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Atom mapping
Thanks to all for the replies, I put an example for better explain my problem. starting from a PDB representing HPE, I use RDKIT/obabel for calculate the relative SMILES. Next, using a RDKIT's function I fragment the smiles in substructure like this "CC(=O)O"; now I need to remap this substructure on the starting tridimensional structureand in order to get the atom coordinate. The task will be pretty easy if the numeration of the SMILES atom representation is the same of the starting PDB file. You know any methods to unify this two numeration? or to map the SMILES atom sequence on the PDB's one? This is the PDB for HPE. HETATM 4176 N HPE B 2 5.227 20.107 15.512 1.00 17.92 N HETATM 4177 CA HPE B 2 4.065 20.646 16.205 1.00 16.87 C HETATM 4178 C HPE B 2 2.784 20.702 15.373 1.00 18.59 C HETATM 4179 O HPE B 2 2.806 21.092 14.215 1.00 17.45 O HETATM 4180 CB HPE B 2 4.377 22.085 16.699 1.00 17.52 C HETATM 4181 CG HPE B 2 5.532 22.067 17.720 1.00 14.97 C HETATM 4182 CD HPE B 2 5.886 23.416 18.279 1.00 17.87 C HETATM 4183 CE1 HPE B 2 6.717 24.309 17.627 1.00 17.26 C HETATM 4184 CE2 HPE B 2 5.385 23.752 19.520 1.00 19.20 C HETATM 4185 CZ1 HPE B 2 7.025 25.546 18.162 1.00 17.16 C HETATM 4186 CZ2 HPE B 2 5.698 24.993 20.061 1.00 22.45 C HETATM 4187 CH HPE B 2 6.517 25.906 19.409 1.00 19.18 C Thanks to all. Carlo On Wed, May 9, 2018 at 8:37 PM, Dimitri Maziukwrote: > On 05/09/2018 10:27 AM, carlo del moro wrote: > > Dear All, > > > > we would like to know if it is possible to map the atom's ID of a SMILES > > represented substructure to the atom sequence of a ligand contained in a > > pdb file. This in order to get the spatial coordinates related to such > > substructure. > > http://alatis.nmrfam.wisc.edu/ will generate unique stable IDs from a 3D > structure, and output the old->new ID map. It'll take a PDB, you'll > have to convert your SMILES into a 3D .mol. ALATIS atom IDs should be > the same in the two maps, *provided both inputs describe the exact same > ligand*. > > (It's the *substructure* bit that I'm not entirely sure about.) > -- > Dimitri Maziuk > Programmer/sysadmin > BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss