Re: [Rdkit-discuss] Beta of the 2022.03.1 release available
Hi Greg, to give you some feedback: I switched my current research project to the beta version and didn't find any problem yet ;-) Best, Markus On Fri, Mar 18, 2022 at 1:32 PM Greg Landrum wrote: > Dear all, > > I tagged the first beta of the 2022.03 RDKit release this morning. > Assuming nothing weird shows up during testing, we'll do the actual > release on the 25th. > > You can find the new beta here: > https://github.com/rdkit/rdkit/releases/tag/Release_2022_03_1b1 > > Conda builds of the beta are available in the rdkit channel for python > 3.8 on Mac and Linux: > conda install -c rdkit/label/beta rdkit rdkit=2022.03 > > Please try out the beta and let us know if you find any problems! > > Best regards, > -greg > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Cheminformatics Graduate School Recommendations?
Hi Patrick, labs I would take a look at (in no particular order and well, a bit heavy on European labs): Irwin Lab, UCFS: https://profiles.ucsf.edu/john.irwin Bajorath Group, Bonn, Germany: https://www.limes-institut-bonn.de/forschung/arbeitsgruppen/unit-4/abteilung-bajorath/abt-bajorath-startseite/ Reymond Group, Bern, Switzerland: https://www.gdb.unibe.ch/ Rarey Group, Hamburg, Germany: https://www.zbh.uni-hamburg.de/personen/amd/mrarey.html Leach Team, Cambridge, UK: https://www.ebi.ac.uk/about/people/andrew-leach Czodrowski Lab, Dortmund, Germany: https://www.czodrowskilab.org/team Best, Markus On Mon, Jul 19, 2021 at 6:17 PM Patrick Neal wrote: > Hi All, > > I apologize if this is too far off topic, but I got a recommendation to > ask here since this community is the most likely to know! > > I'm about to graduate from my undergrad chemistry program and I'm looking > for graduate schools. I started in traditional computational chemistry > research, but have really loved the cheminformatics/datascience aspects of > drug discovery. I'm hoping to ask the community if you all have any > recommendations for academic labs (ideally US based) with interesting > cheminformatics research? > > I'm specifically interested in fingerprinting methods (encoding > 3D/conformational information), similarity search/clustering compounds at > scale, and automation tools for QM calculations. But, I would be grateful > to hear of any labs you think are doing great cheminformatics work! > > All the best, > > Patrick > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chembience Postgres RDKit extension
Hello, I have added a Postgres-13/RDKit 2021.03-Version to my Chembience Postgres RDKit project (https://github.com/chembience/docker-postgres-rdkit-compile) Available Docker images are now *chembience/postgres-rdkit:postgres-13.rdkit-2021.03 (new)* chembience/postgres-rdkit:postgres-13.rdkit-2020.09 chembience/postgres-rdkit:postgres-12.rdkit-2020.03 chembience/postgres-rdkit:postgres-11.rdkit-2019.09 Best, Markus On Mon, Mar 8, 2021 at 8:49 AM Greg Landrum wrote: > That's really cool, thanks Markus! > > On Sat, Mar 6, 2021 at 7:34 PM Markus Sitzmann > wrote: > >> Hello, >> >> I have reworked the Postgres RDKit extension module of Chembience and >> made it a spin-off project of its own which is available at: >> >> https://github.com/chembience/docker-postgres-rdkit-compile >> >> It is now based on a fork of the Official Postgres Docker Image >> repository at GitHub just adding the compilation of the RDKit extension >> module to it. It allows for local compilation of the package, however, I >> also provide ready-to-pull Docker images at DockerHub of it. Currently >> available by docker pull are (they all are usable independently of any >> Chembience setup) : >> >> chembience/postgres-rdkit:postgres-13.rdkit-2020.09 >> chembience/postgres-rdkit:postgres-12.rdkit-2020.03 >> chembience/postgres-rdkit:postgres-11.rdkit-2019.09 >> >> My plan is to keep this project up-to-date if newer versions of RDKit or >> Postgres are released. >> >> I also have updated Chembience https://github.com/chembience/chembience >> to version 0.2.18 last week. This is mostly an upgrade to RDKit 2020.09 >> (before it becomes the "old" version) and Postgres 13 and relies already on >> the project above. >> >> Best. >> Markus >> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Chembience Postgres RDKit extension
Hello, I have reworked the Postgres RDKit extension module of Chembience and made it a spin-off project of its own which is available at: https://github.com/chembience/docker-postgres-rdkit-compile It is now based on a fork of the Official Postgres Docker Image repository at GitHub just adding the compilation of the RDKit extension module to it. It allows for local compilation of the package, however, I also provide ready-to-pull Docker images at DockerHub of it. Currently available by docker pull are (they all are usable independently of any Chembience setup) : chembience/postgres-rdkit:postgres-13.rdkit-2020.09 chembience/postgres-rdkit:postgres-12.rdkit-2020.03 chembience/postgres-rdkit:postgres-11.rdkit-2019.09 My plan is to keep this project up-to-date if newer versions of RDKit or Postgres are released. I also have updated Chembience https://github.com/chembience/chembience to version 0.2.18 last week. This is mostly an upgrade to RDKit 2020.09 (before it becomes the "old" version) and Postgres 13 and relies already on the project above. Best. Markus ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit/tautomers
Hi Benny, that is a pure InChI problem (not a RDKit one). Back then when the Standard InChI was defined, the 15T and the KET option for the InChI calculation weren't either available or still experimental (I don't remember :-)), so they didn't make it into the standard set of options for the Standard InChI calculation. Hence it isn't too surprising that this tautomer pair doesn't calculate the same Standard InChI (InChI isn't/wasn't particularly strong regarding tautomerism outside rings). You might use (non-standard) InChI and switch the 15T and KET options on, that should fix your particular case. In general there are still ongoing efforts to make InChI stronger regarding tautomerism: https://pubmed.ncbi.nlm.nih.gov/32043883/ Markus On Tue, Jul 21, 2020 at 12:11 PM Da'Adoosh Binyamin < daado...@tauex.tau.ac.il> wrote: > Hi, > > > > I have a question about RDKit/tautomers. > > > > Let's say I have smiles input: > > > > C[CH]2CCC(=O)C1=C(O)[CH](O)C[CH](O)[CH]12 > > C[CH]2CCC(O)=C1C(=O)[CH](O)C[CH](O)[CH]12 > > > > Now, if I make this code for each input: > > > > m = Chem.MolFromSmiles(input) > > inchi = Chem.rdinchi.MolToInchi(m) > > > > I get different InChIs: > > > > > InChI=1S/C11H16O4/c1-5-2-3-6(12)10-9(5)7(13)4-8(14)11(10)15/h5,7-9,13-15H,2-4H2,1H3 > > > InChI=1S/C11H16O4/c1-5-2-3-6(12)10-9(5)7(13)4-8(14)11(10)15/h5,7-9,12-14H,2-4H2,1H3 > > > > My question is why is it happening. Usually if I enter two tautomers - > they have the same InChI (like it is supposed to be, according to the > literature ). What is the difference in this example? > > > > Thanks, > > Benny > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Install RDKit (Docker, rapids, rdkit=2020.03.2) ?/
Hi Joey, maybe the Dockerfile of my Chembience project helps: https://github.com/chembience/chembience/blob/master/context/build/rdkit/Dockerfile The chembience/python-base image it starts from actually doesn't do much except providing a very basic setup, its Dockerfile is here: https://github.com/chembience/chembience/blob/master/context/build/base/Dockerfile So it should be replaceable with the rapidsai/rapidsai:0.12-cuda10.1-runtime-ubuntu18.04 image you want to start of. Markus On Wed, Jun 3, 2020 at 10:43 PM Storer, Joey (J) wrote: > Hi, > > > > I am trying to run the following Docker file and the container fails to > install rdkit. Other incarnations install either the 2019 version or even > the 2017 version. > > > > *#* > > *FROM rapidsai/rapidsai:0.12-cuda10.1-runtime-ubuntu18.04* > > > > *ARG ENVNAME=rapids* > > *ENV ENVNAME=$ENVNAME* > > > > *RUN source activate $ENVNAME && \* > > *conda install boost>='1.72.0,<1.72.1.0a0' cairo>='1.16.0,<1.17.0a0' > freetype>='2.9.1,<3.0a0' libgcc-ng>='7.3.0' libstdcxx-ng>='7.3.0' > numpy>='1.14.6,<2.0a0' pandas pillow pycairo python>='3.7,<3.8.0a0' > python_abi='3.7.* *_cp37m' six* > > > > *RUN source activate $ENVNAME && \* > > *conda install "rdkit=2020.03.2=py37hdd87690_0"* > > > > *#* > > > > Any advice on getting RDKit into a Rapids/Ubuntu Docker container? > > > > Thanks! > > Joey Storer (Dow, Inc.) > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RD Kit PostgreSQL in a container
Hi, I am working on for this for the next Chembience release (0.3.0 which I hope will be out in January). It adds RDKit to the official Postgres container repository at https://hub.docker.com/_/postgres If you checkout https://github.com/chembience/chembience-postgresql-rdkit.git *and use branch deploy*, it should be in working conditions (images should be available from Docker hub). It can be used by the provided docker-compose.yml script in the repository, i.e. it can be started with *docker-compose up* I will add more documentation and some improvements for the January release :-). And it currently works only for Postgres 11. Best, Markus https://chembience.com On Wed, Dec 4, 2019 at 7:38 PM Webster Homer < webster.ho...@milliporesigma.com> wrote: > I’m looking at running RD Kit Postgresql cartridge in a docker container. > Has anyone done this? There are PostgreSQL containers available on line at > https://hub.docker.com/_/postgres if there is an existing dockerfile > with the RDKit extension, that would be great. > > > > If not has anyone built one? Ideally I’d start from one of the existing > dockerfiles. > > > > RDKit Postgresql in the current distribution is version 11.2, the > dockerfiles on the hub include an 11 and an 11.6 version. Any idea as to > which one to use? > > > > I’m new to dockerfiles, I’d appreciate any suggestions > > > > Regards, > > Webster Homer > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to > any other person. If you have received this transmission in error, please > notify the sender immediately and delete the message and any attachment > from your system. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not accept liability for any omissions or errors in this > message which may arise as a result of E-Mail-transmission or for damages > resulting from any unauthorized changes of the content of this message and > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not guarantee that this message is free of viruses and does > not accept liability for any damages caused by any virus transmitted > therewith. Click http://www.merckgroup.com/disclaimer to access the > German, French, Spanish and Portuguese versions of this disclaimer. > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Inchi/smiles conversion issue
Yes, this is a well known problem: first of all, if there is more than one chemist present, you can always have a long discussions about what the most stable tautomeric form of a given compound (under certain conditions) might be, however, in case of InChI, if you ask the algorithm for the tautomer-invariant representation of a compound, i.e., the canonical tautomer (and the Standard InChI does this inherently), everybody agrees that in quite many cases it is quite an odd tautomer the InChI algorithm choose for the canonical one :-) Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 18. Jun 2019, at 18:41, Alexis Parenty > wrote: > > Dear Jennifer, > Many thanks for your response. Very useful tutorial on Inchi. I did not know > about the FixedH option: > inchi = Chem.MolToInchi(mol, options='/FixedH') > Best, > Alexis > >> On Tue, 18 Jun 2019 at 13:20, Jennifer Hemmerich >> wrote: >> Dear Alexis, >> >> if you calculate the Standard Inchi it is invariant to tautomers (see here: >> https://www.inchi-trust.org/technical-faq-2/#6.1). Therefore the information >> which tautomer was converted is lost due to the Inchi conversion. If you >> want to keep the tautomer information you need to use the fixedH attribute >> for the inchi. But beware this makes it a non standard Inchi, and thus might >> not be comparable to other Inchis. >> >> Hope this helps, >> >> Jennifer >> >>> On 18.06.19 12:59, Alexis Parenty wrote: >>> Dear RdKiters, >>> >>> Why is it that the stable tautomer of the following structure is lost >>> during inchi/smiles conversion? >>> >>> >>> >>> >>> mol = Chem.MolFromSmiles("Cc1ccc([nH]nc2)c2c1") >>> inchi = Chem.MolToInchi(mol) >>> mol = Chem.MolFromInchi(inchi) >>> smiles = Chem.MolToSmiles(mol) >>> print(smiles) >>> >>> ==> Cc1ccc2n[nH]cc2c1 >>> >>> >>> The H has shifted on the wrong Nitrogen… >>> >>> Interestingly, if you remove the methyl, the shift no longer happens: >>> >>> mol = Chem.MolFromSmiles("c1([nH]nc2)c21") >>> inchi = Chem.MolToInchi(mol) >>> mol = Chem.MolFromInchi(inchi) >>> smiles = Chem.MolToSmiles(mol) >>> print(smiles) >>> ==> c1([nH]nc2)c21 >>> >>> >>> Same issue for any secondary amides: if you pass the smiles of a secondary >>> amide, you end-up with the following unstable tautomer: >>> >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Alexis >>> >>> >>> >>> >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] History of RDKit
Hi Paul, maybe this is helpful, too: https://cactus.nci.nih.gov/presentations/meeting-08-2011/Fri_Aft_Greg_Landrum_RDKit-PostgreSQL.pdf Markus On Tue, Apr 23, 2019 at 11:45 AM Czodrowski, Paul < paul.czodrow...@tu-dortmund.de> wrote: > Dear RDKitters, > > > > I’m using RDKit (of course!) for my “Data Science for Chemistry and > Chemical Biology” class. > > > > Is anyone aware of a historic RDKit overview which is a bit more > non-historic like this wonderful slide deck: > > > https://www.rdkit.org/UGM/2012/Landrum_RDKit_UGM.History%20and%20Status.Final.pptx.pdf > > > > > > Best regards, > > Paul > > > > > > > > Prof. Dr. Paul Czodrowski > > Computational Chemical Biology > > > > *TU Dortmund University* > > Faculty of Chemistry and Chemical Biology > > Otto-Hahn-Strasse 6 > > 44227 Dortmund > > > > Twitter www.twitter.com/czodrowskipaul > > Lab page www.czodrowskilab.org > > Music www.czodrowskilab.org/music > > > > *Important note: The information included in this e-mail is confidential. > It is solely intended for the recipient. If you are not the intended > recipient of this e-mail please contact the sender and delete this message. > Thank you.* > > *Without prejudice of e-mail correspondence, our statements are only > legally binding when they are made in the conventional written form (with > personal signature) or when such documents are sent by fax.* > > > > > > *Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie > ist ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für > diese E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender > und vernichten Sie diese Mail. Vielen Dank. Unbeschadet der Korrespondenz > per E-Mail, sind unsere Erklärungen ausschließlich final rechtsverbindlich, > wenn sie in herkömmlicher Schriftform (mit eigenhändiger Unterschrift) oder > durch Übermittlung eines solchen Schriftstücks per Telefax erfolgen. > Important note: The information included in this e-mail is confidential. It > is solely intended for the recipient. If you are not the intended recipient > of this e-mail please contact the sender and delete this message. Thank > you. Without prejudice of e-mail correspondence, our statements are only > legally binding when they are made in the conventional written form (with > personal signature) or when such documents are sent by fax. * > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] 2019.03.1 RDKit Release
I appreciate this release and updated all Chembience components to RDKit 2019.03: https://github.com/chembience/chembience/releases/tag/0.2.10 Best, Markus On Tue, Apr 9, 2019 at 5:43 AM Greg Landrum wrote: > Dear all, > > I'm pleased to announce that the next version of the RDKit - 2019.03 - is > released. The release notes are below. > > The release files are on the github release page: > https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1 > > Binaries have been uploaded to anaconda.org (https://anaconda.org/rdkit). > The available conda binaries for this release are: > Linux 64bit: python 3.6, 3.7 > Mac OS 64bit: python 3.6, 3.7 > Windows 64bit: python 3.6, 3.7 > > I believe that conda-forge will also switch to the new version in the near > future. > > Please note that the RDKit no longer supports Python 2.7. More details on > this here: > > https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08354.html > > I plan to put conda builds of the PostgreSQL cartridge up in the near > future. > > The online version of the documentation at rdkit.org ( > http://rdkit.org/docs/index.html) has been updated. > > Some things that will be finished over the next couple of days: > - The conda build scripts will be updated to reflect the new version > - The homebrew script > > Thanks to everyone who submitted code, bug reports, and suggestions for > this release! > > Please let me know if you find any problems with the release or have > suggestions for the next one, which is scheduled for October 2019. > > Best Regards, > -greg > > # Release_2019.03.1 > (Changes relative to Release_2018.09.1) > > ## REALLY IMPORTANT ANNOUNCEMENT > - As of this realease (2019.03.1) the RDKit no longer supports Python 2. > Please > read this rdkit-discuss post to learn what your options are if you need > to > keep using Python 2: > > https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08354.html > > ## Backwards incompatible changes > - The fix for github #2245 means that the default behavior of the > MaxMinPicker > is now truly random. If you would like to reproduce the previous > behavior, > provide a seed value of 42. > - The uncharging method in the MolStandardizer now attempts to generate > canonical results for a given molecule. This may result in different > output > for some molecules. > > ## Highlights: > - There's now a Japanese translation of large parts of the RDKit > documentation > - SGroup data can now be read from and written to Mol/SDF files > - The enhanced stereo handling has been improved: the information is now > accessible from Python, EnumerateStereoisomers takes advantage of it, > and it > can be read from and written to CXSmiles > > ## Acknowledgements: > Michael Banck, Francois Berenger, Thomas Blaschke, Brian Cole, Andrew > Dalke, > Bakary N'tji Diallo, Guillaume Godin, Anne Hersey, Jan Holst Jensen, > Sunhwan Jo, > Brian Kelley, Petr Kubat, Karl Leswing, Susan Leung, John Mayfield, Adam > Moyer, > Dan Nealschneider, Noel O'Boyle, Stephen Roughley, Takayuki Serizawa, > Gianluca > Sforna, Ricardo Rodriguez Schmidt, Gianluca Sforna, Matt Swain, Paolo > Tosco, > Ricardo Vianello, 'John-Videogames', 'magattaca', 'msteijaert', > 'paconius', > 'sirbiscuit' > > ## Bug Fixes: > - PgSQL: fix boolean definitions for Postgresql 11 > (github pull #2129 from pkubatrh) > - update fingerprint tutorial notebook > (github pull #2130 from greglandrum) > - Fix typo in RecapHierarchyNode destructor > (github pull #2137 from iwatobipen) > - SMARTS roundtrip failure > (github issue #2142 from mcs07) > - Error thrown in rdMolStandardize.ChargeParent > (github issue #2144 from paconius) > - SMILES parsing inconsistency based on input order > (github issue #2148 from coleb) > - MolDraw2D: line width not in python wrapper > (github issue #2149 from greglandrum) > - Missing Python API Documentation > (github issue #2158 from greglandrum) > - PgSQL: mol_to_svg() changes input molecule. > (github issue #2174 from janholstjensen) > - Remove Unicode From AcidBasePair Name > (github pull #2185 from lilleswing) > - Inconsistent treatment of `[as]` in SMILES and SMARTS > (github issue #2197 from greglandrum) > - RGroupDecomposition fixes, keep userLabels more robust > onlyMatchAtRGroups > (github pull #2202 from bp-kelley) > - Fix TautomerTransform in operator= > (github pull #2203 from bp-kelley) > - testEnumeration hangs/takes where long on 32bit architectures > (github issue #2209 from mbanck) > - Silencing some Python 3 warning messages > (github pull #2223 from coleb) > - removeHs shouldn't remove atom lists > (github issue #2224 from rvianello) > - failure round-tripping mol block with Q atom > (github issue #2225 from rvianello) > - problem round-tripping mol files that include bond topology info > (github issue #2229 from rvianello) > - aromatic main-group atoms written to SMARTS incorrectly > (github issue #2237 from g
Re: [Rdkit-discuss] Beta of the 2019.03 release available
Hi Greg, my Chembience RDKit image build with version 2019.03-b1b went fine (well, I just pull it with conda; in case someone is interested it is available with tag 0.2.10-beta-1 at Dockerhub). For the Postgres extension (which I still compile myself during the Docker build against Postgress), your python 3 enforcement uncovered some dark corners of my build process, but that is fixed. However, compiling 2019.03-b1b against Postgres 11 fails during compilation (am I too cheeky?). Markus On Wed, Apr 3, 2019 at 11:38 AM Greg Landrum wrote: > Dear all, > > The beta of the 2019.03 RDKit release has been tagged in github: > https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1b1 > > There are a couple more bug fixes and maybe one more feature expected > before the actual release, but I wanted to go ahead and get the beta out > there. > > I've done conda builds for Python 3.6 and 3.7 for Windows, Mac, and Linux. > These all use the beta label so that they do not install by default; you'll > need to run "conda install" as follows: > > conda install -c rdkit/label/beta rdkit > > Be sure to confirm that it's installing the right version when you are > prompted (if there's no build available, it will pick the current > production release instead). > > The relevant section of the release notes is below, or you can see a > nicely formatted version here: > https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1b1 > > As usual, if you have time to try out the new release I would love > feedback. If nothing major comes up, I plan to do the actual release early > next week. > > Best, > -greg > > # Release_2019.03.1 > (Changes relative to Release_2018.09.1) > > ## REALLY IMPORTANT ANNOUNCEMENT > - As of this realease (2019.03.1) the RDKit no longer supports Python 2. > Please read this rdkit-discuss post to learn what your options are if you > need to keep using Python 2: > > https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08354.html > > ## Backwards incompatible changes > - The fix for github #2245 means that the default behavior of the MaxMinPicker > is now truly random. If you would like to reproduce the previous behavior, > provide a seed value of 42. > - The uncharging method in the MolStandardizer now attempts to generate > canonical results for a given molecule. This may result in different output > for some molecules. > > ## Highlights: > - There's now a Japanese translation of large parts of the RDKit documentation > - SGroup data can now be read from and written to Mol/SDF files > - The enhanced stereo handling has been improved: the information is now > accessible from Python, EnumerateStereoisomers takes advantage of it, and it > can be read from and written to CXSmiles > > ## Acknowledgements: > Michael Banck, Francois Berenger, Thomas Blaschke, Brian Cole, Andrew Dalke, > Bakary N'tji Diallo, Guillaume Godin, Jan Holst Jensen, Sunhwan Jo, Brian > Kelley, Petr Kubat, Karl Leswing, Susan Leung, John Mayfield, Adam Moyer, Dan > Nealschneider, Noel O'Boyle, Stephen Roughley, Takayuki Serizawa, Gianluca > Sforna, Ricardo Rodriguez Schmidt, Matt Swain, Paolo Tosco, Ricardo Vianello, > 'John-Videogames', 'magattaca', 'msteijaert', 'paconius', 'sirbiscuit' > > ## Bug Fixes: > - PgSQL: fix boolean definitions for Postgresql 11 > (github pull #2129 from pkubatrh) > - update fingerprint tutorial notebook > (github pull #2130 from greglandrum) > - Fix typo in RecapHierarchyNode destructor > (github pull #2137 from iwatobipen) > - SMARTS roundtrip failure > (github issue #2142 from mcs07) > - Error thrown in rdMolStandardize.ChargeParent > (github issue #2144 from paconius) > - SMILES parsing inconsistency based on input order > (github issue #2148 from coleb) > - MolDraw2D: line width not in python wrapper > (github issue #2149 from greglandrum) > - Missing Python API Documentation > (github issue #2158 from greglandrum) > - PgSQL: mol_to_svg() changes input molecule. > (github issue #2174 from janholstjensen) > - Remove Unicode From AcidBasePair Name > (github pull #2185 from lilleswing) > - Inconsistent treatment of `[as]` in SMILES and SMARTS > (github issue #2197 from greglandrum) > - RGroupDecomposition fixes, keep userLabels more robust onlyMatchAtRGroups > (github pull #2202 from bp-kelley) > - Fix TautomerTransform in operator= > (github pull #2203 from bp-kelley) > - testEnumeration hangs/takes where long on 32bit architectures > (github issue #2209 from mbanck) > - Silencing some Python 3 warning messages > (github pull #2223 from coleb) > - removeHs shouldn't remove atom lists > (github issue #2224 from rvianello) > - failure round-tripping mol block with Q atom > (github issue #2225 from rvianello) > - problem round-tripping mol files that include bond topology info > (github issue #2229 from rvianello) > - aromatic main-group atoms written to SMARTS incorrectly > (github issue #2237 from gregland
Re: [Rdkit-discuss] chemfp preprint
Yes, we all love ref 57. - | Markus Sitzmann | markus.sitzm...@gmail.com > On 22. Mar 2019, at 20:39, Andrew Dalke wrote: > > Hi RDKit users, > > This week I submitted a paper about chemfp for publication. I also submitted > a preprint on ChemRxiv, which was just accepted. > > For those interested, it's at > https://chemrxiv.org/articles/The_Chemfp_Project/7877846 . > > It's a rather long paper as it covers many aspects about the chemfp project, > including the FPS and FPB formats, search algorithms, details about the > different ways to compute a popcount, and memory bandwidth and latency > bottlenecks. On a non-technical level I also describe some of the > difficulties I ran into trying to run chemfp as "commercial free software." > > Let me know of any corrections or improvements, or any other feedback you > might have. > > Cheers, > >Andrew >da...@dalkescientific.com > > > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKIT Build Problems
Thanks Greg. I have the problem at CI, too. It was 100% failure rate the last two days. At home, occasionally. At least, it isn’t only me :-) Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 6. Mar 2019, at 17:25, Greg Landrum wrote: > > > >> On Wed, Mar 6, 2019 at 1:15 PM Markus Sitzmann >> wrote: >> >> Does someone maybe has the same problem? And an explanation whats going on - >> the Avalon tools seemed to be unchanged since quite a while. > > I've seen it a couple of times in the CI builds for the RDKit over the past > couple of days. It's caused by downloads failing since sourceforge is > unreliable. > > -greg > > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] RDKIT Build Problems
Hi everybody, Currently (the last couple of days) the build of one of my Docker images stopped working: https://github.com/chembience/chembience/blob/develop/context/build/rdkit-postgres-compile/Dockerfile That occurred to me sporadically already over the last few years and usually disappeared after a rebuild. Currently it is annoying enough to write an email. The build ends with: """ == Using strict rotor definition Downloading http://sourceforge.net/projects/avalontoolkit/files/AvalonToolkit_1.2/AvalonToolkit_1.2.0.source.tar. .. % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 178 100 1780 0 1181 0 --:--:-- --:--:-- --:--:-- 1186 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (35) Unknown SSL protocol error in connection to sourceforge.net:443 *CMake Error at Code/cmake/Modules/RDKitUtils.cmake:215 (MESSAGE):* * The md5 checksum for* * /opt/rdkit/External/AvalonTools/AvalonToolkit_1.2.0.source.tar is* * incorrect; expected: 092a94f421873f038aa67d4a6cc8cb54, found:* * d41d8cd98f00b204e9800998ecf8427e* *Call Stack (most recent call first):* * External/AvalonTools/CMakeLists.txt:29 (downloadAndCheckMD5)* -- Configuring incomplete, errors occurred! See also "/opt/rdkit/build/CMakeFiles/CMakeOutput.log". See also "/opt/rdkit/build/CMakeFiles/CMakeError.log". ERROR: Service 'rdkit-postgres-compile' failed to build: The command '/bin/sh -c wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - && echo 'deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main' > /etc/apt/sources.list.d/pgdg.list && apt-get update && apt-get install -y --no-install-recommends postgresql-server-dev-all postgresql-client postgresql-plpython-${PG_VERSION} postgresql-plpython3-${PG_VERSION} python-numpy python-dev sqlite3 libsqlite3-dev libboost-dev libboost-system-dev libboost-thread-dev libboost-serialization-dev libboost-python-dev libboost-regex-dev libeigen3-dev && git clone -b $RDKIT_BRANCH --single-branch https://github.com/rdkit/rdkit.git && mkdir $RDBASE/build && cd $RDBASE/build && cmake -DRDK_BUILD_INCHI_SUPPORT=ON -DRDK_BUILD_PGSQL=ON -DRDK_BUILD_AVALON_SUPPORT=ON -DPostgreSQL_TYPE_INCLUDE_DIR="/usr/include/postgresql/${PG_VERSION}/server" -DPostgreSQL_ROOT="/usr/lib/postgresql/${PG_VERSION}" .. && make -j `nproc` && make install' returned a non-zero code: 1 Exited with code 1 """ Does someone maybe has the same problem? And an explanation whats going on - the Avalon tools seemed to be unchanged since quite a while. Thanks, Markus ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit Release 2018.09.2 available
Hi Greg, in the meantime, I found a solution: I added a "conda update" as final step when I build my base python/conda container. The RDKit container is build on top of that and now it finds RDKit 2018.09.2 without telling conda the version number explicitly. Why this worked beforehand without this step and why it is necessary although I start building the containers basically from scratch and the newest version, I still don't know. Best, Markus On Sat, Feb 23, 2019 at 12:29 AM Dimitri Maziuk via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > On 2/22/19 5:01 PM, Markus Sitzmann wrote: > > > It is odd, but one thing I learned from using conda is, sometimes it > helps > > to ignore problems and wait for a bit and they might go away ... well, I > > have similar experiences with maven :-) ... but most likely I do > something > > stupid which I don't see right now :-) > > Simple test is to make a clean one and install only rdkit and nothing > else and see what happens. It's pretty common for packagers to do > something-that-may-or-may-not-be-stupid and have a dependency on an > specific version of some other package that depends on a specific > version of another package that depends on... turtles all the way down. > > -- > Dimitri Maziuk > Programmer/sysadmin > BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit Release 2018.09.2 available
Hi Greg, unfortunately, the problem persist - well, it isn't a big one since if I explicitly say it should install 2018.09.2 it does that. Only when I just ask to install rdkit, it is still delivers version 2018.09.1 (when I ask the very same conda instance then to search me all available rdkit packages, it even finds 2018.09.2 ). It is odd, but one thing I learned from using conda is, sometimes it helps to ignore problems and wait for a bit and they might go away ... well, I have similar experiences with maven :-) ... but most likely I do something stupid which I don't see right now :-) Anyway, thanks for your reply. Markus On Fri, Feb 22, 2019 at 9:05 AM Greg Landrum wrote: > Hi Markus, > > I can't reproduce that. Here's what I get when I create a new environment: > > (tmp) glandrum@otter:~/Code/rdkit_containers/docker$ conda install > conda-forge::rdkit > Collecting package metadata: done > Solving environment: done > > ## Package Plan ## > > environment location: /other_linux/home/glandrum/anaconda3/envs/tmp > > added / updated specs: > - conda-forge::rdkit > > > The following packages will be downloaded: > > package|build > ---|- > cairo-1.16.0 |ha4e643d_1000 1.5 MB > conda-forge > > > > rdkit-2018.09.2| py37h270f4b7_020.0 MB > conda-forge > > > > >Total:32.6 MB > > The following NEW packages will be INSTALLED: > > blas pkgs/main/linux-64::blas-1.0-mkl > > boost conda-forge/linux-64::boost-1.68.0-py37h8619c78_1001 > > boost-cpp conda-forge/linux-64::boost-cpp-1.68.0-h11c811c_1000 > > rdkit conda-forge/linux-64::rdkit-2018.09.2-py37h270f4b7_0 > > > > Maybe it was connected to the new version just having appeared? Do you > still have the same problem? > > -greg > > > > > > On Fri, Feb 22, 2019 at 12:43 AM Markus Sitzmann < > markus.sitzm...@gmail.com> wrote: > >> Hi Greg, >> >> I just saw it is available in the conda-forge channel (with a time stamp >> of 2 hours + a few minutes), however, if I install it from there (in a >> fresh container) I receive 2018_09_1 - only when I explicitly force version >> 2018_09_2 I receive it (and at a very fast glance it is running). >> >> But why do I have to request version _02 explicitly (right at the moment) >> ... this is one of the few things I never will get with conda? >> >> Markus >> >> >> On Thu, Feb 21, 2019 at 5:32 PM Greg Landrum >> wrote: >> >>> Dear all, >>> >>> I normally don't announce the patch releases, but there are a couple of >>> changes with the conda builds, so I figured I should probably mention it. >>> :-) >>> >>> This time I did builds for: >>> Python 3.7: Mac, Linux, Windows >>> Python 3.6: Mac, Linux, Windows >>> Python 2.7: Mac, Linux >>> >>> The boost and numpy dependencies have also been changed. >>> >>> The conda-forge channel should be updated in the near future as well. >>> >>> The release notes and source download are here: >>> https://github.com/rdkit/rdkit/releases/tag/Release_2018_09_2 >>> >>> Hopefully this all works smoothly, but I'm not 100% optimistic about >>> that; please let me know if you encounter any problems with the new builds! >>> -greg >>> >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit Release 2018.09.2 available
Hi Greg, I just saw it is available in the conda-forge channel (with a time stamp of 2 hours + a few minutes), however, if I install it from there (in a fresh container) I receive 2018_09_1 - only when I explicitly force version 2018_09_2 I receive it (and at a very fast glance it is running). But why do I have to request version _02 explicitly (right at the moment) ... this is one of the few things I never will get with conda? Markus On Thu, Feb 21, 2019 at 5:32 PM Greg Landrum wrote: > Dear all, > > I normally don't announce the patch releases, but there are a couple of > changes with the conda builds, so I figured I should probably mention it. > :-) > > This time I did builds for: > Python 3.7: Mac, Linux, Windows > Python 3.6: Mac, Linux, Windows > Python 2.7: Mac, Linux > > The boost and numpy dependencies have also been changed. > > The conda-forge channel should be updated in the near future as well. > > The release notes and source download are here: > https://github.com/rdkit/rdkit/releases/tag/Release_2018_09_2 > > Hopefully this all works smoothly, but I'm not 100% optimistic about that; > please let me know if you encounter any problems with the new builds! > -greg > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Warning as error
Maybe this helps (at least, it is from Greg): https://github.com/rdkit/rdkit/issues/642 Markus On Mon, Jan 21, 2019 at 2:25 PM Jean-Marc Nuzillard < jm.nuzill...@univ-reims.fr> wrote: > My problem is more to know which molecules cause problems > than avoiding the printing of warning messages in the console window. > I am looking for an option that would turn warnings into errors, if any. > > Jean-Marc > > > > Le 21/01/2019 à 13:44, Stephen O'hagan a écrit : > > I've had similar problems; none of the claimed methods to switch off > RDKit logging of warnings has worked for me. > > > > I ended up just re-directing stderr when running the script like this: > > > > python myfile.py 2> myErrorLog.txt > > > > > > Dr. Steve O'Hagan, > > > > > > -Original Message- > > From: Jean-Marc Nuzillard [mailto:jm.nuzill...@univ-reims.fr] > > Sent: 21 January 2019 12:33 > > To: RDKit Discuss > > Subject: [Rdkit-discuss] Warning as error > > > > Dear all, > > > > The minimalist python code: > > reader = Chem.SDMolSupplier('my_file.sdf') > > for mol in reader: > > pass > > > > gives me warning messages when run on a particular SD file. > > How can I simply run a specific action for the molecules that cause > problem, possibly using try/catch statements? > > Best, > > > > Jean-Marc > > > > > > -- > > Jean-Marc Nuzillard > > Directeur de Recherches au CNRS > > > > Institut de Chimie Moléculaire de Reims > > CNRS UMR 7312 > > Moulin de la Housse > > CPCBAI, Bâtiment 18 > > BP 1039 > > 51687 REIMS Cedex 2 > > France > > > > Tel : 03 26 91 82 10 > > Fax : 03 26 91 31 66 > > http://www.univ-reims.fr/ICMR > > http://eos.univ-reims.fr/LSD/CSNteam.html > > > > http://www.univ-reims.fr/LSD/ > > http://www.univ-reims.fr/LSD/JmnSoft/ > > > > > > --- > > L'absence de virus dans ce courrier électronique a été vérifiée par le > logiciel antivirus Avast. > > https://www.avast.com/antivirus > > > > > > > > ___ > > Rdkit-discuss mailing list > > Rdkit-discuss@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > -- > Jean-Marc Nuzillard > Directeur de Recherches au CNRS > > Institut de Chimie Moléculaire de Reims > CNRS UMR 7312 > Moulin de la Housse > CPCBAI, Bâtiment 18 > BP 1039 > 51687 REIMS Cedex 2 > France > > Tel : 03 26 91 82 10 > Fax : 03 26 91 31 66 > http://www.univ-reims.fr/ICMR > http://eos.univ-reims.fr/LSD/CSNteam.html > > http://www.univ-reims.fr/LSD/ > http://www.univ-reims.fr/LSD/JmnSoft/ > > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Dividing inputstream over threads
> SQLalchemy creates a fairly specific ecosystem that you have to buy > into for it to make sense. When you don't have objects, only a table > of properties, OR mapper is just bloat. There is no need for objects with SQLAlchemy, SQLAlchemy's Core and its expression language is pretty excellent without objects ... >With parallel processing your bottleneck is going to be database >inserts. One option is write out CSV file(s) from each thread/job, >concatenate them in the final node, and then bulk-import into the >database: typically CSV (or other such format) bulk import is orders >of magnitude faster than inserting one SQL statement at a time. ... and bulk-inserts of Python data types into the database. Markus On Sun, Jan 20, 2019 at 9:17 PM Dmitri Maziuk via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > On Sun, 20 Jan 2019 12:03:50 +0100 > Shojiro Shibayama wrote: > > > ... I guess SQLalchemy > > in python might be good, but I'm not sure. Hope that you'll find out > > a good library of SQL OR mapper for python. > > SQLalchemy creates a fairly specific ecosystem that you have to buy > into for it to make sense. When you don't have objects, only a table > of properties, OR mapper is just bloat. > > With parallel processing your bottleneck is going to be database > inserts. One option is write out CSV file(s) from each thread/job, > concatenate them in the final node, and then bulk-import into the > database: typically CSV (or other such format) bulk import is orders > of magnitude faster than inserting one SQL statement at a time. > > -- > Dmitri Maziuk > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] InChI to Mol to InChi
I think I do vaguely remember that InChI gives precedence to 3D coordinates if present over anything else for the determination of stereochemistry. And I think that is what happens here: the Allchem embedding of the molecule adds 3D coordinates which are not present for the original molecule create straight from InChI. Probably the minimization of the structure during the embedding is “turning around” the stereochemistry (probably you could have a long discussion whether this is a bug or a feature), Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 18. Dec 2018, at 19:43, Jason Biggs wrote: > > see https://github.com/rdkit/rdkit/issues/1852, and > https://sourceforge.net/p/rdkit/mailman/message/36309813/ > > You can see it in the smiles if you remove stereo after embedding, then > re-detect stereo from the conformation. > > inchi1 = > "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" > m1 = Chem.MolFromInchi(inchi1) > m1 = Chem.AddHs(m1) > m2 = Chem.Mol(m1) > AllChem.EmbedMolecule(m2) > m3 = Chem.Mol(m2) > Chem.rdmolops.RemoveStereochemistry(m3) > Chem.rdmolops.AssignStereochemistryFrom3D(m3) > sm1 = Chem.MolToSmiles(m1) > sm2 = Chem.MolToSmiles(m2) > sm3 = Chem.MolToSmiles(m3) > print(sm1 == sm2) # returns true > print(sm2 == sm3) # returns false > > The difference between sm2 and sm3 is just swapping a \ for a /, confirming > what Christos was able to read from the InChI. > > Why does the inchi reflect the 3D bond stereo but the smiles doesn't until > you remove and re-detect the stereo? Does the InChI code go to the 3D > structure when present and ignore stereo information in the mol object? > > Jason Biggs > > >> On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas >> wrote: >> Hi Jean-Marc, >> >> There difference is due to bond orientation (if my inchi analysis skills are >> correct). >> See the bold bond layer below (14-7+ vs 14-7-). >> >> m1 -> >> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1 >> >> m2 -> >> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7-/t17-,19-/m1/s1 >> >> Not sure why it happens, but I've seen it multiple times... >> >> Best, >> Christos >> >> Christos Kannas >> >> Chem[o]informatics Researcher & Software Developer >> >> >> >> >> >>> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD >>> wrote: >>> Thank you for your answer but alatis might not be adapted to my current >>> problem. >>> >>> Attempting to understand what was changed by the embedding step I wrote: >>> >>> inchi1 = >>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" >>> m1 = Chem.MolFromInchi(inchi1) >>> m1 = Chem.AddHs(m1) >>> m2 = Chem.Mol(m1) >>> AllChem.EmbedMolecule(m2) >>> sm1 = Chem.MolToSmiles(m1) >>> sm2 = Chem.MolToSmiles(m2) >>> print(sm1) >>> print(sm2) >>> print(sm1 == sm2) >>> inc1 = Chem.MolToInchi(m1) >>> inc2 = Chem.MolToInchi(m2) >>> print(inc1) >>> print(inc2) >>> print(inc1 == inc2) >>> >>> Molecules m1 and m2 have identical SMILES representations >>> but different InChI representations, which I find odd. >>> >>> All the best, >>> >>> Jean-Marc >>> >>> >>> >>> >>> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit : >>> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote: >>> >> Is there any more deterministic procedure than the one of trying until >>> >> success is obtained? >>> >> >>> >> How do I determine the InChI string of a conformer obtained after >>> >> multiple embedding? >>> > >>> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/ >>> > >>> > Generally speaking the problem with InChI is that the only *required* >>> > layer is the formula. Therefore *an* InChI string cannot be used to >>> > differentiate conformers, you need the InChI string with all the >>> > relevant layers and all the proton
Re: [Rdkit-discuss] Chembience
Hello, I have releases Chembience 0.2.6 - it switches Python from 3.6 to 3.7 and updates RDKit to 2018.09.1. Just to mention it, the Docker images of all previous releases are also still available from Dockerhub. https://github.com/chembience/chembience/releases/tag/v0.2.6 https://twitter.com/markussitzmann/status/105216581521409 Markus On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann wrote: > Hello, > > since it includes RDKit as one of its major components I am happy to > announce the first release of my new open-source project Chembience: > > A Docker-based, cloudable platform for the development of > chemoinformatics-centric web applications and microservices. > > https://github.com/chembience/chembience > > (unfortunately it is still on RDKit 2017.09_3, I failed releasing it > before 2018.03 :-) ). > > Best, > Markus > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Compilation Errors on RHEL7
Re: [Rdkit-discuss] Chembience
Feedback was so far kind words and Twitter likes :-). And looking on my github stats I also see some clones. However, I am happy so far with it - I know it is still a bit heady and I have to improve documentation a lot. And I also want to build some easily distributable open chemoinformatics projects on top of it which I hope creates more interest. From my Chemical Identifier Resolver days I know you have to patient. Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 24. Oct 2018, at 17:56, Greg Landrum wrote: > > Glad that update went ok. > > Have you gotten any feedback about this yet? > >> On Wed, 24 Oct 2018 at 14:10, Markus Sitzmann >> wrote: >> >> Hello, >> >> I have released Chembience 0.2.5: it updates the Docker images of the >> Django and the Jupyter notebook app in Chembience + the Postgres extension >> of the Chembience database image to RDKit 2018.09 (and it went really smooth >> :-) ) >> >> https://twitter.com/markussitzmann/status/1055047319660490753 >> >> https://github.com/chembience/chembience/releases >> >> https://www.chembience.com >> >> Best, >> Markus >> >>> On Fri, Aug 31, 2018 at 11:20 PM Markus Sitzmann >>> wrote: >>> Hello, >>> >>> I have put together another Chembience release (0.2.3): update of RDKit to >>> version 2018.03.4, Postgres to version 10.5, and Django to 2.1 >>> >>> https://github.com/chembience/chembience >>> >>> https://twitter.com/markussitzmann/status/1035629283736264704 >>> >>> Best, >>> Markus >>> >>>> On Sun, Jun 10, 2018 at 4:41 PM Markus Sitzmann >>>> wrote: >>>> Hello, >>>> >>>> I have just released Chembience 0.2.1: it updates RDKit to version >>>> 2018.03.2 and switches Postgres from the 9.x series to version 10.4 >>>> >>>> https://github.com/chembience/chembience >>>> >>>> Best, >>>> Markus >>>> >>>> >>>>> On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann >>>>> wrote: >>>>> Hello, >>>>> >>>>> I have released Chembience 0.2.0: it includes an update to RDKit 2018.03 >>>>> and also provides Jupyter as new base App container type. >>>>> >>>>> https://github.com/chembience/chembience >>>>> >>>>> (so, assuming you have Docker and docker-compose installed on your >>>>> computer, you are a few, easy commands away from your personal Jupyter >>>>> notebook server with all RDKit 2018.03 goodness readily available). >>>>> >>>>> Best, >>>>> Markus >>>>> >>>>> >>>>>> On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann >>>>>> wrote: >>>>>> Hello, >>>>>> >>>>>> since it includes RDKit as one of its major components I am happy to >>>>>> announce the first release of my new open-source project Chembience: >>>>>> >>>>>> A Docker-based, cloudable platform for the development of >>>>>> chemoinformatics-centric web applications and microservices. >>>>>> >>>>>> https://github.com/chembience/chembience >>>>>> >>>>>> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it >>>>>> before 2018.03 :-) ). >>>>>> >>>>>> Best, >>>>>> Markus >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chembience
Hello, I have released *Chembience 0.2.5*: it updates the Docker images of the Django and the Jupyter notebook app in Chembience + the Postgres extension of the Chembience database image to *RDKit 2018.09* (and it went really smooth :-) ) https://twitter.com/markussitzmann/status/1055047319660490753 https://github.com/chembience/chembience/releases https://www.chembience.com Best, Markus On Fri, Aug 31, 2018 at 11:20 PM Markus Sitzmann wrote: > Hello, > > I have put together another Chembience release (0.2.3): update of RDKit to > version 2018.03.*4, *Postgres to version 10.5, and Django to 2.1 > > https://github.com/chembience/chembience > > https://twitter.com/markussitzmann/status/1035629283736264704 > > Best, > Markus > > On Sun, Jun 10, 2018 at 4:41 PM Markus Sitzmann > wrote: > >> Hello, >> >> I have just released Chembience 0.2.1: it updates RDKit to version >> 2018.03.2 and switches Postgres from the 9.x series to version 10.4 >> >> https://github.com/chembience/chembience >> >> Best, >> Markus >> >> >> On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann < >> markus.sitzm...@gmail.com> wrote: >> >>> Hello, >>> >>> I have released Chembience 0.2.0: it includes an update to RDKit 2018.03 >>> and also provides Jupyter as new base App container type. >>> >>> https://github.com/chembience/chembience >>> >>> (so, assuming you have Docker and docker-compose installed on your >>> computer, you are a few, easy commands away from your personal Jupyter >>> notebook server with all RDKit 2018.03 goodness readily available). >>> >>> Best, >>> Markus >>> >>> >>> On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann < >>> markus.sitzm...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> since it includes RDKit as one of its major components I am happy to >>>> announce the first release of my new open-source project Chembience: >>>> >>>> A Docker-based, cloudable platform for the development of >>>> chemoinformatics-centric web applications and microservices. >>>> >>>> https://github.com/chembience/chembience >>>> >>>> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it >>>> before 2018.03 :-) ). >>>> >>>> Best, >>>> Markus >>>> >>> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] [Question] Ok to switch to conda-forge for RDKit builds?
I am happy with conda forge :-). And thanks for the great work. Markus On Thu, Oct 18, 2018 at 5:46 PM Greg Landrum wrote: > Um, guys, there are some interesting side conversations starting here, but > can we please keep this thread on the "is it ok for me to stop doing builds > on the RDKit channel" question? > This is important to me (and possibly the community), so I'd like the keep > the discussion as simple and uncluttered as possible. > > Thanks, > -greg > > > On Thu, Oct 18, 2018 at 5:26 PM Markus Sitzmann > wrote: > >> Hmm, isn't that the problem with any build/dependency automation tool >> and hard to fix in a generic way? If you are really very dependent on a >> specific version of a software you have to be very careful with the >> environment it sits in while you do "carefree" updates only in a carefree >> environment :-) (and environment management got a lot easier the recent >> years) >> >> Markus >> >> On Thu, Oct 18, 2018 at 4:43 PM Greg Landrum >> wrote: >> >>> >>> >>> On Thu, Oct 18, 2018 at 2:21 PM Eric Jonas wrote: >>> >>>> Greg, I'm all for anything that makes the release process on developers >>>> easier; my main question is : With conda-forge, how hard is it to install >>>> just _one_ package without having everything else (say numpy, pandas, etc) >>>> upgraded to the latest conda-forge version? I've had situations in the past >>>> where i'm like "oh I'd just like the latest ___" and suddenly everything in >>>> my conda env has been upgraded to the bleeding edge. >>>> >>> >>> That's a great question, and it's one I don't really know the answer to. >>> >>> On my PC (I'm on the train, and this is what I have with me), here's >>> what I did: >>> - create a new conda environment that includes an rdkit-channel RDKit >>> install >>> - uninstall the RDKit from that >>> - install the RDKit from the conda-forge channel >>> >>> Here's what ends up getting changed: >>> >>> ## Package Plan ## >>> >>> environment location: C:\Users\glandrum\Anaconda3\envs\py36_tmp >>> >>> added / updated specs: >>> - rdkit >>> >>> >>> The following NEW packages will be INSTALLED: >>> >>> boost: 1.67.0-py36_vc14_0 conda-forge [vc14] >>> boost-cpp: 1.67.0-vc14_0 conda-forge [vc14] >>> pycairo: 1.16.3-py36_vc14_0 conda-forge [vc14] >>> rdkit: 2018.03.4-py36h857267b_1000 conda-forge >>> >>> The following packages will be UPDATED: >>> >>> certifi: 2018.10.15-py36_0 --> >>> 2018.10.15-py36_1000 conda-forge >>> jpeg: 9b-hb83a4c4_2 --> 9b-vc14_2 >>> conda-forge [vc14] >>> tk:8.6.8-hfa6e2cd_0--> 8.6.8-vc14_0 >>>conda-forge [vc14] >>> >>> The following packages will be DOWNGRADED: >>> >>> icu: 58.2-ha66f8fd_1 --> 58.2-vc14_0 >>> conda-forge [vc14] >>> libpng:1.6.35-h2a8f88b_0 --> >>> 1.6.34-vc14_0conda-forge [vc14] >>> libtiff: 4.0.9-h36446d0_2--> 4.0.9-vc14_0 >>>conda-forge [vc14] >>> pillow:5.3.0-py36hdc69c19_0--> >>> 5.2.0-py36h08d_0 >>> pixman:0.34.0-hcef7cb0_3 --> >>> 0.34.0-vc14_2conda-forge [vc14] >>> vc:14.1-h0510ff6_4 --> 14-0 >>>conda-forge >>> zlib: 1.2.11-h8395fce_2 --> >>> 1.2.11-vc14_0conda-forge [vc14] >>> >>> >>> That's a fair amount of change, but is less than what I thought might >>> happen (I was worried about numpy+pandas+... being updated). >>> So that's one data point. What's your take? >>> >>> >>> I will try the same thing on my Mac and Linux boxes tomorrow if no one >>> else has done it by then. >>> >>> -greg >>> >>> >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] [Question] Ok to switch to conda-forge for RDKit builds?
Hmm, isn't that the problem with any build/dependency automation tool and hard to fix in a generic way? If you are really very dependent on a specific version of a software you have to be very careful with the environment it sits in while you do "carefree" updates only in a carefree environment :-) (and environment management got a lot easier the recent years) Markus On Thu, Oct 18, 2018 at 4:43 PM Greg Landrum wrote: > > > On Thu, Oct 18, 2018 at 2:21 PM Eric Jonas wrote: > >> Greg, I'm all for anything that makes the release process on developers >> easier; my main question is : With conda-forge, how hard is it to install >> just _one_ package without having everything else (say numpy, pandas, etc) >> upgraded to the latest conda-forge version? I've had situations in the past >> where i'm like "oh I'd just like the latest ___" and suddenly everything in >> my conda env has been upgraded to the bleeding edge. >> > > That's a great question, and it's one I don't really know the answer to. > > On my PC (I'm on the train, and this is what I have with me), here's what > I did: > - create a new conda environment that includes an rdkit-channel RDKit > install > - uninstall the RDKit from that > - install the RDKit from the conda-forge channel > > Here's what ends up getting changed: > > ## Package Plan ## > > environment location: C:\Users\glandrum\Anaconda3\envs\py36_tmp > > added / updated specs: > - rdkit > > > The following NEW packages will be INSTALLED: > > boost: 1.67.0-py36_vc14_0 conda-forge [vc14] > boost-cpp: 1.67.0-vc14_0 conda-forge [vc14] > pycairo: 1.16.3-py36_vc14_0 conda-forge [vc14] > rdkit: 2018.03.4-py36h857267b_1000 conda-forge > > The following packages will be UPDATED: > > certifi: 2018.10.15-py36_0 --> > 2018.10.15-py36_1000 conda-forge > jpeg: 9b-hb83a4c4_2 --> 9b-vc14_2 > conda-forge [vc14] > tk:8.6.8-hfa6e2cd_0--> 8.6.8-vc14_0 > conda-forge [vc14] > > The following packages will be DOWNGRADED: > > icu: 58.2-ha66f8fd_1 --> 58.2-vc14_0 > conda-forge [vc14] > libpng:1.6.35-h2a8f88b_0 --> 1.6.34-vc14_0 > conda-forge [vc14] > libtiff: 4.0.9-h36446d0_2--> 4.0.9-vc14_0 > conda-forge [vc14] > pillow:5.3.0-py36hdc69c19_0--> > 5.2.0-py36h08d_0 > pixman:0.34.0-hcef7cb0_3 --> 0.34.0-vc14_2 > conda-forge [vc14] > vc:14.1-h0510ff6_4 --> 14-0 > conda-forge > zlib: 1.2.11-h8395fce_2 --> 1.2.11-vc14_0 > conda-forge [vc14] > > > That's a fair amount of change, but is less than what I thought might > happen (I was worried about numpy+pandas+... being updated). > So that's one data point. What's your take? > > > I will try the same thing on my Mac and Linux boxes tomorrow if no one > else has done it by then. > > -greg > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chembience
Hello, I have put together another Chembience release (0.2.3): update of RDKit to version 2018.03.*4, *Postgres to version 10.5, and Django to 2.1 https://github.com/chembience/chembience https://twitter.com/markussitzmann/status/1035629283736264704 Best, Markus On Sun, Jun 10, 2018 at 4:41 PM Markus Sitzmann wrote: > Hello, > > I have just released Chembience 0.2.1: it updates RDKit to version > 2018.03.2 and switches Postgres from the 9.x series to version 10.4 > > https://github.com/chembience/chembience > > Best, > Markus > > > On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann > wrote: > >> Hello, >> >> I have released Chembience 0.2.0: it includes an update to RDKit 2018.03 >> and also provides Jupyter as new base App container type. >> >> https://github.com/chembience/chembience >> >> (so, assuming you have Docker and docker-compose installed on your >> computer, you are a few, easy commands away from your personal Jupyter >> notebook server with all RDKit 2018.03 goodness readily available). >> >> Best, >> Markus >> >> >> On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann < >> markus.sitzm...@gmail.com> wrote: >> >>> Hello, >>> >>> since it includes RDKit as one of its major components I am happy to >>> announce the first release of my new open-source project Chembience: >>> >>> A Docker-based, cloudable platform for the development of >>> chemoinformatics-centric web applications and microservices. >>> >>> https://github.com/chembience/chembience >>> >>> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it >>> before 2018.03 :-) ). >>> >>> Best, >>> Markus >>> >> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] enumeration of smiles question
Oh tempora o mores. Didn't we try for ages to make our SMILES canonical and now, all of sudden, the opposite is hip :-) On Mon, Aug 6, 2018 at 1:38 PM Chris Earnshaw wrote: > Hi > > The question 'what do you mean by ALL?' springs to mind. None of the > discussion includes dot-disconnected SMILES, which are also perfectly valid > representations. For example C(C1C2)C.C12 is yet another SMILES (of many > possible) for the example structure. > > I've no idea whether this is of any relevance to you, but you should > probably consider these representations and decide whether they are > important or not. > > Best regards, > Chris > > On 6 August 2018 at 11:27, Jan Halborg Jensen wrote: > >> This blogpost links to two other ones that may have done that (haven’t >> read them carefully): >> https://baoilleach.blogspot.com/2018/06/cheminformatics-for-deep-learners.html >> >> Best regards, Jan >> >> On 06 Aug 2018, at 11:57, Guillaume GODIN >> wrote: >> >> Dear Greg, >> >> Fantastic, thank you to give both explanation and solution to this >> “simple question”, I know this is not so simple & it’s fundamental for data >> augmentation in deep learning. >> >> If I may, I have another question related, do you know if someone has >> worked on a generator of all unique smiles independently of RDKit ? >> >> Thanks again, >> >> Guillaume >> >> *De : *Greg Landrum >> *Date : *lundi, 6 août 2018 à 11:40 >> *À : *Guillaume GODIN >> *Cc : *RDKit Discuss >> *Objet : *Re: [Rdkit-discuss] enumeration of smiles question >> >> >> On Thu, Aug 2, 2018 at 8:59 AM Guillaume GODIN < >> guillaume.go...@firmenich.com> wrote: >> >> >> I have a simple question about generating all possible smiles of a given >> molecule: >> >> >> It's a simple question, but the answer is somewhat complicated. :-) >> >> >> >> RDKit provides only 4 differents smiles for my molecule “CCC1CC1“: >> C1C(CC)C1 >> CCC1CC1 >> C1(CC)CC1 >> C(C)C1CC1 >> >> While by hand we can write those 7 smiles: >> CCC1CC1 >> C(C)C1CC1 >> C(C1CC1)C >> C1CC(CC)1 >> C1C(CC)C1 >> C1CC1CC >> C(CC)1CC1 >> >> I use this function for the enumeration: >> >> def allsmiles(smil): >> m = Chem.MolFromSmiles(smil) # Construct a molecule from a SMILES >> string. >> if m is None: >> return smil >> N = m.GetNumAtoms() >> if N==0: >> return smil >> try: >> n= np.random.randint(0,high=N) >> t= Chem.MolToSmiles(m, rootedAtAtom=n, canonical=False) >> except : >> return smil >> return t >> >> n= 50 >> SMILES = [“CCC1CC1”] >> SMILES_mult = [allsmiles(S) for S in SMILES for i in range(n)] >> >> Why we cannot generate all the 7 smiles ? >> >> >> The RDKit has rules that it uses to decide which atom to branch to when >> generating a SMILES. These are used regardless of whether you are >> generating canonical SMILES or not. >> The upshot of this is that it will never generate a SMILES where there's >> a branch before a ring closure. >> The other important factor here is that atom rank is determined by the >> index of the atom in the molecule when you aren't using canonicalization. >> So changing the atom order on input can help: >> >> In [12]: set(allsmiles('CCC1CC1') for i in range(50)) >> Out[12]: {'C(C)C1CC1', 'C1(CC)CC1', 'C1C(CC)C1', 'CCC1CC1'} >> >> In [13]: set(allsmiles('C1CC1CC') for i in range(50)) >> Out[13]: {'C(C1CC1)C', 'C1(CC)CC1', 'C1CC1CC', 'CCC1CC1'} >> >> You can do this all at once as follows: >> >> ``` >> In [20]: def allsmiles(smil): >> ...: m = Chem.MolFromSmiles(smil) # Construct a molecule from a >> SMILES string. >> ...: if m is None: >> ...: return smil >> ...: N = m.GetNumAtoms() >> ...: if N==0: >> ...: return smil >> ...: aids = list(range(N)) >> ...: random.shuffle(aids) >> ...: m = Chem.RenumberAtoms(m,aids) >> ...: try: >> ...: n= random.randint(0,N-1) >> ...: t= Chem.MolToSmiles(m, rootedAtAtom=n, canonical=False) >> ...: except : >> ...: return smil >> ...: return t >> ...: >> ...: >> ...: >> >> In [21]: >> >> In [21]: set(allsmiles('C1CC1CC') for i in range(50)) >> Out[21]: {'C(C)C1CC1', 'C(C1CC1)C', 'C1(CC)CC1', 'C1C(CC)C1', 'C1CC1CC', >> 'CCC1CC1'} >> ``` >> Note that I switched to using python's built in random module instead of >> using the one in numpy. >> >> -greg >> >> >> >> >> >> Thanks guys, >> >> Best regards, >> >> Guillaume >> >> *** >> DISCLAIMER >> This email and any files transmitted with it, including replies and >> forwarded copies (which may contain alterations) subsequently transmitted >> from Firmenich, are confidential and solely for the use of the intended >> recipient. The contents do not represent the opinion of Firmenich except to >> the extent that it relates to their official business. >> >>
Re: [Rdkit-discuss] MolFromInchi with Amides
Hi Jeff, That is because InChI is a structure identifier, not a structure representation. The difference of both is, a structure identifier normalizes the structure to a form which it regards as the standard representation of the molecule in order to make the molecule identifiable regardless of the state the molecule is coming in from a input resource (and hence calculates the same identifier). For Standard InChI, the decision was made to make them insensitive to tautomers (within the limitations of the InChI algorithm). Kind of unluckily, this normalizes most amides to a form that chemists regard as the incorrect one. And the second unlucky thing is that you can convert the InChI back to a structure representation which then is of course the normalized or standardized form of the molecule. So if you want to make sure to keep the original representation of a molecule don’t use InChI as your representation format (calculate InChI as an identifier field next to it). If your input resource only provides InChI or Standard InChI then your are of course out of luck. Best, Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 14. Jun 2018, at 23:33, Jeff van Santen wrote: > > Hi all, > > > I have some questions about how remit handles amides. For context, I am > working with a large set of molecules, many of which contain peptides. I have > been running into a problem with using rdkit, in that when I try to > load a molecule from the InChI, the wrong tautomer is loaded. As a simple > example consider acetamide: > > > """ > > FromInchi = Chem.MolFromInchi('InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)') > > print(rdMolDescriptors.CalcNumAmideBonds(FromInchi)) > > > 0 > > print(Chem.MolToSmiles(FromInchi)) > > > CC(=N)O > > > > FromSmiles = Chem.MolFromSmiles('CC(=O)N') > > print(rdMolDescriptors.CalcNumAmideBonds(FromInchi)) > > > 1 > > print(Chem.MolToSmiles(FromSmiles)) > > > CC(=N)O > > """ > > > I realize that Standard InChi does not have a mechanism for distinguishing > between the two tautomers, so I am wondering why rdkit considers the iminol > to be a better representation? Also, there is anyway to get the amide > instead? (Without using MolVS) > > > Thanks, > > Jeff > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chembience
Hello, I have just released Chembience 0.2.1: it updates RDKit to version 2018.03.2 and switches Postgres from the 9.x series to version 10.4 https://github.com/chembience/chembience Best, Markus On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann wrote: > Hello, > > I have released Chembience 0.2.0: it includes an update to RDKit 2018.03 > and also provides Jupyter as new base App container type. > > https://github.com/chembience/chembience > > (so, assuming you have Docker and docker-compose installed on your > computer, you are a few, easy commands away from your personal Jupyter > notebook server with all RDKit 2018.03 goodness readily available). > > Best, > Markus > > > On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann < > markus.sitzm...@gmail.com> wrote: > >> Hello, >> >> since it includes RDKit as one of its major components I am happy to >> announce the first release of my new open-source project Chembience: >> >> A Docker-based, cloudable platform for the development of >> chemoinformatics-centric web applications and microservices. >> >> https://github.com/chembience/chembience >> >> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it >> before 2018.03 :-) ). >> >> Best, >> Markus >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit postgres cartridge building
Hi Alfredo, My first guess would be you have another, older Postgres version on your computer and you have build against this version. Take a look at the /use/share/postgresql directory and take a look if there is another directory instead of 10/ Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 24. May 2018, at 18:24, Alfredo Quevedo wrote: > > Good morning, > > I am trying to build RDKit from source, and succeed with that following the > instructions provided in the documentation. Howvere, I am trying to use the > postgres cartridge, which as far as I understand is built during the main > building process. > > but after trying to create the extension for a database with: > > psql -c 'create extension rdkit' molecules > > I am getting the following error > > ERROR: could not open extension control file > "/usr/share/postgresql/10/extension/rdkit.control": No such file or directory > > It seems that the building of the cartridge is not being applyed to my local > postgres installation? > > Any hint is highly appreacited, > > thanks in advance > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] convert a smiles file to a xyz file
In reminiscence of old times, you can do this with the Chemical Identifier Resolver, for instance with the SMILES string for ethanol, CCO: https://cactus.nci.nih.gov/chemical/structure/CCO/file?format=xyz On Wed, May 23, 2018 at 5:24 PM Chenyang Shi wrote: > Hi Everyone, > > I am seeking helps about how to convert a SMILES file to a series of > coordinates for the molecule, in the format of xyz. > I saw some online service that can do the job (e.g. > http://www.cheminfo.org/Chemistry/Cheminformatics/FormatConverter/index.html), > but it is not convenient to use. > > I am wondering how can we do this by writing RDKit code. A separate > question is that is the converted molecular structure from SMILES the same > as that taken from a crystal structure? > > Many thanks! > Chenyang > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chembience
Hello, I have released Chembience 0.2.0: it includes an update to RDKit 2018.03 and also provides Jupyter as new base App container type. https://github.com/chembience/chembience (so, assuming you have Docker and docker-compose installed on your computer, you are a few, easy commands away from your personal Jupyter notebook server with all RDKit 2018.03 goodness readily available). Best, Markus On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann wrote: > Hello, > > since it includes RDKit as one of its major components I am happy to > announce the first release of my new open-source project Chembience: > > A Docker-based, cloudable platform for the development of > chemoinformatics-centric web applications and microservices. > > https://github.com/chembience/chembience > > (unfortunately it is still on RDKit 2017.09_3, I failed releasing it > before 2018.03 :-) ). > > Best, > Markus > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit-MolVS intergration: Google Summer of Code Project
Yes, great news. Matt has really started a very nice work there. I hope it can be turned into something like a well-documented, open standard for molecule standardization. Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 26. Apr 2018, at 00:29, Paul Czodrowski > wrote: > > Susan, great news, looking forward to this project, enjoy GSoC! Paul > > Von: Susan Leung [mailto:susan.le...@st-hildas.ox.ac.uk] > Gesendet: Mittwoch, 25. April 2018 23:35 > An: rdkit-discuss@lists.sourceforge.net > Betreff: [Rdkit-discuss] RDKit-MolVS intergration: Google Summer of Code > Project > > Hi all, > > I am really excited and happy to let you know that I will be working with > Greg on a RDKit-MolVS integration project as part of the Open Chemistry > Google Summer of Code. > > I have followed and used the RDKit mailing list since the start of my PhD and > have used both RDKit and MolVS in my workflow so I'm very excited to have the > opportunity to contribute to the code base. > > In this project we aim to expand the current capabilities of MolVS and > integrate it into RDKit so hopefully by the end of it, you will see > improvements in the molecular standardisation tools available in RDKit. > > Best wishes, > > Susan > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to any > other person. If you have received this transmission in error, please notify > the sender immediately and delete the message and any attachment from your > system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > accept liability for any omissions or errors in this message which may arise > as a result of E-Mail-transmission or for damages resulting from any > unauthorized changes of the content of this message and any attachment > thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > guarantee that this message is free of viruses and does not accept liability > for any damages caused by any virus transmitted therewith. > > Click http://www.merckgroup.com/disclaimer to access the German, French, > Spanish and Portuguese versions of this disclaimer. > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Chembience
Hello, since it includes RDKit as one of its major components I am happy to announce the first release of my new open-source project Chembience: A Docker-based, cloudable platform for the development of chemoinformatics-centric web applications and microservices. https://github.com/chembience/chembience (unfortunately it is still on RDKit 2017.09_3, I failed releasing it before 2018.03 :-) ). Best, Markus -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Some larger-scale RDKit C++ code changes
Yes, looks good :-). And the good thing with git is (if you very uncertain about the outcome), you always can make a test run by copying the whole directory, test all things with the copy, and if it goes horribly wrong, just delete the copy. Markus On Thu, Apr 5, 2018 at 8:46 AM, Greg Landrum wrote: > Thanks for raising this Markus. It had been on my list of things to look > into for a while and I had been kind of dreading it.[1] > > I did a bit of googling and experimentation and it looks like this > approach works well: > https://stackoverflow.com/questions/5956300/merging-two- > very-divergent-branches-using-git > Given that it also (at least to me) makes sense, I think that this is how > I'll proceed. > > -greg > [1] this is where I usually point to this xkcd: https://xkcd.com/1597/ > and make a joke about no longer being able to just walk over and ask Nadine > how to solve the problem. :-) > > On Wed, Apr 4, 2018 at 1:20 PM, Markus Sitzmann > wrote: > >> Have you tried a merge (after branching the master to something like >> master-test-merge and then merge modern_cxx) ? How horrible does it look? >> It might be quiet okay. Or do you really have a lot of changes in the >> current master you don't have/want to have in modern_cxx and the future >> master. And well, it just was a concern by me that avoiding "early" horrors >> might cause bigger horrors later :-). Renaming the master in a GIT >> repository is something I wouldn't do easily - I would regard it more like >> a very, very last resort because if the master is renamed (or replaced by >> another branch), any branch in any remote repository by anybody who ever >> branched from master (including the RDKit github repository) becomes >> potentially (very likely) invalid by this step. Only if this is a small >> concern, I would do it (I doubt it is in case of RDKit). >> >> Markus >> >> On Wed, Apr 4, 2018 at 11:56 AM, Greg Landrum >> wrote: >> >>> >>> >>> On Wed, Apr 4, 2018 at 11:27 AM, Markus Sitzmann < >>> markus.sitzm...@gmail.com> wrote: >>> >>>> Hi Greg, >>>> >>>> > Concretely what this means in github is that the current master >>>> branch will be renamed to legacy and the modern_cxx branch will be renamed >>>> to master. >>>> >>>> I hope you are not actually just renaming it - although I am not >>>> affected personally, that might be a call for trouble because it >>>> invalidates any remote repository of rdkit. >>>> >>> >>> If you have suggestions for how to do a large-delta change like that in >>> a non-horrible manner, I would love to hear them :-) >>> >>> -greg >>> >>> >> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Some larger-scale RDKit C++ code changes
Have you tried a merge (after branching the master to something like master-test-merge and then merge modern_cxx) ? How horrible does it look? It might be quiet okay. Or do you really have a lot of changes in the current master you don't have/want to have in modern_cxx and the future master. And well, it just was a concern by me that avoiding "early" horrors might cause bigger horrors later :-). Renaming the master in a GIT repository is something I wouldn't do easily - I would regard it more like a very, very last resort because if the master is renamed (or replaced by another branch), any branch in any remote repository by anybody who ever branched from master (including the RDKit github repository) becomes potentially (very likely) invalid by this step. Only if this is a small concern, I would do it (I doubt it is in case of RDKit). Markus On Wed, Apr 4, 2018 at 11:56 AM, Greg Landrum wrote: > > > On Wed, Apr 4, 2018 at 11:27 AM, Markus Sitzmann < > markus.sitzm...@gmail.com> wrote: > >> Hi Greg, >> >> > Concretely what this means in github is that the current master >> branch will be renamed to legacy and the modern_cxx branch will be renamed >> to master. >> >> I hope you are not actually just renaming it - although I am not affected >> personally, that might be a call for trouble because it invalidates any >> remote repository of rdkit. >> > > If you have suggestions for how to do a large-delta change like that in a > non-horrible manner, I would love to hear them :-) > > -greg > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Some larger-scale RDKit C++ code changes
Hi Greg, > Concretely what this means in github is that the current master branch will be renamed to legacy and the modern_cxx branch will be renamed to master. I hope you are not actually just renaming it - although I am not affected personally, that might be a call for trouble because it invalidates any remote repository of rdkit. Markus On Wed, Apr 4, 2018 at 5:23 AM, Greg Landrum wrote: > > NOTE: If you don't work with the RDKit at the C++ level or build the code > yourself from source, you probably don't need to read this email. > > TL;DR: When we do the beta for the 2018.03.1 release we're going to switch > the C++ backend to use modern C++ (=C++11). For people who can't switch to > use that code, we will continue to provide bug fixes for the 2017.09 > release for at least another 6 months. > > -- > # What's happening? > > As part of the upcoming 2018.03 release, we will start using modern C++ > for the RDKit - this means C++11 at the moment, the goal is that you should > be able to build the code with g++ v4.8. I've been talking about this for a > while, blogged about it (https://medium.com/@greg.land > rum_t5/the-rdkit-and-modern-c-48206b966218), and posted to the > rdkit-devel list (https://sourceforge.net/p/rdk > it/mailman/message/35811216/), now it's finally happening. > > Concretely what this means in github is that the current master branch > will be renamed to legacy and the modern_cxx branch will be renamed to > master. > > # Who does this affect? > > This should only affect people who need to build the RDKit C++ code > themselves. If you use a binary version of the RDKit like the ones > available inside of Anaconda Python or KNIME, this change should have no > impact upon you. > > # What about people who can't use up-to-date compilers? > > We realize that some people on older operating systems will not be able to > switch to start using a compiler that supports C++11. In order to continue > to support this subset of developers, we will continue to apply bug fixes > to the current Release_2017_09 branch and do occasional patch releases. > Since this is intended for people who need to build the code themselves > anyway, we won't do builds of these releases any more. > > We will keep doing these patch release at least until the 2018.09 release. > Whether or not we continue past that date will depend on demand, so if you > are using these releases please let us know. > > # Why are you doing this? > > There's a long, rambling answer to this, but I'm not going to give it > here. :-) > The simplest explanation is that we think that the core of the RDKit > should be using a modern and (reasonably) up-to-date version of the > language that it's written in. The developer experience is better and, > happily, the code ends up being faster. > > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] conda build instructions for OSX?
I am not 100% sure what your motivation is to build a certain rev or the master branch, but let me guess: you want to be independent from further changes in the development branch. Well, my suggestion for this: fork the conda-rdkit repro on github and use your fork for future builds, i.e. the repro is stable until you decide to merge future changes from the original repro Having your own fork would also allow you to merge the development branch into the master branch of your fork if this is a requirement (although I don't see any differences between using development or the master branch for builds). On Tue, Jan 2, 2018 at 4:18 PM, Brian Cole wrote: > Figured out by sleuthing around the conda-rdkit repo that the 'master' > branch is really old. Looks like the 'development' branch is the branch > that works. If you switch over to the 'development' branch then the 'conda > build boost && conda build rdkit' works. > > Now the next trick I'm still stuck on is how to build RDKit's master > branch using conda. Changing `git_rev` in rdkit/meta.yaml didn't have the > desired effect. > > -Brian > > On Wed, Dec 27, 2017 at 5:08 PM, Brian Cole wrote: > >> Trying to 'conda build rdkit' as described in the >> https://github.com/rdkit/conda-rdkit README to no success. Are there any >> OSX 'conda build' instructions tucked away somewhere? >> >> It's currently failing on the cairo dependency: >> >> -- Checking for one of the modules 'cairo' >> CMake Error at /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/_h_env_placehold_placehold_placehold_ >> placehold_placehold_placehold_placehold_placehold_placehold_ >> placehold_placehold_placehold_placehold_placehold_placehold_ >> placehold_placehold_placehold_placehold_place/share/cmake-3. >> 9/Modules/FindPkgConfig.cmake:640 (message): >> None of the required 'cairo' found >> Call Stack (most recent call first): >> Code/cmake/Modules/FindCairo.cmake:23 (PKG_SEARCH_MODULE) >> Code/GraphMol/MolDraw2D/CMakeLists.txt:31 (find_package) >> >> >> CMake Error at Code/cmake/Modules/FindCairo.cmake:38 (MESSAGE): >> Could not find Cairo >> Call Stack (most recent call first): >> Code/GraphMol/MolDraw2D/CMakeLists.txt:31 (find_package) >> >> >> -- Boost version: 1.56.0 >> -- Found the following Boost libraries: >> -- regex >> CMake Error: The following variables are used in this project, but they >> are set to NOTFOUND. >> Please set them or make sure they are set and tested correctly in the >> CMake files: >> CAIRO_INCLUDE_DIRS (ADVANCED) >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap >>used as include directory in directory /Users/coleb/anaconda2/conda-b >> ld
Re: [Rdkit-discuss] Issue with the latest RDKit DB build
I have the problem, too, on Debian stretch - | Markus Sitzmann | markus.sitzm...@gmail.com > On 29. Dec 2017, at 20:01, Drew Gibson via Rdkit-discuss > wrote: > > Hello, and compliments of the season to you, RDKitters :) > > I'm having trouble getting the conda build of the DB package > (rdkit-postgresql95) working. > > The issue I'm having occurs when trying to initialise the rdkit DB extension > on a newly created DB, eg... > > createdb emolecules > psql -c 'create extension rdkit' emolecules which will give me the > error... > > psql: error while loading shared libraries: libncursesw.so.6: cannot open > shared object file: No such file or directory > > I am getting the same error on both Ubuntu 16.04 LTS and in CentOS7 (latest, > running in VirtualBox for now). > > I have successfully installed and used previous versions of the DB > (rdkit-postgresql), and currently have this working on Ubuntu. > > Any suggestions to getting the newest version working ? > > Cheers, Drew > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Docker with (latest) rdkit+jupyter
Hi JP, From the Docker log you posted it is obvious that the build starts from the latest miniconda version which than will use python 3.6 as default, however one of the python packages still relies python 3.5. One thing you can try is to tell the conda install command in the docker script to go back to python 3.5 or create a python 3.5 based environment. Unfortunately I just don’t remember out of my head which option you have to use for this but you fill find it in the conda documentation. And as much I like the idea of conda, it is unfortunately one of the biggest troublemakers in my personal projects. Another point is, if you look for one of the recent post from Greg here on the list, there is another problem with the latest conda version you might run into. Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 21. Nov 2017, at 16:53, Tim Dudgeon wrote: > > I've got some dockerfiles that might be worth a look. > https://github.com/InformaticsMatters/docker_jupyter > > Not sure if they will help. > > Tim > > >> On 21/11/2017 15:25, JP wrote: >> Yo RDKitters, >> >> I am running a CADD workshop for a group of MSc students and would like to >> show them some some RDKit awesomeness. >> >> I thought the best way to do this is to use an rdkit enabled docker image + >> jupyter notebooks (they are comfortable with python). >> >> In preparation, I tried building the docker image from the docker file at >> https://github.com/rdkit/rdkit_containers/tree/master/docker/run_conda3 but >> this fails on Ubuntu 16.04.3 LTS with the following error: >> >> $ docker build -t run_rdkit_conda >> https://raw.githubusercontent.com/rdkit/rdkit_containers/master/docker/run_conda3/Dockerfile >> Downloading build context from remote url: >> https://raw.githubusercontent.com/rdkit/rdkit_containers/master/docker/run_conda3/Dockerfile >> 357B >> Sending build context to Docker daemon 2.048kB >> Step 1/7 : FROM continuumio/miniconda3 >> latest: Pulling from continuumio/miniconda3 >> 85b1f47fba49: Pull complete >> 6b3cb0c49789: Pull complete >> fecb432dacf0: Pull complete >> f461f7e3890d: Pull complete >> Digest: >> sha256:604cda0c0be5d40cc26db31912d8b1b7276840a56544b846abef441b32d987fc >> Status: Downloaded newer image for continuumio/miniconda3:latest >> ---> f700f7f570c7 >> Step 2/7 : MAINTAINER Greg Landrum >> ---> Running in ad6a648c18ba >> ---> 18e6d6093d5b >> Removing intermediate container ad6a648c18ba >> Step 3/7 : ENV PATH /opt/conda/bin:$PATH >> ---> Running in e21cf8e5332f >> ---> ddef65292068 >> Removing intermediate container e21cf8e5332f >> Step 4/7 : ENV LANG C >> ---> Running in efa12ef17f37 >> ---> 137d7e20350d >> Removing intermediate container efa12ef17f37 >> Step 5/7 : RUN conda config --add channels https://conda.anaconda.org/rdkit >> ---> Running in 79566bf4b6e9 >> ---> 032965875391 >> Removing intermediate container 79566bf4b6e9 >> Step 6/7 : RUN conda install -y nomkl rdkit pandas cairo cairocffi jupyter >> ---> Running in c5aa6417a63a >> Fetching package metadata . >> Solving package specifications: . >> >> UnsatisfiableError: The following specifications were found to be in >> conflict: >> - cairocffi -> python 3.5* -> xz 5.0.5 >> - python 3.6* >> Use "conda info " to see the dependencies for each package. >> >> The command '/bin/sh -c conda install -y nomkl rdkit pandas cairo cairocffi >> jupyter' returned a non-zero code: 1 >> >> Any ideas? >> JP >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] ImportError: No module named rdkit
Well, if you have python 2.7 and 3.5 already running ,you can use (mini)conda for the RDKit installation (conda is anaconda but instead of one huge package you can install the packages you want including RDKit) On Fri, Sep 15, 2017 at 9:12 AM, Loris Bennett wrote: > Hi Greg, > > Greg Landrum writes: > > > I'll provide a more detailed answer in a bit, but since you aren't > > using the system python anyway, is there any chance that you could > > switch to anaconda python on your machines? Anaconda is a great python > > distribution for scientific applications and it makes many things > > (including system administration) a ton easier. > > Anaconda might be a possibility. On the other hand we already have 3 > versions of Python in use: 2.6 from the OS, and 2.7 and 3.5 from the > Software Collections. In addition, the current cluster is nearing its > end-of-life, probably before the end of the year and so I am somewhat > loathe to install yet another one and add to my can of worms (or pit of > snakes). > > However, now I have a slight handle on the problem and know that there > is a responsive and helpful mailing list to back me up, I'm happy to > invest a little more time in trying another source build. > > Cheers, > > Loris > > -- > Dr. Loris Bennett (Mr.) > ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] ImportError: No module named rdkit
BTW, python 3.6 is out since last Christmas ;-) (and made it to sub-release .2) On Fri, Sep 15, 2017 at 8:36 AM, Greg Landrum wrote: > I'll provide a more detailed answer in a bit, but since you aren't using > the system python anyway, is there any chance that you could switch to > anaconda python on your machines? Anaconda is a great python distribution > for scientific applications and it makes many things (including system > administration) a ton easier. > > -greg > > > On Fri, Sep 15, 2017 at 8:19 AM, Loris Bennett > wrote: > >> Hi Greg, >> >> Greg Landrum writes: >> >> > Hi Loris, >> > >> > On Thu, Sep 14, 2017 at 2:25 PM, Loris Bennett < >> loris.benn...@fu-berlin.de> wrote: >> > >> > I am trying to install RDKit on a university cluster running Linux from >> > source. The build seem to go OK and 'make install' copied the >> > directories >> > >> > lib >> > rdkit >> > >> > to the NFS share where the software should reside. I then do >> > >> > export RDBASE=/cm/shared/apps/rdkit/rdkit_2017_03_3 >> > export PYTHONPATH=$PYTHONPATH:$RDBASE >> > export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$RDBASE/lib >> > >> > However when I then run Python (2.6.6) and try >> > >> > Just to do some expectation management: python 2.6 is pretty ancient >> > and there's no guarantee that all of the RDKit code will work with >> > it. Python 2.7 is the minimum version that we "officially" >> > support. It's a very good idea to update. >> >> OK. I didn't notice that 2.6 was deprecated - maybe this could be >> explicitly mentioned in the install instructions. I'm running the >> RedHat clone Scientific Linux 6, so everything in this thread on >> RH/Python applies. So I can use either Python 2.7 or Python 3.5. I can >> ask the users what they prefer - although, as you seem know my users >> here in Berlin, maybe you know too ;-) >> >> > import rdkit >> > >> > I get >> > >> > ImportError: No module named rdkit >> > >> > I am not a Python person and my naive expectation was that there should >> > be a file called >> > >> > rdkit.py >> > >> > Based on the info provided so far, there should be a directory called >> > rdkit in the directory: /cm/shared/apps/rdkit/rdkit_2017_03_3 >> >> This directory exists. >> >> > That directory should contain a number of sub dirs, other files, and a >> > file called __init__.py (this is the one that tells Python that it can >> > import the directory as a package). What do you see there? >> >> The directory just contains >> >> lib >> rdkit >> >> an nothing else, in particular, no __init__.py. I have plenty of >> __init__.pys in the build directory, so I assume I must have done some >> thing wrong when running cmake and/or make install. >> >> I must admit that I found the installation instructions somewhat unclear >> on that point. I would find it clearer if things were couched in terms >> of 'source' and 'destination'. For me, as a make-guy rather than a >> cmake-guy, it would also be helpful if it were made clearer at which >> point the destination directory should be specified. I ended up with >> RDKit being installed under a very long path with included both my >> intended path and the original build path, so I had to move things >> around and may have goofed up at that point. >> >> > which has to be on my PYTHONPATH. However, since the unpacked sources >> > together with the build don't seem to contain such a file, either >> > something is broken or the rdkit module should be found by some other >> > mechanism. >> > >> > Again, based on the info above, I would expect that you want "make >> > install" to copy the "rdkit" and "lib" directories (as well as a >> > couple others) to /cm/shared/apps/rdkit/rdkit_2017_03_3. Once we >> > figure out what actually happened I can maybe help you figure out how >> > to fix it. >> >> This is what I did: >> >> module add boost # this just sets the boost stuff up >> >> export VERSION=2017_03_3 >> export RDBASE=/home/BUILD/rdkit/rdkit-rdkit-Release_${VERSION} >> export LD_LIBRARY_PATH=${RDBASE}:${LD_LIBRARY_PATH} >> export DESTDIR=/cm/shared/apps/rdkit/${VERSION} >> >> and then probably >> >> cmake -DCMAKE_INSTALL_PREFIX=/cm/shared/apps/rdkit/${VERSION} >> >> so I may have over-egged my install-path-cake. I started all the >> fiddling with DESTDIR and CMAKE_INSTALL_PREFIX, because my initial >> attempt resulted in the destination directory being the same as the >> build directory, which didn't work so well. >> >> Thanks for the help - I'll have another go Python 3.5 and try to keep my >> eye on __init__.py. >> >> Cheers, >> >> Loris >> >> -- >> Dr. Loris Bennett (Mr.) >> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de >> > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-disc
Re: [Rdkit-discuss] Non-redundant database of molecules
t;>> Type "help", "copyright", "credits" or "license" for more information. >>>>> Anaconda is brought to you by Continuum Analytics. >>>>> Please check out: http://continuum.io/thanks and https://anaconda.org >>>>> >>> import rdkit >>>>> >>> from rdkit import Chem >>>>> Traceback (most recent call last): >>>>> File "", line 1, in >>>>> File "/opt/rdkit-Release_2016_03_1/rdkit/Chem/__init__.py", line >>>>> 18, in >>>>> from rdkit import rdBase >>>>> ImportError: cannot import name rdBase >>>>> >>>>> >>>>> -- >>>>> Wandré Nunes de Pinho Veloso >>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >>>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - >>>>> UFMG >>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >>>>> Inteligência Computacional - UNIFEI >>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >>>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >>>>> >>>>> 2017-09-14 9:17 GMT-03:00 Malitha Kabir : >>>>> >>>>>> Hi Wandré, >>>>>> >>>>>> Good day! It's malitha. >>>>>> >>>>>> Considering your first question I would say, the path variable NOT >>>>>> set correctly. To avoid having gymnastic with linux system you may >>>>>> consider >>>>>> the following steps: >>>>>> >>>>>>1. Install miniconda or andcona from >>>>>>https://conda.io/miniconda.html <https://conda.io/miniconda.html> >>>>>>and command yes (y) when it says to add path variable to python >>>>>> shipped >>>>>>with conda. I mean python within conda would be your default python. >>>>>> After >>>>>>installing it, when you run the command <<<<>>>>> from shell >>>>>> you >>>>>>will see something like <<>> at the screen >>>>>>2. Install rdkit from https://anaconda.org/rdkit/rdkit on top of >>>>>>conda >>>>>> >>>>>> >>>>>> For question regarding energy minimization, you may find the >>>>>> following link helpful. >>>>>> https://sourceforge.net/p/rdkit/mailman/message/28298074/ >>>>>> >>>>>> I hope, it helps! >>>>>> >>>>>> - malitha >>>>>> >>>>>> On Thu, Sep 14, 2017 at 4:22 PM, Wandré >>>>>> wrote: >>>>>> >>>>>>> So, >>>>>>> 1) I run all the commands in tutorial of installation of RDKit in >>>>>>> Conda (https://github.com/rdkit/conda-rdkit), but, when I run >>>>>>> python and try to import Chem ("from rdkit import Chem") appears an >>>>>>> error >>>>>>> message: >>>>>>> Traceback (most recent call last): >>>>>>> File "", line 1, in >>>>>>> File "/opt/rdkit-Release_2016_03_1/rdkit/Chem/__init__.py", line >>>>>>> 18, in >>>>>>> from rdkit import rdBase >>>>>>> ImportError: cannot import name rdBase >>>>>>> >>>>>>> 2) Thanks for all the references >>>>>>> >>>>>>> 3) Which function generate this "energy minimized molecule"? >>>>>>> >>>>>>> -- >>>>>>> Wandré Nunes de Pinho Veloso >>>>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >>>>>>> Doutorando em Bioinformática - Universidade Federal de Minas >>>>>>> Gerais - UFMG >>>>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >>>>>>> Inteligência Computacional - UNIFEI >>>>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >>>>>>> Membro do Grupo de Pesquisa Bioinformática Estrutu
Re: [Rdkit-discuss] ImportError: No module named rdkit
Not on Centos 6 - Docker requires Centos 7 for the host system. On Thu, Sep 14, 2017 at 10:01 PM, Dimitri Maziuk wrote: > On 09/14/2017 02:58 PM, Andrew Dalke wrote: > > > If only Greg got as much money for long term RDKit support as Red Hat > > gets for long term RHEL support. :) > > Yep. But an rdkit docker container might be feasible. > > -- > Dimitri Maziuk > Programmer/sysadmin > BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] bad inchi or parsing problem?
On Thu, Sep 14, 2017 at 8:09 PM, Jason Biggs wrote: > Okay, all three of these smiles strings resolve to the same inchi, > > "O=[N+](C1=NC2=CC=CC=C2N=C1)[N-](=O)C1=NC2=CC=CC=C2N=C1" > "C1=CC=C2C(=C1)N=CC(=N2)N(=N(=O)C3=NC4=CC=CC=C4N=C3)=O" > "[O-][N+](c1cnc2c2n1)=[N+]([O-])c3cnc4c4n3" > > even though to me they seem like different structures due to the specified > charges. Is this a limitation of inchi, or do I need to rethink my ideas > of what makes two chemical structures the same? > > Well, but at least the first two ones I would regard as erroneous or unlikely (not stable) creatures - and that is exactly what John meant with InChI is an identifier, not a representation. InChI's main purpose (particularly that one of Standard InChI) is to identify them as the same (corrected, normalized) molecule, not as three separate species (that would be the purpose of representation). Of course, in many cases, there might be a discussion avout where sensible correction/normalization should end and separation of structures should start but that is long topic. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] bad inchi or parsing problem?
On Thu, Sep 14, 2017 at 8:09 PM, Jason Biggs wrote: > Okay, all three of these smiles strings resolve to the same inchi, > > "O=[N+](C1=NC2=CC=CC=C2N=C1)[N-](=O)C1=NC2=CC=CC=C2N=C1" > "C1=CC=C2C(=C1)N=CC(=N2)N(=N(=O)C3=NC4=CC=CC=C4N=C3)=O" > "[O-][N+](c1cnc2c2n1)=[N+]([O-])c3cnc4c4n3" > > even though to me they seem like different structures due to the specified > charges. Is this a limitation of inchi, or do I need to rethink my ideas > of what makes two chemical structures the same? > > Well, but at least the first two ones I would regard as erroneous or unlikely (not stable) creatures - and that is exactly what John meant with InChI is an identifier, not a representation. InChI's main purpose (particularly that one of Standard InChI) is to identify them as the same (corrected, normalized) molecule, not as three separate species (that would be the purpose of representation). Of course, in many cases, there might be a discussion avout where sensible correction/normalization should end and separation of structures should start but that is long topic. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] bad inchi or parsing problem?
On Thu, Sep 14, 2017 at 7:38 PM, John Mayfield wrote: > InChI is an identifier and not a representation, you should not read > InChIs... but we are beyond hope there so... > Wonderfully said - unfortunately one day they decided to make InChIs "readable" ... > The InChI string is correct and is the same if you roundtrip your > preferred one with charge separated bonds and the 5 valent one. > > All toolkits will use the InChI library to read/write InChIs and it > generates the representation with 5v nitrogens, cactus is either applying > normalisation after reading or in this case (since it's the name resolved) > doing a identifier lookup from an original SMILES used to generate this > InChI: > No, my "good old" cactus service doesn't do a lookup in this case, it is read from the string, which is of of course in opposition to what I just said :-). We did quite a bit regarding normalization, first, the CACTVS toolkit behind the service is quite good in this regard and I added a few things for the web service, too. Markus -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Non-redundant database of molecules (Wandr?)
If you do nothing else (on purpose), SMILES *calculated* by RDKit from any input are canonical per se (BUT that is only true if you compare it to other SMILES also calculated by RDKit, you can not compare SMILES between software packages even if they canonical in the domain of each of the software packages). On Wed, Sep 13, 2017 at 9:16 PM, Wandré wrote: > Why don't use the InChI function on RDKit? > Canonical SMILES cannot be generated by RDKit, correct? > > -- > Wandré Nunes de Pinho Veloso > Professor Assistente - Unifei - Campus Avançado de Itabira-MG > Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG > Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e > Inteligência Computacional - UNIFEI > Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ > Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG > Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG > > 2017-09-13 15:57 GMT-03:00 Chris Swain : > >> Hi, >> >> I’d use a text based version of the structure InChiKey or canonical >> SMILES it then becomes a easy task to do the comparison in Python >> >> I wrote a script to do this in Vortex but it should be easy to modify. >> https://www.macinchem.org/reviews/vortex/tut28/scripting_vortex28.php >> >> >> Cheers >> >> Chris >> >> >> >> Today's Topics: >> >> 1. Non-redundant database of molecules (Wandr?) >> >> >> -- >> >> Message: 1 >> Date: Wed, 13 Sep 2017 07:13:56 -0300 >> From: Wandr? >> To: rdkit-discuss@lists.sourceforge.net >> Subject: [Rdkit-discuss] Non-redundant database of molecules >> Message-ID: >> >> Content-Type: text/plain; charset="utf-8" >> >> Hi, >> >> My name is Wandr? and I'm from Brazil. >> I'm trying to do a big database of molecules, but, I want to eliminate all >> the redundant molecules before insert them in database. >> I want to know what is the best method to identify one molecule in RDKit. >> Is SMILES ("Chem.MolToSmiles(mol,isomericSmiles=True)") or I will need to >> compare all molecules, one by one, before insert them in database (using >> Tanimoto)? >> This can be hard to do because my database will have lot of millions of >> molecules, so, compare one by one before insert is the only answer? >> Compare if the SMILES as already inserted is easy (text compare), but, >> compare fingerprint of molecule... >> >> If I really need to compare the fingerprint of molecule, how to store this >> data in PostgreSQL without use cartridge? I will generate the fingeprint >> (Atompair, for example) and store this fingerprint in database and compare >> all the fingerprints, one by one, before insert a now molecule. This >> fingerprint (Atompair) have lot of features, so, store this in relational >> database is expensive. >> It is possible? >> >> Thanks! >> >> -- >> Wandr? Nunes de Pinho Veloso >> Professor Assistente - Unifei - Campus Avan?ado de Itabira-MG >> Doutorando em Bioinform?tica - Universidade Federal de Minas Gerais - UFMG >> Pesquisador do INSILICO - Grupo Interdisciplinar em Simula??o e >> Intelig?ncia Computacional - UNIFEI >> Membro do Grupo de Pesquisa Assinaturas Biol?gicas da FIOCRUZ >> Membro do Grupo de Pesquisa Bioinform?tica Estrutural da UFMG >> Laborat?rio de Bioinform?tica e Sistemas - LBS, DCC, UFMG >> -- next part -- >> An HTML attachment was scrubbed... >> >> -- >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> -- >> >> Subject: Digest Footer >> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >> -- >> >> End of Rdkit-discuss Digest, Vol 119, Issue 20 >> ** >> >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one
Re: [Rdkit-discuss] Non-redundant database of molecules
PS. The conda version has InChI support On Wed, Sep 13, 2017 at 10:04 PM, Markus Sitzmann wrote: > Strong recommendation: use the conda version: > > http://www.rdkit.org/docs/Install.html > > On Wed, Sep 13, 2017 at 9:58 PM, Wandré wrote: > >> I just run sudo apt-get install python-rdkit librdkit1 rdkit-data 😁 >> I'm trying to solve this with this link: http://www.blopig.com/bl >> og/2013/02/how-to-install-rdkit-on-ubuntu-12-04/ >> >> -- >> Wandré Nunes de Pinho Veloso >> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG >> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >> Inteligência Computacional - UNIFEI >> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >> >> 2017-09-13 16:55 GMT-03:00 Markus Sitzmann : >> >>> How did you install rdkit so far? And where? Is it the conda/anaconda >>> version? >>> >>> On Wed, Sep 13, 2017 at 9:39 PM, Wandré wrote: >>> >>>> How to install RDKit with InChI? >>>> When I run Chem.inchi.INCHI_AVAILABLE, the result is False >>>> >>>> -- >>>> Wandré Nunes de Pinho Veloso >>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - >>>> UFMG >>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >>>> Inteligência Computacional - UNIFEI >>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >>>> >>>> 2017-09-13 16:30 GMT-03:00 Wandré : >>>> >>>>> Thanks Malitha. >>>>> I choose this descriptors because I will store this on my database, >>>>> so, will be fast compare one molecule before insert them in database. >>>>> My worry now is if the RDKit will generate different SMILES or InChI >>>>> in same SDF molecule or equals in different molecules (molecules from RCSB >>>>> PDB, PubChem, ChemBL, for example). >>>>> >>>>> -- >>>>> Wandré Nunes de Pinho Veloso >>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >>>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - >>>>> UFMG >>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >>>>> Inteligência Computacional - UNIFEI >>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >>>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >>>>> >>>>> 2017-09-13 16:22 GMT-03:00 Malitha Kabir : >>>>> >>>>>> Hi Wandré, >>>>>> >>>>>> It seems you already did intense research on it. Kindly accept my >>>>>> comments as an addition to your idea (not the answer you trying to find >>>>>> out). In my idea, categorizing molecules using it's descriptor should >>>>>> reduce computation time. RDKit currently offer calculation of about 200 >>>>>> descriptors! So, a careful look up at those makes a lot of sense to me. >>>>>> Conceptually, descriptor matching should follow a sequence (I don't know >>>>>> what sequence would be ideal) - for example MolWt should match first (H >>>>>> contribution and ions should be taken into consideration here) and then >>>>>> subsequent matching of other descriptors (might be different while >>>>>> writing >>>>>> programs). There are a few reading materials on molecular fingerprint and >>>>>> database schema. You may have a look at those. >>>>>> >>>>>> The links are from Daylight. I am neither involved with the company >>>>>> nor their product. >>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html >>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html >>>>>> >>>>>> Best regards, >>>>>> - malitha >>>>>> >&g
Re: [Rdkit-discuss] Non-redundant database of molecules
Strong recommendation: use the conda version: http://www.rdkit.org/docs/Install.html On Wed, Sep 13, 2017 at 9:58 PM, Wandré wrote: > I just run sudo apt-get install python-rdkit librdkit1 rdkit-data 😁 > I'm trying to solve this with this link: http://www.blopig.com/ > blog/2013/02/how-to-install-rdkit-on-ubuntu-12-04/ > > -- > Wandré Nunes de Pinho Veloso > Professor Assistente - Unifei - Campus Avançado de Itabira-MG > Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG > Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e > Inteligência Computacional - UNIFEI > Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ > Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG > Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG > > 2017-09-13 16:55 GMT-03:00 Markus Sitzmann : > >> How did you install rdkit so far? And where? Is it the conda/anaconda >> version? >> >> On Wed, Sep 13, 2017 at 9:39 PM, Wandré wrote: >> >>> How to install RDKit with InChI? >>> When I run Chem.inchi.INCHI_AVAILABLE, the result is False >>> >>> -- >>> Wandré Nunes de Pinho Veloso >>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - >>> UFMG >>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >>> Inteligência Computacional - UNIFEI >>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >>> >>> 2017-09-13 16:30 GMT-03:00 Wandré : >>> >>>> Thanks Malitha. >>>> I choose this descriptors because I will store this on my database, so, >>>> will be fast compare one molecule before insert them in database. >>>> My worry now is if the RDKit will generate different SMILES or InChI in >>>> same SDF molecule or equals in different molecules (molecules from RCSB >>>> PDB, PubChem, ChemBL, for example). >>>> >>>> -- >>>> Wandré Nunes de Pinho Veloso >>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - >>>> UFMG >>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >>>> Inteligência Computacional - UNIFEI >>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >>>> >>>> 2017-09-13 16:22 GMT-03:00 Malitha Kabir : >>>> >>>>> Hi Wandré, >>>>> >>>>> It seems you already did intense research on it. Kindly accept my >>>>> comments as an addition to your idea (not the answer you trying to find >>>>> out). In my idea, categorizing molecules using it's descriptor should >>>>> reduce computation time. RDKit currently offer calculation of about 200 >>>>> descriptors! So, a careful look up at those makes a lot of sense to me. >>>>> Conceptually, descriptor matching should follow a sequence (I don't know >>>>> what sequence would be ideal) - for example MolWt should match first (H >>>>> contribution and ions should be taken into consideration here) and then >>>>> subsequent matching of other descriptors (might be different while writing >>>>> programs). There are a few reading materials on molecular fingerprint and >>>>> database schema. You may have a look at those. >>>>> >>>>> The links are from Daylight. I am neither involved with the company >>>>> nor their product. >>>>> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html >>>>> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html >>>>> >>>>> Best regards, >>>>> - malitha >>>>> >>>>> >>>>> On Thu, Sep 14, 2017 at 12:43 AM, Wandré >>>>> wrote: >>>>> >>>>>> Thanks for all the answers. >>>>>> >>>>>> Reading all answers, I think in something different... If the SMILES >>>>>> (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi >>>>>> (Chem.MolToInchi(mol)) can generate the same value in different >>>>
Re: [Rdkit-discuss] Non-redundant database of molecules
How did you install rdkit so far? And where? Is it the conda/anaconda version? On Wed, Sep 13, 2017 at 9:39 PM, Wandré wrote: > How to install RDKit with InChI? > When I run Chem.inchi.INCHI_AVAILABLE, the result is False > > -- > Wandré Nunes de Pinho Veloso > Professor Assistente - Unifei - Campus Avançado de Itabira-MG > Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG > Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e > Inteligência Computacional - UNIFEI > Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ > Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG > Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG > > 2017-09-13 16:30 GMT-03:00 Wandré : > >> Thanks Malitha. >> I choose this descriptors because I will store this on my database, so, >> will be fast compare one molecule before insert them in database. >> My worry now is if the RDKit will generate different SMILES or InChI in >> same SDF molecule or equals in different molecules (molecules from RCSB >> PDB, PubChem, ChemBL, for example). >> >> -- >> Wandré Nunes de Pinho Veloso >> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG >> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >> Inteligência Computacional - UNIFEI >> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >> >> 2017-09-13 16:22 GMT-03:00 Malitha Kabir : >> >>> Hi Wandré, >>> >>> It seems you already did intense research on it. Kindly accept my >>> comments as an addition to your idea (not the answer you trying to find >>> out). In my idea, categorizing molecules using it's descriptor should >>> reduce computation time. RDKit currently offer calculation of about 200 >>> descriptors! So, a careful look up at those makes a lot of sense to me. >>> Conceptually, descriptor matching should follow a sequence (I don't know >>> what sequence would be ideal) - for example MolWt should match first (H >>> contribution and ions should be taken into consideration here) and then >>> subsequent matching of other descriptors (might be different while writing >>> programs). There are a few reading materials on molecular fingerprint and >>> database schema. You may have a look at those. >>> >>> The links are from Daylight. I am neither involved with the company nor >>> their product. >>> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html >>> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html >>> >>> Best regards, >>> - malitha >>> >>> >>> On Thu, Sep 14, 2017 at 12:43 AM, Wandré wrote: >>> >>>> Thanks for all the answers. >>>> >>>> Reading all answers, I think in something different... If the SMILES >>>> (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi >>>> (Chem.MolToInchi(mol)) can generate the same value in different molecules, >>>> I will generate others descriptors (NumHDonors, NumHAcceptors, Ri >>>> ngCount, GetNumAtoms, TPSA, pyLabuteASA, MolWt, CalcNumRotatableBonds >>>> and MolLogP) to compare all the molecules that SMILES and Inchi are the >>>> same. >>>> If all this data are the same, I will generate the fingerprint >>>> (Atompair for exemple) and use Tanimoto coefficient and, if this value, >>>> when I compare two molecules, is 1, this molecules are the same. >>>> >>>> Where is my mistake (I think that is, one or more, mistakes)? >>>> >>>> Thanks! >>>> >>>> -- >>>> Wandré Nunes de Pinho Veloso >>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - >>>> UFMG >>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >>>> Inteligência Computacional - UNIFEI >>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >>>> >>>> 2017-09-13 14:19 GMT-03:00 Dimitri Maziuk : >>>> >>>>> On 09/13/2017 11:46 AM, Markus Sitzmann wrote: >>>>> > The
Re: [Rdkit-discuss] Non-redundant database of molecules
Hi Wandré, your problem is the opposite - it is quite unlikely, actually impossible, that different molecules calculate the same InChI or SMILES, your bigger problem is, that what you regard as the same chemical, is regarded as different ones by SMILES or InChI. The danger for this is quite big for SMILES. it becomes better with canonical SMILES (but in my opinion, not much), your best friend is InChI or Standard InChI. Also, if two different molecules would calculate the same InChI or SMILES, in all likelihood all your descriptors are very similar, too, because SMILES, InChI etc. are just connection table representations and those descriptor calculating algorithms just work on the connection table (so, the molecules also look the same for any of these algorithms). Calculation of Tanimoto coefficient-type doesn't help this problem either, and a Tanimoto coefficient of 1 doesn't mean two molecules are identical (they are very similar but not identical). Markus On Wed, Sep 13, 2017 at 8:43 PM, Wandré wrote: > Thanks for all the answers. > > Reading all answers, I think in something different... If the SMILES > (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi > (Chem.MolToInchi(mol)) can generate the same value in different molecules, > I will generate others descriptors (NumHDonors, NumHAcceptors, > RingCount, GetNumAtoms, TPSA, pyLabuteASA, MolWt, CalcNumRotatableBonds > and MolLogP) to compare all the molecules that SMILES and Inchi are the > same. > If all this data are the same, I will generate the fingerprint (Atompair > for exemple) and use Tanimoto coefficient and, if this value, when I > compare two molecules, is 1, this molecules are the same. > > Where is my mistake (I think that is, one or more, mistakes)? > > Thanks! > > -- > Wandré Nunes de Pinho Veloso > Professor Assistente - Unifei - Campus Avançado de Itabira-MG > Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG > Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e > Inteligência Computacional - UNIFEI > Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ > Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG > Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG > > 2017-09-13 14:19 GMT-03:00 Dimitri Maziuk : > >> On 09/13/2017 11:46 AM, Markus Sitzmann wrote: >> > The case that you have 3D information available for a molecule dataset >> is rare, if you want it trustworthy it gets even worse than that. And what >> is the point then to generate the configuration of a molecule first if you >> can not trust that either? >> >> Veering further off topic, do you even care in the first place? E.g. if >> your molecule always exists as a mixture of isomers, except in some >> megabuck-per-microgram painstakingly created reference samples, a >> 3D-based system will represent it as two distinct molecules. Whereas you >> want it represented as one. >> >> Last I looked PDB Ligand Expo had two different benzenes. Their software >> doesn't (didn't?) do the circle version so they don't have the third one. >> >> -- >> Dimitri Maziuk >> Programmer/sysadmin >> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Non-redundant database of molecules
The case that you have 3D information available for a molecule dataset is rare, if you want it trustworthy it gets even worse than that. And what is the point then to generate the configuration of a molecule first if you can not trust that either? - | Markus Sitzmann | markus.sitzm...@gmail.com > On 13. Sep 2017, at 17:58, Dimitri Maziuk wrote: > >> On 2017-09-13 10:17, Markus Sitzmann wrote: >> Canonical SMILES are only a very rough approximation for "unique molecule" >> as they usually don't work well for tautomeric forms of compound. >> InChI or Standard InChI is much better although also not perfect. > > ALATIS I linked to above does impose a stable consistent ordering for > everything including hydrogens. The downside is it's garbage in - garbage > out: you need to start with a 3D structure, otherwise it has an option to > addHs and gen3D but no guarantee it'll generate the one you want. > > Dima > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Non-redundant database of molecules
Canonical SMILES are only a very rough approximation for "unique molecule" as they usually don't work well for tautomeric forms of compound. InChI or Standard InChI is much better although also not perfect. The "perfect solution" depends also on how uniqueness or redundancy of molecules is regarded for the purpose of the database. On Wed, Sep 13, 2017 at 4:56 PM, TJ O'Donnell wrote: > Let the database do the work for you. Create a canonical SMILES column > and/or InChI column and declare them to be unique. As you insert new > rows, postgres will let you know if there is already a row with the same > SMILES or InChI. > Here's some help on how to handle that. > https://www.postgresql.org/docs/9.5/static/sql-insert.html#SQL-ON-CONFLICT > > TJ O'Donnell > > On Wed, Sep 13, 2017 at 3:13 AM, Wandré wrote: > >> Hi, >> >> My name is Wandré and I'm from Brazil. >> I'm trying to do a big database of molecules, but, I want to eliminate >> all the redundant molecules before insert them in database. >> I want to know what is the best method to identify one molecule in RDKit. >> Is SMILES ("Chem.MolToSmiles(mol,isomericSmiles=True)") or I will need >> to compare all molecules, one by one, before insert them in database (using >> Tanimoto)? >> This can be hard to do because my database will have lot of millions of >> molecules, so, compare one by one before insert is the only answer? >> Compare if the SMILES as already inserted is easy (text compare), but, >> compare fingerprint of molecule... >> >> If I really need to compare the fingerprint of molecule, how to store >> this data in PostgreSQL without use cartridge? I will generate the >> fingeprint (Atompair, for example) and store this fingerprint in database >> and compare all the fingerprints, one by one, before insert a now molecule. >> This fingerprint (Atompair) have lot of features, so, store this in >> relational database is expensive. >> It is possible? >> >> Thanks! >> >> -- >> Wandré Nunes de Pinho Veloso >> Professor Assistente - Unifei - Campus Avançado de Itabira-MG >> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG >> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e >> Inteligência Computacional - UNIFEI >> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ >> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG >> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] ETKDG conformation generation algorithm and fullerene-like structures.
Mhh, your choices of test molecules sounds like going from poster child to archenemy of conformation generation algorithms :-) - | Markus Sitzmann | markus.sitzm...@gmail.com > On 7. Sep 2017, at 18:59, Jason Biggs wrote: > > I've never had success using the ETKDG or KDG methods for fullerenes, when > trying on C60 it goes for a long time and returns -1. The ETDG method works > on C60, but fails on your C60H60. > > One thing you could try is to embed the hydrogen-suppressed structure, then > add the hydrogens > > RDKit::DGeomHelpers::EmbedParameters params(RDKit::DGeomHelpers::ETDG); > > RDKit::DGeomHelpers::EmbedMolecule(*mol, params); > > bool explicitOnly = false; > > bool addCoords = true; > > RDKit::MolOps::addHs(*mol, explicitOnly, addCoords); > > seems to work. > > > > > Jason Biggs > > >> On Thu, Sep 7, 2017 at 10:49 AM, Dmitry Redkin wrote: >> Hello all! >> I've just started to use RDKit, and now I'm trying to generate some 3D >> conformation for a molecule. ETKDG successfully optimized cyclohexane, so >> I've tried some more complex example. >> It was this fullerene-like structure (with all the single bonds and every C >> atom having H atom attached). I'm attaching it to this email. >> >> But whatever I've tried to do with embedding parameters, RDKit whether >> stalls for several minutes trying to complete operation or just exits with >> all zero coordinates. >> >> Is there any way to generate conformations for this structure? Maybe I did >> something wrong or there is some flag that can be set to get some result >> (any result, not necessarily the best one) in a reasonable time? >> >> My code is pretty simple, you can see it below. >> >> >> RWMol *mol = MolFileToMol("d:\\temp\\exe32\\full.mol", true, false, false); >> >> MolOps::addHs(*mol); >> DGeomHelpers::EmbedParameters p(DGeomHelpers::ETKDG); >> p.maxIterations = 100; // if I left it -1, I could not wait long enough for >> EmbedMolecule to exit. >> p.useRandomCoords = true; >> int confid = DGeomHelpers::EmbedMolecule(*((ROMol*)mol), p); >> MolToMolFile(*((ROMol*)mol), "d:\\temp\\exe32\\full1.mol", true, confid); >> free(mol); >> >> >> >> Dmitry Redkin, ACD Inc. >> red...@acdlabs.ru >> -- >> This message has been scanned for viruses and >> dangerous content by MailScanner, and is >> believed to be clean. >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Using RDKit in PyCharm and Anaconda on Windows
I definitely have it working on Linux, too, but it might have been that I also only tried it with PyCharm 2017.1.3 first. Before that, I did what Greg suggested, starting pycharm from the activated environment. Unfortunately I have no experience with Windows in this regard, too. On Thu, Jun 1, 2017 at 9:57 AM, Pavel Polishchuk wrote: > I had some issues to run rdkit from Python console in PyCharm (4.5.5) on > Linux. After recent installation of PyCharm 2017.1.3 it started to work. > Maybe updating PyCharm will help on Win as well. > > Pavel. > > > > On 05/30/2017 10:10 PM, West, Richard wrote: > >> We're having trouble getting RDKit to work in a PyCharm project using an >> Anaconda interpreter (Python 2.7), on Windows 8.1. >> Has anyone had success with this and can guide us? >> The trouble is we get an >> >>ImportError: DLL load failed: The specified module could not be found. >> >> when trying to import rdkit (or rdBase). >> >> We have tried many variations of the following, but here is a basic >> recipe of what does/doesn't work: >> 1. Make a new conda environment (called 'eg1'), install rdkit ('conda >> install -c rdkit rdkit') >> 2. From a cmd.exe prompt, use this environment ('activate eg1') load >> python ('python') and import rdkit ('import rdkit') it works fine. >> 3. From PyCharm, create a Project Interpreter (pointing to >> 'C:\Anaconda2\envs\eg1\python.exe'), and use this to run a script or >> create a new Python Console in which you 'import rdkit', leading to the >> "DLL load failed" message. >> 4. We have tried manually adding a bunch of things to the "Interpreter >> Paths" in PyCharm, but without success (perhaps we just didn't add the >> right thing). >> >> >> >> >> Update: just before I hit "send" on this request for help, we stumbled >> across this posting of the same problem, and solution, from Christian >> Ribeaud: >> https://intellij-support.jetbrains.com/hc/en-us/community/ >> posts/115000244450-DLL-load-failed >> >> It seems that if we open cmd.exe, activate the environment, and then >> launch PyCharm exe from there, it works. >> I'm sharing this here because it took us a while to find the other post, >> but also to ask: is there a "better" way? >> >> Cheers, >> Richard >> >> >> -- >> Richard H. West, Ph.D. >> Assistant Professor, Department of Chemical Engineering, >> Northeastern University, 360 Huntington Ave, Boston, MA 02115 >> http://northeastern.edu/comochengPhone: 617-373-5163 >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Trouble with conda build in Docker
Hi Riccardo, just to tell you: my problem went away (I didn't touched it since my last email) for whatever reason (did you do something?) Markus On Thu, Apr 6, 2017 at 12:51 PM, Markus Sitzmann wrote: > Thanks Riccardo for your reply > > I tried both master and development - both with your Dockerscript (which > starts from centos) and mine (which starts from Debian:jessie). Same result > everywhere. I haven't built it in a while, too, but since I updated to > Docker CE, Version 17.03, it triggered this rebuild. > > I hope it isn't my setup (well, that is actually what I wanted to find out > :-), if somebody else has problems). It isn't urgent, also :-) > > > Markus > > On Thu, Apr 6, 2017 at 8:50 AM, Riccardo Vianello < > riccardo.viane...@gmail.com> wrote: > >> Hi Markus, >> >> On Thu, Apr 6, 2017 at 12:03 AM, Markus Sitzmann < >> markus.sitzm...@gmail.com> wrote: >> >>> Hi (Riccardo). >>> >>> I have trouble with the conda build in Docker (I just updated to the >>> most recent version which triggered the new build) - below is the error >>> trace. I took the original Docker file and just edited out all non-Python35 >>> builds - so it does only the Python35 builds and ends somewhere when >>> rdkit-postgres95 is built. Does somebody have the same problem? >>> >> >> I couldn't work on this during the last few months so I didn't test any >> recent builds. I might be able to have a closer look and run some tests >> next week. What branch of the conda-rdkit repository are you using (master >> or development)? >> >> Best, >> Riccardo >> >> >>> >>> make[3]: Entering directory `/home/rdkit/bld/postgresql95_ >>> 1491429385957/work/postgresql-9.5.2/src/port' >>> make -C ../backend submake-errcodes >>> make[3]: Entering directory `/home/rdkit/bld/postgresql95_ >>> 1491429385957/work/postgresql-9.5.2/src/backend/catalog' >>> cd ../../../src/include/catalog && /bin/sh ../../../config/missing perl >>> ./duplicate_oids >>> make -C utils probes.h >>> *** >>> ERROR: Perl is missing on your system. It is needed unless you are >>> building >>> from an unmodified official distribution of PostgreSQL. >>> *** >>> make[3]: Leaving directory `/home/rdkit/bld/postgresql95_ >>> 1491429385957/work/postgresql-9.5.2/src/backend/catalog' >>> >>> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Install rdkit with anaconda3
I think if you install conda freshly now it automatically uses python 3.6. If you don't have the requirement for 3.6 you have to do this conda install python=3.5 Then you should be able do install rdkit as described. On Wed, Apr 12, 2017 at 12:29 PM, Greg Landrum wrote: > > > On Wed, Apr 12, 2017 at 4:59 AM, Maciek Wójcikowski > wrote: > >> >> There are no Python 3.6 packages of rdkit right now. >> >> I guess we can ask Greg or Riccardo to build them with the next release >> of RDKit. >> > > That is the plan. When we do the next release (in about a week), we'll do > python3.6 builds too. > > -greg > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Trouble with conda build in Docker
Thanks Riccardo for your reply I tried both master and development - both with your Dockerscript (which starts from centos) and mine (which starts from Debian:jessie). Same result everywhere. I haven't built it in a while, too, but since I updated to Docker CE, Version 17.03, it triggered this rebuild. I hope it isn't my setup (well, that is actually what I wanted to find out :-), if somebody else has problems). It isn't urgent, also :-) Markus On Thu, Apr 6, 2017 at 8:50 AM, Riccardo Vianello < riccardo.viane...@gmail.com> wrote: > Hi Markus, > > On Thu, Apr 6, 2017 at 12:03 AM, Markus Sitzmann < > markus.sitzm...@gmail.com> wrote: > >> Hi (Riccardo). >> >> I have trouble with the conda build in Docker (I just updated to the most >> recent version which triggered the new build) - below is the error trace. I >> took the original Docker file and just edited out all non-Python35 builds - >> so it does only the Python35 builds and ends somewhere when >> rdkit-postgres95 is built. Does somebody have the same problem? >> > > I couldn't work on this during the last few months so I didn't test any > recent builds. I might be able to have a closer look and run some tests > next week. What branch of the conda-rdkit repository are you using (master > or development)? > > Best, > Riccardo > > >> >> make[3]: Entering directory `/home/rdkit/bld/postgresql95_ >> 1491429385957/work/postgresql-9.5.2/src/port' >> make -C ../backend submake-errcodes >> make[3]: Entering directory `/home/rdkit/bld/postgresql95_ >> 1491429385957/work/postgresql-9.5.2/src/backend/catalog' >> cd ../../../src/include/catalog && /bin/sh ../../../config/missing perl >> ./duplicate_oids >> make -C utils probes.h >> *** >> ERROR: Perl is missing on your system. It is needed unless you are >> building >> from an unmodified official distribution of PostgreSQL. >> *** >> make[3]: Leaving directory `/home/rdkit/bld/postgresql95_ >> 1491429385957/work/postgresql-9.5.2/src/backend/catalog' >> >> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Trouble with conda build in Docker
Hi (Riccardo). I have trouble with the conda build in Docker (I just updated to the most recent version which triggered the new build) - below is the error trace. I took the original Docker file and just edited out all non-Python35 builds - so it does only the Python35 builds and ends somewhere when rdkit-postgres95 is built. Does somebody have the same problem? Thanks, Markus make[3]: Entering directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/port' make -C ../backend submake-errcodes make[3]: Entering directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/catalog' cd ../../../src/include/catalog && /bin/sh ../../../config/missing perl ./duplicate_oids make -C utils probes.h *** ERROR: Perl is missing on your system. It is needed unless you are building from an unmodified official distribution of PostgreSQL. *** make[3]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/catalog' make[3]: *** [postgres.bki] Error 1 make[2]: *** [submake-schemapg] Error 2 make[2]: *** Waiting for unfinished jobs make[3]: Entering directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/utils' sed -f ./Gen_dummy_probes.sed probes.d >probes.h make[3]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/utils' make[4]: Entering directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend' make[4]: Nothing to be done for `submake-errcodes'. make[4]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend' make[3]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/port' make -C ../../src/common all make[3]: Entering directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/common' make -C ../backend submake-errcodes make[4]: Entering directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend' make[4]: Nothing to be done for `submake-errcodes'. make[4]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend' make[3]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/common' make[2]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend' make[1]: *** [all-backend-recurse] Error 2 make[1]: Leaving directory `/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src' make: *** [all-src-recurse] Error 2 Traceback (most recent call last): File "/home/rdkit/miniconda/bin/conda-build", line 6, in path in binary file share/terminfo/w/wsvt25 Detected hard-coded path in binary file share/terminfo/w/wsvt25m Detected hard-coded path in binary file share/terminfo/x/x68k Detected hard-coded path in binary file share/terminfo/x/x68k-ite Detected hard-coded path in binary file share/terminfo/z/z29a Detected hard-coded path in binary file share/terminfo/z/z29a-kc-bc Detected hard-coded path in binary file share/terminfo/z/z29a-kc-uc Detected hard-coded path in binary file share/terminfo/z/z29a-nkc-bc Detected hard-coded path in binary file share/terminfo/z/z29a-nkc-uc Detected hard-coded path in binary file share/terminfo/z/z340 Detected hard-coded path in binary file share/terminfo/z/z340-nam Detected hard-coded path in text file bin/ncurses6-config Detected hard-coded path in text file share/man/man1/captoinfo.1m Detected hard-coded path in text file share/man/man1/infocmp.1m Detected hard-coded path in text file share/man/man1/infotocap.1m Detected hard-coded path in text file share/man/man1/ncurses6-config.1 Detected hard-coded path in text file share/man/man1/tic.1m Detected hard-coded path in text file share/man/man1/toe.1m Detected hard-coded path in text file share/man/man1/tput.1 Detected hard-coded path in text file share/man/man1/tset.1 Detected hard-coded path in text file share/man/man3/ncurses.3x Detected hard-coded path in text file share/man/man3/panel.3x Detected hard-coded path in text file share/man/man5/term.5 Detected hard-coded path in text file share/man/man5/terminfo.5 Detected hard-coded path in text file share/man/man7/term.7 /home/rdkit/bld/linux-64/ncurses-6.0-0.tar.bz2 Nothing to test for: /home/rdkit/bld/linux-64/ncurses-6.0-0.tar.bz2 BUILD START: postgresql95-9.5.2-py35_0 The following NEW packages will be INSTALLED: libiconv: 1.14-0 libxml2:2.9.4-0 libxslt:1.1.29-0 ncurses:6.0-0 local openssl:1.0.2k-1 pip:9.0.1-py35_1 python: 3.5.3-1 readline: 6.2-2 setuptools: 27.2.0-py35_0 sqlite: 3.13.0-0 tk: 8.5.18-0 wheel: 0.29.0-py35_0 xz: 5.2.2-1 zlib: 1.2.8-3 Source cache directory is: /home/rdkit/bld/src_cache Downloading source to cache: postgresql-9.5.2.tar.bz2 Downloading https://ftp.postgresql.org/pub/source/v9.5.2/pos
Re: [Rdkit-discuss] connecting to postgres in rdkit environment
Maybe this one here helps, too, although it is basically the same what TJ said: https://devops.profitbricks.com/tutorials/install-postgresql-on-centos-7/ Markus On Sat, Feb 25, 2017 at 11:29 PM, TJ O'Donnell wrote: > The server itself must be told to allow remote connections. > You might check these two things. > 1. You can edit the postgresql.conf file (not sure where that is on your > system). > https://www.postgresql.org/docs/9.2/static/runtime- > config-connection.html > Uncomment or add the line listen_addresses='*'. You can > tailor that to be more specific, but try this first. > > 2. The file pg_hba.conf also controls access. Look at this: > https://www.postgresql.org/docs/9.3/static/auth-pg-hba-conf.html > > Be sure to restart the server after you make changes to these files. > > Hope this helps, > TJ O'Donnell > > > On Sat, Feb 25, 2017 at 12:34 PM, wrote: > >> Hi, >> I've installed rdkit on a CentOS machine using anaconda python and set up >> a postgresql compound database in the rdkit environment. It works great on >> the machine's console. >> I now want to access it remotely and I'm trying to set up a jdbc postgres >> driver to access it from a windows client but this is not working. If I >> test the driver on the server it tells me that the connection is refused >> and I should check that the machine is accepting TCP requests. >> >> I have opened the standard port that postgres uses >> -A INPUT -m state --state NEW -m tcp -p tcp --dport 5432 -j ACCEPT >> >> iptables -L returns >> ACCEPT tcp -- anywhere anywherestate NEW >> tcp dpt:postgres >> >> this is where I don't know what to check next. A few things that might be >> relevant. If I "ps -eaf | grep post" I see four postgres processes running >> under my username (not postgres), so I think there is a server working. >> There is also a "system" postgresql (version 9.2) which I have connected to >> previously a long time ago. This connection no longer works either and I >> don't really care about that but could be an interfering factor. >> >> If anyone has suggestions about what to check next or solve this I'd be >> grateful >> >> thanks, >> Neil >> >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] If someone has build problems using conda currently ...
... I just suffered this: https://github.com/conda/conda/issues/4309 Going back to a previous conda version (4.2.12) helps. Other than that: Happy New Year (a late one :-) -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Extracting SMILES from text
Hi Alexis, you may find also so some "novel" compounds by this approach :-). Whether your tuple solution improves performance strongly depends on the content of your text documents and how often they repeat the same words again - but my guess would be it will help. Probably the best way is even to look at the distribution of words before you feed them to RDKit. You should also "memorize" those ones that successfully generated a structure, doesn't make sense to do it again, then. Markus On Fri, Dec 2, 2016 at 10:21 AM, Maciek Wójcikowski wrote: > Hi Alexis, > > You may want to filter with some regex strings containing not valid > characters (i.e. there is small subset of atoms that may be without > brackets). See "Atoms" section: http://www.daylight.com/ > dayhtml/doc/theory/theory.smiles.html > > The set might grow pretty quick and may be inefficient, so I'd parse all > strings passing above filter. Although there will be some false positives > like "CC" which may occur in text (emails especially). > > > Pozdrawiam, | Best regards, > Maciek Wójcikowski > mac...@wojcikowski.pl > > 2016-12-02 10:11 GMT+01:00 Alexis Parenty : > >> Dear all, >> >> >> I am looking for a way to extract SMILES scattered in many text documents >> (thousands documents of several pages each). >> >> At the moment, I am thinking to scan each words from the text and try to >> make a mol object from them using Chem.MolFromSmiles() then store the words >> if they return a mol object that is not None. >> >> Can anyone think of a better/quicker way? >> >> >> Would it be worth storing in a tuple any word that do not return a mol >> object from Chem.MolFromSmiles() and exclude them from subsequent search? >> >> >> Something along those lines >> >> >> excluded_set = set() >> >> smiles_list = [] >> >> For each_word in text: >> >> If each_word not in excluded_set: >> >> each_word_mol = Chem.MolFromSmiles(each_word) >> >> if each_word_mol is not None: >> >> smiles_list.append(each_word) >> >> else: >> >> excluded_set.add(each_word_mol) >> >> >> Would not searching into that growing tuple take actually more time than >> trying to blindly make a mol object for every word? >> >> >> >> Any suggestion? >> >> >> Many thanks and regards, >> >> >> Alexis >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] comparing two or more tables of molecules
Well, since George mentioned a talk by me, I wish we would have implemented our tool back then using an open-source tool like RDKit (which wasn't very well know back then), and also would have been so smart to use SMARTS for the transformation rules (partially they are implemented as SMARTS but big parts are other CACTVS script functionalities). There is still an intention by me to continue/advance (whatever) on this and make it openly available, but I must admit it is a quite vague intention currently. Markus -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] smarts vs smiles database queries and explicit hydrogens
If I understood Greg correctly, it will be in 2016.09 which isn't in conda just of yet, they are currently working on putting it there. Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 23 Nov 2016, at 15:29, Alexander Klenner-Bajaja wrote: > > Dear Greg, > > Thank you very much, looking at the results that function was exactly what I > was looking for – only I can’t find it in my updated anaconda installation. > > “conda update rdkit” tells me I have the latest version 2016.03.4 and > postgres tells me I have the 3.4 version of the RDKit extension > > If I understand your blog post correctly it should be in 2016.03 version? > What am I missing? > > > Best, > > Alex > > > > From: Greg Landrum [mailto:greg.land...@gmail.com] > Sent: Wednesday, November 23, 2016 11:42 AM > To: Alexander Klenner-Bajaja > Cc: rdkit-discuss@lists.sourceforge.net > Subject: Re: [Rdkit-discuss] smarts vs smiles database queries and explicit > hydrogens > > Hi Alex, > > The new version of the cartridge has some capabilities that, I think, address > this. > > There's a blog post about this: > http://rdkit.blogspot.com/2016/07/tuning-substructure-queries-ii.html > but the short version is that you can do the kind of queries it seems like > you want to do quite simply: > > chembl_21=# select * from rdk.mols where > m@>mol_adjust_query_properties('*c1ncccn1') limit 3; > molregno | m > > --+--- >601707 | CCCc1nc(-c2ccc(F)cc2)oc1C(=O)NC(CC)CN1CCN(c2ncccn2)CC1 >289103 | CC1C(=N)/C(=N/Nc2ccc(S(=O)(=O)Nc3ncccn3)cc2)C(=O)C(C)C1=O >607646 | > CCNC(=O)[C@@H]1OC(n2cnc3c(NC(=O)Nc4ccc(S(=O)(=O)Nc5ncccn5)cc4)ncnc32)[C@@H](O)[C@H]1O > (3 rows) > > chembl_21=# select * from rdk.mols where > m@>mol_adjust_query_properties('*c1nc(*)ccn1') limit 3; > molregno | m > --+--- >158659 | CCNc1nccc(-c2c(-c3ccc(F)cc3)ncn2C2CCN(C)CC2)n1 >158743 | Nc1nccc(-c2c(-c3ccc(F)cc3)ncn2C2CCN(Cc3c3)CC2)n1 >158843 | CC1(C)CC(n2cnc(-c3ccc(F)cc3)c2-c2ccnc(N)n2)CC(C)(C)N1 > (3 rows) > > chembl_21=# select * from rdk.mols where > m@>mol_adjust_query_properties('*c1nc(*)cc(*)n1') limit 3; > molregno |m > > --+-- >726443 | CN=C(S)NNc1nc(C)cc(C)n1 >561136 | > C[C@H](Nc1cc(NC2CC2)nc(C(F)(F)F)n1)[C@@H](Cc1ccc(Cl)cc1)c1(Br)c1 >205784 | CCN(CC)C(=O)CSc1nc(N)cc(Cl)n1 > (3 rows) > > There's more detail in the blog post, but the default behavior is to convert > dummies into generic query atoms and to constrain the substitution at any > other *ring* position. > > Best Regards, > -greg > > > On Wed, Nov 23, 2016 at 9:20 AM, Alexander Klenner-Bajaja > wrote: > Hi all, > > I am currently exploring the possibilities of the RDKit database cartridge > for substructure search- I installed everything following the tutorial from > http://www.rdkit.org/docs/Install.html > > Very nice tutorial - worked perfectly fine. > > Since we are exploring solutions for browser based gui searches I created a > test page using Ketcher (http://lifescience.opensource.epam.com/ketcher/) > which communicates with the database through PHP. > > Ketcher returns a SMILES representation from the drawn molecule. The raw data > of the molecules in the database are canonical SMILES created from RDKIT > canonical SMILES from the rdkit KNIME node (they are text-mined from patents). > > When doing substructure searches, as long as we query for well-defined > compounds the results make sense – however looking at R1,…-groups things get > a little odd. > > I found a very old discussion on the mailing list from 2009 where this has > been discussed and I understood from that dialog that when looking at SMILES > with a “*” representation this is interpreted as a dummy atom and the same > dummy atom is expected in the search space to produce a hit. While a SMARTS > representation of the same string actually leads to the behaviour that “any > atom” is matched at that position. > > I ended up with the very cumbersome query, I am sure there are more elegant > ways of doing this using ::qmol notation, but as I said I am currently > explori
Re: [Rdkit-discuss] reading multiple conformers from file
+1 for a json format ... hmm, how about a general json-based molecular structure format ... let us call it "cson" (that is an homage to Google gson and Chemical Markup Language CML :-) Markus On Mon, Oct 31, 2016 at 11:18 AM, Brian Cole wrote: > I would 2nd the suggestion of continuing to push a JSON format forward > that natively supports multiple conformers. > > I've never seen automatic recombination of an SDF work %100 of the time, > it's fraught with corner cases. It's also abysmally slow and takes a huge > amount of disk space. > > -Bruce > > On Oct 30, 2016, at 5:21 PM, Brian Kelley wrote: > > Rdkit already has a way to serialize conformers, the binary pickle format! > > Perhaps we should make a file extension for multiple molecules. Say > ".rdk" and call it a day. Like inchi the source code is the reference :) > > > Brian Kelley > > On Oct 27, 2016, at 2:05 AM, Greg Landrum wrote: > > The RDKit has support for the TPL format, an old BioCad/MSI/Accelrys > format. > It's easy to imagine something better, but this is at least already there > and there could be other software that speaks it: > https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/FileParsers/test_ > data/cmpd2.tpl > > I'd still like to do a decent JSON format and adding multi-confs to that > would be logical > > On Thu, Oct 27, 2016 at 6:58 AM, David Cosgrove < > davidacosgrov...@gmail.com> wrote: > >> I've been wondering if, now that you can get decent conformations from >> RDKit, it would be worth devising a multi-conformation file format to make >> reading multi-conf molecules faster for vs purposes. In my experience, >> pulling all the conformers out of an ascii file such as an sdf can become >> the RDS for pharmacophore searchimg. Something to think about at the >> hackathon maybe and certainly something that deserves a new email >> thread. >> >> Dave >> >> >> On Thursday, 27 October 2016, Greg Landrum >> wrote: >> >>> Hi Thomas, >>> >>> You're right, reading multiple conformations out of an SDF does seem >>> like one of those common operations. Unfortunately the RDKit does not >>> currently support it in an easy way. >>> >>> A python implementation of this would be a good topic for Friday's UGM >>> hackathon, we can see if anyone finds it interesting enough to work on. >>> >>> -greg >>> >>> >>> On Tue, Oct 25, 2016 at 2:16 AM, Thomas Evangelidis >>> wrote: >>> Hello everyone, I am a new user of RDkit and I was looking in the documentation for an easy way to load multiple conformers from a structure file like .sdf. The code must 1) distinguish between different protonation states of the same molecule, 2) create a new Mol() object for each protonation state and load into it the respective conformers. Apparently I can work out a solution for 1) using mol.GetProp('_Name'), mol.GetNumAtoms, mol.GetNumBonds and other properties, but I was wondering if there is any more straight forward way to do it. For 2) I guess I must iterate over all molecules in the input file, create new Mol() objects (one for each protonation state of each ligand) and add conformers to these new Mol() objects. Again this sounds easily programmable, but sounds like a very common operation, thus I was wondering if it has been implemented in a function. thanks in advance Thomas -- == Thomas Evangelidis Research Specialist CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/1S081, 62500 Brno, Czech Republic email: tev...@pharm.uoa.gr teva...@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/ -- The Command Line: Reinvented for Modern Developers Did the resurgence of CLI tooling catch you by surprise? Reconnect with the command line and become more productive. Learn the new .NET and ASP.NET CLI. Get your free copy! http://sdm.link/telerik ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> > > -- > The Command Line: Reinvented for Modern Developers > Did the resurgence of CLI tooling catch you by surprise? > Reconnect with the command line and become more productive. > Learn the new .NET and ASP.NET CLI. Get your free copy! > http://sdm.link/telerik > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > >
Re: [Rdkit-discuss] The RDKit and modern C++
I get the feeling, RH/Centos 6 becomes the next XP kind of story - to many legacies that make the update impossible or very hard. Also docker, a great technology that could mitigate this problem, is very painful under RH/Centos 6. --- Markus Sitzmann > On 29 Sep 2016, at 07:31, Greg Landrum wrote: > > >> On Thu, Sep 29, 2016 at 7:06 AM, Peter S. Shenkin wrote: >> >> Thanks... so it sounds like the main effort (aside from what you delicately >> called "professional development" ;-) ) will be to introduce features that >> improve robustness or performance when writing new code and possibly when >> maintaining (fixing, extending) existing code. > > Yes, I think that's about right with the one refinement that we'll be using > some automated tools to convert the existing code to use some of those new > features. > > -greg > > -- > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Conda installation of RDKit on W8
Hi Gonzalo, after you activated my-rdkit-env, try to install rdkit by conda install -c https://conda.anaconda.org/rdkit rdkit Alternatively, if you go a step back, you can also start with conda create -c https://conda.anaconda.org/rdkit -n give-your-environment-whatever-name-you-want rdkit and then activate "give-your-enviroment-whatever-name-you-want" Your error message above just says that you are trying to create a environment with the same name again Markus On Mon, Sep 26, 2016 at 2:11 PM, Gonzalo Colmenarejo < colmenarejo.gonz...@gmail.com> wrote: > Thanks a lot, Marta. Still, after activating the environment, I get in > jupyter the "ImportError: No module named rdkit". > > This is confusing... > > > On Mon, Sep 26, 2016 at 1:56 PM, Marta Stępniewska-Dziubińska < > mart...@ibb.waw.pl> wrote: > >> Hi Gonzalo, >> You need to activate your environment: >> activate my-rdkit-env >> >> See: http://conda.pydata.org/docs/using/envs.html#change-environm >> ents-activate-deactivate >> >> Best, >> Marta >> >> >> 2016-09-26 13:45 GMT+02:00 Gonzalo Colmenarejo < >> colmenarejo.gonz...@gmail.com>: >> > rdkit is not shown within the package list. However, if I run conda >> create >> > -c https://conda.anaconda.org/rdkit -n my-rdkit-env rdkit I get this >> > message: >> > >> > Error: prefix already exists: C:\Users\Dell\Anaconda\envs\my-rdkit-env >> > >> > Any idea on how this could be fixed? >> > >> > Thanks >> > >> > On Fri, Sep 23, 2016 at 9:06 PM, Greg Landrum >> > wrote: >> >> >> >> I think anaconda is fine, but it looks like either the RDKit isn't >> >> installed correctly or you aren't running the anaconda Python. >> >> >> >> Please check that the python you are running is the one from anaconda >> and >> >> that the RDKit is installed (that last one is "conda list") >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Fri, Sep 23, 2016 at 8:31 PM +0200, "Gonzalo Colmenarejo" >> >> wrote: >> >> >> >>> Hi Greg, >> >>> >> >>> It shows: >> >>> >> >>> ImportError: No module named rdkit >> >>> >> >>> Should I reinstall anaconda? >> >>> >> >>> Thanks >> >>> >> >>> Gonzalo >> >>> >> >>> On Fri, Sep 23, 2016 at 2:54 PM, Greg Landrum > > >> >>> wrote: >> >> Hi Gonzalo, >> >> Are you sure that the jupyter you are running is the same one that >> came >> with your conda installation? >> Can you do, from the command line: >> python -c "from rdkit import Chem" >> >> On Fri, Sep 23, 2016 at 10:49 AM, Gonzalo Colmenarejo >> wrote: >> > >> > Hi, >> > I had a previous release of RDKit (2015_03_1) in my Windows 8 PC >> > installed in the old fashioned mode and it worked OK. I renamed the >> > corresponding folder and installed the latest version of RDKit >> through >> > conda. Now I get the following error message when trying to run my >> previous >> > code in Jupyter: ImportError: No module named rdkit >> > >> > Any advice on how to fix this would be appreciated. >> > >> > Thanks a lot >> > >> > Gonzalo >> > >> > >> > >> -- >> > >> > ___ >> > Rdkit-discuss mailing list >> > Rdkit-discuss@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > >> >> >>> >> > >> > >> > >> -- >> > >> > ___ >> > Rdkit-discuss mailing list >> > Rdkit-discuss@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > >> > > > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] cannot import matplotlib in my-rdkit-env
Hi Chris, You have to explicitly install it in your my-rdkit-env, too, like you did in the environment where matplotlib is already available. After you activated my-rdkit-env, you probably just have to run conda install matplotlib (You have to do this for any other package, too) Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 21.08.2016, at 16:24, chris dalton wrote: > > Hi, > I have installed Rdkit on a windows laptop with conda and I can activate the > rdkit environment OK and if I start IDLE up, rdkit works. However, I can no > longer import some other packages, such as matplotlib from that IDLE > interpreter. It tells me the package isn't there. > > If I just start up python without activating the rdkit envronment, I can > import matplotlib so it is there; there is something about the rdkit > environment that is not looking in the right place. Looking in environment > variables, I cannot see anything rdkit-specific. > > How can I use matplotlib within my-rdkit-env? > > thanks, > > Chris. > -- > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] writing c++ programs using rdkit
Hi Hitesh, there is nothing particularly special about the RDKit installation received/built by conda. The command you used in your email created a conda environment. If you go to the directory where you initially *installed* conda there is an envs directory, inside this directory there should be a "my-rdkit-env" directory which contains all components of RDKit. If you activate your conda enviroment ("source activate my-rdkit-env" on Linux) conda doesn't do much more than adding the needed paths from "{CONDA INSTALLATION DIR}/envs/my-rdkit-env" to your shell environment. So just delve into the conda envs/ directory and look at the enviroment changes conda does to your shell and you should get an idea what to do and how to link to your project. Best, Markus On Fri, Aug 5, 2016 at 11:40 PM, Hitesh Patel wrote: > Hi all, > I have used rdkit from python. But, now I would like to write c++ programs > using rdkit. > I have scientific linux 6.8. Which has python 2.6. I installed python 2.7 > but that doesn't work with sudo previlages. So, I could not install numpy > for python 2.7. Numpy was installed for python 2.6 instead. I tried a lot > in that. But, doesn't look convenient. > Then, I installed anaconda and installed rdkit using > > $ conda create -c https://conda.anaconda.org/rdkit -n my-rdkit-env rdkit > > Is it possible to use this installation to write c++ code? > > I have also installed rdkit in Macbook Pro, OS X 10.10.5 using homebrew. I > haven't tried anything in that to write c++ code. If I get some > instructions for that too, It will be helpful in exploring. > > Thanks > Hitesh Patel > > > > > -- > > Regards, > > Dr. Hitesh Patel > Post-Doctoral Fellow, > CADD Group, > National Cancer Institute, > National Institute of Health, > 21702, Frederick, MD > USA > Building 376, Room: 205A > Work: +1 301 846 5993 > Mob.: +1 240 367 5208 > Website: http://www.hiteshpatel379.com/ > Email: hitesh.pa...@nih.gov > > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. http://sdm.link/zohodev2dev___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Some feedback from the Sheffield Cheminformatics Conference
Well, first thing I saw on the lock screen of my alarm clock-ringing iPad the morning after a long night at the Sheffield conference dinner was a reply by Greg on this list sent at 6:48am (it even contained some code). Thanks a lot for your dedication and for building RDKit and its community, Greg. Cheers, Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 07.07.2016, at 08:20, Greg Landrum wrote: > > Dear all, > > I was at the Sheffield Cheminformatics conference earlier this week (along > with several people from this list) and I was really struck by the number of > talks and posters that are using the RDKit. By my rough count the RDKit was > used for about 1/3 of the talks and a similar fraction of the posters. > > This of course, makes me smile rather broadly (Christian, Nadine, and Sereina > had to suffer through this while we were waiting at the airport ;-) ) but a > big part of the reason for this success is the engagement and activity of the > RDKit community. So I figured I'd share so that those of you who weren't in > Sheffield also get the chance to grin about it. > > We're having an impact... that's really cool. Thanks! and congrats! :-) > > -greg > > -- > Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San > Francisco, CA to explore cutting-edge tech and listen to tech luminaries > present their vision of the future. This family event has something for > everyone, including kids. Get more information and register today. > http://sdm.link/attshape > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] conda build of Release_2016_03_2 failed on Ubuntu 16.04.
Hi Riccardo, Yes, it builds again - thanks a lot for your efforts. Best, Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 30.06.2016, at 20:55, Riccardo Vianello > wrote: > > Hi Markus, > > I think the problem should be fixed now. The recipes were building the > cartridge using the earlier release tag, and therefore executing tests that > were not fully up-to-date. Please try again and let me know in case the > problem persisted. > > Best, > Riccardo > > -- Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] conda build of Release_2016_03_2 failed on Ubuntu 16.04.
Hi Riccardo, Thanks for your efforts and sorry that I didn't reply earlier. I am not sure about all the side conditions in order this error to occur but I am glad you can reproduce it. Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 30.06.2016, at 08:30, Riccardo Vianello > wrote: > >> On Tue, Jun 28, 2016 at 11:40 PM, Markus Sitzmann >> wrote: >> unfortunately I have another problem - rdkit-postgres isn't building for me >> since the change to Release_2016_03_2. Is that a known problem? > > I tested a couple of full builds and the master branch looks ok, but I could > reproduce this error with the tagged release. I will try to identify the > exact cause. > > Best, > Riccardo > -- Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] conda build of Release_2016_03_2 failed on Ubuntu 16.04.
Hi, unfortunately I have another problem - rdkit-postgres isn't building for me since the change to Release_2016_03_2. Is that a known problem? Below is the end of the build log. I only let build the py35-part (+ncurses) of the Dockerscript. Thanks & Best, Markus BUILD START: rdkit-postgresql-__conda_version__-py35_1 Fetching package metadata . Solving package specifications: .. + source activate /home/rdkit/miniconda/envs/_build ++ [[ -n 4.1.2(1)-release ]] ++ _SCRIPT_LOCATION=/home/rdkit/miniconda/envs/_build/bin/activate ++ SHELL=bash +++ dirname /home/rdkit/miniconda/envs/_build/bin/activate ++ _CONDA_DIR=/home/rdkit/miniconda/envs/_build/bin ++ '[' 1 -gt 1 ']' ++ case "$(uname -s)" in +++ uname -s ++ EXT= ++ [[ -n 4.1.2(1)-release ]] +++ basename /home/rdkit/miniconda/conda-bld/work/conda_build.sh ++ [[ conda_build.sh == \a\c\t\i\v\a\t\e ]] ++ '[' 1 -eq 0 ']' ++ args=/home/rdkit/miniconda/envs/_build ++ /home/rdkit/miniconda/envs/_build/bin/conda ..checkenv bash /home/rdkit/miniconda/envs/_build ++ (( 0 != 0 )) ++ source /home/rdkit/miniconda/envs/_build/bin/deactivate +++ [[ -n 4.1.2(1)-release ]] +++ _SCRIPT_LOCATION=/home/rdkit/miniconda/envs/_build/bin/deactivate +++ SHELL=bash dirname /home/rdkit/miniconda/envs/_build/bin/deactivate +++ _CONDA_DIR=/home/rdkit/miniconda/envs/_build/bin +++ case "$(uname -s)" in uname -s +++ EXT= +++ [[ 1 > 0 ]] +++ key=/home/rdkit/miniconda/envs/_build +++ case $key in +++ shift +++ [[ 0 > 0 ]] +++ [[ -n 4.1.2(1)-release ]] basename /home/rdkit/miniconda/conda-bld/work/conda_build.sh +++ [[ conda_build.sh == \d\e\a\c\t\i\v\a\t\e ]] +++ [[ -z '' ]] +++ [[ -n 4.1.2(1)-release ]] basename /home/rdkit/miniconda/conda-bld/work/conda_build.sh +++ [[ conda_build.sh == \d\e\a\c\t\i\v\a\t\e ]] +++ return 0 +++ /home/rdkit/miniconda/envs/_build/bin/conda ..activate bash /home/rdkit/miniconda/envs/_build prepending /home/rdkit/miniconda/envs/_build/bin to PATH ++ _NEW_PART=/home/rdkit/miniconda/envs/_build/bin ++ (( 0 == 0 )) ++ export CONDA_PATH_BACKUP=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ++ CONDA_PATH_BACKUP=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ++ export CONDA_PS1_BACKUP= ++ CONDA_PS1_BACKUP= ++ export PATH=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ++ PATH=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ++ [[ '' == */* ]] ++ export CONDA_DEFAULT_ENV=/home/rdkit/miniconda/envs/_build ++ CONDA_DEFAULT_ENV=/home/rdkit/miniconda/envs/_build ++ firstpath=/home/rdkit/miniconda/envs/_build/bin +++ echo /home/rdkit/miniconda/envs/_build/bin +++ sed 's|/bin$||' ++ export CONDA_PREFIX=/home/rdkit/miniconda/envs/_build +++ /home/rdkit/miniconda/envs/_build/bin/conda ..changeps1 ++ '[' 1 = 1 ']' +++ grep -q CONDA_DEFAULT_ENV ++ export 'PS1=(/home/rdkit/miniconda/envs/_build) ' ++ PS1='(/home/rdkit/miniconda/envs/_build) ' ++ _CONDA_D=/home/rdkit/miniconda/envs/_build/etc/conda/activate.d ++ [[ -d /home/rdkit/miniconda/envs/_build/etc/conda/activate.d ]] ++ unset CONDA_PATH ++ [[ -n 4.1.2(1)-release ]] ++ hash -r + /home/rdkit/miniconda/envs/_build/bin/python /home/rdkit/conda-rdkit/rdkit-postgresql/pkg_version.py + cd /home/rdkit/miniconda/conda-bld/work/Code/PgSQL/rdkit + make gcc -I/home/rdkit/miniconda/envs/_build/include -I/home/rdkit/miniconda/envs/_build/include/rdkit -DRDKITVER='"007300"' -DBUILD_AVALON_SUPPORT -DBUILD_INCHI_SUPPORT -mpopcnt -I. -I./ -I/home/rdkit/miniconda/envs/_build/include/postgresql/server -I/home/rdkit/miniconda/envs/_build/include/postgresql/internal -D_GNU_SOURCE -I/home/rdkit/miniconda/envs/_build/include/libxml2 -I/home/rdkit/miniconda/envs/_build/include -fPIC -c -o rdkit_io.o rdkit_io.c gcc -I/home/rdkit/miniconda/envs/_build/include -I/home/rdkit/miniconda/envs/_build/include/rdkit -DRDKITVER='"007300"' -DBUILD_AVALON_SUPPORT -DBUILD_INCHI_SUPPORT -mpopcnt -I. -I./ -I/home/rdkit/miniconda/envs/_build/include/postgresql/server -I/home/rdkit/miniconda/envs/_build/include/postgresql/internal -D_GNU_SOURCE -I/home/rdkit/miniconda/envs/_build/include/libxml2 -I/home/rdkit/miniconda/envs/_build/include -fPIC -c -o mol_op.o mol_op.c mol_op.c: In function 'fmcs_mol2s_transition': mol_op.c:334: warning: initialization makes pointer from integer without a cast mol_op.c:363: warning: initialization makes pointer from integer without a cast mol_op.c: In function 'fmcs_mol_transition': mol_op.c:432: warning: initialization makes pointer from integer without a cast mol_op.c:439: warning: cast from pointer to integer of different size mol_op.c:443: warning: initialization makes pointer from integer without a cast
Re: [Rdkit-discuss] Struggling with apache + rdkit + django
Hi Stephane, Add some Python code to your uwsgi.py file that prints out the environment that the Python interpreter sees (maybe comment out everything else) when it is called by the Apache. It is very likely that the Apache calls another Python interpreter than you expect. What Paolo writes is probably the solution to your problem. Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 21.06.2016, at 21:24, Michał Nowotka wrote: > > Hi Stéphane, > > Just to let you know about two things: > > 1. ChEMBL web services are a Django application written using RDKit. > We deploy it using gunicorn and Apache through Reverse Proxy and put > on a Virtual Machine named myChEMBL that you can download. Here are > some example configuration files: > https://github.com/chembl/mychembl/tree/master/webservices/conf but > I'm happy to explain more if you want. > > 2. There is a project called Beaker that exposes most of RDKit methods > as RESTful API. The source code is here: > https://github.com/chembl/chembl_beaker and a live instance here: > https://www.ebi.ac.uk/chembl/api/utils/docs > > Kind regards, > > Michał Nowotka > > On Tue, Jun 21, 2016 at 7:46 PM, Téletchéa Stéphane > wrote: >> Le 21/06/2016 20:18, TJ O'Donnell a écrit : >>> I would suggest setting PYTHONPATH in >>> config or ini files for >>> Apache or Django or uwsgi >>> Not sure which is required. >> >> Dear all, >> >> This is already indicated using a WSGIprocessGroup : >> >> WSGIDaemonProcess manageLibrary >> python-path=/path/to/project/projets/manageLibrary:/path/to/project/projets/manageLibrary/tools/django1.8/lib/python2.7/site-packages:/path/to/project/projets/manageLibrary/tools/rdkit/lib:/path/to/project/projets/manageLibrary/tools/rdkit/lib/python2.7/site-packages >> display-name=manageLibrary >> WSGIProcessGroup manageLibrary >> WSGIScriptAlias /tools/manageLibrary >> '/path/to/project/projets/manageLibrary/manageLibrary/wsgi.py' >> >> >> See more in detail here: >> https://www.digitalocean.com/community/tutorials/how-to-serve-django-applications-with-apache-and-mod_wsgi-on-ubuntu-14-04 >> >> I have also checked permisisons and files with no luck (and no output in >> logs ...). >> >> I may start from scratch with a simple django project to find if is >> already works there ... >> >> Many Thanks, if you have any direction I'll be happy to test, >> >> Stéphane >> >> -- >> Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein >> Design In Silico >> UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322 Nantes >> cedex 03, France >> Tél : +33 251 125 636 / Fax : +33 251 125 632 >> http://www.ufip.univ-nantes.fr/ - http://www.steletch.org >> >> >> -- >> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San >> Francisco, CA to explore cutting-edge tech and listen to tech luminaries >> present their vision of the future. This family event has something for >> everyone, including kids. Get more information and register today. >> http://sdm.link/attshape >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- > Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San > Francisco, CA to explore cutting-edge tech and listen to tech luminaries > present their vision of the future. This family event has something for > everyone, including kids. Get more information and register today. > http://sdm.link/attshape > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Conda and Rdkit 2016-03 Pains
Hi Riccardo, thanks for your reply and all your work. I actually tried over the course of the last few days and again just before I wrote my first email. I do all this builds in a virtual machine (VMware) just with an (almost) up-to-date Docker installation - so it should work (at least I hope so otherwise it would defeat the purpose of Docker :-) ). Okay, I will stay patient and keep you posted. Best, Markus On Tue, May 3, 2016 at 9:02 AM, Riccardo Vianello wrote: > Hi Markus, > > On Tue, May 3, 2016 at 1:41 AM, Markus Sitzmann > wrote: >> >> thanks for your great software - unfortunately, I have some building >> pains. I recently decided to go from RDKit 2015-03 to 2015-09 (yes I >> was late) , everything still on python 2.7. >> >> As part of this migration I decided to give Conda a try and it worked >> nicely in my Docker container (which is very similar to the the >> official Conda RDKit container at >> >> https://github.com/rdkit/conda-rdkit >> >> but starts from Debian Jessie instead of Centos6 - however it still >> clones from this repository). >> >> >> Unfortunately, since you switched to RDKit 2016-03 my troubles began. > > > A set of changes have been recently merged into the conda-rdkit development > branch in order to re-sync it with the rdkit master branch. If your tests > with the development branch are earlier than just a few days, then you might > want to try that again (and I would be actually interested to know in case > the problems persisted). Please note that the current tip of the rdkit > master branch already includes a few additions/changes compared to the > latest release. > > I am also preparing a PR that will fully update the conda-rdkit master > branch to the current 2016.03.1 release, I am about to run some final tests > but I think it should be hopefully ready between today and tomorrow. > > Best, > Riccardo > -- Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Conda and Rdkit 2016-03 Pains
Hi Greg and everybody involved, thanks for your great software - unfortunately, I have some building pains. I recently decided to go from RDKit 2015-03 to 2015-09 (yes I was late) , everything still on python 2.7. As part of this migration I decided to give Conda a try and it worked nicely in my Docker container (which is very similar to the the official Conda RDKit container at https://github.com/rdkit/conda-rdkit but starts from Debian Jessie instead of Centos6 - however it still clones from this repository). Unfortunately, since you switched to RDKit 2016-03 my troubles began. I know your are still working on this, but as soon as RDKit starts to build in the container, the build process breaks. If I go back to revision 56c3a779f873c4e6f6dbbdc87d67d106f04c140d (the last one before RDKit 2016-03 occurs) it at least builds the python 2.7 part again but breaks later for python 3.4 and 3.5. Just in order to maybe get a clue what's wrong, I started playing around with the original Docker build on Centos 6 (i.e. the original Dockerfile), but I observe the same behavior - the build breaks somewhere. And even there, when I go back to revision 56c3a779f873c4e6f6dbbdc87d67d106f04c140d (i.e. replace the word "development" by this revision number in line 26 of the Dockerfile and uncomment the line - otherwise it is unchanged) , the build breaks after the python 2.7 part is finished (I attach the end of the build log below). Is that something you are aware of? Or is this a problem only I observe? I can also give more documentation if this is needed, however, I just wanted to get a first opinion. I also already tried builds with the development branch (besides the master branch, of course), unfortunately they also break, too. Thanks a lot, Markus BUILD END: rdkit-postgresql-2015.09.2-py27_1 Nothing to test for: rdkit-postgresql-2015.09.2-py27_1 # If you want to upload this package to anaconda.org later, type: # # $ anaconda upload /home/rdkit/miniconda/conda-bld/linux-64/rdkit-postgresql-2015.09.2-py27_1.tar.bz2 # # To have conda build upload to anaconda.org automatically, use # $ conda config --set anaconda_upload yes ---> 5898063548d9 Removing intermediate container 8af783478ecc Step 24 : RUN CONDA_PY=34 conda build boost --quiet --no-anaconda-upload ---> Running in 01043f80baa4 Using Anaconda Cloud api site https://api.anaconda.org Removing old build environment Removing old work directory BUILD START: boost-1.56.0-py34_3 Fetching package metadata: .. Solving package specifications: The following specifications were found to be in conflict: - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> boost ==1.56.0 - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> python 2.7* - zlib Use "conda info " to see the dependencies for each package. Missing dependency boost, but found recipe directory, so building boost first Error: The following specifications were found to be in conflict: - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> boost ==1.56.0 - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> python 2.7* - zlib Use "conda info " to see the dependencies for each package. The command '/bin/sh -c CONDA_PY=34 conda build boost --quiet --no-anaconda-upload' returned a non-zero code: 1 -- Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo
Hi James, I know that my opinion might sound extreme but I had this discussion many times (mostly regarding tautomerism which is, however, similar in some way). The problem is, you can look at a chemical structure in many different ways - two scenarios are: 1. What can I perceive from a chemical structure if all I have is the pure connection table and nothing else (and maybe millions of them) 2. What can I find about a particular structure if a I can run fully fledged quantum-mechanical calculations, do an extensive literature search, and/or have carefully measured experimental data and conditions (rarely in the millions :-)) So, if I deal with something like implementing RDKit, things are probably always quite close to scenario 1, hence my suggestion to disregard stereochemistry on these type of N atoms (you need a lot of information from scenario 2 to even decide whether there is stereochemistry or not). The ideal solution, of course, would be to offer three different modes for stereo perception: "disregard", "keep", "perceive" from 3D (I am not sure if Greg likes that :-)). If these three modes would be available I still would suggest to set the default to "disregard" for 3-coordinated N because the other two modes require that you know what you are doing and/or have full trust in your data - otherwise you probably do more harm than good. Best, Markus On Fri, Aug 21, 2015 at 3:10 PM, James Davidson wrote: > Hi Greg (and Markus, Peter, et al.), > > > > Personal opinion – my vote would be to always keep the chiral information at > 3-valent nitrogen centres… > > As Peter pointed-out, there are bridgehead examples (most of which, I guess, > will have additional carbon chiral centres – and offer diastereomeric > considerations). > > There are also, I believe, some nice oxaziridine examples where the > oxaziridine N is the only chiral centre present (interpreted from abstract > here: http://dx.doi.org/10.1039/C3985998): > > > > 3,3-dimethyl (2S)-2-tert-butyloxaziridine-3,3-dicarboxylate > > COC(=O)C1(O[N@]1C(C)(C)C)C(=O)OC > > > > and many other examples of diastereomeric oxaziridines – where the N is a > chiral centre – eg see http://dx.doi.org/10.1016/j.tetasy.2008.09.016 > > > > > > Kind regards > > > > James > > > __ > PLEASE READ: This email is confidential and may be privileged. It is > intended for the named addressee(s) only and access to it by anyone else is > unauthorised. If you are not an addressee, any disclosure or copying of the > contents of this email or any action taken (or not taken) in reliance on it > is unauthorised and may be unlawful. If you have received this email in > error, please notify the sender or postmas...@vernalis.com. Email is not a > secure method of communication and the Company cannot accept responsibility > for the accuracy or completeness of this message or any attachment(s). > Please check this email for virus infection for which the Company accepts no > responsibility. If verification of this email is sought then please request > a hard copy. Unless otherwise stated, any views or opinions presented are > solely those of the author and do not represent those of the Company. > > The Vernalis Group of Companies > 100 Berkshire Place > Wharfedale Road > Winnersh, Berkshire > RG41 5RD, England > Tel: +44 (0)118 938 > > To access trading company registration and address details, please go to the > Vernalis website at www.vernalis.com and click on the "Company address and > registration details" link at the bottom of the page.. > __ > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo
Hmm, well - probably not, you mention the always present exception in chemistry, Peter (Sulfoxides have a similar situation, stereochemistry from lone pairs). But generally I still think it is more dangerous to keep or even perceive (from 3D) stereochemistry on three-coordinated N - you will do more harm with this than fix things. On Thu, Aug 20, 2015 at 6:40 PM, Peter Shenkin wrote: > "My initial answer, and I would love input on this, is that three-coordinate > N should always have stereochemistry removed." > > Umm... even if it's a bridgehead? > > -P. > > On Thu, Aug 20, 2015 at 10:30 AM, Greg Landrum > wrote: >> >> This isn't a simple one, so it may take a bit to get to an answer that's >> comprehensible. >> >> There are two things going on here in the RDKit: >> 1) Ring stereochemistry >> 2) stereochemistry about nitrogen centers >> >> Let's start with the second, because it's easier: RDKit does not generally >> "believe in" stereochemistry around three coordinate nitrogens. Here's a >> very simple example: >> In [45]: m3 = Chem.MolFromSmiles('Br[N@](F)Cl') >> >> In [46]: Chem.MolToSmiles(m3,isomericSmiles=True) >> Out[46]: 'FN(Cl)Br' >> >> >> The 3D equivalent of that: >> In [41]: m = Chem.MolFromSmiles('BrN(F)Cl') >> >> In [42]: AllChem.EmbedMolecule(m) >> Out[42]: 0 >> >> In [43]: Chem.AssignAtomChiralTagsFromStructure(m) >> >> In [44]: Chem.MolToSmiles(m,isomericSmiles=True) >> Out[44]: 'FN(Cl)Br' >> >> Contrast this with what you get for a carbon: >> >> In [34]: m2 = Chem.MolFromSmiles('FC(Br)(Cl)I') >> >> In [35]: AllChem.EmbedMolecule(m2) >> Out[35]: 0 >> >> In [36]: Chem.AssignAtomChiralTagsFromStructure(m2) >> >> In [37]: Chem.MolToSmiles(m2,isomericSmiles=True) >> Out[37]: 'F[C@](Cl)(Br)I' >> >> >> Back to the first: ring stereochemistry. By this I mean things like >> C[C@H]1CC[C@@H](C)CC1 - molecules where the stereochemistry information is >> really about whether the substituents of the ring are cis or trans relative >> to the ring plane. >> >> The way the RDKit handles this is something of a hack: it doesn't identify >> those atoms as chiral centers, but it does preserve the chiral tags when >> generating a canonical SMILES: >> >> In [47]: m = Chem.MolFromSmiles('C[C@H]1CC[C@@H](C)CC1') >> >> In [48]: Chem.FindMolChiralCenters(m) >> Out[48]: [] >> >> In [49]: Chem.MolToSmiles(m,isomericSmiles=True) >> Out[49]: 'C[C@H]1CC[C@@H](C)CC1' >> >> Curiously, to me at least, it does the same thing with nitrogens; >> >> In [52]: m2 = Chem.MolFromSmiles('C[N@@]1CC[C@@H](C)CC1') >> >> In [53]: Chem.MolToSmiles(m2,isomericSmiles=True) >> Out[53]: 'C[C@H]1CC[N@](C)CC1' >> >> Lest anyone think that this might make sense because being a ring makes >> inversion more difficult, that's not what is going on here. If I make the >> ring truly chiral, then the stereochemistry of the N is removed: >> >> In [54]: m3 = Chem.MolFromSmiles('C[N@@]1CO[C@@H](C)CC1') >> >> In [55]: Chem.MolToSmiles(m3,isomericSmiles=True) >> Out[55]: 'C[C@H]1CCN(C)CO1' >> >> I believe that this inconsistent behavior is a bug: either N should always >> have the input stereochemistry preserved (and that should be perceived from >> the 3D coordinates) or it should never have the input stereochemistry >> preserved. My initial answer, and I would love input on this, is that >> three-coordinate N should always have stereochemistry removed. >> >> -greg >> >> >> >> On Thu, Aug 20, 2015 at 2:22 PM, Rob Smith wrote: >>> >>> Hi Greg, >>> >>> I've attached the SDF that Corina generates. I'm not convinced it is a >>> problem, more an observation that I'm trying to understand. >>> >>> Looking at the results again today - it seems that from the Corina output >>> Indigo is interpreting the conformer (including whether the ethyl >>> substituent on the piperidine nitrogen is equatorial or axial) - and >>> outputting a canonical smiles string that has the conformer "encoded" in it >>> (using the chiral flags). Whereas RDKit is reading in the Corina output, >>> "discounting" whether the nitrogen is axial or equatorial (which due to >>> inversion I can understand) and interpreting it as having only two chiral >>> centers (which is correct). >>> >>> What is confusing me, is that when I supply RDKit with the canonical >>> smiles string from Indigo (which has the conformer "encoded" in it), and >>> then ask for the isomeric canonical smiles, it supplies the canonical smiles >>> with the conformer still "encoded" within it. >>> >>> For example, I read in the following canonical smiles string into RDKit: >>> CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 (which was generated by reading >>> in one of the mols in the SD File into RDKit and output the isomeric >>> canonical smiles), running the FindMolChiralCenters on this molecule, >>> correctly reports the number of chiral centres to be 2 (6S, 9R), and then >>> asking it to output the canonical smiles string (with isomericSmiles=True) >>> gives CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1 (1). >>> >>> If I take the same
Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo
I agree with remove - the chance that you destroy actual information by this is low - or in other words, the chance that steroinformation on three-coordinate N is spurious I would expect as high. Markus On Thu, Aug 20, 2015 at 4:30 PM, Greg Landrum wrote: > This isn't a simple one, so it may take a bit to get to an answer that's > comprehensible. > > There are two things going on here in the RDKit: > 1) Ring stereochemistry > 2) stereochemistry about nitrogen centers > > Let's start with the second, because it's easier: RDKit does not generally > "believe in" stereochemistry around three coordinate nitrogens. Here's a > very simple example: > In [45]: m3 = Chem.MolFromSmiles('Br[N@](F)Cl') > > In [46]: Chem.MolToSmiles(m3,isomericSmiles=True) > Out[46]: 'FN(Cl)Br' > > > The 3D equivalent of that: > In [41]: m = Chem.MolFromSmiles('BrN(F)Cl') > > In [42]: AllChem.EmbedMolecule(m) > Out[42]: 0 > > In [43]: Chem.AssignAtomChiralTagsFromStructure(m) > > In [44]: Chem.MolToSmiles(m,isomericSmiles=True) > Out[44]: 'FN(Cl)Br' > > Contrast this with what you get for a carbon: > > In [34]: m2 = Chem.MolFromSmiles('FC(Br)(Cl)I') > > In [35]: AllChem.EmbedMolecule(m2) > Out[35]: 0 > > In [36]: Chem.AssignAtomChiralTagsFromStructure(m2) > > In [37]: Chem.MolToSmiles(m2,isomericSmiles=True) > Out[37]: 'F[C@](Cl)(Br)I' > > > Back to the first: ring stereochemistry. By this I mean things like > C[C@H]1CC[C@@H](C)CC1 - molecules where the stereochemistry information is > really about whether the substituents of the ring are cis or trans relative > to the ring plane. > > The way the RDKit handles this is something of a hack: it doesn't identify > those atoms as chiral centers, but it does preserve the chiral tags when > generating a canonical SMILES: > > In [47]: m = Chem.MolFromSmiles('C[C@H]1CC[C@@H](C)CC1') > > In [48]: Chem.FindMolChiralCenters(m) > Out[48]: [] > > In [49]: Chem.MolToSmiles(m,isomericSmiles=True) > Out[49]: 'C[C@H]1CC[C@@H](C)CC1' > > Curiously, to me at least, it does the same thing with nitrogens; > > In [52]: m2 = Chem.MolFromSmiles('C[N@@]1CC[C@@H](C)CC1') > > In [53]: Chem.MolToSmiles(m2,isomericSmiles=True) > Out[53]: 'C[C@H]1CC[N@](C)CC1' > > Lest anyone think that this might make sense because being a ring makes > inversion more difficult, that's not what is going on here. If I make the > ring truly chiral, then the stereochemistry of the N is removed: > > In [54]: m3 = Chem.MolFromSmiles('C[N@@]1CO[C@@H](C)CC1') > > In [55]: Chem.MolToSmiles(m3,isomericSmiles=True) > Out[55]: 'C[C@H]1CCN(C)CO1' > > I believe that this inconsistent behavior is a bug: either N should always > have the input stereochemistry preserved (and that should be perceived from > the 3D coordinates) or it should never have the input stereochemistry > preserved. My initial answer, and I would love input on this, is that > three-coordinate N should always have stereochemistry removed. > > -greg > > > > On Thu, Aug 20, 2015 at 2:22 PM, Rob Smith wrote: >> >> Hi Greg, >> >> I've attached the SDF that Corina generates. I'm not convinced it is a >> problem, more an observation that I'm trying to understand. >> >> Looking at the results again today - it seems that from the Corina output >> Indigo is interpreting the conformer (including whether the ethyl >> substituent on the piperidine nitrogen is equatorial or axial) - and >> outputting a canonical smiles string that has the conformer "encoded" in it >> (using the chiral flags). Whereas RDKit is reading in the Corina output, >> "discounting" whether the nitrogen is axial or equatorial (which due to >> inversion I can understand) and interpreting it as having only two chiral >> centers (which is correct). >> >> What is confusing me, is that when I supply RDKit with the canonical >> smiles string from Indigo (which has the conformer "encoded" in it), and >> then ask for the isomeric canonical smiles, it supplies the canonical smiles >> with the conformer still "encoded" within it. >> >> For example, I read in the following canonical smiles string into RDKit: >> CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 (which was generated by reading >> in one of the mols in the SD File into RDKit and output the isomeric >> canonical smiles), running the FindMolChiralCenters on this molecule, >> correctly reports the number of chiral centres to be 2 (6S, 9R), and then >> asking it to output the canonical smiles string (with isomericSmiles=True) >> gives CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1 (1). >> >> If I take the same mol file, read it into Indigo, and ask it to output the >> canonical smiles string, I get: CC(C)[C@H]1CC[N@H+]1[C@@H]1CC[N@@](CC1)CC, >> if I read this smiles string into RDKit and run FindMolCenters on it, I get >> (3R, 6S) - which is fine, if I then out the canonical smiles (again with >> isomericSmiles=True) I get CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1. I >> expected this isomeric canonical smiles to be the same as (1), however RDKit >> appears to conserve the conformer
Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo
Hehe, that is why I keep my computers always really cold when I run RDKit ... - | Markus Sitzmann | markus.sitzm...@gmail.com > On 20.08.2015, at 04:33, Peter Shenkin wrote: > > Maybe when you have a toolkit as blazingly fast as RDKit it captures the > chirality of N center before it has time to interconvert > > -P. > >> On Wed, Aug 19, 2015 at 10:17 PM, John M wrote: >> More odd is the carbon stereocentre with two methyls... >> >> Generally trivalent nitrogens are not considered chiral due to inversion of >> the lone-pair. The two usual exceptions are when they are a bridgehead or in >> a tight ring (cyclopropane). This is the same in most toolkits, the InChI >> technical documentation provides useful examples. >> >> InChI actually only sees one stereo centre since it strips the proton off: >> InChI=1S/C13H26N2/c1-4-14-8-5-12(6-9-14)15-10-7-13(15)11(2)3/h11-13H,4-10H2,1-3H3/p+1/t13-/m1/s1 >> >> It may well be chiral in this case but since it's not you should also >> strictly remove the other stereocentre in the para position to the nitrogen >> >> For the record just tested and ChemAxon/CDK/OpenBabel do the same. >> >> John >> >> Regards, >> John W May >> john.wilkinson...@gmail.com >> >>> On 19 August 2015 at 09:00, Rob Smith wrote: >>> Dear RDKit community, >>> >>> I'm trying to use RDKit to read in Corina generated stereoisomers (from a >>> Mol file), assign chiral tags and stereochemistry to the structure and >>> output the canonical smiles string for each isomer of a given molecule (in >>> Python), when I do this, half the canonical smiles strings are not unique. >>> >>> When I read in the output from Corina into an Indigo instance, then use the >>> canonical smiles from Indigo to create an RDKit molecule, canonical smiles >>> strings generated from the molecule objects are all unique. >>> >>> I may be missing an option to enable RDKit to 'visualise' the chiral centre >>> adjacent to the protonated nitrogen, so if someone can spot where I've made >>> a mistake, I'd really appreciate it. I've included the output and Python >>> script below. If you require any further information, please let me know. >>> >>> Many thanks, >>> Rob >>> >>> Output: >>> >>> RDKit Read in of Molecule >>> RDKit Output - CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 >>> RDKit Output - CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 >>> RDKit Output - CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1 >>> RDKit Output - CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1 >>> RDKit Output - CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1 >>> RDKit Output - CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1 >>> RDKit Output - CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1 >>> RDKit Output - CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1 >>> >>> INDIGO Read in of Molecule >>> RDKit Output - CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1 >>> RDKit Output - CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1 >>> RDKit Output - CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1 >>> RDKit Output - CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1 >>> RDKit Output - CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1 >>> RDKit Output - CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1 >>> RDKit Output - CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1 >>> RDKit Output - CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1 >>> >>> Python script : >>> >>> from rdkit import Chem >>> import subprocess # Used to run Corina >>> from indigo import * >>> >>> def runCorinaTest(inputMol): >>> indigo = Indigo() >>> >>> molFile = Chem.MolToMolBlock(inputMol) >>> >>> corinaCommand = "echo \'" + molFile + "\' | " >>> # Then Corina - generate stereoisomers... >>> corinaCommand = corinaCommand + "/apps/corina/corina -t n -d >>> canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf" >>> corinaResult = subprocess.check_output([corinaCommand], shell=True) # >>> Gives the stereoisomer species as an SDF string >>> >>> allMoleculeObjects = [] >>> allMolecules = corinaResult.split("\n") # Separate Corina output >>> into individual molecules >>> allMolecules = allMolecules[0:len(allMolecules)-1] >>> >>
Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't
We could consider some quantum-mechanical calculations ... well, I always hated this discussion when I heard for my web service with millions of structures, I should consider quantum-mechanical calculations as part of the structure normalization/canonicalization ;-) On Wed, Jun 17, 2015 at 8:22 AM, Peter Shenkin wrote: > Hi, Greg, > > Within the SMILES framework, it seems to me that if you allow the atoms to > be aromatic, then these are two Kekule structures of the same aromatic > system, and however you do the canonicalization, they ought to canonicalize > to the same structure, which the two examples did not do. I don't think you > addressed this. > > I think now that there is no issue with having a double bond between two > aromatic atoms beyond our preconceptions. If that is a problem, you could > Kekulize it per your first picture, (though perhaps that is inconvenient in > the context of the implementation). > > I actually didn't realize why aromaticity (particularly the double bond) > made sense when I originally wrote, so the above is with the benefit of > hindsight, and your comments. > > I think the molecule is entertaining in several ways. In the cubane > geometry, the molecule cannot be conventionally aromatic. Might it actually > be antiaromatic? Could there be two forms? > > Dunno > -P. > > > On Wed, Jun 17, 2015 at 1:25 AM, Greg Landrum > wrote: >> >> >> The problematic part of your two molecules can be reduced to: >> [image: Inline image 3] >> and >> [image: Inline image 4] >> That second one shows the kekulized form that the RDKit ends up using. >> >> These produce the following canonical SMILES: >> >> In [31]: Chem.CanonSmiles('C1=CC2=CC=C12') >> Out[31]: 'c1cc2ccc1-2' >> >> In [32]: Chem.CanonSmiles('C1=CC2=C1C=C2') >> Out[32]: 'c1cc2ccc1=2' >> >> > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Test #91 failing for RDkit 2015_03_1
The same happened to me for the previous version of RDKit when I compiled it in a Docker container. I hat to install Pillow first, too - probably they are trying to keep those Ubuntu or Centos versions more essential when they are used in VMs or Docker. Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 26.05.2015, at 18:08, Michał Nowotka wrote: > > Sorry, for the hassle, this has now been fixed. After running 'ctest > -R pythonTestDirChem -V' I've noticed that Pillow/PIL is missing. > >> On Tue, May 26, 2015 at 4:51 PM, Michał Nowotka wrote: >> Hi, >> >> We are trying to compile latest (2015_03_1) RDKit version on myChEMBL VMs. >> Unfortunately when running tests, the last one fails: >> >> - >> >> 91/91 Test #91: pythonTestDirChem ***Failed 36.43 sec >> >> 99% tests passed, 1 tests failed out of 91 >> >> Total Test time (real) = 119.76 sec >> >> The following tests FAILED: >> 91 - pythonTestDirChem (Failed) >> Errors while running CTest >> >> - >> >> This happens on Ubuntu 14.04 LTS and CentOS 7. >> Is it something serious, can this be fixed? >> >> Kind regards, >> >> Michał Nowotka > > -- > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SDF properties in case of error
If you (ab)use ErrorMolecule to keep or add garbage into your future blockbuster drug molecule set, it is your own problem. And if you rely on the correctness of a SD file reader of any software as part of your quality assurance in your drug pipeline process, I am quite positive, you do something wrong. On Sun, May 3, 2015 at 7:04 PM, Dimitri Maziuk wrote: > On 2015-05-03 03:56, Markus Sitzmann wrote: >> No, "cutting out a chunk of lines from a file" might be simple, but >> can become an expensive operation if you want to deal with thousands >> of files and million of records. > > *If you have the line numbers* it's something like "head | tail" or a > 2-line for loop w/ line counter. > > If it's not a one-off and your upstream keeps generating junk, the > proper solution is to "have a talk" with them. > > The worst possible solution is to happily generate a garbage molecule > that will blow up user's entire downstream pipeline. *If they're lucky* > -- most likely it'll be garbage in - garbage out and crap happily flows > on to the next stage. If ErrorMolecule "is a" Molecule that will happen. > > I most emphatically do not want to take any drug developed using that > kind of software quality assurance and error control procedures. Or have > any new material developed like that anywhere near my bike, car, or > diving gear. And so on. > > Dimitri > > > -- > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SDF properties in case of error
No, "cutting out a chunk of lines from a file" might be simple, but can become an expensive operation if you want to deal with thousands of files and million of records. That is one of the reasons why I (unfortunately) couldn't consider rdkit any further for one of my projects a few years ago. So, I support Michael's idea :-) On Sat, May 2, 2015 at 12:17 AM, Dimitri Maziuk wrote: > On 04/30/2015 05:01 PM, Michael Reutlinger wrote: > >> However, in some cases this does not help. E.g. when an unknown atom (most >> of the time this is X) is found in the MolBlock the import fails with an >> Post-condition Violation and None is yielded. This is fine to detect the >> problem BUT it is impossible to get any information about the molecule >> which failed. > > I'd say the best you can do skip over to the next molecule and report > "molecule in lines X to Y is corrupt". Cutting out a chunk of lines from > a file is trivial, and if you're reading from a stream rather than a > file then, well, don't. Without a valid mol block you don't have a > molecule and you shouldn't be making one up. As in "conservative in what > you produce". > > -- > Dimitri Maziuk > Programmer/sysadmin > BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu > > > -- > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !
Sorry, looks like my baby is getting old ... :-) Markus On Wed, Feb 25, 2015 at 7:26 AM, Greg Landrum wrote: > To close the loop here: after an email exchange with Marc Nicklaus and Wolf > Ihlenfeldt, it looks like the problem is that the NCI website is using an > older version of the CACTVS toolkit to do the SMILES->InChI conversion. That > older version contains a bug that has since been fixed. Marc is now aware of > the problem. > > The RDKit was, at least in this case, not responsible for the bad InChIs. > :-) > > Best, > -greg > > > > > On Tue, Feb 24, 2015 at 8:27 AM, Greg Landrum > wrote: >> >> >> The InChIs have me confused. >> >> I'm going to simplify the below by just showing the input SMILES, the >> current (=master) RDKit InChI and the PubChem InChI >> >> On Mon, Feb 23, 2015 at 10:54 AM, JP wrote: >>> >>> >>> Here is the list (first inchi is the 2014_09_2, second one is the >>> 2015.03.1pre generated one, third inchi is the cactus.nci.nih.gov): >>> >>> O=C(/N=c1/[nH]ncs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1 >>> >>> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13- >>> # RDKit 2015.03.1pre >>> >>> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)/t12-,13-,14?,15? >>> # cactus.nci.nih.gov >>> >>> O=C(/N=c1\[nH]c(-c2n2)cs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1 >>> InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17- >>> >>> InChI=1S/C24H39N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h16-21,25-26H,1-15H2,(H,27,28,30)/t16-,17-,18?,19?,20?,21? >>> >>> CCOC(=O)Cc1cs/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)[nH]1 >>> InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16- >>> >>> InChI=1S/C23H36N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h15-19,24H,2-14H2,1H3,(H,25,26,29)/t15-,16-,17?,18?,19? >>> >>> COCc1n[nH]/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)s1 >>> InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14- >>> >>> InChI=1S/C20H33N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h13-17,21,23H,2-12H2,1H3,(H,22,24,26)/t13-,14-,15?,16?,17? >>> >>> COC(=O)c1[nH]/c(=N\C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)sc1C(C)C >>> InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16- >>> >>> InChI=1S/C24H38N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h14-20,25H,4-13H2,1-3H3,(H,26,27,29)/t15-,16-,17?,18?,19?,20? >>> >>> CC(C)[C@H]1CC[C@H](C(=O)N[C@H](Cc2c2)C(=O)/N=c2\[nH]ncs2)CC1 >>> InChI=1S/C21H28N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h3-7,13-14,16-18H,8-12H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1 >>> >>> InChI=1S/C21H36N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h14-18,22H,3-13H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1 >> >> >> If you look in the formula layer for the InChIs from PubChem, you will see >> that they all have *way* too many H atoms. I think there's something about >> the structures that is confusing the pubchem/cactvs conversion code. >> >> Compare these two outputs. >> Aromatic form: >> >> http://cactus.nci.nih.gov/chemical/structure/O=C(N=c1[nH]ncs1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi >> produces: >> >> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24) >> >> Kekule form: >> >> http://cactus.nci.nih.gov/chemical/structure/O=C(/N=C1/[NH]N=CS1)[C@H]1CC[C@H](CN2C=NC3=CC=CC=C3C2=O)CC1/stdinchi >> produces: >> >> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13- >> >> In fact, converting the 5 membered ring to kekule form is enough: >> >> http://cactus.nci.nih.gov/chemical/structure/O=C(N=C1[NH]N=CS1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi >> produces: >> >> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24) >> >> This can't be true. >> >> We can further simplify things to track down the problem: >> >> http://cactus.nci.nih.gov/chemical/structure/N=c1[nH]ncs1/stdinchi >> InChI=1S/C2H5N3S/c3-2-5-4-1-6-2/h4H,1H2,(H2,3,5) >> >> vs >> >> http://cactus.nci.nih.gov/ch
Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !
You can report it to Marc Nicklaus ... who will probably sent it to me ... I will take a look. Whether I can fix any misbehavior is another question. On Tue, Feb 24, 2015 at 8:27 AM, Greg Landrum wrote: > > The InChIs have me confused. > > I'm going to simplify the below by just showing the input SMILES, the > current (=master) RDKit InChI and the PubChem InChI > > On Mon, Feb 23, 2015 at 10:54 AM, JP wrote: >> >> >> Here is the list (first inchi is the 2014_09_2, second one is the >> 2015.03.1pre generated one, third inchi is the cactus.nci.nih.gov): >> >> O=C(/N=c1/[nH]ncs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1 >> >> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13- >> # RDKit 2015.03.1pre >> >> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)/t12-,13-,14?,15? >> # cactus.nci.nih.gov >> >> O=C(/N=c1\[nH]c(-c2n2)cs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1 >> InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17- >> >> InChI=1S/C24H39N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h16-21,25-26H,1-15H2,(H,27,28,30)/t16-,17-,18?,19?,20?,21? >> >> CCOC(=O)Cc1cs/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)[nH]1 >> InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16- >> >> InChI=1S/C23H36N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h15-19,24H,2-14H2,1H3,(H,25,26,29)/t15-,16-,17?,18?,19? >> >> COCc1n[nH]/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)s1 >> InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14- >> >> InChI=1S/C20H33N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h13-17,21,23H,2-12H2,1H3,(H,22,24,26)/t13-,14-,15?,16?,17? >> >> COC(=O)c1[nH]/c(=N\C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)sc1C(C)C >> InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16- >> >> InChI=1S/C24H38N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h14-20,25H,4-13H2,1-3H3,(H,26,27,29)/t15-,16-,17?,18?,19?,20? >> >> CC(C)[C@H]1CC[C@H](C(=O)N[C@H](Cc2c2)C(=O)/N=c2\[nH]ncs2)CC1 >> InChI=1S/C21H28N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h3-7,13-14,16-18H,8-12H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1 >> >> InChI=1S/C21H36N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h14-18,22H,3-13H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1 > > > If you look in the formula layer for the InChIs from PubChem, you will see > that they all have *way* too many H atoms. I think there's something about > the structures that is confusing the pubchem/cactvs conversion code. > > Compare these two outputs. > Aromatic form: > http://cactus.nci.nih.gov/chemical/structure/O=C(N=c1[nH]ncs1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi > produces: > InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24) > > Kekule form: > http://cactus.nci.nih.gov/chemical/structure/O=C(/N=C1/[NH]N=CS1)[C@H]1CC[C@H](CN2C=NC3=CC=CC=C3C2=O)CC1/stdinchi > produces: > InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13- > > In fact, converting the 5 membered ring to kekule form is enough: > http://cactus.nci.nih.gov/chemical/structure/O=C(N=C1[NH]N=CS1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi > produces: > InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24) > > This can't be true. > > We can further simplify things to track down the problem: > > http://cactus.nci.nih.gov/chemical/structure/N=c1[nH]ncs1/stdinchi > InChI=1S/C2H5N3S/c3-2-5-4-1-6-2/h4H,1H2,(H2,3,5) > > vs > > http://cactus.nci.nih.gov/chemical/structure/O=c1[nH]ncs1/stdinchi > InChI=1S/C2H2N2OS/c5-2-4-3-1-6-2/h1H,(H,4,5) > > It seems to be the exocyclic bond to an atom with Hs. This is ok: > http://cactus.nci.nih.gov/chemical/structure/O=c1occo1/stdinchi > InChI=1S/C3H2O3/c4-3-5-1-2-6-3/h1-2H > > but both of these are wrong: > http://cactus.nci.nih.gov/chemical/structure/N=c1occo1/stdinchi > InChI=1S/C3H5NO2/c4-3-5-1-2-6-3/h4H,1-2H2 > > http://cactus.nci.nih.gov/chemical/structure/C=c1occo1/stdinchi > InChI=1S/C4H6O2/c1-4-5-2-3
Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !
Well, the http://cactus.nci.nih.gov/chemical/structure/ site is my baby which I had to leave behind 1 1/2 years ago (I am not with NIH anymore). Igor who replied in this thread was also involved in some parts of it. Traffic on this cactus service is between 5 to 10 million requests per month - so I think the service survived your attack ;-) And I am not saying it is perfect, it just provides another implementation to double-check things in question. It has the CACTVS chemoinformatic toolkit as chemistry backend which I think is well-tested. Markus On Mon, Feb 23, 2015 at 10:54 AM, JP wrote: > Ok so I got out my test set of 6,940,083 molecules. First, I generated the > inchi using 2014_09_2. I then checked out (and built) the master (with > Greg's latest commits) from github and regenerated the inchis for all these > molecules. > > 3,257 molecules (of 6,940,083) gave me a different inchis between the > current production version and the development (github) one. > > For these 3,257 molecules I hammered the > http://cactus.nci.nih.gov/chemical/structure/%s/stdinchi site and assumed > this to be the 'correct' inchi (those great guys will have an interesting > spike in their web traffic last Fri evening). In 6 (out of 3,257) cases we > get different Inchis from cactus.nci.nih.gov vs RDKit github development > version (2015.03.1pre). > > Here is the list (first inchi is the 2014_09_2, second one is the > 2015.03.1pre generated one, third inchi is the cactus.nci.nih.gov): > > O=C(/N=c1/[nH]ncs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1 > MPQBIWRBISQCLJ-BETUJISGSA-N MPQBIWRBISQCLJ-JOCQHMNTSA-N > InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13+ > # RDKit 2014_09_2 > InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13- > # RDKit 2015.03.1pre > InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)/t12-,13-,14?,15? > # cactus.nci.nih.gov > > O=C(/N=c1\[nH]c(-c2n2)cs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1 > CZKXHWCYFFXKGH-CALCHBBNSA-N CZKXHWCYFFXKGH-QAQDUYKDSA-N > InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17+ > InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17- > InChI=1S/C24H39N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h16-21,25-26H,1-15H2,(H,27,28,30)/t16-,17-,18?,19?,20?,21? > > CCOC(=O)Cc1cs/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)[nH]1 > GAXCPQSXDNGSQV-IYBDPMFKSA-N GAXCPQSXDNGSQV-WKILWMFISA-N > InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16+ > InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16- > InChI=1S/C23H36N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h15-19,24H,2-14H2,1H3,(H,25,26,29)/t15-,16-,17?,18?,19? > > COCc1n[nH]/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)s1 > YVZJPKUMKXPZTK-OKILXGFUSA-N YVZJPKUMKXPZTK-HDJSIYSDSA-N > InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14+ > InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14- > InChI=1S/C20H33N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h13-17,21,23H,2-12H2,1H3,(H,22,24,26)/t13-,14-,15?,16?,17? > > COC(=O)c1[nH]/c(=N\C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)sc1C(C)C > KNDSLDLCZNAXPK-IYBDPMFKSA-N KNDSLDLCZNAXPK-WKILWMFISA-N > InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16+ > InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16- > InChI=1S/C24H38N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h14-20,25H,4-13H2,1-3H3,(H,26,27,29)/t15-,16-,17?,18?,19?,20? > > CC(C)[C@H]1CC[C@H](C(=O)N[C@H](Cc2c2)C(=O)/N=c2\[nH]ncs2)CC1 > OKTRHZCAACPPLC-FGTMMUONSA-N OKTRHZCAACPPLC-KZNAEPCWSA-N > InChI=1S/C21H28N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h3-7,13-14,16-18H,8-12H2,1-2H3,(H,23,26)(H,24,25,27)/
Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !
A database can have several definitions of unique for anything - a structure database can have this, too. If you have a chemical compound which can form 10 different tautomers, you can represent the compound by 10 chemical structures (it is still the same compound, though). So, if you define uniqueness on basis of chemical compound, you have one db entry and this one entry has a single (tatuomer-sensitive) InChI covering 10 chemical structures; if you define uniqueness on basis of tautomers/chemical structures (because all are relevant, for instance, in NMR spectrosopy) you have (and want) 10 database entries, each with a single (tautomer-sensitive) InChI. Two definitions of unique. So my sentence still stands: a chemical structure must calculate a unique InChI, but a InChI might cover more then one chemical structure. On Thu, Feb 19, 2015 at 3:37 PM, Dimitri Maziuk wrote: > On 2015-02-19 07:27, Markus Sitzmann wrote: >> >> No, a chemical structure must calculate a unique InChI, but a InChI >> might cover more then one chemical structure > > > Heh. I could swear last time I read the description it specifically > mentioned databases. In the database context 'unique' has a specific > well-defined meaning and that is *not* 'more than one'. Now I don't see it > in the official blurbs, only pikiwedia mentions databases. > >> ... there is no precise, universally valid >> definition for "unique molecule". > > > "On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the > machine wrong figures, will the right answers come out?' I am not able > rightly to apprehend the kind of confusion of ideas that could provoke such > a question." > > Works for 'undefined figures', too. > > Dimitri > > -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !
No, a chemical structure must calculate a unique InChI, but a InChI might cover more then one chemical structure (because their are molecules that can be described by more than one chemical structure). And a chemical formula might be the most accurate (unique) description you have for a molecule (admittedly, unlikely today), however, that is why the InChI is layered. Ba adding and removing layers, InChI allows you how precisely you want to define uniqueness - that is important with molecules because there is no precise, universally valid definition for "unique molecule". On Thu, Feb 19, 2015 at 2:06 PM, Dimitri Maziuk wrote: > On 2015-02-19 05:58, Greg Landrum wrote: >> >> On Thu, Feb 19, 2015 at 10:11 AM, Markus Sitzmann >> mailto:markus.sitzm...@gmail.com>> wrote: >> >> Well, at least you said something important: "conversion of InChI to >> molecules is something that's not in general guaranteed to work >> perfectly" - and this is by design like this because InChI is an >> identifier, not a molecule representation. Unfortunately, many people >> seemed to forget about this :-) >> >> >> Yes, yes they do. > > Well unfortunately inchi states they're a 'unique identifier' which > means there must be 1 inchi for 1 molecule and it *should* work > perfectly. And then they say the only required 'layer' is the formula > which means a) it's not unique and b) how is "InChi=formula" better than > just "formula"? D'uh. > > Dimitri > > > -- > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !
Well, at least you said something important: "conversion of InChI to molecules is something that's not in general guaranteed to work perfectly" - and this is by design like this because InChI is an identifier, not a molecule representation. Unfortunately, many people seemed to forget about this :-) On Thu, Feb 19, 2015 at 6:59 AM, Greg Landrum wrote: > > On Wed, Feb 18, 2015 at 7:01 PM, Igor Filippov > wrote: >> >> > update the bug report and work on tracking down the wrong problem >> >> That's how I sometimes do it too... ;) > > > I'll leave it as an exercise to the reader to decide if that was > intentional, the fault of auto-correct, or just because it had been a long > day. ;-) > > -greg > > > -- > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !
I agree with John, the InChI for mol1 and mol2 should be http://cactus.nci.nih.gov/chemical/structure/O=C(NCCc1c1)[C@H]1CC[C@H](Cn2c(O)nc3c3c2=O)CC1/stdinchi InChI=1S/C24H27N3O3/c28-22(25-15-14-17-6-2-1-3-7-17)19-12-10-18(11-13-19)16-27-23(29)20-8-4-5-9-21(20)26-24(27)30/h1-9,18-19H,10-16H2,(H,25,28)(H,26,30)/t18-,19- So the + at the end should be a - Markus On Wed, Feb 18, 2015 at 2:53 PM, John M wrote: > Hi Greg, > > I believe it's an RDKitMol -> InChI issue rather than InChI -> RDKitMol. The > correct InChI (below) is different from that in the iPython listing. > > InChI=1S/C24H27N3O3/c28-22(25-15-14-17-6-2-1-3-7-17)19-12-10-18(11-13-19)16-27-23(29)20-8-4-5-9-21(20)26-24(27)30/h1-9,18-19H,10-16H2,(H,25,28)(H,26,30)/t18-,19- > > J > > > Regards, > John W May > john.wilkinson...@gmail.com > > On 18 February 2015 at 04:57, Greg Landrum wrote: >> >> JP, >> >> Looks like that's a bug in the way ring stereochemistry is handled while >> translating the InChI back into an molecule. >> >> It's reproducible with a small example: >> In [1]: from rdkit import Chem >> >> In [2]: mol1 = Chem.MolFromSmiles("C[C@H]1CC[C@H](O)CC1") >> >> In [3]: Chem.MolToSmiles(mol1,True) >> Out[3]: 'C[C@H]1CC[C@H](O)CC1' >> >> In [4]: inchi = Chem.MolToInchi(mol1) >> >> In [5]: mol2 = Chem.MolFromInchi(inchi) >> >> In [6]: Chem.MolToSmiles(mol2,True) >> Out[6]: 'C[C@H]1CC[C@@H](O)CC1' >> >> Conversion of InChI to molecules is something that's not in general >> guaranteed to work perfectly, but I will go ahead and create a bug report. >> >> -greg >> >> >> >> On Tue, Feb 17, 2015 at 2:50 PM, JP wrote: >>> >>> Hi there, >>> >>> I have a question for the 3D enabled of you (I wish the world looked like >>> GTA2 !) >>> >>> I am seeing a case of an RDKit mol -> Inchi -> RDKit mol, that I think is >>> changing the stereochemistry of the molecule. I have 12 example-pairs >>> where this happens (but all very structurally similar). I don't care much >>> that the last rdkit molecule is a different tautomer than the starting one - >>> but if this is the case the stereochemistry should still be conserved, no? >>> >>> I did an ipython notebook (most useful tool of the decade after RDKit?) >>> gist here: >>> >>> >>> http://nbviewer.ipython.org/urls/gist.githubusercontent.com/anonymous/7c158926a0f3bf9a4978/raw/d91cc808ac91eccc8bf0e45d9eacd2af382e5105/gistfile1.txt >>> >>> I appreciate if anyone could shed some light. I'd just like to >>> understand. >>> >>> Thank you for your time! >>> >>> - >>> JP >>> >>> >>> -- >>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards >>> with Interactivity, Sharing, Native Excel Exports, App Integration & more >>> Get technology previously reserved for billion-dollar corporations, FREE >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> >> >> >> -- >> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >> from Actuate! Instantly Supercharge Your Business Reports and Dashboards >> with Interactivity, Sharing, Native Excel Exports, App Integration & more >> Get technology previously reserved for billion-dollar corporations, FREE >> >> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > > > -- > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list
Re: [Rdkit-discuss] New RDKit drawing code
Hi Greg and all the others involved, That looks really nice! And don't give any code to Noel anymore, it all ends up in JavaScript :-) (who would have thought 10 years ago that would make any sense). Best, Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 14.02.2015, at 08:01, Greg Landrum wrote: > > Dear all, > > Noel's great blog post on using the RDKit from emscripten > (http://baoilleach.blogspot.ch/2015/02/cheminformaticsjs-rdkit.html) made me > realize that I should post something here about the new RDKit drawing code > that's currently available in github. > > Rather than do a long email message, I did a quick blog post that > demonstrates some of the functionality: > http://rdkit.blogspot.com/2015/02/new-drawing-code.html > > I'm still actively working on this, but I think what's there is already worth > showing off a bit. :-) > > Many thanks are due to Dave Cosgrove, who did the initial work that makes > this all possible. > > -greg > > -- > Dive into the World of Parallel Programming. The Go Parallel Website, > sponsored by Intel and developed in partnership with Slashdot Media, is your > hub for all things parallel software development, from weekly thought > leadership blogs to news, videos, case studies, tutorials and more. Take a > look and join the conversation now. http://goparallel.sourceforge.net/ > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] MaxMin Picker and Python
Hi Matt, maybe squeeze these two lines zims = [x for x in Chem.ForwardSDMolSupplier(gzip.open('a.sdf.gz')) if x is not None] zims_fps=[AllChem.GetMorganFingerprintAsBitVect(x,2) for x in zims] into one: zims_fps=[AllChem.GetMorganFingerprintAsBitVect(x,2) for x in Chem.ForwardSDMolSupplier(gzip.open('a.sdf.gz')) if x is not None] because zims keeps the whole file in memory for no good reason :-) (is that sdf.gz big?) Markus On Thu, Jul 17, 2014 at 12:43 AM, Matthew Lardy wrote: > Hi Igor, > > Thanks! Maybe I am a throwback, but I prefer the command line to a GUI. > Still I'll give it a whirl! :) > > If you are handling millions of molecules without issue; then my Python > skills are really, really, rusty. Or, I shouldn't be using Python to handle > this much data. :) > > Thanks for the info! > Matt > > > On Wed, Jul 16, 2014 at 3:31 PM, Igor Filippov > wrote: >> >> Matthew, >> >> Two lines of shameless self-promotion: >> This is exactly the kind of problem for Diversity Genie - >> http://www.diversitygenie.com/ >> It is using RDKit library underneath, but wraps it in a simple, easy to >> use GUI front-end. >> >> Best regards, >> Igor >> >> >> On Wed, Jul 16, 2014 at 6:18 PM, Matthew Lardy wrote: >>> >>> Hi all, >>> >>> I have been playing with the diversity selection in RDKit. I am running >>> through a set of ~26,000 molecules to pick a set of 200 diverse molecules. >>> I saw some examples of how to do this in Python (my variant of their script >>> below), but the memory consumption is massive. I burned through ~15GB of >>> memory before I killed it off. Is this about what others have seen, or >>> should I move to doing this in C++ or Java (assuming that others have seen a >>> significantly lower level of memory consumption)? >>> >>> Here is the script: >>> >>> from rdkit import Chem >>> from rdkit.Chem import AllChem >>> from rdkit import DataStructs >>> import gzip >>> from rdkit.Chem import Draw >>> from rdkit.SimDivFilters import rdSimDivPickers >>> >>> zims = [x for x in Chem.ForwardSDMolSupplier(gzip.open('a.sdf.gz')) if x >>> is not None] >>> >>> zims_fps=[AllChem.GetMorganFingerprintAsBitVect(x,2) for x in zims] >>> >>> dm=[] >>> for i,fp in enumerate(zims_fps[:26000]): # only 1000 in the demo (in >>> the interest of time) >>> >>> dm.extend(DataStructs.BulkTanimotoSimilarity(fp,zims_fps[1+1:26000],returnDistance=True)) >>> dm = array(dm) >>> picker = rdSimDivPickers.MaxMinPicker() >>> ids = picker.Pick(dm,26000,200) >>> list(ids[:200]) >>> >>> >>> Thanks in advance! >>> Matt >>> >>> >>> -- >>> Want fast and easy access to all the code in your enterprise? Index and >>> search up to 200,000 lines of code with a free copy of Black Duck >>> Code Sight - the same software that powers the world's largest code >>> search on Ohloh, the Black Duck Open Hub! Try it now. >>> http://p.sf.net/sfu/bds >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> > > > -- > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Fwd: Tautomeric InChIs
-- Forwarded message -- From: Markus Sitzmann Date: Thu, May 8, 2014 at 3:27 PM Subject: Re: [Rdkit-discuss] Tautomeric InChIs To: Edward Pyzer-Knapp Hi Edward, since your InChI is a Standard InChI ("1S/"): tautomeric forms are purposely *not* preserved by Standard InChI - that's why we created Standard InChI (with non-standard InChI's it is another story, those you can make tautomer-sensitive or insensitive).And actually many people complain that Standard InChI falls short in some cases regarding tautomer normalization :-). Best, Markus On Thu, May 8, 2014 at 3:16 PM, Edward Pyzer-Knapp wrote: > Hi all, > > I have been playing around with RDKIT for a while now - great work guys! > > I have recently hit an issue when using InChIs: > > When generating both inchi and smiles from a rdkit Mol, I get two different > structures, even if I use the smiles as an input for the inchi generation. > > An example: > > smiles = "[H]N1C(=O)C(=C2C(=O)c3c(Cl)sc(F)c3N2[H])c2sc(F)c(Cl)c21" (I should > add this smiles was generated by RDKIT, from a Mol file) > > mol = MolFromSmiles(smiles) > inchi = MolToInchi(mol) > > print inchi > InChI=1S/C12H2Cl2F2N2O2S2/c13-3-6-8(21-10(3)15)2(12(20)18-6)4-7(19)1-5(17-4)11(16)22-9(1)14/h17H,(H,18,20) > > when comparing the smiles and the inchi, the C=O has changed to an OH and a > C-N-H has changed to a C=N. I realise that these are tautomers of each > other, but surely the tautomeric form should be preserved when interchanging > smiles to inchi? Since at the moment, going Smiles->Inchi->Smiles does NOT > result in the original smiles... > > There is a layer in the INCHI standard which would allow description of > this, is there a way to turn that on? > > Many Thanks, > > Ed Pyzer-Knapp > > -- > Is your legacy SCM system holding you back? Join Perforce May 7 to find out: > • 3 signs your SCM is hindering your productivity > • Requirements for releasing software faster > • Expert tips and advice for migrating your SCM now > http://p.sf.net/sfu/perforce > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: • 3 signs your SCM is hindering your productivity • Requirements for releasing software faster • Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Fwd: RDKit cartridge similarity search speeds(?)
-- Forwarded message -- From: Markus Sitzmann Date: Thu, May 8, 2014 at 3:14 PM Subject: Re: [Rdkit-discuss] RDKit cartridge similarity search speeds(?) To: James Davidson Hi James, I would guess, in your second query, "morganbv_fp('c1nnccc1'::mol, 2)" has to be calculated for each row you are scanning because from the database's perspective the result is unpredictable (although it is not), so it can not be optimized so easily. All of this is avoided in your first query, the calculation is done once before the table scan and then the actual index/table scan is a rather simple one. Markus On Thu, May 8, 2014 at 2:35 PM, James Davidson wrote: > Dear All, > > > > I have recently been spending a bit more time with the RDKit cartridge, and > have what is probably a very naïve question… > > Having built some RDKit fingerprints for ChEMBL_18, I see the following > behaviour (for clarification – ‘ecfp4_bv’ is the column in my rdk.fps table > that has been generated using morganbv_fp(mol, 2)): > > > > > > chembl_18=# \timing on > > Timing is on. > > > > chembl_18=# set rdkit.tanimoto_threshold=0.5; > > SET > > Time: 0.167 ms > > > > chembl_18=# select chembl_id from rdk.fps where ecfp4_bv % > morganbv_fp('c1nnccc1'::mol,2); > > chembl_id > > - > > CHEMBL15719 > > (1 row) > > > > Time: 2033.348 ms > > > > chembl_18=# select chembl_id from rdk.fps where tanimoto_sml(ecfp4_bv, > morganbv_fp('c1nnccc1'::mol, 2)) > 0.5; > > chembl_id > > - > > CHEMBL15719 > > (1 row) > > > > Time: 6843.605 ms > > > > > > I can see that the query plans are different in the two cases, but I don’t > fully understand why – see below: > > > > QUERY 1 (with explain analyze) > > chembl_18=# explain analyze select chembl_id from rdk.fps where ecfp4_bv % > morganbv_fp('c1nnccc1'::mol,2); > > > QUERY PLAN > > > > Bitmap Heap Scan on fps (cost=106.91..5298.31 rows=1352 width=13) (actual > time=1774.986..1774.987 rows=1 loops=1) > >Recheck Cond: (ecfp4_bv % > '\x0100084200048204'::bfp) > >-> Bitmap Index Scan on fps_ecfp4bv_idx (cost=0.00..106.57 rows=1352 > width=0) (actual time=1774.969..1774.969 rows=1 loops=1) > > Index Cond: (ecfp4_bv % > '\x0100084200048204'::bfp) > > Total runtime: 1775.035 ms > > (5 rows) > > > > Time: 1776.133 ms > > > > > > QUERY 2 (with explain analyze) > > chembl_18=# explain analyze select chembl_id from rdk.fps where > tanimoto_sml(ecfp4_bv, morganbv_fp('c1nnccc1'::mol, 2)) > 0.5; > > > QUERY PLAN > > --- > > Seq Scan on fps (cost=0.00..388808.17 rows=450793 width=13) (actual > time=1278.115..6953.977 rows=1 loops=1) > >Filter: (tanimoto_sml(ecfp4_bv, > '\x0100084200048204'::bfp) >> 0.5::double precision) > >Rows Removed by Filter: 1352377 > > Total runtime: 6954.010 ms > > (4 rows) > > > > Time: 6955.103 ms > > > > > > It seems conceptually ‘easier’ to add the similarity value as part of the > query, rather than setting it as a variable ahead of the query; but clearly > I should be doing it the latter way for performance reasons. So even if I > don’t fully understand why at the moment, am I correct in thinking that > queries of this sort should always be run with the similarity operators (%, > #)? And if so, is the rdkit.tanimoto_threshold variable set at the level of > the session, the user, or the database? > > > > Kind regards > > > > James > > > __ > PLEASE READ: This email is confidential and may be privileged. It is > intended for the named addressee(s) only and access to it by anyone else is > unauthorised. If you are not an addressee, any disclosure or copying of the > contents of this email or any ac
Re: [Rdkit-discuss] implementation of tautomer enumeration/canonicalization
Hello, I was about to ask the same (I am one of the authors of the mentioned paper) - I had seen this post (gosh, a year ago) but had no time back then to answer (job search and a move from the US to Europe). I was digging into this last week a bit, however, I can not say much yet - very initial work. If something comes out of it, I would contribute it to RDKit. Well, if somebody has already done, I am happy, too. Or we join forces (however, for me it is only some private hacking with not so much time). Markus On Wed, Apr 2, 2014 at 11:32 AM, Dave W wrote: > Hi Markus and all, > > Did you or anyone else end up coding this up? I am looking into doing it > myself, but if it's already been done... > > Many thanks, > Dave > > > > -- > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Three tests failing on CentOS 5.3 - important ?
I think the syntax "except Exception as e:" did't exist before python 2.6 ... are you running this on an older version? :-) Cheers, Markus On Wed, Mar 19, 2014 at 7:54 AM, Jan Holst Jensen wrote: > On 2014-03-19 05:54, Greg Landrum wrote: > > > On Tue, Mar 18, 2014 at 4:59 PM, Jan Holst Jensen > wrote: >> >> Hi RDKitters, >> >> I managed to get RDKit 2013_09_2 built on CentOS 5.3. Will post a short >> recipe later. >> > > Wow; that's an ancient version. > > > Yup. Approaching archaeology-region here. > > > >> >> Right now, I am still left with three tests that fail, but I think (hope) >> that I can live with that ? Failing tests are: >> >> 72:pythonTestDbCLI >> 73:pythonTestDirML >> 78:pythonTestDirChem >> >> The test log shows that test 72 won't run because of missing SQLite >> support. >> >> 72/78 Testing: pythonTestDbCLI >> 72/78 Test: pythonTestDbCLI >> Command: "/usr/bin/python26" >> "/u01/software/RDKit_2013_09_2/Projects/test_list.py" "--testDir" >> "/u01/software/RDKit_2013_09_2/Projects" >> Directory: /u01/software/RDKit_2013_09_2/build/Projects >> "pythonTestDbCLI" start time: Mar 05 11:07 CET >> Output: >> -- >> Traceback (most recent call last): >> File "TestDbCLI.py", line 9, in ? >> from rdkit.Dbase.DbConnection import DbConnect >> File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbConnection.py", line >> 21, in ? >> from rdkit.Dbase import DbUtils,DbInfo >> File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbUtils.py", line 17, in >> ? >> from rdkit.Dbase.DbResultSet import >> DbResultSet,RandomAccessDbResultSet >> File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbResultSet.py", line >> 12, in ? >> from rdkit.Dbase import DbInfo >> File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbInfo.py", line 12, in >> ? >> import DbModule >> File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbModule.py", line 61, >> in ? >> raise ImportError,"Neither sqlite nor PgSQL support found." >> ImportError: Neither sqlite nor PgSQL support found. >> >> >> A bit puzzling, since I can "import sqlite3" just fine from Python when >> run interactively. As far as I can understand RDConfig.py a successful >> import of sqlite3 should make it report that SQLite support is availabe ? >> For my purposes, failing this test is probably fine - I don't expect I need >> sqlite support on this machine. > > > It's probably not important unless you are planning on using the DbCLI code. > If you want to try and track it down: can you do "from rdkit.Dbase import > DbModule"? > > > Goes just fine interactively. > > [oracle@localhost ~]$ python26 > Python 2.6.8 (unknown, Nov 7 2012, 14:47:45) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. from rdkit.Dbase import DbModule quit() > [oracle@localhost ~]$ > > Well, let's leave it at that. Since I am not planning to use the DbCLI code > at the moment I am OK with it. > > > > >> >> Tests 73 has this in the log: >> >> Traceback (most recent call last): >> File "UnitTestBuildComposite.py", line 16, in ? >> from rdkit.ML import BuildComposite >> File "/u01/software/RDKit_2013_09_2/rdkit/ML/BuildComposite.py", line >> 203, in ? >> from rdkit.ML.Composite import Composite,BayesComposite >> File "/u01/software/RDKit_2013_09_2/rdkit/ML/Composite/Composite.py", >> line 25, in ? >> from rdkit.ML.Data import DataUtils >> File "/u01/software/RDKit_2013_09_2/rdkit/ML/Data/DataUtils.py", line >> 57, in ? >> from rdkit.ML.Data import MLData >> File "/u01/software/RDKit_2013_09_2/rdkit/ML/Data/MLData.py", line 8, in >> ? >> import numpy >> ImportError: No module named numpy >> ... >> >> and test 78: >> >> Output: >> -- >> File "UnitTestInchi.py", line 187 >> except InchiReadWriteError as inst: >> ^ >> SyntaxError: invalid syntax >> File "PandasTools.py", line 100 >> except Exception as e: >> ^ >> SyntaxError: invalid syntax >> Traceback (most recent call last): >> File "UnitTestEState.py", line 17, in ? >> import numpy >> ImportError: No module named numpy >> Traceback (most recent call last): >> File "UnitTestFingerprints.py", line 17, in ? >> import numpy >> ImportError: No module named numpy >> ... >> >> >> The numpy module loads fine when run interactively. So maybe it is >> something else that is wrong - just that the error reported from Python is a >> bit misleading (?). > > > That's a strange one, but if you can import numpy and rdkit.Chem, then I > wouldn't be concerned about it. > Again, if you're interested in trying to track it down, there are some > experiments we can do. > >> >> >> I haven't run into stuff that doesn't work yet because of these test >> failures, so I think that I can get by without them passing. But it would be >> nice to know
Re: [Rdkit-discuss] docker.io - container for fully fledged rdkit installation on linux?
It is basically a VM that can be scripted from the host system. The VM client can be preconfigured with anything your software depends on (including databases etc and can be based on arbitrary Linux distributions independent of the Linux distribution of the host). On Wed, Nov 27, 2013 at 4:20 PM, Igor Filippov wrote: > Not to criticize or anything, but I've seen this issue quite a few times - > perhaps the problem > is actually with me and everybody else is "in the know"? > > I've spent last few minutes clicking around Docker website, I still cannot > figure out what it is and what it does? > I found that it runs on all Linux builds, that the latest release is a work > of 130 people, that there are Trusted Builds and Docker Hack Days. > But I still cannot puzzle out what it does!!! > > Would it kill the project maintainers to put a few words somewhere on the > top of the website what the software is actually all about? > > Igor > > P.S. I finally found some clues under "Learn More" link. I guess the point > is only those who already know or the really persistent ones or the ones > with > time to spare need to bother. > > > > > > > On Wed, Nov 27, 2013 at 8:09 AM, Samo Turk wrote: >> >> Hi rdkitters, >> >> New release of Docker is available and it brings one very impotant >> improvement - it runs on any linux distribution (as long as the kernel is >> 3.8 or later). I updated "RDKit Dockerfile" so it builds everything on top >> of Ubuntu 13.10 base image. To build the container do: >> "git clone https://gist.github.com/6669650.git ." >> "mv Dockerfile-rdkit Dockerfile" >> "sudo docker build -t rdkit ." >> >> Run it with: >> "sudo docker run -p 127.0.0.1:8889: rdkit" >> and IPython notebook will be available on http://127.0.0.1:8889/ >> >> Regards, >> Samo >> >> >> On Tue, Sep 24, 2013 at 9:08 AM, wrote: >>> >>> >>> I also highly appreciate your efforts! >>> >>> >>> Cheers, >>> Paul >>> >>> >>> > Stuff like this that makes it easier for people to access/use the >>> > RDKit is great! >>> > >>> > The more options we have the better. >>> > >>> > Many thanks to you guys for looking into this stuff. :-) >>> > >>> > -greg >>> > >>> > >>> >>> > Interesting stuff, looks promising! >>> > Got pulled in so I created a Dockerfile that builds an image with >>> > rdkit, ipython and matplotlib. Once the image is built it runs >>> > ipython notebook server. You can find the source here: https:// >>> > gist.github.com/samoturk/6669650 >>> > Just follow instructions in the first few lines of the Dockerfile to >>> > build and run it.. >>> > >>> > Regards, >>> > Samo >>> > >>> >>> >>> >>> This message and any attachment are confidential and may be privileged or >>> otherwise protected from disclosure. If you are not the intended >>> recipient, >>> you must not copy this message or attachment or disclose the contents to >>> any other person. If you have received this transmission in error, please >>> notify the sender immediately and delete the message and any attachment >>> from your system. Merck KGaA, Darmstadt, Germany and any of its >>> subsidiaries do not accept liability for any omissions or errors in this >>> message which may arise as a result of E-Mail-transmission or for damages >>> resulting from any unauthorized changes of the content of this message >>> and >>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its >>> subsidiaries do not guarantee that this message is free of viruses and >>> does >>> not accept liability for any damages caused by any virus transmitted >>> therewith. >>> >>> Click http://www.merckgroup.com/disclaimer to access the German, French, >>> Spanish and Portuguese versions of this disclaimer. >>> >>> >>> >>> -- >>> October Webinars: Code for Performance >>> Free Intel webinars can help you accelerate application performance. >>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>> from >>> the latest Intel processors and coprocessors. See abstracts and register >>> > >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >> >> >> -- >> Rapidly troubleshoot problems before they affect your business. Most IT >> organizations don't have a clear picture of how application performance >> affects their revenue. With AppDynamics, you get 100% visibility into your >> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics >> Pro! >> >> http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk >> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists