Re: [Rdkit-discuss] Beta of the 2022.03.1 release available

2022-03-18 Thread Markus Sitzmann
Hi Greg,

to give you some feedback: I switched my current research project to the
beta version and didn't find any problem yet ;-)

Best,
Markus

On Fri, Mar 18, 2022 at 1:32 PM Greg Landrum  wrote:

> Dear all,
>
> I tagged the first beta of the 2022.03 RDKit release this morning.
> Assuming nothing weird shows up during testing, we'll do the actual
> release on the 25th.
>
> You can find the new beta here:
> https://github.com/rdkit/rdkit/releases/tag/Release_2022_03_1b1
>
> Conda builds of the beta are available in the rdkit channel for python
> 3.8 on Mac and Linux:
> conda install -c rdkit/label/beta rdkit rdkit=2022.03
>
> Please try out the beta and let us know if you find any problems!
>
> Best regards,
> -greg
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Cheminformatics Graduate School Recommendations?

2021-07-19 Thread Markus Sitzmann
Hi Patrick,

labs I would take a look at (in no particular order and well, a bit heavy
on European labs):

Irwin Lab, UCFS: https://profiles.ucsf.edu/john.irwin
Bajorath Group, Bonn, Germany:
https://www.limes-institut-bonn.de/forschung/arbeitsgruppen/unit-4/abteilung-bajorath/abt-bajorath-startseite/
Reymond Group, Bern, Switzerland: https://www.gdb.unibe.ch/
Rarey Group, Hamburg, Germany:
https://www.zbh.uni-hamburg.de/personen/amd/mrarey.html
Leach Team, Cambridge, UK: https://www.ebi.ac.uk/about/people/andrew-leach
Czodrowski Lab, Dortmund, Germany: https://www.czodrowskilab.org/team

Best,
Markus


On Mon, Jul 19, 2021 at 6:17 PM Patrick Neal  wrote:

> Hi All,
>
> I apologize if this is too far off topic, but I got a recommendation to
> ask here since this community is the most likely to know!
>
> I'm about to graduate from my undergrad chemistry program and I'm looking
> for graduate schools. I started in traditional computational chemistry
> research, but have really loved the cheminformatics/datascience aspects of
> drug discovery. I'm hoping to ask the community if you all have any
> recommendations for academic labs (ideally US based) with interesting
> cheminformatics research?
>
> I'm specifically interested in fingerprinting methods (encoding
> 3D/conformational information), similarity search/clustering compounds at
> scale, and automation tools for QM calculations. But, I would be grateful
> to hear of any labs you think are doing great cheminformatics work!
>
> All the best,
>
> Patrick
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chembience Postgres RDKit extension

2021-05-09 Thread Markus Sitzmann
Hello,

I have added a Postgres-13/RDKit 2021.03-Version to my Chembience Postgres
RDKit project (https://github.com/chembience/docker-postgres-rdkit-compile)

Available Docker images are now

*chembience/postgres-rdkit:postgres-13.rdkit-2021.03 (new)*
chembience/postgres-rdkit:postgres-13.rdkit-2020.09
chembience/postgres-rdkit:postgres-12.rdkit-2020.03
chembience/postgres-rdkit:postgres-11.rdkit-2019.09

Best,
Markus


On Mon, Mar 8, 2021 at 8:49 AM Greg Landrum  wrote:

> That's really cool, thanks Markus!
>
> On Sat, Mar 6, 2021 at 7:34 PM Markus Sitzmann 
> wrote:
>
>> Hello,
>>
>> I have reworked the Postgres RDKit extension module of Chembience and
>> made it a spin-off project of its own which is available at:
>>
>> https://github.com/chembience/docker-postgres-rdkit-compile
>>
>> It is now based on a fork of the Official Postgres Docker Image
>> repository at GitHub just adding the compilation of the RDKit extension
>> module to it. It allows for local compilation of the package, however, I
>> also provide ready-to-pull Docker images at DockerHub of it. Currently
>> available by docker pull are (they all are usable independently of any
>> Chembience setup) :
>>
>> chembience/postgres-rdkit:postgres-13.rdkit-2020.09
>> chembience/postgres-rdkit:postgres-12.rdkit-2020.03
>> chembience/postgres-rdkit:postgres-11.rdkit-2019.09
>>
>> My plan is to keep this project up-to-date if newer versions of RDKit or
>> Postgres are released.
>>
>> I also have updated Chembience  https://github.com/chembience/chembience
>> to version 0.2.18 last week. This is mostly an upgrade to RDKit 2020.09
>> (before it becomes the "old" version) and Postgres 13 and relies already on
>> the project above.
>>
>> Best.
>> Markus
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Chembience Postgres RDKit extension

2021-03-06 Thread Markus Sitzmann
Hello,

I have reworked the Postgres RDKit extension module of Chembience and made
it a spin-off project of its own which is available at:

https://github.com/chembience/docker-postgres-rdkit-compile

It is now based on a fork of the Official Postgres Docker Image repository
at GitHub just adding the compilation of the RDKit extension module to it.
It allows for local compilation of the package, however, I also provide
ready-to-pull Docker images at DockerHub of it. Currently available by
docker pull are (they all are usable independently of any Chembience setup)
:

chembience/postgres-rdkit:postgres-13.rdkit-2020.09
chembience/postgres-rdkit:postgres-12.rdkit-2020.03
chembience/postgres-rdkit:postgres-11.rdkit-2019.09

My plan is to keep this project up-to-date if newer versions of RDKit or
Postgres are released.

I also have updated Chembience  https://github.com/chembience/chembience to
version 0.2.18 last week. This is mostly an upgrade to RDKit 2020.09
(before it becomes the "old" version) and Postgres 13 and relies already on
the project above.

Best.
Markus
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit/tautomers

2020-07-21 Thread Markus Sitzmann
Hi Benny,

that is a pure InChI problem (not a RDKit one). Back then when the Standard
InChI was defined, the 15T and the KET option for the InChI calculation
weren't either available or still experimental (I don't remember :-)), so
they didn't make it into the standard set of options for the Standard InChI
calculation. Hence it isn't too surprising that this tautomer pair doesn't
calculate the same Standard InChI (InChI isn't/wasn't particularly strong
regarding tautomerism outside rings). You might use (non-standard) InChI
and switch the 15T and KET options on, that should fix your particular case.

In general there are still ongoing efforts to make InChI stronger regarding
tautomerism: https://pubmed.ncbi.nlm.nih.gov/32043883/

Markus


On Tue, Jul 21, 2020 at 12:11 PM Da'Adoosh Binyamin <
daado...@tauex.tau.ac.il> wrote:

> Hi,
>
>
>
> I have a question about RDKit/tautomers.
>
>
>
> Let's say I have smiles input:
>
>
>
> C[CH]2CCC(=O)C1=C(O)[CH](O)C[CH](O)[CH]12
>
> C[CH]2CCC(O)=C1C(=O)[CH](O)C[CH](O)[CH]12
>
>
>
> Now, if I make this code for each input:
>
>
>
> m = Chem.MolFromSmiles(input)
>
> inchi = Chem.rdinchi.MolToInchi(m)
>
>
>
> I get different InChIs:
>
>
>
>
> InChI=1S/C11H16O4/c1-5-2-3-6(12)10-9(5)7(13)4-8(14)11(10)15/h5,7-9,13-15H,2-4H2,1H3
>
>
> InChI=1S/C11H16O4/c1-5-2-3-6(12)10-9(5)7(13)4-8(14)11(10)15/h5,7-9,12-14H,2-4H2,1H3
>
>
>
> My question is why is it happening. Usually if I enter two tautomers -
> they have the same InChI (like it is supposed to be, according to the
> literature ). What is the difference in this example?
>
>
>
> Thanks,
>
> Benny
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Install RDKit (Docker, rapids, rdkit=2020.03.2) ?/

2020-06-03 Thread Markus Sitzmann
Hi Joey,

maybe the Dockerfile of my Chembience project helps:

https://github.com/chembience/chembience/blob/master/context/build/rdkit/Dockerfile


The chembience/python-base image it starts from actually doesn't do much
except providing a very basic setup, its Dockerfile is here:

https://github.com/chembience/chembience/blob/master/context/build/base/Dockerfile


So it should be replaceable with
the rapidsai/rapidsai:0.12-cuda10.1-runtime-ubuntu18.04 image you want to
start of.

Markus

On Wed, Jun 3, 2020 at 10:43 PM Storer, Joey (J)  wrote:

> Hi,
>
>
>
> I am trying to run the following Docker file and the container fails to
> install rdkit.  Other incarnations install either the 2019 version or even
> the 2017 version.
>
>
>
> *#*
>
> *FROM rapidsai/rapidsai:0.12-cuda10.1-runtime-ubuntu18.04*
>
>
>
> *ARG ENVNAME=rapids*
>
> *ENV ENVNAME=$ENVNAME*
>
>
>
> *RUN source activate $ENVNAME && \*
>
> *conda install boost>='1.72.0,<1.72.1.0a0' cairo>='1.16.0,<1.17.0a0'
> freetype>='2.9.1,<3.0a0' libgcc-ng>='7.3.0' libstdcxx-ng>='7.3.0'
> numpy>='1.14.6,<2.0a0' pandas pillow pycairo python>='3.7,<3.8.0a0'
> python_abi='3.7.* *_cp37m' six*
>
>
>
> *RUN source activate $ENVNAME && \*
>
> *conda install "rdkit=2020.03.2=py37hdd87690_0"*
>
>
>
> *#*
>
>
>
> Any advice on getting RDKit into a Rapids/Ubuntu Docker container?
>
>
>
> Thanks!
>
> Joey Storer (Dow, Inc.)
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RD Kit PostgreSQL in a container

2019-12-04 Thread Markus Sitzmann
Hi,

I am working on for this for the next Chembience release (0.3.0 which I
hope will be out in January). It adds RDKit to the official Postgres
container repository at https://hub.docker.com/_/postgres

If you checkout
https://github.com/chembience/chembience-postgresql-rdkit.git *and use
branch deploy*, it should be in working conditions (images should be
available from Docker hub).

It can be used by the provided docker-compose.yml script in the repository,
i.e. it can be started with *docker-compose up*

I will add more documentation and some improvements for the January release
:-). And it currently works only for Postgres 11.

Best,
Markus
https://chembience.com

On Wed, Dec 4, 2019 at 7:38 PM Webster Homer <
webster.ho...@milliporesigma.com> wrote:

> I’m looking at running  RD Kit Postgresql cartridge in a docker container.
> Has anyone done this? There are PostgreSQL containers available on line at
> https://hub.docker.com/_/postgres  if there is an existing dockerfile
> with the RDKit extension, that would be great.
>
>
>
> If not has anyone built one? Ideally I’d start from one of the existing
> dockerfiles.
>
>
>
> RDKit Postgresql in the current distribution is version 11.2, the
> dockerfiles on the hub include an 11 and an 11.6 version. Any idea as to
> which one to use?
>
>
>
> I’m new to dockerfiles, I’d appreciate any suggestions
>
>
>
> Regards,
>
> Webster Homer
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith. Click http://www.merckgroup.com/disclaimer to access the
> German, French, Spanish and Portuguese versions of this disclaimer.
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inchi/smiles conversion issue

2019-06-18 Thread Markus Sitzmann
Yes, this is a well known problem: first of all, if there is more than one 
chemist present, you can always have a long discussions about what the most 
stable tautomeric form of a given compound (under certain conditions) might be, 
however, in case of InChI, if you ask the algorithm for the tautomer-invariant 
representation of a compound, i.e., the canonical tautomer (and the Standard 
InChI does this inherently), everybody agrees that in quite many cases it is 
quite an odd tautomer the InChI algorithm choose for the canonical one
 :-)

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 18. Jun 2019, at 18:41, Alexis Parenty  
> wrote:
> 
> Dear Jennifer,
> Many thanks for your response. Very useful tutorial on Inchi. I did not know 
> about the FixedH option:
> inchi = Chem.MolToInchi(mol, options='/FixedH')
> Best,
> Alexis
> 
>> On Tue, 18 Jun 2019 at 13:20, Jennifer Hemmerich 
>>  wrote:
>> Dear Alexis,
>> 
>> if you calculate the Standard Inchi it is invariant to tautomers (see here: 
>> https://www.inchi-trust.org/technical-faq-2/#6.1). Therefore the information 
>> which tautomer was converted is lost due to the Inchi conversion. If you 
>> want to keep the tautomer information you need to use the fixedH attribute 
>> for the inchi. But beware this makes it a non standard Inchi, and thus might 
>> not be comparable to other Inchis.
>> 
>> Hope this helps,
>> 
>> Jennifer
>> 
>>> On 18.06.19 12:59, Alexis Parenty wrote:
>>> Dear RdKiters,
>>> 
>>> Why is it that the stable tautomer of the following structure is lost 
>>> during inchi/smiles conversion?
>>> 
>>>  
>>> 
>>> 
>>> mol = Chem.MolFromSmiles("Cc1ccc([nH]nc2)c2c1")
>>> inchi = Chem.MolToInchi(mol)
>>> mol = Chem.MolFromInchi(inchi)
>>> smiles = Chem.MolToSmiles(mol)
>>> print(smiles)
>>> 
>>> ==> Cc1ccc2n[nH]cc2c1
>>>  
>>> 
>>> The H has shifted on the wrong Nitrogen…
>>> 
>>> Interestingly, if you remove the methyl, the shift no longer happens:
>>> 
>>> mol = Chem.MolFromSmiles("c1([nH]nc2)c21")
>>> inchi = Chem.MolToInchi(mol)
>>> mol = Chem.MolFromInchi(inchi)
>>> smiles = Chem.MolToSmiles(mol)
>>> print(smiles)
>>> ==>  c1([nH]nc2)c21
>>>  
>>> 
>>> Same issue for any secondary amides: if you pass the smiles of a secondary 
>>> amide, you end-up with the following unstable tautomer:
>>> 
>>>  
>>> 
>>> 
>>> Thanks,
>>> 
>>>  
>>> 
>>> Alexis
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] History of RDKit

2019-04-23 Thread Markus Sitzmann
Hi Paul,

maybe this is helpful, too:

https://cactus.nci.nih.gov/presentations/meeting-08-2011/Fri_Aft_Greg_Landrum_RDKit-PostgreSQL.pdf


Markus

On Tue, Apr 23, 2019 at 11:45 AM Czodrowski, Paul <
paul.czodrow...@tu-dortmund.de> wrote:

> Dear RDKitters,
>
>
>
> I’m using RDKit (of course!) for my “Data Science for Chemistry and
> Chemical Biology” class.
>
>
>
> Is anyone aware of a historic RDKit overview which is a bit more
> non-historic like this wonderful slide deck:
>
>
> https://www.rdkit.org/UGM/2012/Landrum_RDKit_UGM.History%20and%20Status.Final.pptx.pdf
>
>
>
>
>
> Best regards,
>
> Paul
>
>
>
>
>
>
>
> Prof. Dr. Paul Czodrowski
>
> Computational Chemical Biology
>
>
>
> *TU Dortmund University*
>
> Faculty of Chemistry and Chemical Biology
>
> Otto-Hahn-Strasse 6
>
> 44227 Dortmund
>
>
>
> Twitter www.twitter.com/czodrowskipaul
>
> Lab page www.czodrowskilab.org
>
> Music  www.czodrowskilab.org/music
>
>
>
> *Important note: The information included in this e-mail is confidential.
> It is solely intended for the recipient. If you are not the intended
> recipient of this e-mail please contact the sender and delete this message.
> Thank you.*
>
> *Without prejudice of e-mail correspondence, our statements are only
> legally binding when they are made in the conventional written form (with
> personal signature) or when such documents are sent by fax.*
>
>
>
>
>
> *Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie
> ist ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für
> diese E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender
> und vernichten Sie diese Mail. Vielen Dank. Unbeschadet der Korrespondenz
> per E-Mail, sind unsere Erklärungen ausschließlich final rechtsverbindlich,
> wenn sie in herkömmlicher Schriftform (mit eigenhändiger Unterschrift) oder
> durch Übermittlung eines solchen Schriftstücks per Telefax erfolgen.
> Important note: The information included in this e-mail is confidential. It
> is solely intended for the recipient. If you are not the intended recipient
> of this e-mail please contact the sender and delete this message. Thank
> you. Without prejudice of e-mail correspondence, our statements are only
> legally binding when they are made in the conventional written form (with
> personal signature) or when such documents are sent by fax. *
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] 2019.03.1 RDKit Release

2019-04-09 Thread Markus Sitzmann
I appreciate this release and updated all Chembience components to RDKit
2019.03:

https://github.com/chembience/chembience/releases/tag/0.2.10

Best,
Markus

On Tue, Apr 9, 2019 at 5:43 AM Greg Landrum  wrote:

> Dear all,
>
> I'm pleased to announce that the next version of the RDKit - 2019.03 - is
> released. The release notes are below.
>
> The release files are on the github release page:
> https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1
>
> Binaries have been uploaded to anaconda.org (https://anaconda.org/rdkit).
> The available conda binaries for this release are:
> Linux 64bit: python 3.6, 3.7
> Mac OS 64bit: python 3.6, 3.7
> Windows 64bit: python 3.6, 3.7
>
> I believe that conda-forge will also switch to the new version in the near
> future.
>
> Please note that the RDKit no longer supports Python 2.7. More details on
> this here:
>
> https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08354.html
>
> I plan to put conda builds of the PostgreSQL cartridge up in the near
> future.
>
> The online version of the documentation at rdkit.org (
> http://rdkit.org/docs/index.html) has been updated.
>
> Some things that will be finished over the next couple of days:
> - The conda build scripts will be updated to reflect the new version
> - The homebrew script
>
> Thanks to everyone who submitted code, bug reports, and suggestions for
> this release!
>
> Please let me know if you find any problems with the release or have
> suggestions for the next one, which is scheduled for October 2019.
>
> Best Regards,
> -greg
>
> # Release_2019.03.1
> (Changes relative to Release_2018.09.1)
>
> ## REALLY IMPORTANT ANNOUNCEMENT
> - As of this realease (2019.03.1) the RDKit no longer supports Python 2.
> Please
>   read this rdkit-discuss post to learn what your options are if you need
> to
>   keep using Python 2:
>
> https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08354.html
>
> ## Backwards incompatible changes
> - The fix for github #2245 means that the default behavior of the
> MaxMinPicker
>   is now truly random. If you would like to reproduce the previous
> behavior,
>   provide a seed value of 42.
> - The uncharging method in the MolStandardizer now attempts to generate
>   canonical results for a given molecule. This may result in different
> output
>   for some molecules.
>
> ## Highlights:
> - There's now a Japanese translation of large parts of the RDKit
> documentation
> - SGroup data can now be read from and written to Mol/SDF files
> - The enhanced stereo handling has been improved: the information is now
>   accessible from Python, EnumerateStereoisomers takes advantage of it,
> and it
>   can be read from and written to CXSmiles
>
> ## Acknowledgements:
> Michael Banck, Francois Berenger, Thomas Blaschke, Brian Cole, Andrew
> Dalke,
> Bakary N'tji Diallo, Guillaume Godin, Anne Hersey, Jan Holst Jensen,
> Sunhwan Jo,
> Brian Kelley, Petr Kubat, Karl Leswing, Susan Leung, John Mayfield, Adam
> Moyer,
> Dan Nealschneider, Noel O'Boyle, Stephen Roughley, Takayuki Serizawa,
> Gianluca
> Sforna, Ricardo Rodriguez Schmidt, Gianluca Sforna, Matt Swain, Paolo
> Tosco,
> Ricardo Vianello, 'John-Videogames', 'magattaca', 'msteijaert',
> 'paconius',
> 'sirbiscuit'
>
> ## Bug Fixes:
>   - PgSQL: fix boolean definitions for Postgresql 11
>  (github pull #2129 from pkubatrh)
>   - update fingerprint tutorial notebook
>  (github pull #2130 from greglandrum)
>   - Fix typo in RecapHierarchyNode destructor
>  (github pull #2137 from iwatobipen)
>   - SMARTS roundtrip failure
>  (github issue #2142 from mcs07)
>   - Error thrown in rdMolStandardize.ChargeParent
>  (github issue #2144 from paconius)
>   - SMILES parsing inconsistency based on input order
>  (github issue #2148 from coleb)
>   - MolDraw2D: line width not in python wrapper
>  (github issue #2149 from greglandrum)
>   - Missing Python API Documentation
>  (github issue #2158 from greglandrum)
>   - PgSQL: mol_to_svg() changes input molecule.
>  (github issue #2174 from janholstjensen)
>   - Remove Unicode From AcidBasePair Name
>  (github pull #2185 from lilleswing)
>   - Inconsistent treatment of `[as]` in SMILES and SMARTS
>  (github issue #2197 from greglandrum)
>   - RGroupDecomposition fixes, keep userLabels more robust
> onlyMatchAtRGroups
>  (github pull #2202 from bp-kelley)
>   - Fix TautomerTransform in operator=
>  (github pull #2203 from bp-kelley)
>   - testEnumeration hangs/takes where long on 32bit architectures
>  (github issue #2209 from mbanck)
>   - Silencing some Python 3 warning messages
>  (github pull #2223 from coleb)
>   - removeHs shouldn't remove atom lists
>  (github issue #2224 from rvianello)
>   - failure round-tripping mol block with Q atom
>  (github issue #2225 from rvianello)
>   - problem round-tripping mol files that include bond topology info
>  (github issue #2229 from rvianello)
>   - aromatic main-group atoms written to SMARTS incorrectly
>  (github issue #2237 from g

Re: [Rdkit-discuss] Beta of the 2019.03 release available

2019-04-05 Thread Markus Sitzmann
Hi Greg,

my Chembience RDKit image build with version 2019.03-b1b went fine (well, I
just pull it with conda; in case someone is interested it is available with
tag 0.2.10-beta-1 at Dockerhub).

For the Postgres extension (which I still compile myself during the Docker
build against Postgress), your python 3 enforcement uncovered some dark
corners of my build process, but that is fixed. However, compiling
2019.03-b1b against Postgres 11 fails during compilation (am I too cheeky?).

Markus

On Wed, Apr 3, 2019 at 11:38 AM Greg Landrum  wrote:

> Dear all,
>
> The beta of the 2019.03 RDKit release has been tagged in github:
> https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1b1
>
> There are a couple more bug fixes and maybe one more feature expected
> before the actual release, but I wanted to go ahead and get the beta out
> there.
>
> I've done conda builds for Python 3.6 and 3.7 for Windows, Mac, and Linux.
> These all use the beta label so that they do not install by default; you'll
> need to run "conda install" as follows:
>
> conda install -c rdkit/label/beta rdkit
>
> Be sure to confirm that it's installing the right version when you are
> prompted (if there's no build available, it will pick the current
> production release instead).
>
> The relevant section of the release notes is below, or you can see a
> nicely formatted version here:
> https://github.com/rdkit/rdkit/releases/tag/Release_2019_03_1b1
>
> As usual, if you have time to try out the new release I would love
> feedback. If nothing major comes up, I plan to do the actual release early
> next week.
>
> Best,
> -greg
>
> # Release_2019.03.1
> (Changes relative to Release_2018.09.1)
>
> ## REALLY IMPORTANT ANNOUNCEMENT
> - As of this realease (2019.03.1) the RDKit no longer supports Python 2. 
> Please read this rdkit-discuss post to learn what your options are if you 
> need to keep using Python 2:
>   
> https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08354.html
>
> ## Backwards incompatible changes
> - The fix for github #2245 means that the default behavior of the MaxMinPicker
>   is now truly random. If you would like to reproduce the previous behavior,
>   provide a seed value of 42.
> - The uncharging method in the MolStandardizer now attempts to generate
>   canonical results for a given molecule. This may result in different output
>   for some molecules.
>
> ## Highlights:
> - There's now a Japanese translation of large parts of the RDKit documentation
> - SGroup data can now be read from and written to Mol/SDF files
> - The enhanced stereo handling has been improved: the information is now
>   accessible from Python, EnumerateStereoisomers takes advantage of it, and it
>   can be read from and written to CXSmiles
>
> ## Acknowledgements:
> Michael Banck, Francois Berenger, Thomas Blaschke, Brian Cole, Andrew Dalke,
> Bakary N'tji Diallo, Guillaume Godin, Jan Holst Jensen, Sunhwan Jo, Brian
> Kelley, Petr Kubat, Karl Leswing, Susan Leung, John Mayfield, Adam Moyer, Dan
> Nealschneider, Noel O'Boyle, Stephen Roughley, Takayuki Serizawa, Gianluca
> Sforna, Ricardo Rodriguez Schmidt, Matt Swain, Paolo Tosco, Ricardo Vianello,
> 'John-Videogames', 'magattaca', 'msteijaert', 'paconius', 'sirbiscuit'
>
> ## Bug Fixes:
>   - PgSQL: fix boolean definitions for Postgresql 11
>  (github pull #2129 from pkubatrh)
>   - update fingerprint tutorial notebook
>  (github pull #2130 from greglandrum)
>   - Fix typo in RecapHierarchyNode destructor
>  (github pull #2137 from iwatobipen)
>   - SMARTS roundtrip failure
>  (github issue #2142 from mcs07)
>   - Error thrown in rdMolStandardize.ChargeParent
>  (github issue #2144 from paconius)
>   - SMILES parsing inconsistency based on input order
>  (github issue #2148 from coleb)
>   - MolDraw2D: line width not in python wrapper
>  (github issue #2149 from greglandrum)
>   - Missing Python API Documentation
>  (github issue #2158 from greglandrum)
>   - PgSQL: mol_to_svg() changes input molecule.
>  (github issue #2174 from janholstjensen)
>   - Remove Unicode From AcidBasePair Name
>  (github pull #2185 from lilleswing)
>   - Inconsistent treatment of `[as]` in SMILES and SMARTS
>  (github issue #2197 from greglandrum)
>   - RGroupDecomposition fixes, keep userLabels more robust onlyMatchAtRGroups
>  (github pull #2202 from bp-kelley)
>   - Fix TautomerTransform in operator=
>  (github pull #2203 from bp-kelley)
>   - testEnumeration hangs/takes where long on 32bit architectures
>  (github issue #2209 from mbanck)
>   - Silencing some Python 3 warning messages
>  (github pull #2223 from coleb)
>   - removeHs shouldn't remove atom lists
>  (github issue #2224 from rvianello)
>   - failure round-tripping mol block with Q atom
>  (github issue #2225 from rvianello)
>   - problem round-tripping mol files that include bond topology info
>  (github issue #2229 from rvianello)
>   - aromatic main-group atoms written to SMARTS incorrectly
>  (github issue #2237 from gregland

Re: [Rdkit-discuss] chemfp preprint

2019-03-22 Thread Markus Sitzmann
Yes, we all love ref 57.

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 22. Mar 2019, at 20:39, Andrew Dalke  wrote:
> 
> Hi RDKit users,
> 
>  This week I submitted a paper about chemfp for publication. I also submitted 
> a preprint on ChemRxiv, which was just accepted.
> 
> For those interested, it's at 
> https://chemrxiv.org/articles/The_Chemfp_Project/7877846 .
> 
> It's a rather long paper as it covers many aspects about the chemfp project, 
> including the FPS and FPB formats, search algorithms, details about the 
> different ways to compute a popcount, and memory bandwidth and latency 
> bottlenecks. On a non-technical level I also describe some of the 
> difficulties I ran into trying to run chemfp as "commercial free software."
> 
> Let me know of any corrections or improvements, or any other feedback you 
> might have.
> 
> Cheers,
> 
>Andrew
>da...@dalkescientific.com
> 
> 
> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKIT Build Problems

2019-03-06 Thread Markus Sitzmann
Thanks Greg. 

I have the problem at CI, too. It was 100% failure rate the last two days. At 
home, occasionally. At least, it isn’t only me :-)

Markus


-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 6. Mar 2019, at 17:25, Greg Landrum  wrote:
> 
> 
> 
>> On Wed, Mar 6, 2019 at 1:15 PM Markus Sitzmann  
>> wrote:
>> 
>> Does someone maybe has the same problem? And an explanation whats going on - 
>> the Avalon tools seemed to be unchanged since quite a while.
> 
> I've seen it a couple of times in the CI builds for the RDKit over the past 
> couple of days. It's caused by downloads failing since sourceforge is 
> unreliable.
> 
> -greg
> 
> 
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKIT Build Problems

2019-03-06 Thread Markus Sitzmann
Hi everybody,

Currently (the last couple of days) the build of one of my Docker images
stopped working:

https://github.com/chembience/chembience/blob/develop/context/build/rdkit-postgres-compile/Dockerfile

That occurred to me sporadically already over the last few years and
usually disappeared after a rebuild. Currently it is annoying enough to
write an email. The build ends with:

"""
== Using strict rotor definition
Downloading
http://sourceforge.net/projects/avalontoolkit/files/AvalonToolkit_1.2/AvalonToolkit_1.2.0.source.tar.
..
  % Total% Received % Xferd  Average Speed   TimeTime Time
Current
 Dload  Upload   Total   SpentLeft
Speed

100   178  100   1780 0   1181  0 --:--:-- --:--:-- --:--:--
1186
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:--
 0
curl: (35) Unknown SSL protocol error in connection to sourceforge.net:443
*CMake Error at Code/cmake/Modules/RDKitUtils.cmake:215 (MESSAGE):*
*  The md5 checksum for*
*  /opt/rdkit/External/AvalonTools/AvalonToolkit_1.2.0.source.tar is*
*  incorrect; expected: 092a94f421873f038aa67d4a6cc8cb54, found:*
*  d41d8cd98f00b204e9800998ecf8427e*
*Call Stack (most recent call first):*
*  External/AvalonTools/CMakeLists.txt:29 (downloadAndCheckMD5)*


-- Configuring incomplete, errors occurred!
See also "/opt/rdkit/build/CMakeFiles/CMakeOutput.log".
See also "/opt/rdkit/build/CMakeFiles/CMakeError.log".
ERROR: Service 'rdkit-postgres-compile' failed to build: The command
'/bin/sh -c wget --quiet -O -
https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -  && echo
'deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main' >
/etc/apt/sources.list.d/pgdg.list  && apt-get update && apt-get install -y
--no-install-recommends  postgresql-server-dev-all
 postgresql-client postgresql-plpython-${PG_VERSION}
 postgresql-plpython3-${PG_VERSION} python-numpy python-dev
 sqlite3 libsqlite3-dev libboost-dev libboost-system-dev
 libboost-thread-dev libboost-serialization-dev
 libboost-python-dev libboost-regex-dev libeigen3-dev && git
clone -b $RDKIT_BRANCH --single-branch https://github.com/rdkit/rdkit.git
&& mkdir $RDBASE/build && cd $RDBASE/build && cmake
 -DRDK_BUILD_INCHI_SUPPORT=ON   -DRDK_BUILD_PGSQL=ON
 -DRDK_BUILD_AVALON_SUPPORT=ON
 -DPostgreSQL_TYPE_INCLUDE_DIR="/usr/include/postgresql/${PG_VERSION}/server"
 -DPostgreSQL_ROOT="/usr/lib/postgresql/${PG_VERSION}" .. && make
-j `nproc` && make install' returned a non-zero code: 1
Exited with code 1
"""

Does someone maybe has the same problem? And an explanation whats going on
- the Avalon tools seemed to be unchanged since quite a while.

Thanks,
Markus
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Release 2018.09.2 available

2019-03-06 Thread Markus Sitzmann
Hi Greg,

in the meantime, I found a solution: I added a "conda update" as final step
when I build my base python/conda container. The RDKit container is build
on top of that and now it finds RDKit 2018.09.2 without telling conda the
version number explicitly.  Why this worked beforehand without this step
and why it is necessary although I start building the containers basically
from scratch and the newest version, I still don't know.

Best,
Markus

On Sat, Feb 23, 2019 at 12:29 AM Dimitri Maziuk via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> On 2/22/19 5:01 PM, Markus Sitzmann wrote:
>
> > It is odd, but one thing I learned from using conda is, sometimes it
> helps
> > to ignore problems and wait for a bit and they might go away ... well, I
> > have similar experiences with maven :-) ... but most likely I do
> something
> > stupid which I don't see right now :-)
>
> Simple test is to make a clean one and install only rdkit and nothing
> else and see what happens. It's pretty common for packagers to do
> something-that-may-or-may-not-be-stupid and have a dependency on an
> specific version of some other package that depends on a specific
> version of another package that depends on... turtles all the way down.
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Release 2018.09.2 available

2019-02-22 Thread Markus Sitzmann
Hi Greg,

unfortunately, the problem persist - well, it isn't a big one since if I
explicitly say it should install 2018.09.2 it does that. Only when I just
ask to install rdkit, it is still delivers version 2018.09.1 (when I ask
the very same conda instance then to search me all available rdkit
packages, it even finds 2018.09.2 ).

It is odd, but one thing I learned from using conda is, sometimes it helps
to ignore problems and wait for a bit and they might go away ... well, I
have similar experiences with maven :-) ... but most likely I do something
stupid which I don't see right now :-)

Anyway, thanks for your reply.
Markus


On Fri, Feb 22, 2019 at 9:05 AM Greg Landrum  wrote:

> Hi Markus,
>
> I can't reproduce that. Here's what I get when I create a new environment:
>
> (tmp) glandrum@otter:~/Code/rdkit_containers/docker$ conda install
> conda-forge::rdkit
> Collecting package metadata: done
> Solving environment: done
>
> ## Package Plan ##
>
>   environment location: /other_linux/home/glandrum/anaconda3/envs/tmp
>
>   added / updated specs:
> - conda-forge::rdkit
>
>
> The following packages will be downloaded:
>
> package|build
> ---|-
> cairo-1.16.0   |ha4e643d_1000 1.5 MB
> conda-forge
>
> 
>
> rdkit-2018.09.2|   py37h270f4b7_020.0 MB
> conda-forge
>
> 
>
> 
>Total:32.6 MB
>
> The following NEW packages will be INSTALLED:
>
>   blas   pkgs/main/linux-64::blas-1.0-mkl
>
>   boost  conda-forge/linux-64::boost-1.68.0-py37h8619c78_1001
>
>   boost-cpp  conda-forge/linux-64::boost-cpp-1.68.0-h11c811c_1000
> 
>   rdkit  conda-forge/linux-64::rdkit-2018.09.2-py37h270f4b7_0
>
> 
>
> Maybe it was connected to the new version just having appeared? Do you
> still have the same problem?
>
> -greg
>
>
>
>
>
> On Fri, Feb 22, 2019 at 12:43 AM Markus Sitzmann <
> markus.sitzm...@gmail.com> wrote:
>
>> Hi Greg,
>>
>> I just saw it is available in the conda-forge channel (with a time stamp
>> of 2 hours + a few minutes), however, if I install it from there (in a
>> fresh container) I receive 2018_09_1 - only when I explicitly force version
>> 2018_09_2 I receive it (and at a very fast glance it is running).
>>
>> But why do I have to request version _02 explicitly (right at the moment)
>> ... this is one of the few things I never will get with conda?
>>
>> Markus
>>
>>
>> On Thu, Feb 21, 2019 at 5:32 PM Greg Landrum 
>> wrote:
>>
>>> Dear all,
>>>
>>> I normally don't announce the patch releases, but there are a couple of
>>> changes with the conda builds, so I figured I should probably mention it.
>>> :-)
>>>
>>> This time I did builds for:
>>> Python 3.7: Mac, Linux, Windows
>>> Python 3.6: Mac, Linux, Windows
>>> Python 2.7: Mac, Linux
>>>
>>> The boost and numpy dependencies have also been changed.
>>>
>>> The conda-forge channel should be updated in the near future as well.
>>>
>>> The release notes and source download are here:
>>> https://github.com/rdkit/rdkit/releases/tag/Release_2018_09_2
>>>
>>> Hopefully this all works smoothly, but I'm not 100% optimistic about
>>> that; please let me know if you encounter any problems with the new builds!
>>> -greg
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Release 2018.09.2 available

2019-02-21 Thread Markus Sitzmann
Hi Greg,

I just saw it is available in the conda-forge channel (with a time stamp of
2 hours + a few minutes), however, if I install it from there (in a fresh
container) I receive 2018_09_1 - only when I explicitly force version
2018_09_2 I receive it (and at a very fast glance it is running).

But why do I have to request version _02 explicitly (right at the moment)
... this is one of the few things I never will get with conda?

Markus


On Thu, Feb 21, 2019 at 5:32 PM Greg Landrum  wrote:

> Dear all,
>
> I normally don't announce the patch releases, but there are a couple of
> changes with the conda builds, so I figured I should probably mention it.
> :-)
>
> This time I did builds for:
> Python 3.7: Mac, Linux, Windows
> Python 3.6: Mac, Linux, Windows
> Python 2.7: Mac, Linux
>
> The boost and numpy dependencies have also been changed.
>
> The conda-forge channel should be updated in the near future as well.
>
> The release notes and source download are here:
> https://github.com/rdkit/rdkit/releases/tag/Release_2018_09_2
>
> Hopefully this all works smoothly, but I'm not 100% optimistic about that;
> please let me know if you encounter any problems with the new builds!
> -greg
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Warning as error

2019-01-21 Thread Markus Sitzmann
Maybe this helps (at least, it is from Greg):

https://github.com/rdkit/rdkit/issues/642

Markus

On Mon, Jan 21, 2019 at 2:25 PM Jean-Marc Nuzillard <
jm.nuzill...@univ-reims.fr> wrote:

> My problem is more to know which molecules cause problems
> than avoiding the printing of warning messages in the console window.
> I am looking for an option that would turn warnings into errors, if any.
>
> Jean-Marc
>
>
>
> Le 21/01/2019 à 13:44, Stephen O'hagan a écrit :
> > I've had similar problems; none of the claimed methods to switch off
> RDKit logging of warnings has worked for me.
> >
> > I ended up just re-directing stderr when running the script like this:
> >
> > python myfile.py  2> myErrorLog.txt
> >
> > 
> > Dr. Steve O'Hagan,
> >
> >
> > -Original Message-
> > From: Jean-Marc Nuzillard [mailto:jm.nuzill...@univ-reims.fr]
> > Sent: 21 January 2019 12:33
> > To: RDKit Discuss 
> > Subject: [Rdkit-discuss] Warning as error
> >
> > Dear all,
> >
> > The minimalist python code:
> >   reader = Chem.SDMolSupplier('my_file.sdf')
> >   for mol in reader:
> >   pass
> >
> > gives me warning messages when run on a particular SD file.
> > How can I simply run a specific action for the molecules that cause
> problem, possibly using  try/catch statements?
> > Best,
> >
> > Jean-Marc
> >
> >
> > --
> > Jean-Marc Nuzillard
> > Directeur de Recherches au CNRS
> >
> > Institut de Chimie Moléculaire de Reims
> > CNRS UMR 7312
> > Moulin de la Housse
> > CPCBAI, Bâtiment 18
> > BP 1039
> > 51687 REIMS Cedex 2
> > France
> >
> > Tel : 03 26 91 82 10
> > Fax : 03 26 91 31 66
> > http://www.univ-reims.fr/ICMR
> > http://eos.univ-reims.fr/LSD/CSNteam.html
> >
> > http://www.univ-reims.fr/LSD/
> > http://www.univ-reims.fr/LSD/JmnSoft/
> >
> >
> > ---
> > L'absence de virus dans ce courrier électronique a été vérifiée par le
> logiciel antivirus Avast.
> > https://www.avast.com/antivirus
> >
> >
> >
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
> --
> Jean-Marc Nuzillard
> Directeur de Recherches au CNRS
>
> Institut de Chimie Moléculaire de Reims
> CNRS UMR 7312
> Moulin de la Housse
> CPCBAI, Bâtiment 18
> BP 1039
> 51687 REIMS Cedex 2
> France
>
> Tel : 03 26 91 82 10
> Fax : 03 26 91 31 66
> http://www.univ-reims.fr/ICMR
> http://eos.univ-reims.fr/LSD/CSNteam.html
>
> http://www.univ-reims.fr/LSD/
> http://www.univ-reims.fr/LSD/JmnSoft/
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-21 Thread Markus Sitzmann
> SQLalchemy creates a fairly specific ecosystem that you have to buy
> into for it to make sense. When you don't have objects, only a table
> of properties, OR mapper is just bloat.

There is no need for objects with SQLAlchemy, SQLAlchemy's Core and its
expression language is pretty excellent without objects ...

>With parallel processing your bottleneck is going to be database
>inserts. One option is write out CSV file(s) from each thread/job,
>concatenate them in the final node, and then bulk-import into the
>database: typically CSV (or other such format) bulk import is orders
>of magnitude faster than inserting one SQL statement at a time.

... and bulk-inserts of Python data types into the database.

Markus

On Sun, Jan 20, 2019 at 9:17 PM Dmitri Maziuk via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> On Sun, 20 Jan 2019 12:03:50 +0100
> Shojiro Shibayama  wrote:
>
> > ... I guess SQLalchemy
> > in python might be good, but I'm not sure. Hope that you'll find out
> > a good library of SQL OR mapper for python.
>
> SQLalchemy creates a fairly specific ecosystem that you have to buy
> into for it to make sense. When you don't have objects, only a table
> of properties, OR mapper is just bloat.
>
> With parallel processing your bottleneck is going to be database
> inserts. One option is write out CSV file(s) from each thread/job,
> concatenate them in the final node, and then bulk-import into the
> database: typically CSV (or other such format) bulk import is orders
> of magnitude faster than inserting one SQL statement at a time.
>
> --
> Dmitri Maziuk 
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Markus Sitzmann
I think I do vaguely remember that InChI gives precedence to 3D coordinates if 
present over anything else for the determination of stereochemistry. And I 
think that is what happens here: the Allchem embedding of the molecule adds 3D 
coordinates which are not present for the original  molecule create straight 
from InChI. Probably the minimization of the structure during the embedding is 
“turning around” the stereochemistry (probably you could have a long discussion 
whether this is a bug or a feature),

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 18. Dec 2018, at 19:43, Jason Biggs  wrote:
> 
> see https://github.com/rdkit/rdkit/issues/1852, and 
> https://sourceforge.net/p/rdkit/mailman/message/36309813/
> 
> You can see it in the smiles if you remove stereo after embedding, then 
> re-detect stereo from the conformation.
> 
> inchi1 = 
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m1 = Chem.MolFromInchi(inchi1)
> m1 = Chem.AddHs(m1)
> m2 = Chem.Mol(m1)
> AllChem.EmbedMolecule(m2)
> m3 = Chem.Mol(m2)
> Chem.rdmolops.RemoveStereochemistry(m3)
> Chem.rdmolops.AssignStereochemistryFrom3D(m3)
> sm1 = Chem.MolToSmiles(m1)
> sm2 = Chem.MolToSmiles(m2)
> sm3 = Chem.MolToSmiles(m3)
> print(sm1 == sm2)  # returns true
> print(sm2 == sm3) # returns false
> 
> The difference between sm2 and sm3 is just swapping a \ for a /, confirming 
> what Christos was able to read from the InChI.
> 
> Why does the inchi reflect the 3D bond stereo but the smiles doesn't until 
> you remove and re-detect the stereo?  Does the InChI code go to the 3D 
> structure when present and ignore stereo information in the mol object?
> 
> Jason Biggs
> 
> 
>> On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas  
>> wrote:
>> Hi Jean-Marc,
>> 
>> There difference is due to bond orientation (if my inchi analysis skills are 
>> correct).
>> See the bold bond layer below (14-7+ vs 14-7-).
>> 
>> m1 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1
>> 
>> m2 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7-/t17-,19-/m1/s1
>> 
>> Not sure why it happens, but I've seen it multiple times...
>> 
>> Best,
>> Christos
>> 
>> Christos Kannas
>> 
>> Chem[o]informatics Researcher & Software Developer
>> 
>> 
>> 
>> 
>> 
>>> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD 
>>>  wrote:
>>> Thank you for your answer but alatis might not be adapted to my current 
>>> problem.
>>> 
>>> Attempting to understand what was changed by the embedding step I wrote:
>>> 
>>> inchi1 = 
>>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
>>> m1 = Chem.MolFromInchi(inchi1)
>>> m1 = Chem.AddHs(m1)
>>> m2 = Chem.Mol(m1)
>>> AllChem.EmbedMolecule(m2)
>>> sm1 = Chem.MolToSmiles(m1)
>>> sm2 = Chem.MolToSmiles(m2)
>>> print(sm1)
>>> print(sm2)
>>> print(sm1 == sm2)
>>> inc1 = Chem.MolToInchi(m1)
>>> inc2 = Chem.MolToInchi(m2)
>>> print(inc1)
>>> print(inc2)
>>> print(inc1 == inc2)
>>> 
>>> Molecules m1 and m2 have identical SMILES representations
>>> but different InChI representations, which I find odd.
>>> 
>>> All the best,
>>> 
>>> Jean-Marc
>>> 
>>> 
>>> 
>>> 
>>> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
>>> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
>>> >> Is there any more deterministic procedure than the one of trying until
>>> >> success is obtained?
>>> >> 
>>> >> How do I determine the InChI string of a conformer obtained after
>>> >> multiple embedding?
>>> > 
>>> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/
>>> > 
>>> > Generally speaking the problem with InChI is that the only *required*
>>> > layer is the formula. Therefore *an* InChI string cannot be used to
>>> > differentiate conformers, you need the InChI string with all the
>>> > relevant layers and all the proton

Re: [Rdkit-discuss] Chembience

2018-10-31 Thread Markus Sitzmann
Hello,

I have releases Chembience 0.2.6 - it switches Python from 3.6 to 3.7 and
updates RDKit to 2018.09.1. Just to mention it, the Docker images of all
previous releases are also still available from Dockerhub.

https://github.com/chembience/chembience/releases/tag/v0.2.6

https://twitter.com/markussitzmann/status/105216581521409

Markus







On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann 
wrote:

> Hello,
>
> since it includes RDKit as one of its major components I am happy to
> announce the first release of my new open-source project Chembience:
>
> A Docker-based, cloudable platform for the development of
> chemoinformatics-centric web applications and microservices.
>
> https://github.com/chembience/chembience
>
> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it
> before 2018.03 :-) ).
>
> Best,
> Markus
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Compilation Errors on RHEL7

2018-10-24 Thread Markus Sitzmann


Re: [Rdkit-discuss] Chembience

2018-10-24 Thread Markus Sitzmann
Feedback was so far kind words and Twitter likes :-). And looking on my github 
stats I also see some clones. 

However, I am happy so far with it - I know it is still a bit heady and I have 
to improve documentation a lot. And I also want to build some easily 
distributable open chemoinformatics projects on top of it which I hope creates 
more interest. From my Chemical Identifier Resolver days I know you have to 
patient.

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 24. Oct 2018, at 17:56, Greg Landrum  wrote:
> 
> Glad that update went ok.
> 
> Have you gotten any feedback about this yet?
> 
>> On Wed, 24 Oct 2018 at 14:10, Markus Sitzmann  
>> wrote:
>> 
>> Hello,
>> 
>> I have released Chembience 0.2.5:  it updates the Docker images of the 
>> Django and the Jupyter notebook app in Chembience + the Postgres extension 
>> of the Chembience database image to RDKit 2018.09 (and it went really smooth 
>> :-) )  
>> 
>> https://twitter.com/markussitzmann/status/1055047319660490753
>> 
>> https://github.com/chembience/chembience/releases  
>> 
>> https://www.chembience.com
>> 
>> Best,
>> Markus
>> 
>>> On Fri, Aug 31, 2018 at 11:20 PM Markus Sitzmann 
>>>  wrote:
>>> Hello,
>>> 
>>> I have put together another Chembience release (0.2.3): update of RDKit to 
>>> version 2018.03.4, Postgres to version 10.5, and Django to 2.1
>>> 
>>> https://github.com/chembience/chembience
>>> 
>>> https://twitter.com/markussitzmann/status/1035629283736264704
>>> 
>>> Best,
>>> Markus
>>> 
>>>> On Sun, Jun 10, 2018 at 4:41 PM Markus Sitzmann 
>>>>  wrote:
>>>> Hello,
>>>> 
>>>> I have just released Chembience 0.2.1: it updates RDKit to version 
>>>> 2018.03.2 and switches Postgres from the 9.x series to version 10.4 
>>>> 
>>>> https://github.com/chembience/chembience
>>>> 
>>>> Best,
>>>> Markus
>>>> 
>>>> 
>>>>> On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann 
>>>>>  wrote:
>>>>> Hello,
>>>>> 
>>>>> I have released Chembience 0.2.0: it includes an update to RDKit 2018.03 
>>>>> and also provides Jupyter as new base App container type.
>>>>> 
>>>>> https://github.com/chembience/chembience
>>>>> 
>>>>> (so, assuming you have Docker and docker-compose installed on your 
>>>>> computer, you are a few, easy commands away from your personal Jupyter 
>>>>> notebook server with all RDKit 2018.03 goodness readily available).
>>>>> 
>>>>> Best,
>>>>> Markus
>>>>> 
>>>>> 
>>>>>> On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann 
>>>>>>  wrote:
>>>>>> Hello,
>>>>>> 
>>>>>> since it includes RDKit as one of its major components I am happy to 
>>>>>> announce the first release of my new open-source project Chembience:
>>>>>> 
>>>>>> A Docker-based, cloudable platform for the development of 
>>>>>> chemoinformatics-centric web applications and microservices. 
>>>>>> 
>>>>>> https://github.com/chembience/chembience
>>>>>> 
>>>>>> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it 
>>>>>> before 2018.03 :-) ).
>>>>>> 
>>>>>> Best,
>>>>>> Markus
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chembience

2018-10-24 Thread Markus Sitzmann
Hello,

I have released *Chembience 0.2.5*: it updates the Docker images of the
Django and the Jupyter notebook app in Chembience + the Postgres extension
of the Chembience database image to *RDKit 2018.09* (and it went really
smooth :-) )

https://twitter.com/markussitzmann/status/1055047319660490753

https://github.com/chembience/chembience/releases

https://www.chembience.com

Best,
Markus

On Fri, Aug 31, 2018 at 11:20 PM Markus Sitzmann 
wrote:

> Hello,
>
> I have put together another Chembience release (0.2.3): update of RDKit to
> version 2018.03.*4, *Postgres to version 10.5, and Django to 2.1
>
> https://github.com/chembience/chembience
>
> https://twitter.com/markussitzmann/status/1035629283736264704
>
> Best,
> Markus
>
> On Sun, Jun 10, 2018 at 4:41 PM Markus Sitzmann 
> wrote:
>
>> Hello,
>>
>> I have just released Chembience 0.2.1: it updates RDKit to version
>> 2018.03.2 and switches Postgres from the 9.x series to version 10.4
>>
>> https://github.com/chembience/chembience
>>
>> Best,
>> Markus
>>
>>
>> On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann <
>> markus.sitzm...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I have released Chembience 0.2.0: it includes an update to RDKit 2018.03
>>> and also provides Jupyter as new base App container type.
>>>
>>> https://github.com/chembience/chembience
>>>
>>> (so, assuming you have Docker and docker-compose installed on your
>>> computer, you are a few, easy commands away from your personal Jupyter
>>> notebook server with all RDKit 2018.03 goodness readily available).
>>>
>>> Best,
>>> Markus
>>>
>>>
>>> On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann <
>>> markus.sitzm...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> since it includes RDKit as one of its major components I am happy to
>>>> announce the first release of my new open-source project Chembience:
>>>>
>>>> A Docker-based, cloudable platform for the development of
>>>> chemoinformatics-centric web applications and microservices.
>>>>
>>>> https://github.com/chembience/chembience
>>>>
>>>> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it
>>>> before 2018.03 :-) ).
>>>>
>>>> Best,
>>>> Markus
>>>>
>>>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] [Question] Ok to switch to conda-forge for RDKit builds?

2018-10-18 Thread Markus Sitzmann
I am happy with conda forge :-). And thanks for the great work.

Markus

On Thu, Oct 18, 2018 at 5:46 PM Greg Landrum  wrote:

> Um, guys, there are some interesting side conversations starting here, but
> can we please keep this thread on the "is it ok for me to stop doing builds
> on the RDKit channel" question?
> This is important to me (and possibly the community), so I'd like the keep
> the discussion as simple and uncluttered as possible.
>
> Thanks,
> -greg
>
>
> On Thu, Oct 18, 2018 at 5:26 PM Markus Sitzmann 
> wrote:
>
>> Hmm, isn't that the problem with any build/dependency automation tool
>> and hard to fix in a generic way? If you are really very dependent on a
>> specific version of a software you have to be very careful with the
>> environment it sits in while you do "carefree" updates only in a carefree
>> environment :-) (and environment management got a lot easier the recent
>> years)
>>
>> Markus
>>
>> On Thu, Oct 18, 2018 at 4:43 PM Greg Landrum 
>> wrote:
>>
>>>
>>>
>>> On Thu, Oct 18, 2018 at 2:21 PM Eric Jonas  wrote:
>>>
>>>> Greg, I'm all for anything that makes the release process on developers
>>>> easier; my main question is : With conda-forge, how hard is it to install
>>>> just _one_ package without having everything else (say numpy, pandas, etc)
>>>> upgraded to the latest conda-forge version? I've had situations in the past
>>>> where i'm like "oh I'd just like the latest ___" and suddenly everything in
>>>> my conda env has been upgraded to the bleeding edge.
>>>>
>>>
>>> That's a great question, and it's one I don't really know the answer to.
>>>
>>> On my PC (I'm on the train, and this is what I have with me), here's
>>> what I did:
>>> - create a new conda environment that includes an rdkit-channel RDKit
>>> install
>>> - uninstall the RDKit from that
>>> - install the RDKit from the conda-forge channel
>>>
>>> Here's what ends up getting changed:
>>>
>>> ## Package Plan ##
>>>
>>>   environment location: C:\Users\glandrum\Anaconda3\envs\py36_tmp
>>>
>>>   added / updated specs:
>>> - rdkit
>>>
>>>
>>> The following NEW packages will be INSTALLED:
>>>
>>> boost: 1.67.0-py36_vc14_0  conda-forge [vc14]
>>> boost-cpp: 1.67.0-vc14_0   conda-forge [vc14]
>>> pycairo:   1.16.3-py36_vc14_0  conda-forge [vc14]
>>> rdkit: 2018.03.4-py36h857267b_1000 conda-forge
>>>
>>> The following packages will be UPDATED:
>>>
>>> certifi:   2018.10.15-py36_0   -->
>>> 2018.10.15-py36_1000 conda-forge
>>> jpeg:  9b-hb83a4c4_2   --> 9b-vc14_2
>>> conda-forge [vc14]
>>> tk:8.6.8-hfa6e2cd_0--> 8.6.8-vc14_0
>>>conda-forge [vc14]
>>>
>>> The following packages will be DOWNGRADED:
>>>
>>> icu:   58.2-ha66f8fd_1 --> 58.2-vc14_0
>>> conda-forge [vc14]
>>> libpng:1.6.35-h2a8f88b_0   -->
>>> 1.6.34-vc14_0conda-forge [vc14]
>>> libtiff:   4.0.9-h36446d0_2--> 4.0.9-vc14_0
>>>conda-forge [vc14]
>>> pillow:5.3.0-py36hdc69c19_0-->
>>> 5.2.0-py36h08d_0
>>> pixman:0.34.0-hcef7cb0_3   -->
>>> 0.34.0-vc14_2conda-forge [vc14]
>>> vc:14.1-h0510ff6_4 --> 14-0
>>>conda-forge
>>> zlib:  1.2.11-h8395fce_2   -->
>>> 1.2.11-vc14_0conda-forge [vc14]
>>>
>>>
>>> That's a fair amount of change, but is less than what I thought might
>>> happen (I was worried about numpy+pandas+... being updated).
>>> So that's one data point. What's your take?
>>>
>>>
>>> I will try the same thing on my Mac and Linux boxes tomorrow if no one
>>> else has done it by then.
>>>
>>> -greg
>>>
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] [Question] Ok to switch to conda-forge for RDKit builds?

2018-10-18 Thread Markus Sitzmann
Hmm, isn't that the problem with any build/dependency automation tool and
hard to fix in a generic way? If you are really very dependent on a
specific version of a software you have to be very careful with the
environment it sits in while you do "carefree" updates only in a carefree
environment :-) (and environment management got a lot easier the recent
years)

Markus

On Thu, Oct 18, 2018 at 4:43 PM Greg Landrum  wrote:

>
>
> On Thu, Oct 18, 2018 at 2:21 PM Eric Jonas  wrote:
>
>> Greg, I'm all for anything that makes the release process on developers
>> easier; my main question is : With conda-forge, how hard is it to install
>> just _one_ package without having everything else (say numpy, pandas, etc)
>> upgraded to the latest conda-forge version? I've had situations in the past
>> where i'm like "oh I'd just like the latest ___" and suddenly everything in
>> my conda env has been upgraded to the bleeding edge.
>>
>
> That's a great question, and it's one I don't really know the answer to.
>
> On my PC (I'm on the train, and this is what I have with me), here's what
> I did:
> - create a new conda environment that includes an rdkit-channel RDKit
> install
> - uninstall the RDKit from that
> - install the RDKit from the conda-forge channel
>
> Here's what ends up getting changed:
>
> ## Package Plan ##
>
>   environment location: C:\Users\glandrum\Anaconda3\envs\py36_tmp
>
>   added / updated specs:
> - rdkit
>
>
> The following NEW packages will be INSTALLED:
>
> boost: 1.67.0-py36_vc14_0  conda-forge [vc14]
> boost-cpp: 1.67.0-vc14_0   conda-forge [vc14]
> pycairo:   1.16.3-py36_vc14_0  conda-forge [vc14]
> rdkit: 2018.03.4-py36h857267b_1000 conda-forge
>
> The following packages will be UPDATED:
>
> certifi:   2018.10.15-py36_0   -->
> 2018.10.15-py36_1000 conda-forge
> jpeg:  9b-hb83a4c4_2   --> 9b-vc14_2
>   conda-forge [vc14]
> tk:8.6.8-hfa6e2cd_0--> 8.6.8-vc14_0
>  conda-forge [vc14]
>
> The following packages will be DOWNGRADED:
>
> icu:   58.2-ha66f8fd_1 --> 58.2-vc14_0
>   conda-forge [vc14]
> libpng:1.6.35-h2a8f88b_0   --> 1.6.34-vc14_0
>   conda-forge [vc14]
> libtiff:   4.0.9-h36446d0_2--> 4.0.9-vc14_0
>  conda-forge [vc14]
> pillow:5.3.0-py36hdc69c19_0-->
> 5.2.0-py36h08d_0
> pixman:0.34.0-hcef7cb0_3   --> 0.34.0-vc14_2
>   conda-forge [vc14]
> vc:14.1-h0510ff6_4 --> 14-0
>  conda-forge
> zlib:  1.2.11-h8395fce_2   --> 1.2.11-vc14_0
>   conda-forge [vc14]
>
>
> That's a fair amount of change, but is less than what I thought might
> happen (I was worried about numpy+pandas+... being updated).
> So that's one data point. What's your take?
>
>
> I will try the same thing on my Mac and Linux boxes tomorrow if no one
> else has done it by then.
>
> -greg
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chembience

2018-08-31 Thread Markus Sitzmann
Hello,

I have put together another Chembience release (0.2.3): update of RDKit to
version 2018.03.*4, *Postgres to version 10.5, and Django to 2.1

https://github.com/chembience/chembience

https://twitter.com/markussitzmann/status/1035629283736264704

Best,
Markus

On Sun, Jun 10, 2018 at 4:41 PM Markus Sitzmann 
wrote:

> Hello,
>
> I have just released Chembience 0.2.1: it updates RDKit to version
> 2018.03.2 and switches Postgres from the 9.x series to version 10.4
>
> https://github.com/chembience/chembience
>
> Best,
> Markus
>
>
> On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann 
> wrote:
>
>> Hello,
>>
>> I have released Chembience 0.2.0: it includes an update to RDKit 2018.03
>> and also provides Jupyter as new base App container type.
>>
>> https://github.com/chembience/chembience
>>
>> (so, assuming you have Docker and docker-compose installed on your
>> computer, you are a few, easy commands away from your personal Jupyter
>> notebook server with all RDKit 2018.03 goodness readily available).
>>
>> Best,
>> Markus
>>
>>
>> On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann <
>> markus.sitzm...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> since it includes RDKit as one of its major components I am happy to
>>> announce the first release of my new open-source project Chembience:
>>>
>>> A Docker-based, cloudable platform for the development of
>>> chemoinformatics-centric web applications and microservices.
>>>
>>> https://github.com/chembience/chembience
>>>
>>> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it
>>> before 2018.03 :-) ).
>>>
>>> Best,
>>> Markus
>>>
>>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] enumeration of smiles question

2018-08-06 Thread Markus Sitzmann
Oh tempora o mores. Didn't we try for ages to make our SMILES canonical and
now, all of sudden, the opposite is hip :-)

On Mon, Aug 6, 2018 at 1:38 PM Chris Earnshaw  wrote:

> Hi
>
> The question 'what do you mean by ALL?' springs to mind. None of the
> discussion includes dot-disconnected SMILES, which are also perfectly valid
> representations. For example C(C1C2)C.C12 is yet another SMILES (of many
> possible) for the example structure.
>
> I've no idea whether this is of any relevance to you, but you should
> probably consider these representations and decide whether they are
> important or not.
>
> Best regards,
> Chris
>
> On 6 August 2018 at 11:27, Jan Halborg Jensen  wrote:
>
>> This blogpost links to two other ones that may have done that (haven’t
>> read them carefully):
>> https://baoilleach.blogspot.com/2018/06/cheminformatics-for-deep-learners.html
>>
>> Best regards, Jan
>>
>> On 06 Aug 2018, at 11:57, Guillaume GODIN 
>> wrote:
>>
>> Dear Greg,
>>
>> Fantastic, thank you to give both explanation and solution to this
>> “simple question”, I know this is not so simple & it’s fundamental for data
>> augmentation in deep learning.
>>
>> If I may, I have another question related, do you know if someone has
>> worked on a generator of all unique smiles independently of RDKit ?
>>
>> Thanks again,
>>
>> Guillaume
>>
>> *De : *Greg Landrum 
>> *Date : *lundi, 6 août 2018 à 11:40
>> *À : *Guillaume GODIN 
>> *Cc : *RDKit Discuss 
>> *Objet : *Re: [Rdkit-discuss] enumeration of smiles question
>>
>>
>> On Thu, Aug 2, 2018 at 8:59 AM Guillaume GODIN <
>> guillaume.go...@firmenich.com> wrote:
>>
>>
>> I have a simple question about generating all possible smiles of a given
>> molecule:
>>
>>
>> It's a simple question, but the answer is somewhat complicated. :-)
>>
>>
>>
>> RDKit provides only 4 differents smiles for my molecule “CCC1CC1“:
>> C1C(CC)C1
>> CCC1CC1
>> C1(CC)CC1
>> C(C)C1CC1
>>
>> While by hand we can write those 7 smiles:
>> CCC1CC1
>> C(C)C1CC1
>> C(C1CC1)C
>> C1CC(CC)1
>> C1C(CC)C1
>> C1CC1CC
>> C(CC)1CC1
>>
>> I use this function for the enumeration:
>>
>> def allsmiles(smil):
>> m = Chem.MolFromSmiles(smil) # Construct a molecule from a SMILES
>> string.
>> if m is None:
>> return smil
>> N = m.GetNumAtoms()
>> if N==0:
>> return smil
>> try:
>> n= np.random.randint(0,high=N)
>> t= Chem.MolToSmiles(m, rootedAtAtom=n, canonical=False)
>> except :
>> return smil
>> return t
>>
>> n= 50
>> SMILES = [“CCC1CC1”]
>> SMILES_mult = [allsmiles(S) for S in SMILES for i in range(n)]
>>
>> Why we cannot generate all the 7 smiles ?
>>
>>
>> The RDKit has rules that it uses to decide which atom to branch to when
>> generating a SMILES. These are used regardless of whether you are
>> generating canonical SMILES or not.
>> The upshot of this is that it will never generate a SMILES where there's
>> a branch before a ring closure.
>> The other important factor here is that atom rank is determined by the
>> index of the atom in the molecule when you aren't using canonicalization.
>> So changing the atom order on input can help:
>>
>> In [12]: set(allsmiles('CCC1CC1') for i in range(50))
>> Out[12]: {'C(C)C1CC1', 'C1(CC)CC1', 'C1C(CC)C1', 'CCC1CC1'}
>>
>> In [13]: set(allsmiles('C1CC1CC') for i in range(50))
>> Out[13]: {'C(C1CC1)C', 'C1(CC)CC1', 'C1CC1CC', 'CCC1CC1'}
>>
>> You can do this all at once as follows:
>>
>> ```
>> In [20]: def allsmiles(smil):
>> ...: m = Chem.MolFromSmiles(smil) # Construct a molecule from a
>> SMILES string.
>> ...: if m is None:
>> ...: return smil
>> ...: N = m.GetNumAtoms()
>> ...: if N==0:
>> ...: return smil
>> ...: aids = list(range(N))
>> ...: random.shuffle(aids)
>> ...: m = Chem.RenumberAtoms(m,aids)
>> ...: try:
>> ...: n= random.randint(0,N-1)
>> ...: t= Chem.MolToSmiles(m, rootedAtAtom=n, canonical=False)
>> ...: except :
>> ...: return smil
>> ...: return t
>> ...:
>> ...:
>> ...:
>>
>> In [21]:
>>
>> In [21]: set(allsmiles('C1CC1CC') for i in range(50))
>> Out[21]: {'C(C)C1CC1', 'C(C1CC1)C', 'C1(CC)CC1', 'C1C(CC)C1', 'C1CC1CC',
>> 'CCC1CC1'}
>> ```
>> Note that I switched to using python's built in random module instead of
>> using the one in numpy.
>>
>> -greg
>>
>>
>>
>>
>>
>> Thanks guys,
>>
>> Best regards,
>>
>> Guillaume
>>
>> ***
>> DISCLAIMER
>> This email and any files transmitted with it, including replies and
>> forwarded copies (which may contain alterations) subsequently transmitted
>> from Firmenich, are confidential and solely for the use of the intended
>> recipient. The contents do not represent the opinion of Firmenich except to
>> the extent that it relates to their official business.
>>
>> 

Re: [Rdkit-discuss] MolFromInchi with Amides

2018-06-14 Thread Markus Sitzmann
Hi Jeff,

That is because InChI is a structure identifier, not a structure 
representation. The difference of both is, a structure identifier normalizes 
the structure to a form which it regards as the standard representation of the 
molecule in order to make the molecule identifiable regardless of the state the 
molecule is coming in from a input resource (and hence calculates the same 
identifier).

For Standard InChI, the decision was made to make them insensitive to tautomers 
(within the limitations of the InChI algorithm). Kind of unluckily, this 
normalizes most amides to a form that chemists regard as the incorrect one. And 
the second unlucky thing is that you can convert the InChI back to a structure 
representation which then  is of course the normalized or standardized form of 
the molecule. 

So if you want to make sure to keep the original representation of a molecule 
don’t use InChI as your representation format (calculate InChI as an identifier 
field next to it). If your input resource only provides InChI or Standard InChI 
then your are of course out of luck.

Best,
Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 14. Jun 2018, at 23:33, Jeff van Santen  wrote:
> 
> Hi all,
> 
> 
> I have some questions about how remit handles amides. For context, I am 
> working with a large set of molecules, many of which contain peptides. I have 
> been running into a problem   with using rdkit, in that when I try to 
> load a molecule from the InChI, the wrong tautomer is loaded. As a simple 
> example consider acetamide:
> 
> 
> """
> 
> FromInchi = Chem.MolFromInchi('InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)')
> 
> print(rdMolDescriptors.CalcNumAmideBonds(FromInchi))
> 
>  > 0
> 
> print(Chem.MolToSmiles(FromInchi))
> 
> > CC(=N)O
> 
> 
> 
> FromSmiles = Chem.MolFromSmiles('CC(=O)N')
> 
> print(rdMolDescriptors.CalcNumAmideBonds(FromInchi))
> 
> > 1
> 
> print(Chem.MolToSmiles(FromSmiles))
> 
> > CC(=N)O
> 
> """
> 
> 
> I realize that Standard InChi does not have a mechanism for distinguishing 
> between the two tautomers, so I am wondering why rdkit considers the iminol 
> to be a better representation? Also, there is anyway to get the amide 
> instead? (Without using MolVS)
> 
> 
> Thanks,
> 
> Jeff
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chembience

2018-06-10 Thread Markus Sitzmann
Hello,

I have just released Chembience 0.2.1: it updates RDKit to version
2018.03.2 and switches Postgres from the 9.x series to version 10.4

https://github.com/chembience/chembience

Best,
Markus


On Mon, May 14, 2018 at 1:49 AM Markus Sitzmann 
wrote:

> Hello,
>
> I have released Chembience 0.2.0: it includes an update to RDKit 2018.03
> and also provides Jupyter as new base App container type.
>
> https://github.com/chembience/chembience
>
> (so, assuming you have Docker and docker-compose installed on your
> computer, you are a few, easy commands away from your personal Jupyter
> notebook server with all RDKit 2018.03 goodness readily available).
>
> Best,
> Markus
>
>
> On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann <
> markus.sitzm...@gmail.com> wrote:
>
>> Hello,
>>
>> since it includes RDKit as one of its major components I am happy to
>> announce the first release of my new open-source project Chembience:
>>
>> A Docker-based, cloudable platform for the development of
>> chemoinformatics-centric web applications and microservices.
>>
>> https://github.com/chembience/chembience
>>
>> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it
>> before 2018.03 :-) ).
>>
>> Best,
>> Markus
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit postgres cartridge building

2018-05-24 Thread Markus Sitzmann
Hi Alfredo,

My first guess would be you have another, older Postgres version on your 
computer and you have build against this version. Take a look at the 
/use/share/postgresql directory and take a look if there is another directory 
instead of 10/

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 24. May 2018, at 18:24, Alfredo Quevedo  wrote:
> 
> Good morning,
> 
> I am trying to build RDKit from source, and succeed with that following the 
> instructions provided in the documentation. Howvere, I am trying to use the 
> postgres cartridge, which as far as I understand is built during the main 
> building process.
> 
> but after trying to create the extension for a database with:
> 
> psql -c  'create extension rdkit'  molecules
> 
> I am getting the following error
> 
> ERROR:  could not open extension control file 
> "/usr/share/postgresql/10/extension/rdkit.control": No such file or directory
> 
> It seems that the building of the cartridge is not being applyed to my local 
> postgres installation?
> 
> Any hint is highly appreacited,
> 
> thanks in advance
> 
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] convert a smiles file to a xyz file

2018-05-23 Thread Markus Sitzmann
In reminiscence of old times, you can do this with the Chemical Identifier
Resolver, for instance with the SMILES string for ethanol, CCO:

https://cactus.nci.nih.gov/chemical/structure/CCO/file?format=xyz

On Wed, May 23, 2018 at 5:24 PM Chenyang Shi  wrote:

> Hi Everyone,
>
> I am seeking helps about how to convert a SMILES file to a series of
> coordinates for the molecule, in the format of xyz.
> I saw some online service that can do the job (e.g.
> http://www.cheminfo.org/Chemistry/Cheminformatics/FormatConverter/index.html),
> but it is not convenient to use.
>
> I am wondering how can we do this by writing RDKit code. A separate
> question is that is the converted molecular structure from SMILES the same
> as that taken from a crystal structure?
>
> Many thanks!
> Chenyang
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chembience

2018-05-13 Thread Markus Sitzmann
Hello,

I have released Chembience 0.2.0: it includes an update to RDKit 2018.03
and also provides Jupyter as new base App container type.

https://github.com/chembience/chembience

(so, assuming you have Docker and docker-compose installed on your
computer, you are a few, easy commands away from your personal Jupyter
notebook server with all RDKit 2018.03 goodness readily available).

Best,
Markus


On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmann 
wrote:

> Hello,
>
> since it includes RDKit as one of its major components I am happy to
> announce the first release of my new open-source project Chembience:
>
> A Docker-based, cloudable platform for the development of
> chemoinformatics-centric web applications and microservices.
>
> https://github.com/chembience/chembience
>
> (unfortunately it is still on RDKit 2017.09_3, I failed releasing it
> before 2018.03 :-) ).
>
> Best,
> Markus
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit-MolVS intergration: Google Summer of Code Project

2018-04-25 Thread Markus Sitzmann
Yes, great news. Matt has really started a very nice work there. I hope it can 
be turned into something like a well-documented, open standard for molecule 
standardization.

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 26. Apr 2018, at 00:29, Paul Czodrowski  
> wrote:
> 
> Susan, great news, looking forward to this project, enjoy GSoC!  Paul
>  
> Von: Susan Leung [mailto:susan.le...@st-hildas.ox.ac.uk] 
> Gesendet: Mittwoch, 25. April 2018 23:35
> An: rdkit-discuss@lists.sourceforge.net
> Betreff: [Rdkit-discuss] RDKit-MolVS intergration: Google Summer of Code 
> Project
>  
> Hi all, 
> 
> I am really excited and happy to let you know that I will be working with 
> Greg on a RDKit-MolVS integration project as part of the Open Chemistry 
> Google Summer of Code. 
> 
> I have followed and used the RDKit mailing list since the start of my PhD and 
> have used both RDKit and MolVS in my workflow so I'm very excited to have the 
> opportunity to contribute to the code base.
> 
> In this project we aim to expand the current capabilities of MolVS and 
> integrate it into RDKit so hopefully by the end of it, you will see 
> improvements in the molecular standardisation tools available in RDKit. 
> 
> Best wishes,
> 
> Susan
>  
> This message and any attachment are confidential and may be privileged or 
> otherwise protected from disclosure. If you are not the intended recipient, 
> you must not copy this message or attachment or disclose the contents to any 
> other person. If you have received this transmission in error, please notify 
> the sender immediately and delete the message and any attachment from your 
> system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> accept liability for any omissions or errors in this message which may arise 
> as a result of E-Mail-transmission or for damages resulting from any 
> unauthorized changes of the content of this message and any attachment 
> thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> guarantee that this message is free of viruses and does not accept liability 
> for any damages caused by any virus transmitted therewith.
>  
> Click http://www.merckgroup.com/disclaimer to access the German, French, 
> Spanish and Portuguese versions of this disclaimer.
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Chembience

2018-04-24 Thread Markus Sitzmann
Hello,

since it includes RDKit as one of its major components I am happy to
announce the first release of my new open-source project Chembience:

A Docker-based, cloudable platform for the development of
chemoinformatics-centric web applications and microservices.

https://github.com/chembience/chembience

(unfortunately it is still on RDKit 2017.09_3, I failed releasing it before
2018.03 :-) ).

Best,
Markus
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Some larger-scale RDKit C++ code changes

2018-04-05 Thread Markus Sitzmann
Yes, looks good :-). And the good thing with git is (if you very uncertain
about the outcome), you always can make a test run by copying the whole
directory, test all things with the copy, and if it goes horribly wrong,
just delete the copy.

Markus



On Thu, Apr 5, 2018 at 8:46 AM, Greg Landrum  wrote:

> Thanks for raising this Markus. It had been on my list of things to look
> into for a while and I had been kind of dreading it.[1]
>
> I did a bit of googling and experimentation and it looks like this
> approach works well:
> https://stackoverflow.com/questions/5956300/merging-two-
> very-divergent-branches-using-git
> Given that it also (at least to me) makes sense, I think that this is how
> I'll proceed.
>
> -greg
> [1] this is where I usually point to this xkcd: https://xkcd.com/1597/
> and make a joke about no longer being able to just walk over and ask Nadine
> how to solve the problem. :-)
>
> On Wed, Apr 4, 2018 at 1:20 PM, Markus Sitzmann  > wrote:
>
>> Have you tried a merge (after branching the master to something like
>> master-test-merge and then merge modern_cxx) ? How horrible does it look?
>> It might be quiet okay. Or do you really have a lot of changes in the
>> current master you don't have/want to have in modern_cxx and the future
>> master. And well, it just was a concern by me that avoiding "early" horrors
>> might cause bigger horrors later :-). Renaming the master in a GIT
>> repository is something I wouldn't do easily - I would regard it more like
>> a very, very last resort because if the master is renamed (or replaced by
>> another branch), any branch in any remote repository by anybody who ever
>> branched from master (including the RDKit github repository) becomes
>> potentially (very likely) invalid by this step. Only if this is a small
>> concern, I would do it (I doubt it is in case of RDKit).
>>
>> Markus
>>
>> On Wed, Apr 4, 2018 at 11:56 AM, Greg Landrum 
>> wrote:
>>
>>>
>>>
>>> On Wed, Apr 4, 2018 at 11:27 AM, Markus Sitzmann <
>>> markus.sitzm...@gmail.com> wrote:
>>>
>>>> Hi Greg,
>>>>
>>>> >  Concretely what this means in github is that the current master
>>>> branch will be renamed to legacy and the modern_cxx branch will be renamed
>>>> to master.
>>>>
>>>> I hope you are not actually just renaming it - although I am not
>>>> affected personally, that might be a call for trouble because it
>>>> invalidates any remote repository of rdkit.
>>>>
>>>
>>> If you have suggestions for how to do a large-delta change like that in
>>> a non-horrible manner, I would love to hear them :-)
>>>
>>> -greg
>>>
>>>
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Some larger-scale RDKit C++ code changes

2018-04-04 Thread Markus Sitzmann
Have you tried a merge (after branching the master to something like
master-test-merge and then merge modern_cxx) ? How horrible does it look?
It might be quiet okay. Or do you really have a lot of changes in the
current master you don't have/want to have in modern_cxx and the future
master. And well, it just was a concern by me that avoiding "early" horrors
might cause bigger horrors later :-). Renaming the master in a GIT
repository is something I wouldn't do easily - I would regard it more like
a very, very last resort because if the master is renamed (or replaced by
another branch), any branch in any remote repository by anybody who ever
branched from master (including the RDKit github repository) becomes
potentially (very likely) invalid by this step. Only if this is a small
concern, I would do it (I doubt it is in case of RDKit).

Markus

On Wed, Apr 4, 2018 at 11:56 AM, Greg Landrum 
wrote:

>
>
> On Wed, Apr 4, 2018 at 11:27 AM, Markus Sitzmann <
> markus.sitzm...@gmail.com> wrote:
>
>> Hi Greg,
>>
>> >  Concretely what this means in github is that the current master
>> branch will be renamed to legacy and the modern_cxx branch will be renamed
>> to master.
>>
>> I hope you are not actually just renaming it - although I am not affected
>> personally, that might be a call for trouble because it invalidates any
>> remote repository of rdkit.
>>
>
> If you have suggestions for how to do a large-delta change like that in a
> non-horrible manner, I would love to hear them :-)
>
> -greg
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Some larger-scale RDKit C++ code changes

2018-04-04 Thread Markus Sitzmann
Hi Greg,

>  Concretely what this means in github is that the current master branch
will be renamed to legacy and the modern_cxx branch will be renamed to
master.

I hope you are not actually just renaming it - although I am not affected
personally, that might be a call for trouble because it invalidates any
remote repository of rdkit.

Markus



On Wed, Apr 4, 2018 at 5:23 AM, Greg Landrum  wrote:

>
> NOTE: If you don't work with the RDKit at the C++ level or build the code
> yourself from source, you probably don't need to read this email.
>
> TL;DR: When we do the beta for the 2018.03.1 release we're going to switch
> the C++ backend to use modern C++ (=C++11). For people who can't switch to
> use that code, we will continue to provide bug fixes for the 2017.09
> release for at least another 6 months.
>
> --
> # What's happening?
>
> As part of the upcoming 2018.03 release, we will start using modern C++
> for the RDKit - this means C++11 at the moment, the goal is that you should
> be able to build the code with g++ v4.8. I've been talking about this for a
> while, blogged about it (https://medium.com/@greg.land
> rum_t5/the-rdkit-and-modern-c-48206b966218), and posted to the
> rdkit-devel list (https://sourceforge.net/p/rdk
> it/mailman/message/35811216/), now it's finally happening.
>
> Concretely what this means in github is that the current master branch
> will be renamed to legacy and the modern_cxx branch will be renamed to
> master.
>
> # Who does this affect?
>
> This should only affect people who need to build the RDKit C++ code
> themselves. If you use a binary version of the RDKit like the ones
> available inside of Anaconda Python or KNIME, this change should have no
> impact upon you.
>
> # What about people who can't use up-to-date compilers?
>
> We realize that some people on older operating systems will not be able to
> switch to start using a compiler that supports C++11. In order to continue
> to support this subset of developers, we will continue to apply bug fixes
> to the current Release_2017_09 branch and do occasional patch releases.
> Since this is intended for people who need to build the code themselves
> anyway, we won't do builds of these releases any more.
>
> We will keep doing these patch release at least until the 2018.09 release.
> Whether or not we continue past that date will depend on demand, so if you
> are using these releases please let us know.
>
> # Why are you doing this?
>
> There's a long, rambling answer to this, but I'm not going to give it
> here. :-)
> The simplest explanation is that we think that the core of the RDKit
> should be using a modern and (reasonably) up-to-date version of the
> language that it's written in. The developer experience is better and,
> happily, the code ends up being faster.
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] conda build instructions for OSX?

2018-01-02 Thread Markus Sitzmann
I am not 100% sure what your motivation is to build a certain rev or the
master branch, but let me guess: you want to be independent from further
changes in the development branch.

Well, my suggestion for this: fork the conda-rdkit repro on github and use
your fork for future builds, i.e. the repro is stable until you decide to
merge future changes from the original repro
Having your own fork would also allow you to merge the development branch
into the master branch of your fork if this is a requirement (although I
don't see any differences between using development or the master branch
for builds).

On Tue, Jan 2, 2018 at 4:18 PM, Brian Cole  wrote:

> Figured out by sleuthing around the conda-rdkit repo that the 'master'
> branch is really old. Looks like the 'development' branch is the branch
> that works. If you switch over to the 'development' branch then the 'conda
> build boost && conda build rdkit' works.
>
> Now the next trick I'm still stuck on is how to build RDKit's master
> branch using conda. Changing `git_rev` in rdkit/meta.yaml didn't have the
> desired effect.
>
> -Brian
>
> On Wed, Dec 27, 2017 at 5:08 PM, Brian Cole  wrote:
>
>> Trying to 'conda build rdkit' as described in the
>> https://github.com/rdkit/conda-rdkit README to no success. Are there any
>> OSX 'conda build' instructions tucked away somewhere?
>>
>> It's currently failing on the cairo dependency:
>>
>> -- Checking for one of the modules 'cairo'
>> CMake Error at /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/_h_env_placehold_placehold_placehold_
>> placehold_placehold_placehold_placehold_placehold_placehold_
>> placehold_placehold_placehold_placehold_placehold_placehold_
>> placehold_placehold_placehold_placehold_place/share/cmake-3.
>> 9/Modules/FindPkgConfig.cmake:640 (message):
>>   None of the required 'cairo' found
>> Call Stack (most recent call first):
>>   Code/cmake/Modules/FindCairo.cmake:23 (PKG_SEARCH_MODULE)
>>   Code/GraphMol/MolDraw2D/CMakeLists.txt:31 (find_package)
>>
>>
>> CMake Error at Code/cmake/Modules/FindCairo.cmake:38 (MESSAGE):
>>   Could not find Cairo
>> Call Stack (most recent call first):
>>   Code/GraphMol/MolDraw2D/CMakeLists.txt:31 (find_package)
>>
>>
>> -- Boost version: 1.56.0
>> -- Found the following Boost libraries:
>> --   regex
>> CMake Error: The following variables are used in this project, but they
>> are set to NOTFOUND.
>> Please set them or make sure they are set and tested correctly in the
>> CMake files:
>> CAIRO_INCLUDE_DIRS (ADVANCED)
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld/rdkit_1514412408124/work/Code/GraphMol/MolDraw2D/Wrap
>>used as include directory in directory /Users/coleb/anaconda2/conda-b
>> ld

Re: [Rdkit-discuss] Issue with the latest RDKit DB build

2017-12-29 Thread Markus Sitzmann

I have the problem, too, on Debian stretch


-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 29. Dec 2017, at 20:01, Drew Gibson via Rdkit-discuss 
>  wrote:
> 
> Hello, and compliments of the season to you, RDKitters :)
> 
> I'm having trouble getting the conda build of the DB package 
> (rdkit-postgresql95) working.
> 
> The issue I'm having occurs when trying to initialise the rdkit DB extension 
> on a newly created DB, eg...
> 
> createdb emolecules 
> psql -c 'create extension rdkit' emolecules which will give me the 
> error...
> 
> psql: error while loading shared libraries: libncursesw.so.6: cannot open 
> shared object file: No such file or directory
> 
> I am getting the same error on both Ubuntu 16.04 LTS and in CentOS7 (latest, 
> running in VirtualBox for now).
> 
> I have successfully installed and used previous versions of the DB 
> (rdkit-postgresql), and currently have this working on Ubuntu.
> 
> Any suggestions to getting the newest version working ? 
> 
> Cheers,  Drew
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Docker with (latest) rdkit+jupyter

2017-11-21 Thread Markus Sitzmann
Hi JP,

From the Docker log you posted it is obvious that the build starts from the 
latest miniconda version which than will use python 3.6 as default, however one 
of the python packages still relies python 3.5. 

One thing you can try is to tell the conda install command in the docker script 
to go back to python 3.5 or create a python 3.5 based environment. 
Unfortunately I just don’t remember out of my head which option you have to use 
for this but you fill find it in the conda documentation. 

And as much I like the idea of conda, it is unfortunately one of the biggest 
troublemakers in my personal projects.

Another point is, if you look for one of the recent post from Greg here on the 
list, there is another problem with the latest conda version you might run into.


Markus


-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 21. Nov 2017, at 16:53, Tim Dudgeon  wrote:
> 
> I've got some dockerfiles that might be worth a look.
> https://github.com/InformaticsMatters/docker_jupyter
> 
> Not sure if they will help.
> 
> Tim
> 
> 
>> On 21/11/2017 15:25, JP wrote:
>> Yo RDKitters,
>> 
>> I am running a CADD workshop for a group of MSc students and would like to 
>> show them some some RDKit awesomeness.
>> 
>> I thought the best way to do this is to use an rdkit enabled docker image + 
>> jupyter notebooks (they are comfortable with python).
>> 
>> In preparation, I tried building the docker image from the docker file at 
>> https://github.com/rdkit/rdkit_containers/tree/master/docker/run_conda3 but 
>> this fails on Ubuntu 16.04.3 LTS with the following error:
>> 
>> $ docker build -t run_rdkit_conda 
>> https://raw.githubusercontent.com/rdkit/rdkit_containers/master/docker/run_conda3/Dockerfile
>> Downloading build context from remote url: 
>> https://raw.githubusercontent.com/rdkit/rdkit_containers/master/docker/run_conda3/Dockerfile
>>  357B
>> Sending build context to Docker daemon  2.048kB
>> Step 1/7 : FROM continuumio/miniconda3
>> latest: Pulling from continuumio/miniconda3
>> 85b1f47fba49: Pull complete 
>> 6b3cb0c49789: Pull complete 
>> fecb432dacf0: Pull complete 
>> f461f7e3890d: Pull complete 
>> Digest: 
>> sha256:604cda0c0be5d40cc26db31912d8b1b7276840a56544b846abef441b32d987fc
>> Status: Downloaded newer image for continuumio/miniconda3:latest
>>  ---> f700f7f570c7
>> Step 2/7 : MAINTAINER Greg Landrum 
>>  ---> Running in ad6a648c18ba
>>  ---> 18e6d6093d5b
>> Removing intermediate container ad6a648c18ba
>> Step 3/7 : ENV PATH /opt/conda/bin:$PATH
>>  ---> Running in e21cf8e5332f
>>  ---> ddef65292068
>> Removing intermediate container e21cf8e5332f
>> Step 4/7 : ENV LANG C
>>  ---> Running in efa12ef17f37
>>  ---> 137d7e20350d
>> Removing intermediate container efa12ef17f37
>> Step 5/7 : RUN conda config --add channels  https://conda.anaconda.org/rdkit
>>  ---> Running in 79566bf4b6e9
>>  ---> 032965875391
>> Removing intermediate container 79566bf4b6e9
>> Step 6/7 : RUN conda install -y nomkl rdkit pandas cairo cairocffi jupyter
>>  ---> Running in c5aa6417a63a
>> Fetching package metadata .
>> Solving package specifications: .
>> 
>> UnsatisfiableError: The following specifications were found to be in 
>> conflict:
>>   - cairocffi -> python 3.5* -> xz 5.0.5
>>   - python 3.6*
>> Use "conda info " to see the dependencies for each package.
>> 
>> The command '/bin/sh -c conda install -y nomkl rdkit pandas cairo cairocffi 
>> jupyter' returned a non-zero code: 1
>> 
>> Any ideas?
>> JP
>> 
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> 
>> 
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-15 Thread Markus Sitzmann
Well, if you have python 2.7 and 3.5 already running ,you can use
(mini)conda for the RDKit installation (conda is anaconda but instead of
one huge package you can install the packages you want including RDKit)

On Fri, Sep 15, 2017 at 9:12 AM, Loris Bennett 
wrote:

> Hi Greg,
>
> Greg Landrum  writes:
>
> > I'll provide a more detailed answer in a bit, but since you aren't
> > using the system python anyway, is there any chance that you could
> > switch to anaconda python on your machines? Anaconda is a great python
> > distribution for scientific applications and it makes many things
> > (including system administration) a ton easier.
>
> Anaconda might be a possibility.  On the other hand we already have 3
> versions of Python in use: 2.6 from the OS, and 2.7 and 3.5 from the
> Software Collections.  In addition, the current cluster is nearing its
> end-of-life, probably before the end of the year and so I am somewhat
> loathe to install yet another one and add to my can of worms (or pit of
> snakes).
>
> However, now I have a slight handle on the problem and know that there
> is a responsive and helpful mailing list to back me up, I'm happy to
> invest a little more time in trying another source build.
>
> Cheers,
>
> Loris
>
> --
> Dr. Loris Bennett (Mr.)
> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-15 Thread Markus Sitzmann
BTW, python 3.6 is out since last Christmas ;-)   (and made it to
sub-release .2)

On Fri, Sep 15, 2017 at 8:36 AM, Greg Landrum 
wrote:

> I'll provide a more detailed answer in a bit, but since you aren't using
> the system python anyway, is there any chance that you could switch to
> anaconda python on your machines? Anaconda is a great python distribution
> for scientific applications and it makes many things (including system
> administration) a ton easier.
>
> -greg
>
>
> On Fri, Sep 15, 2017 at 8:19 AM, Loris Bennett  > wrote:
>
>> Hi Greg,
>>
>> Greg Landrum  writes:
>>
>> > Hi Loris,
>> >
>> > On Thu, Sep 14, 2017 at 2:25 PM, Loris Bennett <
>> loris.benn...@fu-berlin.de> wrote:
>> >
>> >  I am trying to install RDKit on a university cluster running Linux from
>> >  source. The build seem to go OK and 'make install' copied the
>> >  directories
>> >
>> >  lib
>> >  rdkit
>> >
>> >  to the NFS share where the software should reside. I then do
>> >
>> >  export RDBASE=/cm/shared/apps/rdkit/rdkit_2017_03_3
>> >  export PYTHONPATH=$PYTHONPATH:$RDBASE
>> >  export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$RDBASE/lib
>> >
>> >  However when I then run Python (2.6.6) and try
>> >
>> > Just to do some expectation management: python 2.6 is pretty ancient
>> > and there's no guarantee that all of the RDKit code will work with
>> > it. Python 2.7 is the minimum version that we "officially"
>> > support. It's a very good idea to update.
>>
>> OK.  I didn't notice that 2.6 was deprecated - maybe this could be
>> explicitly mentioned in the install instructions.  I'm running the
>> RedHat clone Scientific Linux 6, so everything in this thread on
>> RH/Python applies.  So I can use either Python 2.7 or Python 3.5.  I can
>> ask the users what they prefer - although, as you seem know my users
>> here in Berlin, maybe you know too ;-)
>>
>> >  import rdkit
>> >
>> >  I get
>> >
>> >  ImportError: No module named rdkit
>> >
>> >  I am not a Python person and my naive expectation was that there should
>> >  be a file called
>> >
>> >  rdkit.py
>> >
>> > Based on the info provided so far, there should be a directory called
>> > rdkit in the directory: /cm/shared/apps/rdkit/rdkit_2017_03_3
>>
>> This directory exists.
>>
>> > That directory should contain a number of sub dirs, other files, and a
>> > file called __init__.py (this is the one that tells Python that it can
>> > import the directory as a package).  What do you see there?
>>
>> The directory just contains
>>
>>   lib
>>   rdkit
>>
>> an nothing else, in particular, no __init__.py.  I have plenty of
>> __init__.pys in the build directory, so I assume I must have done some
>> thing wrong when running cmake and/or make install.
>>
>> I must admit that I found the installation instructions somewhat unclear
>> on that point.  I would find it clearer if things were couched in terms
>> of 'source' and 'destination'.  For me, as a make-guy rather than a
>> cmake-guy, it would also be helpful if it were made clearer at which
>> point the destination directory should be specified.  I ended up with
>> RDKit being installed under a very long path with included both my
>> intended path and the original build path, so I had to move things
>> around and may have goofed up at that point.
>>
>> >  which has to be on my PYTHONPATH. However, since the unpacked sources
>> >  together with the build don't seem to contain such a file, either
>> >  something is broken or the rdkit module should be found by some other
>> >  mechanism.
>> >
>> > Again, based on the info above, I would expect that you want "make
>> > install" to copy the "rdkit" and "lib" directories (as well as a
>> > couple others) to /cm/shared/apps/rdkit/rdkit_2017_03_3. Once we
>> > figure out what actually happened I can maybe help you figure out how
>> > to fix it.
>>
>> This is what I did:
>>
>>   module add boost # this just sets the boost stuff up
>>
>>   export VERSION=2017_03_3
>>   export RDBASE=/home/BUILD/rdkit/rdkit-rdkit-Release_${VERSION}
>>   export LD_LIBRARY_PATH=${RDBASE}:${LD_LIBRARY_PATH}
>>   export DESTDIR=/cm/shared/apps/rdkit/${VERSION}
>>
>> and then probably
>>
>>   cmake -DCMAKE_INSTALL_PREFIX=/cm/shared/apps/rdkit/${VERSION}
>>
>> so I may have over-egged my install-path-cake.  I started all the
>> fiddling with DESTDIR and CMAKE_INSTALL_PREFIX, because my initial
>> attempt resulted in the destination directory being the same as the
>> build directory, which didn't work so well.
>>
>> Thanks for the help - I'll have another go Python 3.5 and try to keep my
>> eye on __init__.py.
>>
>> Cheers,
>>
>> Loris
>>
>> --
>> Dr. Loris Bennett (Mr.)
>> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
>>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-disc

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-14 Thread Markus Sitzmann
t;>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> Anaconda is brought to you by Continuum Analytics.
>>>>> Please check out: http://continuum.io/thanks and https://anaconda.org
>>>>> >>> import rdkit
>>>>> >>> from rdkit import Chem
>>>>> Traceback (most recent call last):
>>>>>   File "", line 1, in 
>>>>>   File "/opt/rdkit-Release_2016_03_1/rdkit/Chem/__init__.py", line
>>>>> 18, in 
>>>>> from rdkit import rdBase
>>>>> ImportError: cannot import name rdBase
>>>>>
>>>>>
>>>>> --
>>>>> Wandré Nunes de Pinho Veloso
>>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais -
>>>>> UFMG
>>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>>> Inteligência Computacional - UNIFEI
>>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>>
>>>>> 2017-09-14 9:17 GMT-03:00 Malitha Kabir :
>>>>>
>>>>>> Hi Wandré,
>>>>>>
>>>>>> Good day! It's malitha.
>>>>>>
>>>>>> Considering your first question I would say, the path variable NOT
>>>>>> set correctly. To avoid having gymnastic with linux system you may 
>>>>>> consider
>>>>>> the following steps:
>>>>>>
>>>>>>1. Install miniconda or andcona from
>>>>>>https://conda.io/miniconda.html <https://conda.io/miniconda.html>
>>>>>>and command yes (y) when it says to add path variable to python 
>>>>>> shipped
>>>>>>with conda. I mean python within conda would be your default python. 
>>>>>> After
>>>>>>installing it, when you run the command <<<<>>>>> from shell 
>>>>>> you
>>>>>>will see something like <<>> at the screen
>>>>>>2. Install rdkit from https://anaconda.org/rdkit/rdkit on top of
>>>>>>conda
>>>>>>
>>>>>>
>>>>>> For question regarding energy minimization, you may find the
>>>>>> following link helpful.
>>>>>> https://sourceforge.net/p/rdkit/mailman/message/28298074/
>>>>>>
>>>>>> I hope, it helps!
>>>>>>
>>>>>> - malitha
>>>>>>
>>>>>> On Thu, Sep 14, 2017 at 4:22 PM, Wandré 
>>>>>> wrote:
>>>>>>
>>>>>>> So,
>>>>>>> 1) I run all the commands in tutorial of installation of RDKit in
>>>>>>> Conda (https://github.com/rdkit/conda-rdkit), but, when I run
>>>>>>> python and try to import Chem ("from rdkit import Chem") appears an 
>>>>>>> error
>>>>>>> message:
>>>>>>> Traceback (most recent call last):
>>>>>>>   File "", line 1, in 
>>>>>>>   File "/opt/rdkit-Release_2016_03_1/rdkit/Chem/__init__.py", line
>>>>>>> 18, in 
>>>>>>> from rdkit import rdBase
>>>>>>> ImportError: cannot import name rdBase
>>>>>>>
>>>>>>> 2) Thanks for all the references
>>>>>>>
>>>>>>> 3) Which function generate this "energy minimized molecule"?
>>>>>>>
>>>>>>> --
>>>>>>> Wandré Nunes de Pinho Veloso
>>>>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>>>>> Doutorando em Bioinformática - Universidade Federal de Minas
>>>>>>> Gerais - UFMG
>>>>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>>>>> Inteligência Computacional - UNIFEI
>>>>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>>>>> Membro do Grupo de Pesquisa Bioinformática Estrutu

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Markus Sitzmann
Not on Centos 6 - Docker requires Centos 7 for the host system.

On Thu, Sep 14, 2017 at 10:01 PM, Dimitri Maziuk 
wrote:

> On 09/14/2017 02:58 PM, Andrew Dalke wrote:
>
> > If only Greg got as much money for long term RDKit support as Red Hat
> > gets for long term RHEL support. :)
>
> Yep. But an rdkit docker container might be feasible.
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] bad inchi or parsing problem?

2017-09-14 Thread Markus Sitzmann
On Thu, Sep 14, 2017 at 8:09 PM, Jason Biggs  wrote:

> Okay, all three of these smiles strings resolve to the same inchi,
>
> "O=[N+](C1=NC2=CC=CC=C2N=C1)[N-](=O)C1=NC2=CC=CC=C2N=C1"
> "C1=CC=C2C(=C1)N=CC(=N2)N(=N(=O)C3=NC4=CC=CC=C4N=C3)=O"
> "[O-][N+](c1cnc2c2n1)=[N+]([O-])c3cnc4c4n3"
>
> even though to me they seem like different structures due to the specified
> charges.  Is this a limitation of inchi, or do I need to rethink my ideas
> of what makes two chemical structures the same?
>
>
Well, but at least the first two ones I would regard as erroneous or
unlikely (not stable) creatures - and that is exactly what John meant with
InChI is an identifier, not a representation. InChI's main purpose
(particularly that one of Standard InChI) is to identify them as the same
(corrected, normalized) molecule, not as three separate species (that would
be the purpose of representation). Of course, in many cases, there might be
a discussion avout where sensible correction/normalization should end and
separation of structures should start but that is long topic.
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] bad inchi or parsing problem?

2017-09-14 Thread Markus Sitzmann
On Thu, Sep 14, 2017 at 8:09 PM, Jason Biggs  wrote:

> Okay, all three of these smiles strings resolve to the same inchi,
>
> "O=[N+](C1=NC2=CC=CC=C2N=C1)[N-](=O)C1=NC2=CC=CC=C2N=C1"
> "C1=CC=C2C(=C1)N=CC(=N2)N(=N(=O)C3=NC4=CC=CC=C4N=C3)=O"
> "[O-][N+](c1cnc2c2n1)=[N+]([O-])c3cnc4c4n3"
>
> even though to me they seem like different structures due to the specified
> charges.  Is this a limitation of inchi, or do I need to rethink my ideas
> of what makes two chemical structures the same?
>
>
Well, but at least the first two ones I would regard as erroneous or
unlikely (not stable) creatures - and that is exactly what John meant with
InChI is an identifier, not a representation. InChI's main purpose
(particularly that one of Standard InChI) is to identify them as the same
(corrected, normalized) molecule, not as three separate species (that would
be the purpose of representation). Of course, in many cases, there might be
a discussion avout where sensible correction/normalization should end and
separation of structures should start but that is long topic.
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] bad inchi or parsing problem?

2017-09-14 Thread Markus Sitzmann
On Thu, Sep 14, 2017 at 7:38 PM, John Mayfield 
wrote:

> InChI is an identifier and not a representation, you should not read
> InChIs... but we are beyond hope there so...
>

Wonderfully said - unfortunately one day they decided to make InChIs
"readable" ...


> The InChI string is correct and is the same if you roundtrip your
> preferred one with charge separated bonds and the 5 valent one.
>
> All toolkits will use the InChI library to read/write InChIs and it
> generates the representation with 5v nitrogens, cactus is either applying
> normalisation after reading or in this case (since it's the name resolved)
> doing a identifier lookup from an original SMILES used to generate this
> InChI:
>

No, my "good old" cactus service doesn't do a lookup in this case, it is
read from the string, which is of of course in opposition to what I just
said :-). We did quite a bit regarding normalization, first, the CACTVS
toolkit behind the service is quite good in this regard and I added a few
things for the web service, too.


 Markus
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Non-redundant database of molecules (Wandr?)

2017-09-13 Thread Markus Sitzmann
If you do nothing else (on purpose), SMILES *calculated* by RDKit from any
input are canonical per se (BUT that is only true if you compare it to
other SMILES also calculated by RDKit, you can not compare SMILES between
software packages even if they canonical in the domain of each of the
software packages).

On Wed, Sep 13, 2017 at 9:16 PM, Wandré  wrote:

> Why don't use the InChI function on RDKit?
> Canonical SMILES cannot be generated by RDKit, correct?
>
> --
> Wandré Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
> Inteligência Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>
> 2017-09-13 15:57 GMT-03:00 Chris Swain :
>
>> Hi,
>>
>> I’d use a text based version of the structure InChiKey or canonical
>> SMILES it then becomes a easy task to do the comparison in Python
>>
>> I wrote a script to do this in Vortex but it should be easy to modify.
>> https://www.macinchem.org/reviews/vortex/tut28/scripting_vortex28.php
>>
>>
>> Cheers
>>
>> Chris
>>
>>
>>
>> Today's Topics:
>>
>>   1. Non-redundant database of molecules (Wandr?)
>>
>>
>> --
>>
>> Message: 1
>> Date: Wed, 13 Sep 2017 07:13:56 -0300
>> From: Wandr? 
>> To: rdkit-discuss@lists.sourceforge.net
>> Subject: [Rdkit-discuss] Non-redundant database of molecules
>> Message-ID:
>> 
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hi,
>>
>> My name is Wandr? and I'm from Brazil.
>> I'm trying to do a big database of molecules, but, I want to eliminate all
>> the redundant molecules before insert them in database.
>> I want to know what is the best method to identify one molecule in RDKit.
>> Is SMILES ("Chem.MolToSmiles(mol,isomericSmiles=True)") or I will need to
>> compare all molecules, one by one, before insert them in database (using
>> Tanimoto)?
>> This can be hard to do because my database will have lot of millions of
>> molecules, so, compare one by one before insert is the only answer?
>> Compare if the SMILES as already inserted is easy (text compare), but,
>> compare fingerprint of molecule...
>>
>> If I really need to compare the fingerprint of molecule, how to store this
>> data in PostgreSQL without use cartridge? I will generate the fingeprint
>> (Atompair, for example) and store this fingerprint in database and compare
>> all the fingerprints, one by one, before insert a now molecule. This
>> fingerprint (Atompair) have lot of features, so, store this in relational
>> database is expensive.
>> It is possible?
>>
>> Thanks!
>>
>> --
>> Wandr? Nunes de Pinho Veloso
>> Professor Assistente - Unifei - Campus Avan?ado de Itabira-MG
>> Doutorando em Bioinform?tica - Universidade Federal de Minas Gerais - UFMG
>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simula??o e
>> Intelig?ncia Computacional - UNIFEI
>> Membro do Grupo de Pesquisa Assinaturas Biol?gicas da FIOCRUZ
>> Membro do Grupo de Pesquisa Bioinform?tica Estrutural da UFMG
>> Laborat?rio de Bioinform?tica e Sistemas - LBS, DCC, UFMG
>> -- next part --
>> An HTML attachment was scrubbed...
>>
>> --
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>> --
>>
>> Subject: Digest Footer
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>> --
>>
>> End of Rdkit-discuss Digest, Vol 119, Issue 20
>> **
>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Markus Sitzmann
PS. The conda version has InChI support

On Wed, Sep 13, 2017 at 10:04 PM, Markus Sitzmann  wrote:

> Strong recommendation: use the conda version:
>
> http://www.rdkit.org/docs/Install.html
>
> On Wed, Sep 13, 2017 at 9:58 PM, Wandré  wrote:
>
>> I just run sudo apt-get install python-rdkit librdkit1 rdkit-data 😁
>> I'm trying to solve this with this link: http://www.blopig.com/bl
>> og/2013/02/how-to-install-rdkit-on-ubuntu-12-04/
>>
>> --
>> Wandré Nunes de Pinho Veloso
>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>> Inteligência Computacional - UNIFEI
>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>
>> 2017-09-13 16:55 GMT-03:00 Markus Sitzmann :
>>
>>> How did you install rdkit so far? And where? Is it the conda/anaconda
>>> version?
>>>
>>> On Wed, Sep 13, 2017 at 9:39 PM, Wandré  wrote:
>>>
>>>> How to install RDKit with InChI?
>>>> When I run Chem.inchi.INCHI_AVAILABLE, the result is False
>>>>
>>>> --
>>>> Wandré Nunes de Pinho Veloso
>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais -
>>>> UFMG
>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>> Inteligência Computacional - UNIFEI
>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>
>>>> 2017-09-13 16:30 GMT-03:00 Wandré :
>>>>
>>>>> Thanks Malitha.
>>>>> I choose this descriptors because I will store this on my database,
>>>>> so, will be fast compare one molecule before insert them in database.
>>>>> My worry now is if the RDKit will generate different SMILES or InChI
>>>>> in same SDF molecule or equals in different molecules (molecules from RCSB
>>>>> PDB, PubChem, ChemBL, for example).
>>>>>
>>>>> --
>>>>> Wandré Nunes de Pinho Veloso
>>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais -
>>>>> UFMG
>>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>>> Inteligência Computacional - UNIFEI
>>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>>
>>>>> 2017-09-13 16:22 GMT-03:00 Malitha Kabir :
>>>>>
>>>>>> Hi Wandré,
>>>>>>
>>>>>> It seems you already did intense research on it. Kindly accept my
>>>>>> comments as an addition to your idea (not the answer you trying to find
>>>>>> out). In my idea, categorizing molecules using it's descriptor should
>>>>>> reduce computation time. RDKit currently offer calculation of about 200
>>>>>> descriptors! So, a careful look up at those makes a lot of sense to me.
>>>>>> Conceptually, descriptor matching should follow a sequence (I don't know
>>>>>> what sequence would be ideal) - for example MolWt should match first (H
>>>>>> contribution and ions should be taken into consideration here) and then
>>>>>> subsequent matching of other descriptors (might be different while 
>>>>>> writing
>>>>>> programs). There are a few reading materials on molecular fingerprint and
>>>>>> database schema. You may have a look at those.
>>>>>>
>>>>>> The links are from Daylight. I am neither involved with the company
>>>>>> nor their product.
>>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html
>>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html
>>>>>>
>>>>>> Best regards,
>>>>>> - malitha
>>>>>>
>&g

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Markus Sitzmann
Strong recommendation: use the conda version:

http://www.rdkit.org/docs/Install.html

On Wed, Sep 13, 2017 at 9:58 PM, Wandré  wrote:

> I just run sudo apt-get install python-rdkit librdkit1 rdkit-data 😁
> I'm trying to solve this with this link: http://www.blopig.com/
> blog/2013/02/how-to-install-rdkit-on-ubuntu-12-04/
>
> --
> Wandré Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
> Inteligência Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>
> 2017-09-13 16:55 GMT-03:00 Markus Sitzmann :
>
>> How did you install rdkit so far? And where? Is it the conda/anaconda
>> version?
>>
>> On Wed, Sep 13, 2017 at 9:39 PM, Wandré  wrote:
>>
>>> How to install RDKit with InChI?
>>> When I run Chem.inchi.INCHI_AVAILABLE, the result is False
>>>
>>> --
>>> Wandré Nunes de Pinho Veloso
>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais -
>>> UFMG
>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>> Inteligência Computacional - UNIFEI
>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>
>>> 2017-09-13 16:30 GMT-03:00 Wandré :
>>>
>>>> Thanks Malitha.
>>>> I choose this descriptors because I will store this on my database, so,
>>>> will be fast compare one molecule before insert them in database.
>>>> My worry now is if the RDKit will generate different SMILES or InChI in
>>>> same SDF molecule or equals in different molecules (molecules from RCSB
>>>> PDB, PubChem, ChemBL, for example).
>>>>
>>>> --
>>>> Wandré Nunes de Pinho Veloso
>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais -
>>>> UFMG
>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>> Inteligência Computacional - UNIFEI
>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>
>>>> 2017-09-13 16:22 GMT-03:00 Malitha Kabir :
>>>>
>>>>> Hi Wandré,
>>>>>
>>>>> It seems you already did intense research on it. Kindly accept my
>>>>> comments as an addition to your idea (not the answer you trying to find
>>>>> out). In my idea, categorizing molecules using it's descriptor should
>>>>> reduce computation time. RDKit currently offer calculation of about 200
>>>>> descriptors! So, a careful look up at those makes a lot of sense to me.
>>>>> Conceptually, descriptor matching should follow a sequence (I don't know
>>>>> what sequence would be ideal) - for example MolWt should match first (H
>>>>> contribution and ions should be taken into consideration here) and then
>>>>> subsequent matching of other descriptors (might be different while writing
>>>>> programs). There are a few reading materials on molecular fingerprint and
>>>>> database schema. You may have a look at those.
>>>>>
>>>>> The links are from Daylight. I am neither involved with the company
>>>>> nor their product.
>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html
>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html
>>>>>
>>>>> Best regards,
>>>>> - malitha
>>>>>
>>>>>
>>>>> On Thu, Sep 14, 2017 at 12:43 AM, Wandré 
>>>>> wrote:
>>>>>
>>>>>> Thanks for all the answers.
>>>>>>
>>>>>> Reading all answers, I think in something different... If the SMILES
>>>>>> (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi
>>>>>> (Chem.MolToInchi(mol)) can generate the same value in different 
>>>>

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Markus Sitzmann
How did you install rdkit so far? And where? Is it the conda/anaconda
version?

On Wed, Sep 13, 2017 at 9:39 PM, Wandré  wrote:

> How to install RDKit with InChI?
> When I run Chem.inchi.INCHI_AVAILABLE, the result is False
>
> --
> Wandré Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
> Inteligência Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>
> 2017-09-13 16:30 GMT-03:00 Wandré :
>
>> Thanks Malitha.
>> I choose this descriptors because I will store this on my database, so,
>> will be fast compare one molecule before insert them in database.
>> My worry now is if the RDKit will generate different SMILES or InChI in
>> same SDF molecule or equals in different molecules (molecules from RCSB
>> PDB, PubChem, ChemBL, for example).
>>
>> --
>> Wandré Nunes de Pinho Veloso
>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>> Inteligência Computacional - UNIFEI
>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>
>> 2017-09-13 16:22 GMT-03:00 Malitha Kabir :
>>
>>> Hi Wandré,
>>>
>>> It seems you already did intense research on it. Kindly accept my
>>> comments as an addition to your idea (not the answer you trying to find
>>> out). In my idea, categorizing molecules using it's descriptor should
>>> reduce computation time. RDKit currently offer calculation of about 200
>>> descriptors! So, a careful look up at those makes a lot of sense to me.
>>> Conceptually, descriptor matching should follow a sequence (I don't know
>>> what sequence would be ideal) - for example MolWt should match first (H
>>> contribution and ions should be taken into consideration here) and then
>>> subsequent matching of other descriptors (might be different while writing
>>> programs). There are a few reading materials on molecular fingerprint and
>>> database schema. You may have a look at those.
>>>
>>> The links are from Daylight. I am neither involved with the company nor
>>> their product.
>>> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html
>>> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html
>>>
>>> Best regards,
>>> - malitha
>>>
>>>
>>> On Thu, Sep 14, 2017 at 12:43 AM, Wandré  wrote:
>>>
>>>> Thanks for all the answers.
>>>>
>>>> Reading all answers, I think in something different... If the SMILES
>>>> (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi
>>>> (Chem.MolToInchi(mol)) can generate the same value in different molecules,
>>>> I will generate others descriptors (NumHDonors, NumHAcceptors, Ri
>>>> ngCount, GetNumAtoms, TPSA, pyLabuteASA, MolWt, CalcNumRotatableBonds
>>>> and MolLogP) to compare all the molecules that SMILES and Inchi are the
>>>> same.
>>>> If all this data are the same, I will generate the fingerprint
>>>> (Atompair for exemple) and use Tanimoto coefficient and, if this value,
>>>> when I compare two molecules, is 1, this molecules are the same.
>>>>
>>>> Where is my mistake (I think that is, one or more, mistakes)?
>>>>
>>>> Thanks!
>>>>
>>>> --
>>>> Wandré Nunes de Pinho Veloso
>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais -
>>>> UFMG
>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>> Inteligência Computacional - UNIFEI
>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>
>>>> 2017-09-13 14:19 GMT-03:00 Dimitri Maziuk :
>>>>
>>>>> On 09/13/2017 11:46 AM, Markus Sitzmann wrote:
>>>>> > The

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Markus Sitzmann
Hi Wandré,

your problem is the opposite - it is quite unlikely, actually impossible,
that different molecules calculate the same InChI or SMILES, your bigger
problem is, that what you regard as the same chemical, is regarded as
different ones by SMILES or InChI. The danger for this is quite big for
SMILES. it becomes better with canonical SMILES (but in my opinion, not
much), your best friend is InChI or Standard InChI.

Also, if two different molecules would calculate the same InChI or SMILES,
in all likelihood all your descriptors are very similar, too, because
SMILES, InChI etc. are just connection table representations and those
descriptor calculating algorithms just work on the connection table (so,
the molecules also look the same for any of these algorithms).

Calculation of Tanimoto coefficient-type doesn't help this problem either,
and a Tanimoto coefficient of 1 doesn't mean two molecules are identical
(they are very similar but not identical).

Markus

On Wed, Sep 13, 2017 at 8:43 PM, Wandré  wrote:

> Thanks for all the answers.
>
> Reading all answers, I think in something different... If the SMILES
> (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi
> (Chem.MolToInchi(mol)) can generate the same value in different molecules,
> I will generate others descriptors (NumHDonors, NumHAcceptors,
> RingCount, GetNumAtoms, TPSA, pyLabuteASA, MolWt, CalcNumRotatableBonds
> and MolLogP) to compare all the molecules that SMILES and Inchi are the
> same.
> If all this data are the same, I will generate the fingerprint (Atompair
> for exemple) and use Tanimoto coefficient and, if this value, when I
> compare two molecules, is 1, this molecules are the same.
>
> Where is my mistake (I think that is, one or more, mistakes)?
>
> Thanks!
>
> --
> Wandré Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
> Inteligência Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>
> 2017-09-13 14:19 GMT-03:00 Dimitri Maziuk :
>
>> On 09/13/2017 11:46 AM, Markus Sitzmann wrote:
>> > The case that you have 3D information available for a molecule dataset
>> is rare, if you want it trustworthy it gets even worse than that. And what
>> is the point then to generate the configuration of a molecule first if you
>> can not trust that either?
>>
>> Veering further off topic, do you even care in the first place? E.g. if
>> your molecule always exists as a mixture of isomers, except in some
>> megabuck-per-microgram painstakingly created reference samples, a
>> 3D-based system will represent it as two distinct molecules. Whereas you
>> want it represented as one.
>>
>> Last I looked PDB Ligand Expo had two different benzenes. Their software
>> doesn't (didn't?) do the circle version so they don't have the third one.
>>
>> --
>> Dimitri Maziuk
>> Programmer/sysadmin
>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Markus Sitzmann
The case that you have 3D information available for a molecule dataset is rare, 
if you want it trustworthy it gets even worse than that. And what is the point 
then to generate the configuration of a molecule first if you can not trust 
that either?

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 13. Sep 2017, at 17:58, Dimitri Maziuk  wrote:
> 
>> On 2017-09-13 10:17, Markus Sitzmann wrote:
>> Canonical SMILES are only a very rough approximation for "unique molecule" 
>> as they usually don't work well for tautomeric forms of compound.
>> InChI or Standard InChI is much better although also not perfect.
> 
> ALATIS I linked to above does impose a stable consistent ordering for 
> everything including hydrogens. The downside is it's garbage in - garbage 
> out: you need to start with a 3D structure, otherwise it has an option to 
> addHs and gen3D but no guarantee it'll generate the one you want.
> 
> Dima
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Markus Sitzmann
Canonical SMILES are only a very rough approximation for "unique molecule"
as they usually don't work well for tautomeric forms of compound.
InChI or Standard InChI is much better although also not perfect.

The "perfect solution" depends also on how uniqueness or redundancy of
molecules is regarded for the purpose of the database.


On Wed, Sep 13, 2017 at 4:56 PM, TJ O'Donnell  wrote:

> Let the database do the work for you.  Create a canonical SMILES column
> and/or InChI column and declare them to be unique.  As you insert new
> rows, postgres will let  you know if there is already a row with the same
> SMILES or InChI.
> Here's some help on how to handle that.
> https://www.postgresql.org/docs/9.5/static/sql-insert.html#SQL-ON-CONFLICT
>
> TJ O'Donnell
>
> On Wed, Sep 13, 2017 at 3:13 AM, Wandré  wrote:
>
>> Hi,
>>
>> My name is Wandré and I'm from Brazil.
>> I'm trying to do a big database of molecules, but, I want to eliminate
>> all the redundant molecules before insert them in database.
>> I want to know what is the best method to identify one molecule in RDKit.
>> Is SMILES ("Chem.MolToSmiles(mol,isomericSmiles=True)") or I will need
>> to compare all molecules, one by one, before insert them in database (using
>> Tanimoto)?
>> This can be hard to do because my database will have lot of millions of
>> molecules, so, compare one by one before insert is the only answer?
>> Compare if the SMILES as already inserted is easy (text compare), but,
>> compare fingerprint of molecule...
>>
>> If I really need to compare the fingerprint of molecule, how to store
>> this data in PostgreSQL without use cartridge? I will generate the
>> fingeprint (Atompair, for example) and store this fingerprint in database
>> and compare all the fingerprints, one by one, before insert a now molecule.
>> This fingerprint (Atompair) have lot of features, so, store this in
>> relational database is expensive.
>> It is possible?
>>
>> Thanks!
>>
>> --
>> Wandré Nunes de Pinho Veloso
>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>> Inteligência Computacional - UNIFEI
>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ETKDG conformation generation algorithm and fullerene-like structures.

2017-09-07 Thread Markus Sitzmann
Mhh, your choices of test molecules sounds like going from poster child to 
archenemy of conformation generation algorithms :-)

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 7. Sep 2017, at 18:59, Jason Biggs  wrote:
> 
> I've never had success using the ETKDG or KDG methods for fullerenes, when 
> trying on C60 it goes for a long time and returns -1.  The ETDG method works 
> on C60, but fails on your C60H60.
> 
> One thing you could try is to embed the hydrogen-suppressed structure, then 
> add the hydrogens
> 
> RDKit::DGeomHelpers::EmbedParameters params(RDKit::DGeomHelpers::ETDG);
> 
> RDKit::DGeomHelpers::EmbedMolecule(*mol, params);
> 
> bool explicitOnly = false;
> 
> bool addCoords = true;
> 
> RDKit::MolOps::addHs(*mol, explicitOnly, addCoords);
> 
> seems to work.
> 
> 
> 
> 
> Jason Biggs
> 
> 
>> On Thu, Sep 7, 2017 at 10:49 AM, Dmitry Redkin  wrote:
>> Hello all!
>> I've just started to use RDKit, and now I'm trying to generate some 3D
>> conformation for a molecule. ETKDG successfully optimized cyclohexane, so
>> I've tried some more complex example.
>> It was this fullerene-like structure (with all the single bonds and every C
>> atom having H atom attached). I'm attaching it to this email.
>> 
>> But whatever I've tried to do with embedding parameters, RDKit whether
>> stalls for several minutes trying to complete operation or just exits with
>> all zero coordinates.
>> 
>> Is there any way to generate conformations for this structure? Maybe I did
>> something wrong or there is some flag that can be set to get some result
>> (any result, not necessarily the best one) in a reasonable time?
>> 
>> My code is pretty simple, you can see it below.
>> 
>> 
>> RWMol *mol = MolFileToMol("d:\\temp\\exe32\\full.mol", true, false, false);
>> 
>> MolOps::addHs(*mol);
>> DGeomHelpers::EmbedParameters p(DGeomHelpers::ETKDG);
>> p.maxIterations = 100; // if I left it -1, I could not wait long enough for
>> EmbedMolecule to exit.
>> p.useRandomCoords = true;
>> int confid = DGeomHelpers::EmbedMolecule(*((ROMol*)mol), p);
>> MolToMolFile(*((ROMol*)mol), "d:\\temp\\exe32\\full1.mol", true, confid);
>> free(mol);
>> 
>> 
>> 
>> Dmitry Redkin, ACD Inc.
>> red...@acdlabs.ru 
>> -- 
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>> 
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Using RDKit in PyCharm and Anaconda on Windows

2017-06-01 Thread Markus Sitzmann
I definitely have it working on Linux, too, but it might have been that I
also only tried it with PyCharm 2017.1.3 first. Before that, I did what
Greg suggested, starting pycharm from the activated environment.
Unfortunately I have no experience with Windows in this regard, too.

On Thu, Jun 1, 2017 at 9:57 AM, Pavel Polishchuk 
wrote:

> I had some issues to run rdkit from Python console in PyCharm (4.5.5) on
> Linux. After recent installation of PyCharm 2017.1.3 it started to work.
> Maybe updating PyCharm will help on Win as well.
>
> Pavel.
>
>
>
> On 05/30/2017 10:10 PM, West, Richard wrote:
>
>> We're having trouble getting RDKit to work in a PyCharm project using an
>> Anaconda interpreter (Python 2.7), on Windows 8.1.
>> Has anyone had success with this and can guide us?
>> The trouble is we get an
>>
>>ImportError: DLL load failed: The specified module could not be found.
>>
>> when trying to import rdkit (or rdBase).
>>
>> We have tried many variations of the following, but here is a basic
>> recipe of what does/doesn't work:
>> 1. Make a new conda environment (called 'eg1'), install rdkit ('conda
>> install -c rdkit rdkit')
>> 2. From a cmd.exe prompt, use this environment ('activate eg1') load
>> python ('python') and import rdkit ('import rdkit') it works fine.
>> 3. From PyCharm, create a Project Interpreter (pointing to
>> 'C:\Anaconda2\envs\eg1\python.exe'), and use this to run a script or
>> create a new Python Console in which you 'import rdkit', leading to the
>> "DLL load failed" message.
>> 4. We have tried manually adding a bunch of things to the "Interpreter
>> Paths" in PyCharm, but without success (perhaps we just didn't add the
>> right thing).
>>
>>
>> 
>>
>> Update: just before I hit "send" on this request for help, we stumbled
>> across this posting of the same problem, and solution, from Christian
>> Ribeaud:
>> https://intellij-support.jetbrains.com/hc/en-us/community/
>> posts/115000244450-DLL-load-failed
>>
>> It seems that if we open cmd.exe, activate the environment, and then
>> launch PyCharm exe from there, it works.
>> I'm sharing this here because it took us a while to find the other post,
>> but also to ask: is there a "better" way?
>>
>> Cheers,
>> Richard
>>
>>
>> --
>> Richard H. West, Ph.D.
>> Assistant Professor, Department of Chemical Engineering,
>> Northeastern University, 360 Huntington Ave, Boston, MA 02115
>> http://northeastern.edu/comochengPhone: 617-373-5163
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Trouble with conda build in Docker

2017-04-25 Thread Markus Sitzmann
Hi Riccardo,

just to tell you: my problem went away (I didn't touched it since my last
email) for whatever reason (did you do something?)

Markus

On Thu, Apr 6, 2017 at 12:51 PM, Markus Sitzmann 
wrote:

> Thanks Riccardo for your reply
>
> I tried both master and development - both with your Dockerscript (which
> starts from centos) and mine (which starts from Debian:jessie). Same result
> everywhere. I haven't built it in a while, too, but since I updated to
> Docker CE, Version 17.03, it triggered this rebuild.
>
> I hope it isn't my setup (well, that is actually what I wanted to find out
> :-), if somebody else has problems). It isn't urgent, also :-)
>
>
> Markus
>
> On Thu, Apr 6, 2017 at 8:50 AM, Riccardo Vianello <
> riccardo.viane...@gmail.com> wrote:
>
>> Hi Markus,
>>
>> On Thu, Apr 6, 2017 at 12:03 AM, Markus Sitzmann <
>> markus.sitzm...@gmail.com> wrote:
>>
>>> Hi (Riccardo).
>>>
>>> I have trouble with the conda build in Docker (I just updated to the
>>> most recent version which triggered the new build) - below is the error
>>> trace. I took the original Docker file and just edited out all non-Python35
>>> builds - so it does only the Python35 builds and ends somewhere when
>>> rdkit-postgres95 is built. Does somebody have the same problem?
>>>
>>
>> I couldn't work on this during the last few months so I didn't test any
>> recent builds. I might be able to have a closer look and run some tests
>> next week. What branch of the conda-rdkit repository are you using (master
>> or development)?
>>
>> Best,
>> Riccardo
>>
>>
>>>
>>> make[3]: Entering directory `/home/rdkit/bld/postgresql95_
>>> 1491429385957/work/postgresql-9.5.2/src/port'
>>> make -C ../backend submake-errcodes
>>> make[3]: Entering directory `/home/rdkit/bld/postgresql95_
>>> 1491429385957/work/postgresql-9.5.2/src/backend/catalog'
>>> cd ../../../src/include/catalog && /bin/sh ../../../config/missing perl
>>> ./duplicate_oids
>>> make -C utils probes.h
>>> ***
>>> ERROR: Perl is missing on your system. It is needed unless you are
>>> building
>>> from an unmodified official distribution of PostgreSQL.
>>> ***
>>> make[3]: Leaving directory `/home/rdkit/bld/postgresql95_
>>> 1491429385957/work/postgresql-9.5.2/src/backend/catalog'
>>>
>>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Install rdkit with anaconda3

2017-04-12 Thread Markus Sitzmann
I think if you install conda freshly now it automatically uses python 3.6.
If you don't have the requirement for 3.6 you have to do this

conda install python=3.5

Then you should be able do install rdkit as described.

On Wed, Apr 12, 2017 at 12:29 PM, Greg Landrum 
wrote:

>
>
> On Wed, Apr 12, 2017 at 4:59 AM, Maciek Wójcikowski  > wrote:
>
>>
>> There are no Python 3.6 packages of rdkit right now.
>>
>> I guess we can ask Greg or Riccardo to build them with the next release
>> of RDKit.
>>
>
> That is the plan. When we do the next release (in about a week), we'll do
> python3.6 builds too.
>
> -greg
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Trouble with conda build in Docker

2017-04-06 Thread Markus Sitzmann
Thanks Riccardo for your reply

I tried both master and development - both with your Dockerscript (which
starts from centos) and mine (which starts from Debian:jessie). Same result
everywhere. I haven't built it in a while, too, but since I updated to
Docker CE, Version 17.03, it triggered this rebuild.

I hope it isn't my setup (well, that is actually what I wanted to find out
:-), if somebody else has problems). It isn't urgent, also :-)


Markus

On Thu, Apr 6, 2017 at 8:50 AM, Riccardo Vianello <
riccardo.viane...@gmail.com> wrote:

> Hi Markus,
>
> On Thu, Apr 6, 2017 at 12:03 AM, Markus Sitzmann <
> markus.sitzm...@gmail.com> wrote:
>
>> Hi (Riccardo).
>>
>> I have trouble with the conda build in Docker (I just updated to the most
>> recent version which triggered the new build) - below is the error trace. I
>> took the original Docker file and just edited out all non-Python35 builds -
>> so it does only the Python35 builds and ends somewhere when
>> rdkit-postgres95 is built. Does somebody have the same problem?
>>
>
> I couldn't work on this during the last few months so I didn't test any
> recent builds. I might be able to have a closer look and run some tests
> next week. What branch of the conda-rdkit repository are you using (master
> or development)?
>
> Best,
> Riccardo
>
>
>>
>> make[3]: Entering directory `/home/rdkit/bld/postgresql95_
>> 1491429385957/work/postgresql-9.5.2/src/port'
>> make -C ../backend submake-errcodes
>> make[3]: Entering directory `/home/rdkit/bld/postgresql95_
>> 1491429385957/work/postgresql-9.5.2/src/backend/catalog'
>> cd ../../../src/include/catalog && /bin/sh ../../../config/missing perl
>> ./duplicate_oids
>> make -C utils probes.h
>> ***
>> ERROR: Perl is missing on your system. It is needed unless you are
>> building
>> from an unmodified official distribution of PostgreSQL.
>> ***
>> make[3]: Leaving directory `/home/rdkit/bld/postgresql95_
>> 1491429385957/work/postgresql-9.5.2/src/backend/catalog'
>>
>>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Trouble with conda build in Docker

2017-04-05 Thread Markus Sitzmann
Hi (Riccardo).

I have trouble with the conda build in Docker (I just updated to the most
recent version which triggered the new build) - below is the error trace. I
took the original Docker file and just edited out all non-Python35 builds -
so it does only the Python35 builds and ends somewhere when
rdkit-postgres95 is built. Does somebody have the same problem?

Thanks,
Markus

make[3]: Entering directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/port'
make -C ../backend submake-errcodes
make[3]: Entering directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/catalog'
cd ../../../src/include/catalog && /bin/sh ../../../config/missing perl
./duplicate_oids
make -C utils probes.h
***
ERROR: Perl is missing on your system. It is needed unless you are building
from an unmodified official distribution of PostgreSQL.
***
make[3]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/catalog'
make[3]: *** [postgres.bki] Error 1
make[2]: *** [submake-schemapg] Error 2
make[2]: *** Waiting for unfinished jobs
make[3]: Entering directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/utils'
sed -f ./Gen_dummy_probes.sed probes.d >probes.h
make[3]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend/utils'
make[4]: Entering directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend'
make[4]: Nothing to be done for `submake-errcodes'.
make[4]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend'
make[3]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/port'
make -C ../../src/common all
make[3]: Entering directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/common'
make -C ../backend submake-errcodes
make[4]: Entering directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend'
make[4]: Nothing to be done for `submake-errcodes'.
make[4]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend'
make[3]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/common'
make[2]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src/backend'
make[1]: *** [all-backend-recurse] Error 2
make[1]: Leaving directory
`/home/rdkit/bld/postgresql95_1491429385957/work/postgresql-9.5.2/src'
make: *** [all-src-recurse] Error 2
Traceback (most recent call last):
  File "/home/rdkit/miniconda/bin/conda-build", line 6, in 
 path in binary file share/terminfo/w/wsvt25
Detected hard-coded path in binary file share/terminfo/w/wsvt25m
Detected hard-coded path in binary file share/terminfo/x/x68k
Detected hard-coded path in binary file share/terminfo/x/x68k-ite
Detected hard-coded path in binary file share/terminfo/z/z29a
Detected hard-coded path in binary file share/terminfo/z/z29a-kc-bc
Detected hard-coded path in binary file share/terminfo/z/z29a-kc-uc
Detected hard-coded path in binary file share/terminfo/z/z29a-nkc-bc
Detected hard-coded path in binary file share/terminfo/z/z29a-nkc-uc
Detected hard-coded path in binary file share/terminfo/z/z340
Detected hard-coded path in binary file share/terminfo/z/z340-nam
Detected hard-coded path in text file bin/ncurses6-config
Detected hard-coded path in text file share/man/man1/captoinfo.1m
Detected hard-coded path in text file share/man/man1/infocmp.1m
Detected hard-coded path in text file share/man/man1/infotocap.1m
Detected hard-coded path in text file share/man/man1/ncurses6-config.1
Detected hard-coded path in text file share/man/man1/tic.1m
Detected hard-coded path in text file share/man/man1/toe.1m
Detected hard-coded path in text file share/man/man1/tput.1
Detected hard-coded path in text file share/man/man1/tset.1
Detected hard-coded path in text file share/man/man3/ncurses.3x
Detected hard-coded path in text file share/man/man3/panel.3x
Detected hard-coded path in text file share/man/man5/term.5
Detected hard-coded path in text file share/man/man5/terminfo.5
Detected hard-coded path in text file share/man/man7/term.7
/home/rdkit/bld/linux-64/ncurses-6.0-0.tar.bz2
Nothing to test for: /home/rdkit/bld/linux-64/ncurses-6.0-0.tar.bz2
BUILD START: postgresql95-9.5.2-py35_0

The following NEW packages will be INSTALLED:

libiconv:   1.14-0
libxml2:2.9.4-0
libxslt:1.1.29-0
ncurses:6.0-0 local
openssl:1.0.2k-1
pip:9.0.1-py35_1
python: 3.5.3-1
readline:   6.2-2
setuptools: 27.2.0-py35_0
sqlite: 3.13.0-0
tk: 8.5.18-0
wheel:  0.29.0-py35_0
xz: 5.2.2-1
zlib:   1.2.8-3

Source cache directory is: /home/rdkit/bld/src_cache
Downloading source to cache: postgresql-9.5.2.tar.bz2
Downloading
https://ftp.postgresql.org/pub/source/v9.5.2/pos

Re: [Rdkit-discuss] connecting to postgres in rdkit environment

2017-02-25 Thread Markus Sitzmann
Maybe this one here helps, too, although it is basically the same what TJ
said:

https://devops.profitbricks.com/tutorials/install-postgresql-on-centos-7/

Markus

On Sat, Feb 25, 2017 at 11:29 PM, TJ O'Donnell  wrote:

> The server itself must be told to allow remote connections.
> You might check these two things.
> 1.  You can edit the postgresql.conf file (not sure where that is on your
> system).
>  https://www.postgresql.org/docs/9.2/static/runtime-
> config-connection.html
>  Uncomment or add the line listen_addresses='*'. You can
>  tailor that to be more specific, but try this first.
>
> 2.  The file pg_hba.conf also controls access.  Look at this:
>   https://www.postgresql.org/docs/9.3/static/auth-pg-hba-conf.html
>
> Be sure to restart the server after you make changes to these files.
>
> Hope this helps,
> TJ O'Donnell
>
>
> On Sat, Feb 25, 2017 at 12:34 PM,  wrote:
>
>> Hi,
>> I've installed rdkit on a CentOS machine using anaconda python and set up
>> a postgresql compound database in the rdkit environment. It works great on
>> the machine's console.
>> I now want to access it remotely and I'm trying to set up a jdbc postgres
>> driver to access it from a windows client but this is not working. If I
>> test the driver on the server it tells me that the connection is refused
>> and I should check that the machine is accepting TCP requests.
>>
>> I have opened the standard port that postgres uses
>> -A INPUT -m state --state NEW -m tcp -p tcp --dport 5432 -j ACCEPT
>>
>> iptables -L returns
>> ACCEPT tcp  --  anywhere anywherestate NEW
>> tcp dpt:postgres
>>
>> this is where I don't know what to check next. A few things that might be
>> relevant. If I "ps -eaf | grep post" I see four postgres processes running
>> under my username (not postgres), so I think there is a server working.
>> There is also a "system" postgresql (version 9.2) which I have connected to
>> previously a long time ago. This connection no longer works either and I
>> don't really care about that but could be an interfering factor.
>>
>> If anyone has suggestions about what to check next or solve this I'd be
>> grateful
>>
>> thanks,
>> Neil
>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] If someone has build problems using conda currently ...

2017-01-16 Thread Markus Sitzmann
... I just suffered this:

https://github.com/conda/conda/issues/4309

Going back to a previous conda version (4.2.12) helps.

Other than that:
Happy New Year (a late one :-)
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread Markus Sitzmann
Hi Alexis,

you may find also so some "novel" compounds by this approach :-).

Whether your tuple solution improves performance strongly depends on the
content of your text documents and how often they repeat the same words
again - but my guess would be it will help. Probably the best way is even
to look at the distribution of words before you feed them to RDKit. You
should also "memorize" those ones that successfully generated a structure,
doesn't make sense to do it again, then.

Markus

On Fri, Dec 2, 2016 at 10:21 AM, Maciek Wójcikowski 
wrote:

> Hi Alexis,
>
> You may want to filter with some regex strings containing not valid
> characters (i.e. there is small subset of atoms that may be without
> brackets). See "Atoms" section: http://www.daylight.com/
> dayhtml/doc/theory/theory.smiles.html
>
> The set might grow pretty quick and may be inefficient, so I'd parse all
> strings passing above filter. Although there will be some false positives
> like "CC" which may occur in text (emails especially).
>
> 
> Pozdrawiam,  |  Best regards,
> Maciek Wójcikowski
> mac...@wojcikowski.pl
>
> 2016-12-02 10:11 GMT+01:00 Alexis Parenty :
>
>> Dear all,
>>
>>
>> I am looking for a way to extract SMILES scattered in many text documents
>> (thousands documents of several pages each).
>>
>> At the moment, I am thinking to scan each words from the text and try to
>> make a mol object from them using Chem.MolFromSmiles() then store the words
>> if they return a mol object that is not None.
>>
>> Can anyone think of a better/quicker way?
>>
>>
>> Would it be worth storing in a tuple any word that do not return a mol
>> object from Chem.MolFromSmiles() and exclude them from subsequent search?
>>
>>
>> Something along those lines
>>
>>
>> excluded_set = set()
>>
>> smiles_list = []
>>
>> For each_word in text:
>>
>> If each_word not in excluded_set:
>>
>> each_word_mol =  Chem.MolFromSmiles(each_word)
>>
>> if each_word_mol is not None:
>>
>> smiles_list.append(each_word)
>>
>>  else:
>>
>>  excluded_set.add(each_word_mol)
>>
>>
>> Would not searching into that growing tuple take actually more time than
>> trying to blindly make a mol object for every word?
>>
>>
>>
>> Any suggestion?
>>
>>
>> Many thanks and regards,
>>
>>
>> Alexis
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-12-01 Thread Markus Sitzmann
Well, since George mentioned a talk by me, I wish we would have implemented
our tool back then using an open-source tool like RDKit (which wasn't very
well know back then), and also would have been so smart to use SMARTS for
the transformation rules (partially they are implemented as SMARTS but big
parts are other CACTVS script functionalities).

There is still an intention by me to continue/advance (whatever) on this
and make it openly available, but I must admit it is a quite vague
intention currently.

Markus
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] smarts vs smiles database queries and explicit hydrogens

2016-11-23 Thread Markus Sitzmann
If I understood Greg correctly, it will be in 2016.09 which isn't in conda just 
of yet, they are currently working on putting it there.

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 23 Nov 2016, at 15:29, Alexander Klenner-Bajaja  wrote:
> 
> Dear Greg,
>  
> Thank you very much, looking at the results that function was exactly what I 
> was looking for – only I can’t find it in my updated anaconda installation.
>  
> “conda update rdkit” tells me I have the latest version 2016.03.4 and 
> postgres tells me I have the 3.4 version of the RDKit extension
>  
> If I understand your blog post correctly it should be in 2016.03 version? 
> What am I missing?
>  
>  
> Best,
>  
> Alex
>  
>  
>  
> From: Greg Landrum [mailto:greg.land...@gmail.com] 
> Sent: Wednesday, November 23, 2016 11:42 AM
> To: Alexander Klenner-Bajaja
> Cc: rdkit-discuss@lists.sourceforge.net
> Subject: Re: [Rdkit-discuss] smarts vs smiles database queries and explicit 
> hydrogens
>  
> Hi Alex,
>  
> The new version of the cartridge has some capabilities that, I think, address 
> this.
>  
> There's a blog post about this: 
> http://rdkit.blogspot.com/2016/07/tuning-substructure-queries-ii.html
> but the short version is that you can do the kind of queries it seems like 
> you want to do quite simply:
>  
> chembl_21=# select * from rdk.mols where 
> m@>mol_adjust_query_properties('*c1ncccn1') limit 3;
>  molregno |   m   
> 
> --+---
>601707 | CCCc1nc(-c2ccc(F)cc2)oc1C(=O)NC(CC)CN1CCN(c2ncccn2)CC1
>289103 | CC1C(=N)/C(=N/Nc2ccc(S(=O)(=O)Nc3ncccn3)cc2)C(=O)C(C)C1=O
>607646 | 
> CCNC(=O)[C@@H]1OC(n2cnc3c(NC(=O)Nc4ccc(S(=O)(=O)Nc5ncccn5)cc4)ncnc32)[C@@H](O)[C@H]1O
> (3 rows)
>  
> chembl_21=# select * from rdk.mols where 
> m@>mol_adjust_query_properties('*c1nc(*)ccn1') limit 3;
>  molregno |   m   
> --+---
>158659 | CCNc1nccc(-c2c(-c3ccc(F)cc3)ncn2C2CCN(C)CC2)n1
>158743 | Nc1nccc(-c2c(-c3ccc(F)cc3)ncn2C2CCN(Cc3c3)CC2)n1
>158843 | CC1(C)CC(n2cnc(-c3ccc(F)cc3)c2-c2ccnc(N)n2)CC(C)(C)N1
> (3 rows)
>  
> chembl_21=# select * from rdk.mols where 
> m@>mol_adjust_query_properties('*c1nc(*)cc(*)n1') limit 3;
>  molregno |m  
>
> --+--
>726443 | CN=C(S)NNc1nc(C)cc(C)n1
>561136 | 
> C[C@H](Nc1cc(NC2CC2)nc(C(F)(F)F)n1)[C@@H](Cc1ccc(Cl)cc1)c1(Br)c1
>205784 | CCN(CC)C(=O)CSc1nc(N)cc(Cl)n1
> (3 rows)
>  
> There's more detail in the blog post, but the default behavior is to convert 
> dummies into generic query atoms and to constrain the substitution at any 
> other *ring* position.
>  
> Best Regards,
> -greg
>  
>  
> On Wed, Nov 23, 2016 at 9:20 AM, Alexander Klenner-Bajaja  
> wrote:
> Hi all,
>  
> I am currently exploring the possibilities of the RDKit database cartridge 
> for substructure search- I installed everything following the  tutorial from 
> http://www.rdkit.org/docs/Install.html
>  
> Very nice tutorial  - worked perfectly fine.
>  
> Since we are exploring solutions for browser based gui searches I created a 
> test page using Ketcher (http://lifescience.opensource.epam.com/ketcher/) 
> which communicates with the database through PHP.
>  
> Ketcher returns a SMILES representation from the drawn molecule. The raw data 
> of the molecules in the database are canonical SMILES created from RDKIT 
> canonical SMILES from the rdkit KNIME node (they are text-mined from patents).
>  
> When doing substructure searches, as long as we query for well-defined 
> compounds the results make sense – however looking at R1,…-groups things get 
> a little odd.
>  
> I found a very old discussion on the mailing list from 2009 where this has 
> been discussed and I understood from that dialog that when looking at SMILES 
> with a “*” representation this is interpreted as a dummy atom and the same 
> dummy atom is expected in the search space to produce a hit. While a SMARTS 
> representation of the same string actually leads to the behaviour that “any 
> atom” is matched at that position.
>  
> I ended up with the very cumbersome query, I am sure there are more elegant 
> ways of doing this using ::qmol notation, but as I said I am currently 
> explori

Re: [Rdkit-discuss] reading multiple conformers from file

2016-10-31 Thread Markus Sitzmann
+1 for a json format ... hmm, how about a general json-based molecular
structure format ... let us call it "cson" (that is an homage to Google
gson and Chemical Markup Language CML :-)

Markus

On Mon, Oct 31, 2016 at 11:18 AM, Brian Cole  wrote:

> I would 2nd the suggestion of continuing to push a JSON format forward
> that natively supports multiple conformers.
>
> I've never seen automatic recombination of an SDF work %100 of the time,
> it's fraught with corner cases. It's also abysmally slow and takes a huge
> amount of disk space.
>
> -Bruce
>
> On Oct 30, 2016, at 5:21 PM, Brian Kelley  wrote:
>
> Rdkit already has a way to serialize conformers, the binary pickle format!
>
> Perhaps we should make a file extension for multiple molecules.  Say
> ".rdk" and call it a day.   Like inchi the source code is the reference  :)
>
> 
> Brian Kelley
>
> On Oct 27, 2016, at 2:05 AM, Greg Landrum  wrote:
>
> The RDKit has support for the TPL format, an old BioCad/MSI/Accelrys
> format.
> It's easy to imagine something better, but this is at least already there
> and there could be other software that speaks it:
> https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/FileParsers/test_
> data/cmpd2.tpl
>
> I'd still like to do a decent JSON format and adding multi-confs to that
> would be logical
>
> On Thu, Oct 27, 2016 at 6:58 AM, David Cosgrove <
> davidacosgrov...@gmail.com> wrote:
>
>> I've been wondering if, now that you can get decent conformations from
>> RDKit, it would be worth devising a multi-conformation file format to make
>> reading multi-conf molecules faster for vs purposes. In my experience,
>> pulling all the conformers out of an ascii file such as an sdf can become
>> the RDS for pharmacophore searchimg. Something to think about at the
>> hackathon maybe and certainly something that deserves a new email
>> thread.
>>
>> Dave
>>
>>
>> On Thursday, 27 October 2016, Greg Landrum 
>> wrote:
>>
>>> Hi Thomas,
>>>
>>> You're right, reading multiple conformations out of an SDF does seem
>>> like one of those common operations. Unfortunately the RDKit does not
>>> currently support it in an easy way.
>>>
>>> A python implementation of this would be a good topic for Friday's UGM
>>> hackathon, we can see if anyone finds it interesting enough to work on.
>>>
>>> -greg
>>>
>>>
>>> On Tue, Oct 25, 2016 at 2:16 AM, Thomas Evangelidis 
>>> wrote:
>>>
 Hello everyone,

 I am a new user of RDkit and I was looking in the documentation for an
 easy way to load multiple conformers from a structure file like .sdf. The
 code must 1) distinguish between different protonation states of the same
 molecule,  2) create a new Mol() object for each protonation state and load
 into it the respective conformers.

 Apparently I can work out a solution for 1)
 using mol.GetProp('_Name'), mol.GetNumAtoms, mol.GetNumBonds and other
 properties, but I was wondering if there is any more straight forward way
 to do it.
 For 2) I guess I must iterate over all molecules in the input file,
 create new Mol() objects (one for each protonation state of each ligand)
 and add conformers to these new Mol() objects. Again this sounds easily
 programmable, but sounds like a very common operation, thus I was wondering
 if it has been implemented in a function.

 thanks in advance
 Thomas


 --

 ==

 Thomas Evangelidis

 Research Specialist
 CEITEC - Central European Institute of Technology
 Masaryk University
 Kamenice 5/A35/1S081,
 62500 Brno, Czech Republic

 email: tev...@pharm.uoa.gr

   teva...@gmail.com


 website: https://sites.google.com/site/thomasevangelidishomepage/


 
 --
 The Command Line: Reinvented for Modern Developers
 Did the resurgence of CLI tooling catch you by surprise?
 Reconnect with the command line and become more productive.
 Learn the new .NET and ASP.NET CLI. Get your free copy!
 http://sdm.link/telerik
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


>>>
> 
> --
> The Command Line: Reinvented for Modern Developers
> Did the resurgence of CLI tooling catch you by surprise?
> Reconnect with the command line and become more productive.
> Learn the new .NET and ASP.NET CLI. Get your free copy!
> http://sdm.link/telerik
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
> 

Re: [Rdkit-discuss] The RDKit and modern C++

2016-09-28 Thread Markus Sitzmann
I get the feeling, RH/Centos 6 becomes the next XP kind of story - to many 
legacies that make the update impossible or very hard. Also docker, a great 
technology that could mitigate this problem, is very painful under RH/Centos 6.

---
Markus Sitzmann


> On 29 Sep 2016, at 07:31, Greg Landrum  wrote:
> 
> 
>> On Thu, Sep 29, 2016 at 7:06 AM, Peter S. Shenkin  wrote:
>> 
>> Thanks... so it sounds like the main effort (aside from what you delicately 
>> called "professional development" ;-) ) will be to introduce features that 
>> improve robustness or performance when writing new code and possibly when 
>> maintaining (fixing, extending) existing code.
> 
> Yes, I think that's about right with the one refinement that we'll be using 
> some automated tools to convert the existing code to use some of those new 
> features.
> 
> -greg
>  
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conda installation of RDKit on W8

2016-09-26 Thread Markus Sitzmann
Hi Gonzalo,

after you activated my-rdkit-env, try to install rdkit by

  conda install -c https://conda.anaconda.org/rdkit rdkit

Alternatively, if you go a step back, you can also start with

 conda create -c https://conda.anaconda.org/rdkit -n
give-your-environment-whatever-name-you-want rdkit

and then activate "give-your-enviroment-whatever-name-you-want"

Your error message above just says that you are trying to create a
environment with the same name again

Markus

On Mon, Sep 26, 2016 at 2:11 PM, Gonzalo Colmenarejo <
colmenarejo.gonz...@gmail.com> wrote:

> Thanks a lot, Marta. Still, after activating the environment, I get in
> jupyter the "ImportError: No module named rdkit".
>
> This is confusing...
>
>
> On Mon, Sep 26, 2016 at 1:56 PM, Marta Stępniewska-Dziubińska <
> mart...@ibb.waw.pl> wrote:
>
>> Hi Gonzalo,
>> You need to activate your environment:
>> activate my-rdkit-env
>>
>> See: http://conda.pydata.org/docs/using/envs.html#change-environm
>> ents-activate-deactivate
>>
>> Best,
>> Marta
>>
>>
>> 2016-09-26 13:45 GMT+02:00 Gonzalo Colmenarejo <
>> colmenarejo.gonz...@gmail.com>:
>> > rdkit is not shown within the package list. However, if I run conda
>> create
>> > -c https://conda.anaconda.org/rdkit -n my-rdkit-env rdkit I get this
>> > message:
>> >
>> > Error: prefix already exists: C:\Users\Dell\Anaconda\envs\my-rdkit-env
>> >
>> > Any idea on how this could be fixed?
>> >
>> > Thanks
>> >
>> > On Fri, Sep 23, 2016 at 9:06 PM, Greg Landrum 
>> > wrote:
>> >>
>> >> I think anaconda is fine, but it looks like either the RDKit isn't
>> >> installed correctly or you aren't running the anaconda Python.
>> >>
>> >> Please check that the python you are running is the one from anaconda
>> and
>> >> that the RDKit is installed (that last one is "conda list")
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Sep 23, 2016 at 8:31 PM +0200, "Gonzalo Colmenarejo"
>> >>  wrote:
>> >>
>> >>> Hi Greg,
>> >>>
>> >>> It shows:
>> >>>
>> >>> ImportError: No module named rdkit
>> >>>
>> >>> Should I reinstall anaconda?
>> >>>
>> >>> Thanks
>> >>>
>> >>> Gonzalo
>> >>>
>> >>> On Fri, Sep 23, 2016 at 2:54 PM, Greg Landrum > >
>> >>> wrote:
>> 
>>  Hi Gonzalo,
>> 
>>  Are you sure that the jupyter you are running is the same one that
>> came
>>  with your conda installation?
>>  Can you do, from the command line:
>>  python -c "from rdkit import Chem"
>> 
>>  On Fri, Sep 23, 2016 at 10:49 AM, Gonzalo Colmenarejo
>>   wrote:
>> >
>> > Hi,
>> > I had a previous release of RDKit (2015_03_1) in my Windows 8 PC
>> > installed in the old fashioned mode and it worked OK. I renamed the
>> > corresponding folder and installed the latest version of RDKit
>> through
>> > conda. Now I get the following error message when trying to run my
>> previous
>> > code in Jupyter: ImportError: No module named rdkit
>> >
>> > Any advice on how to fix this would be appreciated.
>> >
>> > Thanks a lot
>> >
>> > Gonzalo
>> >
>> >
>> > 
>> --
>> >
>> > ___
>> > Rdkit-discuss mailing list
>> > Rdkit-discuss@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> >
>> 
>> >>>
>> >
>> >
>> > 
>> --
>> >
>> > ___
>> > Rdkit-discuss mailing list
>> > Rdkit-discuss@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> >
>>
>
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cannot import matplotlib in my-rdkit-env

2016-08-21 Thread Markus Sitzmann
Hi Chris,

You have to explicitly install it in your my-rdkit-env, too, like you did in 
the environment where matplotlib is already available.

After you activated my-rdkit-env, you probably just have to run

conda install matplotlib 

(You have to do this for any other package, too)

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 21.08.2016, at 16:24, chris dalton  wrote:
> 
> Hi,
> I have installed Rdkit on a windows laptop with conda and I can activate the 
> rdkit environment OK and if I start IDLE up, rdkit works. However, I can no 
> longer import some other packages, such as matplotlib from that IDLE 
> interpreter. It tells me the package isn't there. 
> 
> If I just start up python without activating the rdkit envronment, I can 
> import matplotlib so it is there; there is something about the rdkit 
> environment that is not looking in the right place. Looking in environment 
> variables, I cannot see anything rdkit-specific.
> 
> How can I use matplotlib within my-rdkit-env?
> 
> thanks,
> 
> Chris.
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] writing c++ programs using rdkit

2016-08-08 Thread Markus Sitzmann
Hi Hitesh,

there is nothing particularly special about the RDKit installation
received/built by conda. The command you used in your email created a conda
environment. If you go to the directory where you initially *installed*
conda there is an envs directory, inside this directory  there should
be a "my-rdkit-env"
directory which contains all components of RDKit. If you activate your
conda enviroment ("source activate my-rdkit-env" on Linux) conda doesn't do
much more than adding the needed paths from "{CONDA INSTALLATION
DIR}/envs/my-rdkit-env"
to your shell environment. So just delve into the conda envs/ directory and
look at the enviroment changes conda does to your shell and you should get
an idea what to do and how to link to your project.

Best,
Markus

On Fri, Aug 5, 2016 at 11:40 PM, Hitesh Patel 
wrote:

> Hi all,
> I have used rdkit from python. But, now I would like to write c++ programs
> using rdkit.
> I have scientific linux 6.8. Which has python 2.6. I installed python 2.7
> but that doesn't work with sudo previlages. So, I could not install numpy
> for python 2.7. Numpy was installed for python 2.6 instead. I tried a lot
> in that. But, doesn't look convenient.
> Then, I installed anaconda and installed rdkit using
>
> $ conda create -c https://conda.anaconda.org/rdkit -n my-rdkit-env rdkit
>
> Is it possible to use this installation to write c++ code?
>
> I have also installed rdkit in Macbook Pro, OS X 10.10.5 using homebrew. I
> haven't tried anything in that to write c++ code. If I get some
> instructions for that too, It will be helpful in exploring.
>
> Thanks
> Hitesh Patel
>
>
>
>
> --
>
> Regards,
>
> Dr. Hitesh Patel
> Post-Doctoral Fellow,
> CADD Group,
> National Cancer Institute,
> National Institute of Health,
> 21702, Frederick, MD
> USA
> Building 376, Room: 205A
> Work: +1 301 846 5993
> Mob.: +1 240 367 5208
> Website: http://www.hiteshpatel379.com/
> Email: hitesh.pa...@nih.gov
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. http://sdm.link/zohodev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Some feedback from the Sheffield Cheminformatics Conference

2016-07-07 Thread Markus Sitzmann
Well, first thing I saw on the lock screen of my alarm clock-ringing iPad the 
morning after a long night at the Sheffield conference dinner was a reply by 
Greg on this list sent at 6:48am (it even contained some code).

Thanks a lot for your dedication and for building RDKit and its community, Greg.

Cheers,
Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 07.07.2016, at 08:20, Greg Landrum  wrote:
> 
> Dear all,
> 
> I was at the Sheffield Cheminformatics conference earlier this week (along 
> with several people from this list) and I was really struck by the number of 
> talks and posters that are using the RDKit. By my rough count the RDKit was 
> used for about 1/3 of the talks and a similar fraction of the posters. 
> 
> This of course, makes me smile rather broadly (Christian, Nadine, and Sereina 
> had to suffer through this while we were waiting at the airport ;-) ) but a 
> big part of the reason for this success is the engagement and activity of the 
> RDKit community. So I figured I'd share so that those of you who weren't in 
> Sheffield also get the chance to grin about it.
> 
> We're having an impact... that's really cool. Thanks! and congrats! :-)
> 
> -greg
> 
> --
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] conda build of Release_2016_03_2 failed on Ubuntu 16.04.

2016-07-02 Thread Markus Sitzmann
Hi Riccardo,

Yes, it builds again - thanks a lot for your efforts.

Best,
Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 30.06.2016, at 20:55, Riccardo Vianello  
> wrote:
> 
> Hi Markus,
> 
> I think the problem should be fixed now. The recipes were building the 
> cartridge using the earlier release tag, and therefore executing tests that 
> were not fully up-to-date. Please try again and let me know in case the 
> problem persisted.
> 
> Best,
> Riccardo
> 
> 
--
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] conda build of Release_2016_03_2 failed on Ubuntu 16.04.

2016-06-30 Thread Markus Sitzmann
Hi Riccardo,

Thanks for your efforts and sorry that I didn't reply earlier. I am not sure 
about all the side conditions in order this error to occur but I am glad you 
can reproduce it.

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 30.06.2016, at 08:30, Riccardo Vianello  
> wrote:
> 
>> On Tue, Jun 28, 2016 at 11:40 PM, Markus Sitzmann 
>>  wrote:
>> unfortunately I have another problem - rdkit-postgres isn't building for me 
>> since the change to Release_2016_03_2. Is that a known problem?
> 
> I tested a couple of full builds and the master branch looks ok, but I could 
> reproduce this error with the tagged release. I will try to identify the 
> exact cause.
> 
> Best,
> Riccardo
> 
--
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] conda build of Release_2016_03_2 failed on Ubuntu 16.04.

2016-06-28 Thread Markus Sitzmann
Hi,

unfortunately I have another problem - rdkit-postgres isn't building for me
since the change to Release_2016_03_2. Is that a known problem?

Below is the end of the build log. I only let build the py35-part
(+ncurses) of the Dockerscript.


Thanks & Best,
Markus

BUILD START: rdkit-postgresql-__conda_version__-py35_1
Fetching package metadata .
Solving package specifications: ..
+ source activate /home/rdkit/miniconda/envs/_build
++ [[ -n 4.1.2(1)-release ]]
++ _SCRIPT_LOCATION=/home/rdkit/miniconda/envs/_build/bin/activate
++ SHELL=bash
+++ dirname /home/rdkit/miniconda/envs/_build/bin/activate
++ _CONDA_DIR=/home/rdkit/miniconda/envs/_build/bin
++ '[' 1 -gt 1 ']'
++ case "$(uname -s)" in
+++ uname -s
++ EXT=
++ [[ -n 4.1.2(1)-release ]]
+++ basename /home/rdkit/miniconda/conda-bld/work/conda_build.sh
++ [[ conda_build.sh == \a\c\t\i\v\a\t\e ]]
++ '[' 1 -eq 0 ']'
++ args=/home/rdkit/miniconda/envs/_build
++ /home/rdkit/miniconda/envs/_build/bin/conda ..checkenv bash
/home/rdkit/miniconda/envs/_build
++ ((  0 != 0  ))
++ source /home/rdkit/miniconda/envs/_build/bin/deactivate
+++ [[ -n 4.1.2(1)-release ]]
+++ _SCRIPT_LOCATION=/home/rdkit/miniconda/envs/_build/bin/deactivate
+++ SHELL=bash
 dirname /home/rdkit/miniconda/envs/_build/bin/deactivate
+++ _CONDA_DIR=/home/rdkit/miniconda/envs/_build/bin
+++ case "$(uname -s)" in
 uname -s
+++ EXT=
+++ [[ 1 > 0 ]]
+++ key=/home/rdkit/miniconda/envs/_build
+++ case $key in
+++ shift
+++ [[ 0 > 0 ]]
+++ [[ -n 4.1.2(1)-release ]]
 basename /home/rdkit/miniconda/conda-bld/work/conda_build.sh
+++ [[ conda_build.sh == \d\e\a\c\t\i\v\a\t\e ]]
+++ [[ -z '' ]]
+++ [[ -n 4.1.2(1)-release ]]
 basename /home/rdkit/miniconda/conda-bld/work/conda_build.sh
+++ [[ conda_build.sh == \d\e\a\c\t\i\v\a\t\e ]]
+++ return 0
+++ /home/rdkit/miniconda/envs/_build/bin/conda ..activate bash
/home/rdkit/miniconda/envs/_build
prepending /home/rdkit/miniconda/envs/_build/bin to PATH
++ _NEW_PART=/home/rdkit/miniconda/envs/_build/bin
++ ((  0 == 0  ))
++ export
CONDA_PATH_BACKUP=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++
CONDA_PATH_BACKUP=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++ export CONDA_PS1_BACKUP=
++ CONDA_PS1_BACKUP=
++ export
PATH=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++
PATH=/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/envs/_build/bin:/home/rdkit/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++ [[ '' == */* ]]
++ export CONDA_DEFAULT_ENV=/home/rdkit/miniconda/envs/_build
++ CONDA_DEFAULT_ENV=/home/rdkit/miniconda/envs/_build
++ firstpath=/home/rdkit/miniconda/envs/_build/bin
+++ echo /home/rdkit/miniconda/envs/_build/bin
+++ sed 's|/bin$||'
++ export CONDA_PREFIX=/home/rdkit/miniconda/envs/_build
+++ /home/rdkit/miniconda/envs/_build/bin/conda ..changeps1
++ '[' 1 = 1 ']'
+++ grep -q CONDA_DEFAULT_ENV
++ export 'PS1=(/home/rdkit/miniconda/envs/_build) '
++ PS1='(/home/rdkit/miniconda/envs/_build) '
++ _CONDA_D=/home/rdkit/miniconda/envs/_build/etc/conda/activate.d
++ [[ -d /home/rdkit/miniconda/envs/_build/etc/conda/activate.d ]]
++ unset CONDA_PATH
++ [[ -n 4.1.2(1)-release ]]
++ hash -r
+ /home/rdkit/miniconda/envs/_build/bin/python
/home/rdkit/conda-rdkit/rdkit-postgresql/pkg_version.py
+ cd /home/rdkit/miniconda/conda-bld/work/Code/PgSQL/rdkit
+ make
gcc -I/home/rdkit/miniconda/envs/_build/include
-I/home/rdkit/miniconda/envs/_build/include/rdkit -DRDKITVER='"007300"'
-DBUILD_AVALON_SUPPORT -DBUILD_INCHI_SUPPORT -mpopcnt -I. -I./
-I/home/rdkit/miniconda/envs/_build/include/postgresql/server
-I/home/rdkit/miniconda/envs/_build/include/postgresql/internal
-D_GNU_SOURCE -I/home/rdkit/miniconda/envs/_build/include/libxml2
 -I/home/rdkit/miniconda/envs/_build/include -fPIC -c -o rdkit_io.o
rdkit_io.c
gcc -I/home/rdkit/miniconda/envs/_build/include
-I/home/rdkit/miniconda/envs/_build/include/rdkit -DRDKITVER='"007300"'
-DBUILD_AVALON_SUPPORT -DBUILD_INCHI_SUPPORT -mpopcnt -I. -I./
-I/home/rdkit/miniconda/envs/_build/include/postgresql/server
-I/home/rdkit/miniconda/envs/_build/include/postgresql/internal
-D_GNU_SOURCE -I/home/rdkit/miniconda/envs/_build/include/libxml2
 -I/home/rdkit/miniconda/envs/_build/include -fPIC -c -o mol_op.o mol_op.c
mol_op.c: In function 'fmcs_mol2s_transition':
mol_op.c:334: warning: initialization makes pointer from integer without a
cast
mol_op.c:363: warning: initialization makes pointer from integer without a
cast
mol_op.c: In function 'fmcs_mol_transition':
mol_op.c:432: warning: initialization makes pointer from integer without a
cast
mol_op.c:439: warning: cast from pointer to integer of different size
mol_op.c:443: warning: initialization makes pointer from integer without a
cast

Re: [Rdkit-discuss] Struggling with apache + rdkit + django

2016-06-21 Thread Markus Sitzmann
Hi Stephane,

Add some Python code to your uwsgi.py file that prints out the environment that 
the Python interpreter sees (maybe comment out everything else) when it is 
called by the Apache. It is very likely that the Apache calls another Python 
interpreter than you expect. What Paolo writes is probably the solution to your 
problem.

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 21.06.2016, at 21:24, Michał Nowotka  wrote:
> 
> Hi Stéphane,
> 
> Just to let you know about two things:
> 
> 1. ChEMBL web services are a Django application written using RDKit.
> We deploy it using gunicorn and Apache through Reverse Proxy and put
> on a Virtual Machine named myChEMBL that you can download. Here are
> some example configuration files:
> https://github.com/chembl/mychembl/tree/master/webservices/conf but
> I'm happy to explain more if you want.
> 
> 2. There is a project called Beaker that exposes most of RDKit methods
> as RESTful API. The source code is here:
> https://github.com/chembl/chembl_beaker and a live instance here:
> https://www.ebi.ac.uk/chembl/api/utils/docs
> 
> Kind regards,
> 
> Michał Nowotka
> 
> On Tue, Jun 21, 2016 at 7:46 PM, Téletchéa Stéphane
>  wrote:
>> Le 21/06/2016 20:18, TJ O'Donnell a écrit :
>>> I would suggest setting PYTHONPATH in
>>> config or ini files for
>>> Apache or Django or uwsgi
>>> Not sure which is required.
>> 
>> Dear all,
>> 
>> This is already indicated using a WSGIprocessGroup :
>> 
>> WSGIDaemonProcess manageLibrary
>> python-path=/path/to/project/projets/manageLibrary:/path/to/project/projets/manageLibrary/tools/django1.8/lib/python2.7/site-packages:/path/to/project/projets/manageLibrary/tools/rdkit/lib:/path/to/project/projets/manageLibrary/tools/rdkit/lib/python2.7/site-packages
>> display-name=manageLibrary
>> WSGIProcessGroup manageLibrary
>> WSGIScriptAlias /tools/manageLibrary
>> '/path/to/project/projets/manageLibrary/manageLibrary/wsgi.py'
>> 
>> 
>> See more in detail here:
>> https://www.digitalocean.com/community/tutorials/how-to-serve-django-applications-with-apache-and-mod_wsgi-on-ubuntu-14-04
>> 
>> I have also checked permisisons and files with no luck (and no output in
>> logs ...).
>> 
>> I may start from scratch with a simple django project to find if is
>> already works there ...
>> 
>> Many Thanks, if you have any direction I'll be happy to test,
>> 
>> Stéphane
>> 
>> --
>> Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein 
>> Design In Silico
>> UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322 Nantes 
>> cedex 03, France
>> Tél : +33 251 125 636 / Fax : +33 251 125 632
>> http://www.ufip.univ-nantes.fr/ - http://www.steletch.org
>> 
>> 
>> --
>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
>> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
>> present their vision of the future. This family event has something for
>> everyone, including kids. Get more information and register today.
>> http://sdm.link/attshape
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> --
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Conda and Rdkit 2016-03 Pains

2016-05-03 Thread Markus Sitzmann
Hi Riccardo,

thanks for your reply and all your work. I actually tried over the
course of the last few days and again just before I wrote my first
email. I do all this builds in a virtual machine (VMware) just with an
(almost) up-to-date Docker installation - so it should work (at least
I hope so otherwise it would defeat the purpose of Docker :-) ).

Okay, I will stay patient and keep you posted.

Best,
Markus

On Tue, May 3, 2016 at 9:02 AM, Riccardo Vianello
 wrote:
> Hi Markus,
>
> On Tue, May 3, 2016 at 1:41 AM, Markus Sitzmann 
> wrote:
>>
>> thanks for your great software - unfortunately, I have some building
>> pains. I recently decided to go from RDKit 2015-03 to 2015-09  (yes I
>> was late) , everything still on python 2.7.
>>
>> As part of this migration I decided to give Conda a try and it worked
>> nicely in my Docker container (which is very similar to the the
>> official Conda RDKit container at
>>
>> https://github.com/rdkit/conda-rdkit
>>
>> but starts from Debian Jessie instead of Centos6 - however it still
>> clones from this repository).
>>
>>
>> Unfortunately, since you switched to RDKit 2016-03 my troubles began.
>
>
> A set of changes have been recently merged into the conda-rdkit development
> branch in order to re-sync it with the rdkit master branch. If your tests
> with the development branch are earlier than just a few days, then you might
> want to try that again (and I would be actually interested to know in case
> the problems persisted). Please note that the current tip of the rdkit
> master branch already includes a few additions/changes compared to the
> latest release.
>
> I am also preparing a PR that will fully update the conda-rdkit master
> branch to the current 2016.03.1 release, I am about to run some final tests
> but I think it should be hopefully ready between today and tomorrow.
>
> Best,
> Riccardo
>

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Conda and Rdkit 2016-03 Pains

2016-05-02 Thread Markus Sitzmann
Hi Greg and everybody involved,

thanks for your great software - unfortunately, I have some building
pains. I recently decided to go from RDKit 2015-03 to 2015-09  (yes I
was late) , everything still on python 2.7.

As part of this migration I decided to give Conda a try and it worked
nicely in my Docker container (which is very similar to the the
official Conda RDKit container at

https://github.com/rdkit/conda-rdkit

but starts from Debian Jessie instead of Centos6 - however it still
clones from this repository).


Unfortunately, since you switched to RDKit 2016-03 my troubles began.
I know your are still working on this, but as soon as RDKit starts to
build in the container, the build process breaks. If I go back to
revision 56c3a779f873c4e6f6dbbdc87d67d106f04c140d (the last one before
RDKit 2016-03 occurs) it at least builds the python 2.7 part again but
breaks later for python 3.4 and 3.5.

Just in order to maybe get a clue what's wrong, I started playing
around with the original Docker build on Centos 6 (i.e. the original
Dockerfile), but I observe the same behavior - the build breaks
somewhere. And even there, when I go back to revision
56c3a779f873c4e6f6dbbdc87d67d106f04c140d (i.e. replace the word
"development" by this revision number in line 26 of the Dockerfile and
uncomment the line - otherwise it is unchanged) , the build breaks
after the python 2.7 part is finished (I attach the end of the build
log below).

Is that something you are aware of? Or is this a problem only I
observe? I can also give more documentation if this is needed,
however, I just wanted to get a first opinion.

I also already tried builds with the development branch (besides the
master branch, of course), unfortunately they also break, too.

Thanks a lot,
Markus


BUILD END: rdkit-postgresql-2015.09.2-py27_1
Nothing to test for: rdkit-postgresql-2015.09.2-py27_1
# If you want to upload this package to anaconda.org later, type:
#
# $ anaconda upload
/home/rdkit/miniconda/conda-bld/linux-64/rdkit-postgresql-2015.09.2-py27_1.tar.bz2
#
# To have conda build upload to anaconda.org automatically, use
# $ conda config --set anaconda_upload yes

 ---> 5898063548d9
Removing intermediate container 8af783478ecc
Step 24 : RUN CONDA_PY=34 conda build boost --quiet --no-anaconda-upload
 ---> Running in 01043f80baa4
Using Anaconda Cloud api site https://api.anaconda.org
Removing old build environment
Removing old work directory
BUILD START: boost-1.56.0-py34_3
Fetching package metadata: ..
Solving package specifications: 
The following specifications were found to be in conflict:
  - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> boost ==1.56.0
  - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> python 2.7*
  - zlib
Use "conda info " to see the dependencies for each package.
Missing dependency boost, but found recipe directory, so building boost first
Error: The following specifications were found to be in conflict:
  - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> boost ==1.56.0
  - rdkit (target=rdkit-2015.09.2-np110py27_0.tar.bz2) -> python 2.7*
  - zlib
Use "conda info " to see the dependencies for each package.
The command '/bin/sh -c CONDA_PY=34 conda build boost --quiet
--no-anaconda-upload' returned a non-zero code: 1

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-21 Thread Markus Sitzmann
Hi James,

I know that my opinion might sound extreme but I had this discussion
many times (mostly regarding tautomerism which is, however, similar in
some way). The problem is, you can look at a chemical structure in
many different ways - two scenarios are:

1. What can I perceive from a chemical structure if all I have is the
pure connection table and nothing else (and maybe millions of them)
2. What can I find about a particular structure if a I can run fully
fledged quantum-mechanical calculations, do an extensive literature
search, and/or have carefully measured experimental data and
conditions (rarely in the millions :-))

So, if I deal with something like implementing RDKit, things are
probably always quite close to scenario 1, hence my suggestion to
disregard stereochemistry on these type of N atoms (you need a lot of
information from scenario 2 to even decide whether there is
stereochemistry or not). The ideal solution, of course, would be to
offer three different modes for stereo perception: "disregard",
"keep", "perceive" from 3D (I am not sure if Greg likes that :-)).  If
these three modes would be available I still would suggest to set the
default to "disregard" for 3-coordinated N because the other two modes
require that you know what you are doing and/or have full trust in
your data - otherwise you probably do more harm than good.

Best,
Markus

On Fri, Aug 21, 2015 at 3:10 PM, James Davidson  wrote:
> Hi Greg (and Markus, Peter, et al.),
>
>
>
> Personal opinion – my vote would be to always keep the chiral information at
> 3-valent nitrogen centres…
>
> As Peter pointed-out, there are bridgehead examples (most of which, I guess,
> will have additional carbon chiral centres – and offer diastereomeric
> considerations).
>
> There are also, I believe, some nice oxaziridine examples where the
> oxaziridine N is the only chiral centre present (interpreted from abstract
> here: http://dx.doi.org/10.1039/C3985998):
>
>
>
> 3,3-dimethyl (2S)-2-tert-butyloxaziridine-3,3-dicarboxylate
>
> COC(=O)C1(O[N@]1C(C)(C)C)C(=O)OC
>
>
>
> and many other examples of diastereomeric oxaziridines – where the N is a
> chiral centre – eg see http://dx.doi.org/10.1016/j.tetasy.2008.09.016
>
>
>
>
>
> Kind regards
>
>
>
> James
>
>
> __
> PLEASE READ: This email is confidential and may be privileged. It is
> intended for the named addressee(s) only and access to it by anyone else is
> unauthorised. If you are not an addressee, any disclosure or copying of the
> contents of this email or any action taken (or not taken) in reliance on it
> is unauthorised and may be unlawful. If you have received this email in
> error, please notify the sender or postmas...@vernalis.com. Email is not a
> secure method of communication and the Company cannot accept responsibility
> for the accuracy or completeness of this message or any attachment(s).
> Please check this email for virus infection for which the Company accepts no
> responsibility. If verification of this email is sought then please request
> a hard copy. Unless otherwise stated, any views or opinions presented are
> solely those of the author and do not represent those of the Company.
>
> The Vernalis Group of Companies
> 100 Berkshire Place
> Wharfedale Road
> Winnersh, Berkshire
> RG41 5RD, England
> Tel: +44 (0)118 938 
>
> To access trading company registration and address details, please go to the
> Vernalis website at www.vernalis.com and click on the "Company address and
> registration details" link at the bottom of the page..
> __
>
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-20 Thread Markus Sitzmann
Hmm, well - probably not, you mention the always present exception in
chemistry, Peter (Sulfoxides have a similar situation, stereochemistry
from lone pairs). But generally I still think it is more dangerous to
keep or even perceive (from 3D) stereochemistry on three-coordinated N
- you will do more harm with this than fix things.



On Thu, Aug 20, 2015 at 6:40 PM, Peter Shenkin  wrote:
> "My initial answer, and I would love input on this, is that three-coordinate
> N should always have stereochemistry removed."
>
> Umm... even if it's a bridgehead?
>
> -P.
>
> On Thu, Aug 20, 2015 at 10:30 AM, Greg Landrum 
> wrote:
>>
>> This isn't a simple one, so it may take a bit to get to an answer that's
>> comprehensible.
>>
>> There are two things going on here in the RDKit:
>> 1) Ring stereochemistry
>> 2) stereochemistry about nitrogen centers
>>
>> Let's start with the second, because it's easier: RDKit does not generally
>> "believe in" stereochemistry around three coordinate nitrogens. Here's a
>> very simple example:
>> In [45]: m3 = Chem.MolFromSmiles('Br[N@](F)Cl')
>>
>> In [46]: Chem.MolToSmiles(m3,isomericSmiles=True)
>> Out[46]: 'FN(Cl)Br'
>>
>>
>> The 3D equivalent of that:
>> In [41]: m = Chem.MolFromSmiles('BrN(F)Cl')
>>
>> In [42]: AllChem.EmbedMolecule(m)
>> Out[42]: 0
>>
>> In [43]: Chem.AssignAtomChiralTagsFromStructure(m)
>>
>> In [44]: Chem.MolToSmiles(m,isomericSmiles=True)
>> Out[44]: 'FN(Cl)Br'
>>
>> Contrast this with what you get for a carbon:
>>
>> In [34]: m2 = Chem.MolFromSmiles('FC(Br)(Cl)I')
>>
>> In [35]: AllChem.EmbedMolecule(m2)
>> Out[35]: 0
>>
>> In [36]: Chem.AssignAtomChiralTagsFromStructure(m2)
>>
>> In [37]: Chem.MolToSmiles(m2,isomericSmiles=True)
>> Out[37]: 'F[C@](Cl)(Br)I'
>>
>>
>> Back to the first: ring stereochemistry. By this I mean things like
>> C[C@H]1CC[C@@H](C)CC1 - molecules where the stereochemistry information is
>> really about whether the substituents of the ring are cis or trans relative
>> to the ring plane.
>>
>> The way the RDKit handles this is something of a hack: it doesn't identify
>> those atoms as chiral centers, but it does preserve the chiral tags when
>> generating a canonical SMILES:
>>
>> In [47]: m = Chem.MolFromSmiles('C[C@H]1CC[C@@H](C)CC1')
>>
>> In [48]: Chem.FindMolChiralCenters(m)
>> Out[48]: []
>>
>> In [49]: Chem.MolToSmiles(m,isomericSmiles=True)
>> Out[49]: 'C[C@H]1CC[C@@H](C)CC1'
>>
>> Curiously, to me at least, it does the same thing with nitrogens;
>>
>> In [52]: m2 = Chem.MolFromSmiles('C[N@@]1CC[C@@H](C)CC1')
>>
>> In [53]: Chem.MolToSmiles(m2,isomericSmiles=True)
>> Out[53]: 'C[C@H]1CC[N@](C)CC1'
>>
>> Lest anyone think that this might make sense because being a ring makes
>> inversion more difficult, that's not what is going on here. If I make the
>> ring truly chiral, then the stereochemistry of the N is removed:
>>
>> In [54]: m3 = Chem.MolFromSmiles('C[N@@]1CO[C@@H](C)CC1')
>>
>> In [55]: Chem.MolToSmiles(m3,isomericSmiles=True)
>> Out[55]: 'C[C@H]1CCN(C)CO1'
>>
>> I believe that this inconsistent behavior is a bug: either N should always
>> have the input stereochemistry preserved (and that should be perceived from
>> the 3D coordinates) or it should never have the input stereochemistry
>> preserved. My initial answer, and I would love input on this, is that
>> three-coordinate N should always have stereochemistry removed.
>>
>> -greg
>>
>>
>>
>> On Thu, Aug 20, 2015 at 2:22 PM, Rob Smith  wrote:
>>>
>>> Hi Greg,
>>>
>>> I've attached the SDF that Corina generates. I'm not convinced it is a
>>> problem, more an observation that I'm trying to understand.
>>>
>>> Looking at the results again today - it seems that from the Corina output
>>> Indigo is interpreting the conformer (including whether the ethyl
>>> substituent on the piperidine nitrogen is equatorial or axial) - and
>>> outputting a canonical smiles string that has the conformer "encoded" in it
>>> (using the chiral flags). Whereas RDKit is reading in the Corina output,
>>> "discounting" whether the nitrogen is axial or equatorial (which due to
>>> inversion I can understand) and interpreting it as having only two chiral
>>> centers (which is correct).
>>>
>>> What is confusing me, is that when I supply RDKit with the canonical
>>> smiles string from Indigo (which has the conformer "encoded" in it), and
>>> then ask for the isomeric canonical smiles, it supplies the canonical smiles
>>> with the conformer still "encoded" within it.
>>>
>>> For example, I read in the following canonical smiles string into RDKit:
>>> CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 (which was generated by reading
>>> in one of the mols in the SD File into RDKit and output the isomeric
>>> canonical smiles), running the FindMolChiralCenters on this molecule,
>>> correctly reports the number of chiral centres to be 2 (6S, 9R), and then
>>> asking it to output the canonical smiles string (with isomericSmiles=True)
>>> gives CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1 (1).
>>>
>>> If I take the same

Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-20 Thread Markus Sitzmann
I agree with remove - the chance that you destroy actual information
by this is low - or in other words, the chance that steroinformation
on three-coordinate N is spurious I would expect as high.

Markus

On Thu, Aug 20, 2015 at 4:30 PM, Greg Landrum  wrote:
> This isn't a simple one, so it may take a bit to get to an answer that's
> comprehensible.
>
> There are two things going on here in the RDKit:
> 1) Ring stereochemistry
> 2) stereochemistry about nitrogen centers
>
> Let's start with the second, because it's easier: RDKit does not generally
> "believe in" stereochemistry around three coordinate nitrogens. Here's a
> very simple example:
> In [45]: m3 = Chem.MolFromSmiles('Br[N@](F)Cl')
>
> In [46]: Chem.MolToSmiles(m3,isomericSmiles=True)
> Out[46]: 'FN(Cl)Br'
>
>
> The 3D equivalent of that:
> In [41]: m = Chem.MolFromSmiles('BrN(F)Cl')
>
> In [42]: AllChem.EmbedMolecule(m)
> Out[42]: 0
>
> In [43]: Chem.AssignAtomChiralTagsFromStructure(m)
>
> In [44]: Chem.MolToSmiles(m,isomericSmiles=True)
> Out[44]: 'FN(Cl)Br'
>
> Contrast this with what you get for a carbon:
>
> In [34]: m2 = Chem.MolFromSmiles('FC(Br)(Cl)I')
>
> In [35]: AllChem.EmbedMolecule(m2)
> Out[35]: 0
>
> In [36]: Chem.AssignAtomChiralTagsFromStructure(m2)
>
> In [37]: Chem.MolToSmiles(m2,isomericSmiles=True)
> Out[37]: 'F[C@](Cl)(Br)I'
>
>
> Back to the first: ring stereochemistry. By this I mean things like
> C[C@H]1CC[C@@H](C)CC1 - molecules where the stereochemistry information is
> really about whether the substituents of the ring are cis or trans relative
> to the ring plane.
>
> The way the RDKit handles this is something of a hack: it doesn't identify
> those atoms as chiral centers, but it does preserve the chiral tags when
> generating a canonical SMILES:
>
> In [47]: m = Chem.MolFromSmiles('C[C@H]1CC[C@@H](C)CC1')
>
> In [48]: Chem.FindMolChiralCenters(m)
> Out[48]: []
>
> In [49]: Chem.MolToSmiles(m,isomericSmiles=True)
> Out[49]: 'C[C@H]1CC[C@@H](C)CC1'
>
> Curiously, to me at least, it does the same thing with nitrogens;
>
> In [52]: m2 = Chem.MolFromSmiles('C[N@@]1CC[C@@H](C)CC1')
>
> In [53]: Chem.MolToSmiles(m2,isomericSmiles=True)
> Out[53]: 'C[C@H]1CC[N@](C)CC1'
>
> Lest anyone think that this might make sense because being a ring makes
> inversion more difficult, that's not what is going on here. If I make the
> ring truly chiral, then the stereochemistry of the N is removed:
>
> In [54]: m3 = Chem.MolFromSmiles('C[N@@]1CO[C@@H](C)CC1')
>
> In [55]: Chem.MolToSmiles(m3,isomericSmiles=True)
> Out[55]: 'C[C@H]1CCN(C)CO1'
>
> I believe that this inconsistent behavior is a bug: either N should always
> have the input stereochemistry preserved (and that should be perceived from
> the 3D coordinates) or it should never have the input stereochemistry
> preserved. My initial answer, and I would love input on this, is that
> three-coordinate N should always have stereochemistry removed.
>
> -greg
>
>
>
> On Thu, Aug 20, 2015 at 2:22 PM, Rob Smith  wrote:
>>
>> Hi Greg,
>>
>> I've attached the SDF that Corina generates. I'm not convinced it is a
>> problem, more an observation that I'm trying to understand.
>>
>> Looking at the results again today - it seems that from the Corina output
>> Indigo is interpreting the conformer (including whether the ethyl
>> substituent on the piperidine nitrogen is equatorial or axial) - and
>> outputting a canonical smiles string that has the conformer "encoded" in it
>> (using the chiral flags). Whereas RDKit is reading in the Corina output,
>> "discounting" whether the nitrogen is axial or equatorial (which due to
>> inversion I can understand) and interpreting it as having only two chiral
>> centers (which is correct).
>>
>> What is confusing me, is that when I supply RDKit with the canonical
>> smiles string from Indigo (which has the conformer "encoded" in it), and
>> then ask for the isomeric canonical smiles, it supplies the canonical smiles
>> with the conformer still "encoded" within it.
>>
>> For example, I read in the following canonical smiles string into RDKit:
>> CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 (which was generated by reading
>> in one of the mols in the SD File into RDKit and output the isomeric
>> canonical smiles), running the FindMolChiralCenters on this molecule,
>> correctly reports the number of chiral centres to be 2 (6S, 9R), and then
>> asking it to output the canonical smiles string (with isomericSmiles=True)
>> gives CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1 (1).
>>
>> If I take the same mol file, read it into Indigo, and ask it to output the
>> canonical smiles string, I get: CC(C)[C@H]1CC[N@H+]1[C@@H]1CC[N@@](CC1)CC,
>> if I read this smiles string into RDKit and run FindMolCenters on it, I get
>> (3R, 6S) - which is fine, if I then out the canonical smiles (again with
>> isomericSmiles=True) I get CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1. I
>> expected this isomeric canonical smiles to be the same as (1), however RDKit
>> appears to conserve the conformer 

Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-20 Thread Markus Sitzmann
Hehe, that is why I keep my computers always really cold when I run RDKit ... 

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 20.08.2015, at 04:33, Peter Shenkin  wrote:
> 
> Maybe when you have a toolkit as blazingly fast as RDKit it captures the 
> chirality of N center before it has time to interconvert
> 
> -P.
> 
>> On Wed, Aug 19, 2015 at 10:17 PM, John M  wrote:
>> More odd is the carbon stereocentre with two methyls...
>> 
>> Generally trivalent nitrogens are not considered chiral due to inversion of 
>> the lone-pair. The two usual exceptions are when they are a bridgehead or in 
>> a tight ring (cyclopropane). This is the same in most toolkits, the InChI 
>> technical documentation provides useful examples.
>> 
>> InChI actually only sees one stereo centre since it strips the proton off:
>> InChI=1S/C13H26N2/c1-4-14-8-5-12(6-9-14)15-10-7-13(15)11(2)3/h11-13H,4-10H2,1-3H3/p+1/t13-/m1/s1
>> 
>> It may well be chiral in this case but since it's not you should also 
>> strictly remove the other stereocentre in the para position to the nitrogen
>> 
>> For the record just tested and ChemAxon/CDK/OpenBabel do the same.
>> 
>> John
>> 
>> Regards,
>> John W May 
>> john.wilkinson...@gmail.com
>> 
>>> On 19 August 2015 at 09:00, Rob Smith  wrote:
>>> Dear RDKit community,
>>> 
>>> I'm trying to use RDKit to read in Corina generated stereoisomers (from a 
>>> Mol file), assign chiral tags and stereochemistry to the structure and 
>>> output the canonical smiles string for each isomer of a given molecule (in 
>>> Python), when I do this, half the canonical smiles strings are not unique.
>>> 
>>> When I read in the output from Corina into an Indigo instance, then use the 
>>> canonical smiles from Indigo to create an RDKit molecule, canonical smiles 
>>> strings generated from the molecule objects are all unique.
>>> 
>>> I may be missing an option to enable RDKit to 'visualise' the chiral centre 
>>> adjacent to the protonated nitrogen, so if someone can spot where I've made 
>>> a mistake, I'd really appreciate it. I've included the output and Python 
>>> script below. If you require any further information, please let me know.
>>> 
>>> Many thanks,
>>> Rob
>>> 
>>> Output:
>>> 
>>> RDKit Read in of Molecule
>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>>> 
>>> INDIGO Read in of Molecule
>>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1
>>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1
>>> 
>>> Python script :
>>> 
>>> from rdkit import Chem
>>> import subprocess # Used to run Corina
>>> from indigo import *
>>> 
>>> def runCorinaTest(inputMol):
>>> indigo = Indigo()
>>> 
>>> molFile = Chem.MolToMolBlock(inputMol)
>>> 
>>> corinaCommand = "echo \'" + molFile + "\' | "
>>> # Then Corina - generate stereoisomers...
>>> corinaCommand = corinaCommand + "/apps/corina/corina -t n -d 
>>> canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf"
>>> corinaResult = subprocess.check_output([corinaCommand], shell=True) # 
>>> Gives the stereoisomer species as an SDF string
>>> 
>>> allMoleculeObjects = []
>>> allMolecules = corinaResult.split("\n") # Separate Corina output 
>>> into individual molecules
>>> allMolecules = allMolecules[0:len(allMolecules)-1]
>>> 
>>

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Markus Sitzmann
We could consider some quantum-mechanical calculations ... well, I always
hated this discussion when I heard for my web service with millions of
structures, I should consider quantum-mechanical calculations as part of
the structure normalization/canonicalization ;-)

On Wed, Jun 17, 2015 at 8:22 AM, Peter Shenkin  wrote:

> Hi, Greg,
>
> Within the SMILES framework, it seems to me that if you allow the atoms to
> be aromatic, then these are two Kekule structures of the same aromatic
> system, and however you do the canonicalization, they ought to canonicalize
> to the same structure, which the two examples did not do. I don't think you
> addressed this.
>
> I think now that there is no issue with having a double bond between two
> aromatic atoms beyond our preconceptions. If that is a problem, you could
> Kekulize it per your first picture, (though perhaps that is inconvenient in
> the context of the implementation).
>
> I actually didn't realize why aromaticity (particularly the double bond)
> made sense when I originally wrote, so the above is with the benefit of
> hindsight, and your comments.
>
> I think the molecule is entertaining in several ways. In the cubane
> geometry, the molecule cannot be conventionally aromatic. Might it actually
> be antiaromatic? Could there be two forms?
>
> Dunno
> -P.
>
>
> On Wed, Jun 17, 2015 at 1:25 AM, Greg Landrum 
> wrote:
>>
>>
>> The problematic part of your two molecules can be reduced to:
>> [image: Inline image 3]
>> and
>> [image: Inline image 4]
>> That second one shows the kekulized form that the RDKit ends up using.
>>
>> These produce the following canonical SMILES:
>>
>> In [31]: Chem.CanonSmiles('C1=CC2=CC=C12')
>> Out[31]: 'c1cc2ccc1-2'
>>
>> In [32]: Chem.CanonSmiles('C1=CC2=C1C=C2')
>> Out[32]: 'c1cc2ccc1=2'
>>
>>
>
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Test #91 failing for RDkit 2015_03_1

2015-05-26 Thread Markus Sitzmann
The same happened to me for the previous version of RDKit when I compiled it in 
a Docker container. I hat to install Pillow first, too - probably they are 
trying to keep those Ubuntu or Centos versions more essential when they are 
used in VMs or Docker.

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 26.05.2015, at 18:08, Michał Nowotka  wrote:
> 
> Sorry, for the hassle, this has now been fixed. After running  'ctest
> -R pythonTestDirChem -V' I've noticed that Pillow/PIL is missing.
> 
>> On Tue, May 26, 2015 at 4:51 PM, Michał Nowotka  wrote:
>> Hi,
>> 
>> We are trying to compile latest (2015_03_1) RDKit version on myChEMBL VMs.
>> Unfortunately when running tests, the last one fails:
>> 
>> -
>> 
>> 91/91 Test #91: pythonTestDirChem ***Failed   36.43 sec
>> 
>> 99% tests passed, 1 tests failed out of 91
>> 
>> Total Test time (real) = 119.76 sec
>> 
>> The following tests FAILED:
>> 91 - pythonTestDirChem (Failed)
>> Errors while running CTest
>> 
>> -
>> 
>> This happens on Ubuntu 14.04 LTS and CentOS 7.
>> Is it something serious, can this be fixed?
>> 
>> Kind regards,
>> 
>> Michał Nowotka
> 
> --
> One dashboard for servers and applications across Physical-Virtual-Cloud 
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SDF properties in case of error

2015-05-03 Thread Markus Sitzmann
If you (ab)use ErrorMolecule to keep or add garbage into your future
blockbuster drug molecule set, it is your own problem. And if you rely
on the correctness of a SD file reader of any software as part of your
quality assurance in your drug pipeline process, I am quite positive,
you do something wrong.

On Sun, May 3, 2015 at 7:04 PM, Dimitri Maziuk  wrote:
> On 2015-05-03 03:56, Markus Sitzmann wrote:
>> No, "cutting out a chunk of lines from a file" might be simple, but
>> can become an expensive operation if you want to deal with thousands
>> of files and million of records.
>
> *If you have the line numbers* it's something like "head | tail" or a
> 2-line for loop w/ line counter.
>
> If it's not a one-off and your upstream keeps generating junk, the
> proper solution is to "have a talk" with them.
>
> The worst possible solution is to happily generate a garbage molecule
> that will blow up user's entire downstream pipeline. *If they're lucky*
> -- most likely it'll be garbage in - garbage out and crap happily flows
> on to the next stage. If ErrorMolecule "is a" Molecule that will happen.
>
> I most emphatically do not want to take any drug developed using that
> kind of software quality assurance and error control procedures. Or have
> any new material developed like that anywhere near my bike, car, or
> diving gear. And so on.
>
> Dimitri
>
>
> --
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SDF properties in case of error

2015-05-03 Thread Markus Sitzmann
No, "cutting out a chunk of lines from a file" might be simple, but
can become an expensive operation if you want to deal with thousands
of files and million of records. That is one of the reasons why I
(unfortunately) couldn't consider rdkit any further for one of my
projects a few years ago. So, I support Michael's idea :-)

On Sat, May 2, 2015 at 12:17 AM, Dimitri Maziuk  wrote:
> On 04/30/2015 05:01 PM, Michael Reutlinger wrote:
>
>> However, in some cases this does not help. E.g. when an unknown atom (most
>> of the time this is X) is found in the MolBlock the import fails with an
>> Post-condition Violation and None is yielded. This is fine to detect the
>> problem BUT it is impossible to get any information about the molecule
>> which failed.
>
> I'd say the best you can do skip over to the next molecule and report
> "molecule in lines X to Y is corrupt". Cutting out a chunk of lines from
> a file is trivial, and if you're reading from a stream rather than a
> file then, well, don't. Without a valid mol block you don't have a
> molecule and you shouldn't be making one up. As in "conservative in what
> you produce".
>
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
> --
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-25 Thread Markus Sitzmann
Sorry, looks like my baby is getting old ... :-)

Markus

On Wed, Feb 25, 2015 at 7:26 AM, Greg Landrum  wrote:
> To close the loop here: after an email exchange with Marc Nicklaus and Wolf
> Ihlenfeldt, it looks like the problem is that the NCI website is using an
> older version of the CACTVS toolkit to do the SMILES->InChI conversion. That
> older version contains a bug that has since been fixed. Marc is now aware of
> the problem.
>
> The RDKit was, at least in this case, not responsible for the bad InChIs.
>  :-)
>
> Best,
> -greg
>
>
>
>
> On Tue, Feb 24, 2015 at 8:27 AM, Greg Landrum 
> wrote:
>>
>>
>> The InChIs have me confused.
>>
>> I'm going to simplify the below by just showing the input SMILES, the
>> current (=master) RDKit InChI and the PubChem InChI
>>
>> On Mon, Feb 23, 2015 at 10:54 AM, JP  wrote:
>>>
>>>
>>> Here is the list (first inchi is the 2014_09_2, second one is the
>>> 2015.03.1pre generated one, third inchi is the cactus.nci.nih.gov):
>>>
>>> O=C(/N=c1/[nH]ncs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1
>>>
>>> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13-
>>> # RDKit 2015.03.1pre
>>>
>>> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)/t12-,13-,14?,15?
>>> # cactus.nci.nih.gov
>>>
>>> O=C(/N=c1\[nH]c(-c2n2)cs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1
>>> InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17-
>>>
>>> InChI=1S/C24H39N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h16-21,25-26H,1-15H2,(H,27,28,30)/t16-,17-,18?,19?,20?,21?
>>>
>>> CCOC(=O)Cc1cs/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)[nH]1
>>> InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16-
>>>
>>> InChI=1S/C23H36N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h15-19,24H,2-14H2,1H3,(H,25,26,29)/t15-,16-,17?,18?,19?
>>>
>>> COCc1n[nH]/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)s1
>>> InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14-
>>>
>>> InChI=1S/C20H33N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h13-17,21,23H,2-12H2,1H3,(H,22,24,26)/t13-,14-,15?,16?,17?
>>>
>>> COC(=O)c1[nH]/c(=N\C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)sc1C(C)C
>>> InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16-
>>>
>>> InChI=1S/C24H38N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h14-20,25H,4-13H2,1-3H3,(H,26,27,29)/t15-,16-,17?,18?,19?,20?
>>>
>>> CC(C)[C@H]1CC[C@H](C(=O)N[C@H](Cc2c2)C(=O)/N=c2\[nH]ncs2)CC1
>>> InChI=1S/C21H28N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h3-7,13-14,16-18H,8-12H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1
>>>
>>> InChI=1S/C21H36N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h14-18,22H,3-13H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1
>>
>>
>> If you look in the formula layer for the InChIs from PubChem, you will see
>> that they all have *way* too many H atoms. I think there's something about
>> the structures that is confusing the pubchem/cactvs conversion code.
>>
>> Compare these two outputs.
>> Aromatic form:
>>
>> http://cactus.nci.nih.gov/chemical/structure/O=C(N=c1[nH]ncs1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi
>> produces:
>>
>> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)
>>
>> Kekule form:
>>
>> http://cactus.nci.nih.gov/chemical/structure/O=C(/N=C1/[NH]N=CS1)[C@H]1CC[C@H](CN2C=NC3=CC=CC=C3C2=O)CC1/stdinchi
>> produces:
>>
>> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13-
>>
>> In fact, converting the 5 membered ring to kekule form is enough:
>>
>> http://cactus.nci.nih.gov/chemical/structure/O=C(N=C1[NH]N=CS1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi
>> produces:
>>
>> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)
>>
>> This can't be true.
>>
>> We can further simplify things to track down the problem:
>>
>> http://cactus.nci.nih.gov/chemical/structure/N=c1[nH]ncs1/stdinchi
>> InChI=1S/C2H5N3S/c3-2-5-4-1-6-2/h4H,1H2,(H2,3,5)
>>
>> vs
>>
>> http://cactus.nci.nih.gov/ch

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-25 Thread Markus Sitzmann
You can report it to Marc Nicklaus ... who will probably sent it to me
... I will take a look. Whether I can fix any misbehavior is another
question.

On Tue, Feb 24, 2015 at 8:27 AM, Greg Landrum  wrote:
>
> The InChIs have me confused.
>
> I'm going to simplify the below by just showing the input SMILES, the
> current (=master) RDKit InChI and the PubChem InChI
>
> On Mon, Feb 23, 2015 at 10:54 AM, JP  wrote:
>>
>>
>> Here is the list (first inchi is the 2014_09_2, second one is the
>> 2015.03.1pre generated one, third inchi is the cactus.nci.nih.gov):
>>
>> O=C(/N=c1/[nH]ncs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1
>>
>> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13-
>> # RDKit 2015.03.1pre
>>
>> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)/t12-,13-,14?,15?
>> # cactus.nci.nih.gov
>>
>> O=C(/N=c1\[nH]c(-c2n2)cs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1
>> InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17-
>>
>> InChI=1S/C24H39N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h16-21,25-26H,1-15H2,(H,27,28,30)/t16-,17-,18?,19?,20?,21?
>>
>> CCOC(=O)Cc1cs/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)[nH]1
>> InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16-
>>
>> InChI=1S/C23H36N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h15-19,24H,2-14H2,1H3,(H,25,26,29)/t15-,16-,17?,18?,19?
>>
>> COCc1n[nH]/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)s1
>> InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14-
>>
>> InChI=1S/C20H33N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h13-17,21,23H,2-12H2,1H3,(H,22,24,26)/t13-,14-,15?,16?,17?
>>
>> COC(=O)c1[nH]/c(=N\C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)sc1C(C)C
>> InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16-
>>
>> InChI=1S/C24H38N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h14-20,25H,4-13H2,1-3H3,(H,26,27,29)/t15-,16-,17?,18?,19?,20?
>>
>> CC(C)[C@H]1CC[C@H](C(=O)N[C@H](Cc2c2)C(=O)/N=c2\[nH]ncs2)CC1
>> InChI=1S/C21H28N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h3-7,13-14,16-18H,8-12H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1
>>
>> InChI=1S/C21H36N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h14-18,22H,3-13H2,1-2H3,(H,23,26)(H,24,25,27)/t16-,17-,18-/m1/s1
>
>
> If you look in the formula layer for the InChIs from PubChem, you will see
> that they all have *way* too many H atoms. I think there's something about
> the structures that is confusing the pubchem/cactvs conversion code.
>
> Compare these two outputs.
> Aromatic form:
> http://cactus.nci.nih.gov/chemical/structure/O=C(N=c1[nH]ncs1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi
> produces:
> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)
>
> Kekule form:
> http://cactus.nci.nih.gov/chemical/structure/O=C(/N=C1/[NH]N=CS1)[C@H]1CC[C@H](CN2C=NC3=CC=CC=C3C2=O)CC1/stdinchi
> produces:
> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13-
>
> In fact, converting the 5 membered ring to kekule form is enough:
> http://cactus.nci.nih.gov/chemical/structure/O=C(N=C1[NH]N=CS1)C1CCC(Cn2cnc3c3c2=O)CC1/stdinchi
> produces:
> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)
>
> This can't be true.
>
> We can further simplify things to track down the problem:
>
> http://cactus.nci.nih.gov/chemical/structure/N=c1[nH]ncs1/stdinchi
> InChI=1S/C2H5N3S/c3-2-5-4-1-6-2/h4H,1H2,(H2,3,5)
>
> vs
>
> http://cactus.nci.nih.gov/chemical/structure/O=c1[nH]ncs1/stdinchi
> InChI=1S/C2H2N2OS/c5-2-4-3-1-6-2/h1H,(H,4,5)
>
> It seems to be the exocyclic bond to an atom with Hs. This is ok:
> http://cactus.nci.nih.gov/chemical/structure/O=c1occo1/stdinchi
> InChI=1S/C3H2O3/c4-3-5-1-2-6-3/h1-2H
>
> but both of these are wrong:
> http://cactus.nci.nih.gov/chemical/structure/N=c1occo1/stdinchi
> InChI=1S/C3H5NO2/c4-3-5-1-2-6-3/h4H,1-2H2
>
> http://cactus.nci.nih.gov/chemical/structure/C=c1occo1/stdinchi
> InChI=1S/C4H6O2/c1-4-5-2-3

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-23 Thread Markus Sitzmann
Well, the http://cactus.nci.nih.gov/chemical/structure/ site is my
baby which I had to leave behind 1 1/2 years ago (I am not with NIH
anymore). Igor who replied in this thread was also involved in some
parts of it. Traffic on this cactus service is between 5 to 10 million
requests per month - so I think the service survived your attack ;-)

And I am not saying it is perfect, it just provides another
implementation to double-check things in question. It has the CACTVS
chemoinformatic toolkit as chemistry backend which I think is
well-tested.

Markus

On Mon, Feb 23, 2015 at 10:54 AM, JP  wrote:
> Ok so I got out my test set of 6,940,083 molecules.  First, I generated the
> inchi using 2014_09_2.  I then checked out (and built) the master (with
> Greg's latest commits) from github and regenerated the inchis for all these
> molecules.
>
> 3,257 molecules (of 6,940,083) gave me a different inchis between the
> current production version and the development (github) one.
>
> For these 3,257 molecules I hammered the
> http://cactus.nci.nih.gov/chemical/structure/%s/stdinchi site and assumed
> this to be the 'correct' inchi (those great guys will have an interesting
> spike in their web traffic last Fri evening).  In 6 (out of 3,257) cases we
> get different Inchis from cactus.nci.nih.gov vs RDKit github development
> version (2015.03.1pre).
>
> Here is the list (first inchi is the 2014_09_2, second one is the
> 2015.03.1pre generated one, third inchi is the cactus.nci.nih.gov):
>
> O=C(/N=c1/[nH]ncs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1
> MPQBIWRBISQCLJ-BETUJISGSA-N MPQBIWRBISQCLJ-JOCQHMNTSA-N
> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13+
> # RDKit 2014_09_2
> InChI=1S/C18H19N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h1-4,10-13H,5-9H2,(H,21,22,24)/t12-,13-
> # RDKit 2015.03.1pre
> InChI=1S/C18H29N5O2S/c24-16(21-18-22-20-11-26-18)13-7-5-12(6-8-13)9-23-10-19-15-4-2-1-3-14(15)17(23)25/h12-15,19-20H,1-11H2,(H,21,22,24)/t12-,13-,14?,15?
> # cactus.nci.nih.gov
>
> O=C(/N=c1\[nH]c(-c2n2)cs1)[C@H]1CC[C@H](Cn2cnc3c3c2=O)CC1
> CZKXHWCYFFXKGH-CALCHBBNSA-N CZKXHWCYFFXKGH-QAQDUYKDSA-N
> InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17+
> InChI=1S/C24H23N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h1-7,12,14-17H,8-11,13H2,(H,27,28,30)/t16-,17-
> InChI=1S/C24H39N5O2S/c30-22(28-24-27-21(14-32-24)20-7-3-4-12-25-20)17-10-8-16(9-11-17)13-29-15-26-19-6-2-1-5-18(19)23(29)31/h16-21,25-26H,1-15H2,(H,27,28,30)/t16-,17-,18?,19?,20?,21?
>
> CCOC(=O)Cc1cs/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)[nH]1
> GAXCPQSXDNGSQV-IYBDPMFKSA-N GAXCPQSXDNGSQV-WKILWMFISA-N
> InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16+
> InChI=1S/C23H26N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h3-6,13-16H,2,7-12H2,1H3,(H,25,26,29)/t15-,16-
> InChI=1S/C23H36N4O4S/c1-2-31-20(28)11-17-13-32-23(25-17)26-21(29)16-9-7-15(8-10-16)12-27-14-24-19-6-4-3-5-18(19)22(27)30/h15-19,24H,2-14H2,1H3,(H,25,26,29)/t15-,16-,17?,18?,19?
>
> COCc1n[nH]/c(=N/C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)s1
> YVZJPKUMKXPZTK-OKILXGFUSA-N YVZJPKUMKXPZTK-HDJSIYSDSA-N
> InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14+
> InChI=1S/C20H23N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h2-5,12-14H,6-11H2,1H3,(H,22,24,26)/t13-,14-
> InChI=1S/C20H33N5O3S/c1-28-11-17-23-24-20(29-17)22-18(26)14-8-6-13(7-9-14)10-25-12-21-16-5-3-2-4-15(16)19(25)27/h13-17,21,23H,2-12H2,1H3,(H,22,24,26)/t13-,14-,15?,16?,17?
>
> COC(=O)c1[nH]/c(=N\C(=O)[C@H]2CC[C@H](Cn3cnc4c4c3=O)CC2)sc1C(C)C
> KNDSLDLCZNAXPK-IYBDPMFKSA-N KNDSLDLCZNAXPK-WKILWMFISA-N
> InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16+
> InChI=1S/C24H28N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h4-7,13-16H,8-12H2,1-3H3,(H,26,27,29)/t15-,16-
> InChI=1S/C24H38N4O4S/c1-14(2)20-19(23(31)32-3)26-24(33-20)27-21(29)16-10-8-15(9-11-16)12-28-13-25-18-7-5-4-6-17(18)22(28)30/h14-20,25H,4-13H2,1-3H3,(H,26,27,29)/t15-,16-,17?,18?,19?,20?
>
> CC(C)[C@H]1CC[C@H](C(=O)N[C@H](Cc2c2)C(=O)/N=c2\[nH]ncs2)CC1
> OKTRHZCAACPPLC-FGTMMUONSA-N OKTRHZCAACPPLC-KZNAEPCWSA-N
> InChI=1S/C21H28N4O2S/c1-14(2)16-8-10-17(11-9-16)19(26)23-18(12-15-6-4-3-5-7-15)20(27)24-21-25-22-13-28-21/h3-7,13-14,16-18H,8-12H2,1-2H3,(H,23,26)(H,24,25,27)/

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Markus Sitzmann
A database can have several definitions of unique for anything - a
structure database can have this, too. If you have a chemical compound
which can form 10 different tautomers, you can represent the compound
by 10 chemical structures (it is still the same compound, though). So,
if you define uniqueness on basis of chemical compound, you have one
db entry and this one entry has a single (tatuomer-sensitive) InChI
covering 10 chemical structures; if you define uniqueness on basis of
tautomers/chemical structures (because all are relevant, for instance,
in NMR spectrosopy) you have (and want) 10 database entries, each with
a single (tautomer-sensitive) InChI. Two definitions of unique.

So my sentence still stands: a chemical structure must calculate a
unique InChI, but a InChI might cover more then one chemical
structure.

On Thu, Feb 19, 2015 at 3:37 PM, Dimitri Maziuk  wrote:
> On 2015-02-19 07:27, Markus Sitzmann wrote:
>>
>> No, a chemical structure must calculate a unique InChI, but a InChI
>> might cover more then one chemical structure
>
>
> Heh. I could swear last time I read the description it specifically
> mentioned databases. In the database context 'unique' has a specific
> well-defined meaning and that is *not* 'more than one'. Now I don't see it
> in the official blurbs, only pikiwedia mentions databases.
>
>> ... there is no precise, universally valid
>> definition for "unique molecule".
>
>
> "On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the
> machine wrong figures, will the right answers come out?' I am not able
> rightly to apprehend the kind of confusion of ideas that could provoke such
> a question."
>
> Works for 'undefined figures', too.
>
> Dimitri
>
>

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Markus Sitzmann
No, a chemical structure must calculate a unique InChI, but a InChI
might cover more then one chemical structure (because their are
molecules that can be described by more than one chemical structure).
And a chemical formula might be the most accurate (unique) description
you have for a molecule (admittedly, unlikely today), however, that is
why the InChI is layered. Ba adding and removing layers, InChI allows
you how precisely you want to define uniqueness - that is important
with molecules because there is no precise, universally valid
definition for "unique molecule".

On Thu, Feb 19, 2015 at 2:06 PM, Dimitri Maziuk  wrote:
> On 2015-02-19 05:58, Greg Landrum wrote:
>>
>> On Thu, Feb 19, 2015 at 10:11 AM, Markus Sitzmann
>> mailto:markus.sitzm...@gmail.com>> wrote:
>>
>> Well, at least you said something important: "conversion of InChI to
>> molecules is something that's not in general guaranteed to work
>> perfectly" - and this is by design like this because InChI is an
>> identifier, not a molecule representation. Unfortunately, many people
>> seemed to forget about this :-)
>>
>>
>> Yes, yes they do.
>
> Well unfortunately inchi states they're a 'unique identifier' which
> means there must be 1 inchi for 1 molecule and it *should* work
> perfectly. And then they say the only required 'layer' is the formula
> which means a) it's not unique and b) how is "InChi=formula" better than
> just "formula"? D'uh.
>
> Dimitri
>
>
> --
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Markus Sitzmann
Well, at least you said something important: "conversion of InChI to
molecules is something that's not in general guaranteed to work
perfectly" - and this is by design like this because InChI is an
identifier, not a molecule representation. Unfortunately, many people
seemed to forget about this :-)

On Thu, Feb 19, 2015 at 6:59 AM, Greg Landrum  wrote:
>
> On Wed, Feb 18, 2015 at 7:01 PM, Igor Filippov 
> wrote:
>>
>> > update the bug report and work on tracking down the wrong problem
>>
>> That's how I sometimes do it too... ;)
>
>
> I'll leave it as an exercise to the reader to decide if that was
> intentional, the fault of auto-correct, or just because it had been a long
> day. ;-)
>
> -greg
>
>
> --
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-18 Thread Markus Sitzmann
I agree with John, the InChI for mol1 and mol2 should be

http://cactus.nci.nih.gov/chemical/structure/O=C(NCCc1c1)[C@H]1CC[C@H](Cn2c(O)nc3c3c2=O)CC1/stdinchi

InChI=1S/C24H27N3O3/c28-22(25-15-14-17-6-2-1-3-7-17)19-12-10-18(11-13-19)16-27-23(29)20-8-4-5-9-21(20)26-24(27)30/h1-9,18-19H,10-16H2,(H,25,28)(H,26,30)/t18-,19-

So the + at the end should be a -

Markus

On Wed, Feb 18, 2015 at 2:53 PM, John M  wrote:
> Hi Greg,
>
> I believe it's an RDKitMol -> InChI issue rather than InChI -> RDKitMol. The
> correct InChI (below) is different from that in the iPython listing.
>
> InChI=1S/C24H27N3O3/c28-22(25-15-14-17-6-2-1-3-7-17)19-12-10-18(11-13-19)16-27-23(29)20-8-4-5-9-21(20)26-24(27)30/h1-9,18-19H,10-16H2,(H,25,28)(H,26,30)/t18-,19-
>
> J
>
>
> Regards,
> John W May
> john.wilkinson...@gmail.com
>
> On 18 February 2015 at 04:57, Greg Landrum  wrote:
>>
>> JP,
>>
>> Looks like that's a bug in the way ring stereochemistry is handled while
>> translating the InChI back into an molecule.
>>
>> It's reproducible with a small example:
>> In [1]: from rdkit import Chem
>>
>> In [2]: mol1 = Chem.MolFromSmiles("C[C@H]1CC[C@H](O)CC1")
>>
>> In [3]: Chem.MolToSmiles(mol1,True)
>> Out[3]: 'C[C@H]1CC[C@H](O)CC1'
>>
>> In [4]: inchi = Chem.MolToInchi(mol1)
>>
>> In [5]: mol2 = Chem.MolFromInchi(inchi)
>>
>> In [6]: Chem.MolToSmiles(mol2,True)
>> Out[6]: 'C[C@H]1CC[C@@H](O)CC1'
>>
>> Conversion of InChI to molecules is something that's not in general
>> guaranteed to work perfectly, but I will go ahead and create a bug report.
>>
>> -greg
>>
>>
>>
>> On Tue, Feb 17, 2015 at 2:50 PM, JP  wrote:
>>>
>>> Hi there,
>>>
>>> I have a question for the 3D enabled of you (I wish the world looked like
>>> GTA2 !)
>>>
>>> I am seeing a case of an RDKit mol -> Inchi -> RDKit mol, that I think is
>>> changing the  stereochemistry of the molecule.  I have 12 example-pairs
>>> where this happens (but all very structurally similar).  I don't care much
>>> that the last rdkit molecule is a different tautomer than the starting one -
>>> but if this is the case the stereochemistry should still be conserved, no?
>>>
>>> I did an ipython notebook (most useful tool of the decade after RDKit?)
>>> gist here:
>>>
>>>
>>> http://nbviewer.ipython.org/urls/gist.githubusercontent.com/anonymous/7c158926a0f3bf9a4978/raw/d91cc808ac91eccc8bf0e45d9eacd2af382e5105/gistfile1.txt
>>>
>>> I appreciate if anyone could shed some light.  I'd just like to
>>> understand.
>>>
>>> Thank you for your time!
>>>
>>> -
>>> JP
>>>
>>>
>>> --
>>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>>> Get technology previously reserved for billion-dollar corporations, FREE
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
>>
>>
>> --
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>> Get technology previously reserved for billion-dollar corporations, FREE
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> --
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list

Re: [Rdkit-discuss] New RDKit drawing code

2015-02-14 Thread Markus Sitzmann
Hi Greg and all the others involved,

That looks really nice! And don't give any code to Noel anymore, it all ends up 
in JavaScript :-) (who would have thought 10 years ago that would make any 
sense).

Best,
Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 14.02.2015, at 08:01, Greg Landrum  wrote:
> 
> Dear all,
> 
> Noel's great blog post on using the RDKit from emscripten 
> (http://baoilleach.blogspot.ch/2015/02/cheminformaticsjs-rdkit.html) made me 
> realize that I should post something here about the new RDKit drawing code 
> that's currently available in github.
> 
> Rather than do a long email message, I did a quick blog post that 
> demonstrates some of the functionality:
> http://rdkit.blogspot.com/2015/02/new-drawing-code.html
> 
> I'm still actively working on this, but I think what's there is already worth 
> showing off a bit. :-)
> 
> Many thanks are due to Dave Cosgrove, who did the initial work that makes 
> this all possible.
> 
> -greg
> 
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MaxMin Picker and Python

2014-07-16 Thread Markus Sitzmann
Hi Matt,

maybe squeeze these two lines

zims = [x for x in Chem.ForwardSDMolSupplier(gzip.open('a.sdf.gz')) if
x is not None]

zims_fps=[AllChem.GetMorganFingerprintAsBitVect(x,2) for x in zims]

into one:

zims_fps=[AllChem.GetMorganFingerprintAsBitVect(x,2) for x in
Chem.ForwardSDMolSupplier(gzip.open('a.sdf.gz')) if x is not None]

because zims keeps the whole file in memory for no good reason  :-)
(is that sdf.gz big?)

Markus

On Thu, Jul 17, 2014 at 12:43 AM, Matthew Lardy  wrote:
> Hi Igor,
>
> Thanks!  Maybe I am a throwback, but I prefer the command line to a GUI.
> Still I'll give it a whirl!  :)
>
> If you are handling millions of molecules without issue; then my Python
> skills are really, really, rusty.  Or, I shouldn't be using Python to handle
> this much data.  :)
>
> Thanks for the info!
> Matt
>
>
> On Wed, Jul 16, 2014 at 3:31 PM, Igor Filippov 
> wrote:
>>
>> Matthew,
>>
>> Two lines of shameless self-promotion:
>> This is exactly the kind of problem for Diversity Genie -
>> http://www.diversitygenie.com/
>> It is using RDKit library underneath, but wraps it in a simple, easy to
>> use GUI front-end.
>>
>> Best regards,
>> Igor
>>
>>
>> On Wed, Jul 16, 2014 at 6:18 PM, Matthew Lardy  wrote:
>>>
>>> Hi all,
>>>
>>> I have been playing with the diversity selection in RDKit.  I am running
>>> through a set of ~26,000 molecules to pick a set of 200 diverse molecules.
>>> I saw some examples of how to do this in Python (my variant of their script
>>> below), but the memory consumption is massive.  I burned through ~15GB of
>>> memory before I killed it off.  Is this about what others have seen, or
>>> should I move to doing this in C++ or Java (assuming that others have seen a
>>> significantly lower level of memory consumption)?
>>>
>>> Here is the script:
>>>
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> from rdkit import DataStructs
>>> import gzip
>>> from rdkit.Chem import Draw
>>> from rdkit.SimDivFilters import rdSimDivPickers
>>>
>>> zims = [x for x in Chem.ForwardSDMolSupplier(gzip.open('a.sdf.gz')) if x
>>> is not None]
>>>
>>> zims_fps=[AllChem.GetMorganFingerprintAsBitVect(x,2) for x in zims]
>>>
>>> dm=[]
>>> for i,fp in enumerate(zims_fps[:26000]): # only 1000 in the demo (in
>>> the interest of time)
>>>
>>> dm.extend(DataStructs.BulkTanimotoSimilarity(fp,zims_fps[1+1:26000],returnDistance=True))
>>> dm = array(dm)
>>> picker = rdSimDivPickers.MaxMinPicker()
>>> ids = picker.Pick(dm,26000,200)
>>> list(ids[:200])
>>>
>>>
>>> Thanks in advance!
>>> Matt
>>>
>>>
>>> --
>>> Want fast and easy access to all the code in your enterprise? Index and
>>> search up to 200,000 lines of code with a free copy of Black Duck
>>> Code Sight - the same software that powers the world's largest code
>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>> http://p.sf.net/sfu/bds
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
>
>
> --
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Fwd: Tautomeric InChIs

2014-05-08 Thread Markus Sitzmann
-- Forwarded message --
From: Markus Sitzmann 
Date: Thu, May 8, 2014 at 3:27 PM
Subject: Re: [Rdkit-discuss] Tautomeric InChIs
To: Edward Pyzer-Knapp 


Hi Edward,

since your InChI is a Standard InChI ("1S/"): tautomeric forms are
purposely  *not* preserved by Standard InChI - that's why we created
Standard InChI (with non-standard InChI's it is another story, those
you can make tautomer-sensitive or insensitive).And actually many
people complain that Standard InChI falls short in some cases
regarding tautomer normalization :-).

Best,
Markus

On Thu, May 8, 2014 at 3:16 PM, Edward Pyzer-Knapp
 wrote:
> Hi all,
>
> I have been playing around with RDKIT for a while now - great work guys!
>
> I have recently hit an issue when using InChIs:
>
> When generating both inchi and smiles from a rdkit Mol, I get two different
> structures, even if I use the smiles as an input for the inchi generation.
>
> An example:
>
> smiles = "[H]N1C(=O)C(=C2C(=O)c3c(Cl)sc(F)c3N2[H])c2sc(F)c(Cl)c21" (I should
> add this smiles was generated by RDKIT, from a Mol file)
>
> mol = MolFromSmiles(smiles)
> inchi = MolToInchi(mol)
>
> print inchi
> InChI=1S/C12H2Cl2F2N2O2S2/c13-3-6-8(21-10(3)15)2(12(20)18-6)4-7(19)1-5(17-4)11(16)22-9(1)14/h17H,(H,18,20)
>
> when comparing the smiles and the inchi, the C=O has changed to an OH and a
> C-N-H  has changed to a C=N.  I realise that these are tautomers of each
> other, but surely the tautomeric form should be preserved when interchanging
> smiles to inchi? Since at the moment, going Smiles->Inchi->Smiles does NOT
> result in the original smiles...
>
> There is a layer in the INCHI standard which would allow description of
> this, is there a way to turn that on?
>
> Many Thanks,
>
> Ed Pyzer-Knapp
>
> --
> Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
> • 3 signs your SCM is hindering your productivity
> • Requirements for releasing software faster
> • Expert tips and advice for migrating your SCM now
> http://p.sf.net/sfu/perforce
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Fwd: RDKit cartridge similarity search speeds(?)

2014-05-08 Thread Markus Sitzmann
-- Forwarded message --
From: Markus Sitzmann 
Date: Thu, May 8, 2014 at 3:14 PM
Subject: Re: [Rdkit-discuss] RDKit cartridge similarity search speeds(?)
To: James Davidson 


Hi James,

I would guess, in your second query, "morganbv_fp('c1nnccc1'::mol, 2)"
has to be calculated for each row you are scanning because from the
database's perspective the result is unpredictable (although it is
not), so it can not be optimized so easily. All of this is avoided in
your first query, the calculation is done once before the table scan
and then the actual index/table scan is a rather simple one.

Markus

On Thu, May 8, 2014 at 2:35 PM, James Davidson  wrote:
> Dear All,
>
>
>
> I have recently been spending a bit more time with the RDKit cartridge, and
> have what is probably a very naïve question…
>
> Having built some RDKit fingerprints for ChEMBL_18, I see the following
> behaviour (for clarification – ‘ecfp4_bv’ is the column in my rdk.fps table
> that has been generated using morganbv_fp(mol, 2)):
>
>
>
>
>
> chembl_18=# \timing on
>
> Timing is on.
>
>
>
> chembl_18=# set rdkit.tanimoto_threshold=0.5;
>
> SET
>
> Time: 0.167 ms
>
>
>
> chembl_18=# select chembl_id from rdk.fps where ecfp4_bv %
> morganbv_fp('c1nnccc1'::mol,2);
>
>   chembl_id
>
> -
>
> CHEMBL15719
>
> (1 row)
>
>
>
> Time: 2033.348 ms
>
>
>
> chembl_18=# select chembl_id from rdk.fps where tanimoto_sml(ecfp4_bv,
> morganbv_fp('c1nnccc1'::mol, 2)) > 0.5;
>
>   chembl_id
>
> -
>
> CHEMBL15719
>
> (1 row)
>
>
>
> Time: 6843.605 ms
>
>
>
>
>
> I can see that the query plans are different in the two cases, but I don’t
> fully understand why – see below:
>
>
>
> QUERY 1 (with explain analyze)
>
> chembl_18=# explain analyze select chembl_id from rdk.fps where ecfp4_bv %
> morganbv_fp('c1nnccc1'::mol,2);
>
>
> QUERY PLAN
>
> 
>
> Bitmap Heap Scan on fps  (cost=106.91..5298.31 rows=1352 width=13) (actual
> time=1774.986..1774.987 rows=1 loops=1)
>
>Recheck Cond: (ecfp4_bv %
> '\x0100084200048204'::bfp)
>
>->  Bitmap Index Scan on fps_ecfp4bv_idx  (cost=0.00..106.57 rows=1352
> width=0) (actual time=1774.969..1774.969 rows=1 loops=1)
>
>  Index Cond: (ecfp4_bv %
> '\x0100084200048204'::bfp)
>
> Total runtime: 1775.035 ms
>
> (5 rows)
>
>
>
> Time: 1776.133 ms
>
>
>
>
>
> QUERY 2 (with explain analyze)
>
> chembl_18=# explain analyze select chembl_id from rdk.fps where
> tanimoto_sml(ecfp4_bv, morganbv_fp('c1nnccc1'::mol, 2)) > 0.5;
>
>
> QUERY PLAN
>
> ---
>
> Seq Scan on fps  (cost=0.00..388808.17 rows=450793 width=13) (actual
> time=1278.115..6953.977 rows=1 loops=1)
>
>Filter: (tanimoto_sml(ecfp4_bv,
> '\x0100084200048204'::bfp)
>> 0.5::double precision)
>
>Rows Removed by Filter: 1352377
>
> Total runtime: 6954.010 ms
>
> (4 rows)
>
>
>
> Time: 6955.103 ms
>
>
>
>
>
> It seems conceptually ‘easier’ to add the similarity value as part of the
> query, rather than setting it as a variable ahead of the query; but clearly
> I should be doing it the latter way for performance reasons.  So even if I
> don’t fully understand why at the moment, am I correct in thinking that
> queries of this sort should always be run with the similarity operators (%,
> #)?  And if so, is the rdkit.tanimoto_threshold variable set at the level of
> the session, the user, or the database?
>
>
>
> Kind regards
>
>
>
> James
>
>
> __
> PLEASE READ: This email is confidential and may be privileged. It is
> intended for the named addressee(s) only and access to it by anyone else is
> unauthorised. If you are not an addressee, any disclosure or copying of the
> contents of this email or any ac

Re: [Rdkit-discuss] implementation of tautomer enumeration/canonicalization

2014-04-03 Thread Markus Sitzmann
Hello,

I was about to ask the same (I am one of the authors of the mentioned
paper) - I had seen this post (gosh, a year ago) but had no time back
then to answer (job search and a move from the US to Europe).

I was digging into this last week a bit, however, I can not say much
yet - very initial work. If something comes out of it, I would
contribute it to RDKit. Well, if somebody has already done, I am
happy, too. Or we join forces (however, for me it is only some private
hacking with not so much time).

Markus

On Wed, Apr 2, 2014 at 11:32 AM, Dave W  wrote:
> Hi Markus and all,
>
> Did you or anyone else end up coding this up? I am looking into doing it
> myself, but if it's already been done...
>
> Many thanks,
> Dave
>
>
>
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Three tests failing on CentOS 5.3 - important ?

2014-03-19 Thread Markus Sitzmann
I think the syntax "except Exception as e:" did't exist before python
2.6 ... are you running this on an older version? :-)

Cheers,
Markus

On Wed, Mar 19, 2014 at 7:54 AM, Jan Holst Jensen  
wrote:
> On 2014-03-19 05:54, Greg Landrum wrote:
>
>
> On Tue, Mar 18, 2014 at 4:59 PM, Jan Holst Jensen 
> wrote:
>>
>> Hi RDKitters,
>>
>> I managed to get RDKit 2013_09_2 built on CentOS 5.3. Will post a short
>> recipe later.
>>
>
> Wow; that's an ancient version.
>
>
> Yup. Approaching archaeology-region here.
>
>
>
>>
>> Right now, I am still left with three tests that fail, but I think (hope)
>> that I can live with that ? Failing tests are:
>>
>> 72:pythonTestDbCLI
>> 73:pythonTestDirML
>> 78:pythonTestDirChem
>>
>> The test log shows that test 72 won't run because of missing SQLite
>> support.
>>
>> 72/78 Testing: pythonTestDbCLI
>> 72/78 Test: pythonTestDbCLI
>> Command: "/usr/bin/python26"
>> "/u01/software/RDKit_2013_09_2/Projects/test_list.py" "--testDir"
>> "/u01/software/RDKit_2013_09_2/Projects"
>> Directory: /u01/software/RDKit_2013_09_2/build/Projects
>> "pythonTestDbCLI" start time: Mar 05 11:07 CET
>> Output:
>> --
>> Traceback (most recent call last):
>>   File "TestDbCLI.py", line 9, in ?
>> from rdkit.Dbase.DbConnection import DbConnect
>>   File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbConnection.py", line
>> 21, in ?
>> from rdkit.Dbase import DbUtils,DbInfo
>>   File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbUtils.py", line 17, in
>> ?
>> from rdkit.Dbase.DbResultSet import
>> DbResultSet,RandomAccessDbResultSet
>>   File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbResultSet.py", line
>> 12, in ?
>> from rdkit.Dbase import DbInfo
>>   File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbInfo.py", line 12, in
>> ?
>> import DbModule
>>   File "/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbModule.py", line 61,
>> in ?
>> raise ImportError,"Neither sqlite nor PgSQL support found."
>> ImportError: Neither sqlite nor PgSQL support found.
>>
>>
>> A bit puzzling, since I can "import sqlite3" just fine from Python when
>> run interactively. As far as I can understand RDConfig.py a successful
>> import of sqlite3 should make it report that SQLite support is availabe ?
>> For my purposes, failing this test is probably fine - I don't expect I need
>> sqlite support on this machine.
>
>
> It's probably not important unless you are planning on using the DbCLI code.
> If you want to try and track it down: can you do "from rdkit.Dbase import
> DbModule"?
>
>
> Goes just fine interactively.
>
> [oracle@localhost ~]$ python26
> Python 2.6.8 (unknown, Nov  7 2012, 14:47:45)
> [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
 from rdkit.Dbase import DbModule
 quit()
> [oracle@localhost ~]$
>
> Well, let's leave it at that. Since I am not planning to use the DbCLI code
> at the moment I am OK with it.
>
>
>
>
>>
>> Tests 73 has this in the log:
>>
>> Traceback (most recent call last):
>>   File "UnitTestBuildComposite.py", line 16, in ?
>> from rdkit.ML import BuildComposite
>>   File "/u01/software/RDKit_2013_09_2/rdkit/ML/BuildComposite.py", line
>> 203, in ?
>> from rdkit.ML.Composite import Composite,BayesComposite
>>   File "/u01/software/RDKit_2013_09_2/rdkit/ML/Composite/Composite.py",
>> line 25, in ?
>> from rdkit.ML.Data import DataUtils
>>   File "/u01/software/RDKit_2013_09_2/rdkit/ML/Data/DataUtils.py", line
>> 57, in ?
>> from rdkit.ML.Data import MLData
>>   File "/u01/software/RDKit_2013_09_2/rdkit/ML/Data/MLData.py", line 8, in
>> ?
>> import numpy
>> ImportError: No module named numpy
>> ...
>>
>> and test 78:
>>
>> Output:
>> --
>>   File "UnitTestInchi.py", line 187
>> except InchiReadWriteError as inst:
>> ^
>> SyntaxError: invalid syntax
>>   File "PandasTools.py", line 100
>> except Exception as e:
>>   ^
>> SyntaxError: invalid syntax
>> Traceback (most recent call last):
>>   File "UnitTestEState.py", line 17, in ?
>> import numpy
>> ImportError: No module named numpy
>> Traceback (most recent call last):
>>   File "UnitTestFingerprints.py", line 17, in ?
>> import numpy
>> ImportError: No module named numpy
>> ...
>>
>>
>> The numpy module loads fine when run interactively. So maybe it is
>> something else that is wrong - just that the error reported from Python is a
>> bit misleading (?).
>
>
> That's a strange one, but if you can import numpy and rdkit.Chem, then I
> wouldn't be concerned about it.
> Again, if you're interested in trying to track it down, there are some
> experiments we can do.
>
>>
>>
>> I haven't run into stuff that doesn't work yet because of these test
>> failures, so I think that I can get by without them passing. But it would be
>> nice to know

Re: [Rdkit-discuss] docker.io - container for fully fledged rdkit installation on linux?

2013-11-27 Thread Markus Sitzmann
It is basically a VM that can be scripted from the host system. The VM
client can be preconfigured with anything your software depends on
(including databases etc and can be based on arbitrary Linux
distributions independent of the Linux distribution of the host).

On Wed, Nov 27, 2013 at 4:20 PM, Igor Filippov
 wrote:
> Not to criticize or anything, but I've seen this issue quite a few times -
> perhaps the problem
> is actually with me and everybody else is "in the know"?
>
> I've spent last few minutes clicking around Docker website, I still cannot
> figure out what it is and what it does?
> I found that it runs on all Linux builds, that the latest release is a work
> of 130 people, that there are Trusted Builds and Docker Hack Days.
> But I still cannot puzzle out what it does!!!
>
> Would it kill the project maintainers to put a few words somewhere on the
> top of the website what the software is actually all about?
>
> Igor
>
> P.S. I finally found some clues under "Learn More" link. I guess the point
> is only those who already know or the really persistent ones or the ones
> with
> time to spare need to bother.
>
>
>
>
>
>
> On Wed, Nov 27, 2013 at 8:09 AM, Samo Turk  wrote:
>>
>> Hi rdkitters,
>>
>> New release of Docker is available and it brings one very impotant
>> improvement - it runs on any linux distribution (as long as the kernel is
>> 3.8 or later). I updated "RDKit Dockerfile" so it builds everything on top
>> of Ubuntu 13.10 base image. To build the container do:
>> "git clone https://gist.github.com/6669650.git ."
>> "mv Dockerfile-rdkit Dockerfile"
>> "sudo docker build -t rdkit ."
>>
>> Run it with:
>> "sudo docker run -p 127.0.0.1:8889: rdkit"
>> and IPython notebook will be available on http://127.0.0.1:8889/
>>
>> Regards,
>> Samo
>>
>>
>> On Tue, Sep 24, 2013 at 9:08 AM,  wrote:
>>>
>>>
>>> I also highly appreciate your efforts!
>>>
>>>
>>> Cheers,
>>> Paul
>>>
>>>
>>> > Stuff like this that makes it easier for people to access/use the
>>> > RDKit is great!
>>> >
>>> > The more options we have the better.
>>> >
>>> > Many thanks to you guys for looking into this stuff. :-)
>>> >
>>> > -greg
>>> >
>>> >
>>>
>>> > Interesting stuff, looks promising!
>>> > Got pulled in so I created a Dockerfile that builds an image with
>>> > rdkit, ipython and matplotlib. Once the image is built it runs
>>> > ipython notebook server. You can find the source here: https://
>>> > gist.github.com/samoturk/6669650
>>> > Just follow instructions in the first few lines of the Dockerfile to
>>> > build and run it..
>>> >
>>> > Regards,
>>> > Samo
>>> >
>>>
>>>
>>>
>>> This message and any attachment are confidential and may be privileged or
>>> otherwise protected from disclosure. If you are not the intended
>>> recipient,
>>> you must not copy this message or attachment or disclose the contents to
>>> any other person. If you have received this transmission in error, please
>>> notify the sender immediately and delete the message and any attachment
>>> from your system. Merck KGaA, Darmstadt, Germany and any of its
>>> subsidiaries do not accept liability for any omissions or errors in this
>>> message which may arise as a result of E-Mail-transmission or for damages
>>> resulting from any unauthorized changes of the content of this message
>>> and
>>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>>> subsidiaries do not guarantee that this message is free of viruses and
>>> does
>>> not accept liability for any damages caused by any virus transmitted
>>> therewith.
>>>
>>> Click http://www.merckgroup.com/disclaimer to access the German, French,
>>> Spanish and Portuguese versions of this disclaimer.
>>>
>>>
>>>
>>> --
>>> October Webinars: Code for Performance
>>> Free Intel webinars can help you accelerate application performance.
>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
>>> from
>>> the latest Intel processors and coprocessors. See abstracts and register
>>> >
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>>
>>
>> --
>> Rapidly troubleshoot problems before they affect your business. Most IT
>> organizations don't have a clear picture of how application performance
>> affects their revenue. With AppDynamics, you get 100% visibility into your
>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics
>> Pro!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists

  1   2   >