Re: [Rdkit-discuss] computing molecular descriptors for a small molecule

2012-10-15 Thread Francois Berenger
On 10/15/2012 04:01 PM, paul.czodrow...@merckgroup.com wrote: this one should work: http://code.google.com/p/rdkit/wiki/descriptor_calculation Thanks! I found this one in the wiki also: http://code.google.com/p/rdkit/wiki/DescriptorsInTheRDKit I'll try to start using them. Regards, F.

Re: [Rdkit-discuss] aligning maximum common substructure of 2 molecules

2017-02-20 Thread Francois BERENGER
At least for the MCS calculation, there is an R package for chemistry: https://bioconductor.org/packages/release/bioc/vignettes/fmcsR/inst/doc/fmcsR.html On 02/19/2017 07:33 PM, Thomas Evangelidis wrote: > Dear all, > > I want to align 250 compounds that binding to the same pocket to one of >

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Francois BERENGER
Hi, Here is a Python script that was created with the help of some rdkit wizards: https://github.com/UnixJunkie/mol2ecfp4 It works with unfolded ECFP4 fingerprints, so not exactly what you are looking for. There would be more modifications needed in order to fold the fingerprint to the desired

Re: [Rdkit-discuss] numpy array to bit vector

2017-03-16 Thread Francois BERENGER
I'll send a Python script. It works for .smi files. If anyone can adapt it to work on sdf files, that would be wonderful. Just give me 5mn to put it on github. On 03/16/2017 09:28 AM, Thomas Evangelidis wrote: > Hello, > > I created a numpyarray from a molecule using the following function: > >

[Rdkit-discuss] official Tripos MOL2 file format PDF document

2017-04-11 Thread Francois BERENGER
Hello, Not directly related to rdkit, but if someone that have the original PDF of this file format could place it online permanently, that would be wonderful. The official URL at tripos.com is dead since quite some time apparently. And that's bad because it's a quite popular file format and its

Re: [Rdkit-discuss] can't kekulize molecule

2017-08-16 Thread Francois BERENGER
On 08/16/2017 03:36 PM, Greg Landrum wrote: Hi Shuai, The RDKit Mol2 parser is really only validated for the atom types generated by corina. I'm not surprised that the ouput from open babel would not be understood. This is documented:

[Rdkit-discuss] Is it possible to compute pi and sigma partial charges with rdkit?

2017-07-25 Thread Francois BERENGER
Hello, Is it possible to decompose partial charges with rdkit? I am afraid that Gasteiger-Marsili (PEOE) is mostly about sigma bonds, but I might be wrong. Regards, F. -- Check out the vibrant tech community on one

Re: [Rdkit-discuss] position restraints on all atoms

2017-07-25 Thread Francois BERENGER
On 07/25/2017 05:45 AM, Katrina Lexa wrote: Hi All, I'm relatively new to RDKit, so I apologize for what may be a silly question. I'd like to generate a set of local minimum conformations around my input conformation, using a set of defined flat bottom potentials (0.2, 0.6, 1.0, and 1.4), in

Re: [Rdkit-discuss] can't kekulize molecule

2017-08-16 Thread Francois BERENGER
On 08/16/2017 06:14 PM, Greg Landrum wrote: On Wed, Aug 16, 2017 at 3:55 AM, Francois BERENGER <beren...@bioreg.kyushu-u.ac.jp <mailto:beren...@bioreg.kyushu-u.ac.jp>> wrote: On 08/16/2017 03:36 PM, Greg Landrum wrote: The RDKit Mol2 parser is really only validated

Re: [Rdkit-discuss] troubles going from 2D to 3D

2017-08-16 Thread Francois BERENGER
On 08/17/2017 03:19 AM, Bennion, Brian wrote: Hello All, I am parsing a set of 2D sd files in rdkit in order to generate a 3D structure. The code is below and is based on what I could find on the list for errors in generating 3D coordinates. Temp.mol is the downloaded molfile from chembl

[Rdkit-discuss] Is there a Ubuntu ppa or some repository with the latest rdkit release as .deb ?

2017-06-21 Thread Francois BERENGER
Hello, I'd like to install rdkit system-wide. However, I'd like the install to use a regular system package since rdkit is available in my distro. I don't like system-wide install from sources. Because they tend to install things in different places than what the binary package does, and

Re: [Rdkit-discuss] RMSD value between two non-covalent molecular complexes

2017-06-22 Thread Francois BERENGER
On 06/22/2017 11:20 PM, gosia olejniczak wrote: Hi again, i found where the problem was (it seems): as i was reading in molecules from sdf file through "SDMolSupplier" by doing: suppl = Chem.SDMolSupplier(filename) the hydrogen atoms were removed (what was not obvious since e.g.

[Rdkit-discuss] AllChem.GetConformerRMSD: this is not RMSD between two conformers but an upper bound of it

2017-06-14 Thread Francois BERENGER
Hello, I am afraid that in AllChem.GetConfomerRMSD: one doesn't get the RMSD between the two conformers but an upper bound of it. I understand from the doc that if they are aligned, they are aligned to the first conformer of the molecule. To get the real RMSD between two conformers, they must

Re: [Rdkit-discuss] a 2D to 3D (smi to sdf) conformer generator python script using rdkit

2017-06-15 Thread Francois BERENGER
coordinate generation if the initial embedding fails. Thanks for the comment. I might update this part if I see it fail. Regards, F. Best, -greg On Wed, Jun 14, 2017 at 9:27 AM, Francois BERENGER <beren...@bioreg.kyushu-u.ac.jp <mailto:beren...@bioreg.kyushu-u.ac.jp>> wrote:

[Rdkit-discuss] atom indexes and order of atoms in the input file

2017-06-15 Thread Francois BERENGER
Hello, If I read a molecule from a .sdf file, will the atom indexes be conserved/preserved? 1st atom in the file will have index 0, 2nd index 1, etc. And, will this always hold in the future? Is this an invariant of rdkit? Thanks, F.

[Rdkit-discuss] how to append conformer number to molecule name in SDF output file?

2017-06-13 Thread Francois BERENGER
Hello, I am generating conformers. When I write them out, I'd like that they are named like this: molName_001 molName_002 ... So that, down the line, I know with which conformer of which molecule I am working with. So: "parent" molecule name followed by one '_' then the conformer Id in a

[Rdkit-discuss] difference in VdW radii between Open Babel 2.3.2 and rdkit 201503

2017-06-18 Thread Francois BERENGER
Hello, Sometimes, as a computer scientist, I am quite worried by chemical software libraries: $ cat data/ethanol.pqr COMPNDethanol AUTHORGENERATED BY OPEN BABEL 2.3.2 HETATM1 C LIG 1 -0.017 -0.601 0.000 0.04138432 1.700 C HETATM2 C LIG 1 1.247

[Rdkit-discuss] If we want a molecular surfacing implementation in rdkit ...

2017-10-16 Thread Francois BERENGER
Hello, I found this open source implementation recently: Website: https://zhanglab.ccmb.med.umich.edu/EDTSurf/ C++ Code: https://zhanglab.ccmb.med.umich.edu/EDTSurf/EDTSurf.zip Paper: "Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform"

[Rdkit-discuss] recent packages for Ubuntu

2017-09-05 Thread Francois BERENGER
Hello, If the update of the binary packages for Ubuntu/Debian is documented somewhere, it would help people who want to make available binary packages of rdkit as soon as there is a new rdkit release. I think we should have a ppa for people who want to use the bleeding edge version of rdkit.

Re: [Rdkit-discuss] ErG: 2D Pharmacophore Similarity Searches

2017-09-03 Thread Francois BERENGER
On 09/02/2017 11:05 PM, Konrad Koehler wrote: > Hi everyone, > > As a followup to my previous post, in reading the Stiefl paper (Chem. > Inf. Model. 2006, 46, 208-220) closer, I see my question about > converting ErG numpy array into a bit vector was a little naive. It > turns out that ErG

Re: [Rdkit-discuss] RPM distros

2017-11-26 Thread Francois BERENGER
it/build/RDKit-2018.03.1.dev1-Linux-Extras.deb generated. >>>>> CPack: - package: >>>>> /rdkit/build/RDKit-2018.03.1.dev1-Linux-Python.deb generated. >>>>> CPack: - package: >>>>> /rdkit/build/RDKit-2018.03.1.dev1-Linux-Runtime.deb generated

Re: [Rdkit-discuss] RPM distros

2017-11-27 Thread Francois BERENGER
chine before installing the new packages (so that we test what we intend to test). On a Debian-like: sudo apt-get remove $(dpkg -l | grep rdkit | awk '{print $2}') > cpack -G DEB > cpack -G RPM > > > On 27/11/2017 00:05, Francois BERENGER wrote: >> Hello, >> >> What

Re: [Rdkit-discuss] RPM distros

2017-11-27 Thread Francois BERENGER
8.0, libboost-system1.58.0, libboost-thread1.58.0, libc6 (>= 2.14), libgcc1 (>= 1:4.1.1), libpython2.7 (>= 2.7), libstdc++6 (>= 5.2), python (<< 2.8), python (>= 2.7~) You should install the ones you are missing and test again. > On 27/11/2017 09:11, Francois BERENGER wrote: >

Re: [Rdkit-discuss] Sanitizing SD file

2017-12-13 Thread Francois BERENGER
On 12/14/2017 02:10 PM, Greg Landrum wrote: > > On Thu, Dec 14, 2017 at 4:22 AM, Francois BERENGER > <beren...@bioreg.kyushu-u.ac.jp <mailto:beren...@bioreg.kyushu-u.ac.jp>> > wrote: > > On 12/14/2017 05:15 AM, Sundar wrote: > > Hi RDkit users, &

Re: [Rdkit-discuss] Sanitizing SD file

2017-12-13 Thread Francois BERENGER
On 12/14/2017 05:15 AM, Sundar wrote: > Hi RDkit users, > > I encounter following sanitize issue while I was trying to load an SD > file using > Chem.SDMolSupplier('lig.sdf') > > Explicit valence for atom # 16 N, 4, is greater than permitted > ERROR: Could not sanitize molecule ending on line

Re: [Rdkit-discuss] read sdf file without removing hydrogen atoms.

2017-12-12 Thread Francois BERENGER
On 12/13/2017 11:03 AM, Chicago Ji wrote: > Dear RDkit Users, > > Rdkit would delete all hydrogen atoms when read in a sdf file. > Since I want to use charge information of all atoms in sdf file, I want > to keep all hydrogen atoms when readin. > Is there something like  Chem.SmilesParserParams()

Re: [Rdkit-discuss] RPM distros

2017-11-09 Thread Francois BERENGER
On 11/08/2017 08:47 PM, Tim Dudgeon wrote: There is mention of RPM distributions of RDKit (https://copr.fedorainfracloud.org/coprs/giallu/rdkit/). But on trying these: 1. the distro is based on the 2017_03_1 release 2. it fails due to missing libinchi.so.1 dependency. In the bugtracker,

[Rdkit-discuss] it would be nice to have a working 'brew install rdkit' on Mac OS X

2017-12-11 Thread Francois BERENGER
Hello, Apparently, the homebrew recipe for rdkit is broken on Mac. That's not very cool, since brew is the reference tool to install software from source (automatically) on the Mac. Regards, F. -- Check out the

Re: [Rdkit-discuss] Question regarding 3D pharmacophores

2017-10-22 Thread Francois BERENGER
On 10/21/2017 01:58 AM, Andy Jennings wrote: Hi, I'm curious if anyone has figured out a way to compare two molecules based upon their pharmacophore similarities. Specifically, I want to compare 2 molecules in their _absolute_ locations, and not simply see if they have 2 pharmacophores that

Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-15 Thread Francois BERENGER
On 01/16/2018 06:43 AM, Dimitri Maziuk wrote: > On 01/15/2018 02:43 PM, Tim Dudgeon wrote: > >> Could there be something in a more general project to bridge the >> compound (mol/smiles), sequence (protein/nucleotide seq + alignments) >> and structure (pdb/mmcif/mmtf) worlds? > > FWIW PDB builds

Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-15 Thread Francois BERENGER
On 01/16/2018 05:51 AM, Tim Dudgeon wrote: > Incorporating and "industrialising" Matt's MolVS tautomer and > standardizer code? > http://molvs.readthedocs.io/en/latest/index.html If we can vote, I would vote for this one. > On 15/01/18 07:09, Greg Landrum wrote: >> Dear all, >> >> We've been

Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-15 Thread Francois BERENGER
Supporting mol2 files as input would be nice. There is already some code out there, people have worked on it and several people would like to have the feature... On 01/15/2018 04:09 PM, Greg Landrum wrote: > Dear all, > > We've been invited again to participate in the OpenChemistry application

Re: [Rdkit-discuss] How to encode atomic contributions to logP (hydrophobicity) in MOL2 formatted charge slot?

2018-11-13 Thread Francois Berenger
On 14/11/2018 02:42, James T. Metz via Rdkit-discuss wrote: RDkit Discussion Group, Given a set of small molecules as a SDF file, I would like to generate a MOL2 file where the atomic contributions to logP (hydrophobicity) from each atom including hydrogens have been calculated and are now

Re: [Rdkit-discuss] Plotting values next to atoms

2018-11-05 Thread Francois Berenger
On 03/11/2018 04:27, Greg Landrum wrote: Hi Eric, On Fri, Nov 2, 2018 at 2:00 PM Eric Jonas wrote: Hello! I'm trying to figure out if there's any known or sane way to automatically plot numerical values adjacent to atoms using the rdkit drawing machinery. Ideally I'd like to annotate certain

Re: [Rdkit-discuss] Stable format for long-term storage

2018-10-08 Thread Francois Berenger
On 06/10/2018 02:27, Eric Jonas wrote: Hello! Is there a recommended stable format for long-term storage of RDKit molecules? Will ToBinary() give me what I need? (the documentation / purpose seems to be a bit... spartan) I'd like to save topology, conformers, and properties (at the atom, bond,

Re: [Rdkit-discuss] Butina clustering with additional output

2018-09-26 Thread Francois Berenger
On 21/09/2018 16:53, Chris Earnshaw wrote: Hi I'm afraid I can't help with an RDkit solution to your question, but there are a couple of issues which should be born in mind: 1) The centroid of a cluster is a vector mean of the fingerprints of all the members of the cluster and probably will not

Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-14 Thread Francois Berenger
On 15/01/2019 09:53, Andreas Luttens wrote: Hi! I have developed a small script that calculates molecules properties for molecules that are stored in a SMILES file. The properties should be stored in an SQL database, which works fine, but I would like to speed up the process a bit. I was

Re: [Rdkit-discuss] Open-source business models and the RDKit

2019-03-27 Thread Francois Berenger
On 27/03/2019 16:24, Francois Berenger wrote: On 27/03/2019 01:46, Greg Landrum wrote: And now that I've included two other messages, here's (part of) my take on this. The viability of open-source business models is something I'm deeply interested in (I pay rent these days thanks to income

Re: [Rdkit-discuss] Open-source business models and the RDKit

2019-03-27 Thread Francois Berenger
On 27/03/2019 01:46, Greg Landrum wrote: And now that I've included two other messages, here's (part of) my take on this. The viability of open-source business models is something I'm deeply interested in (I pay rent these days thanks to income from two open-source companies) and, like Andrew,

Re: [Rdkit-discuss] chemfp preprint

2019-03-24 Thread Francois Berenger
On 23/03/2019 04:39, Andrew Dalke wrote: Hi RDKit users, This week I submitted a paper about chemfp for publication. I also submitted a preprint on ChemRxiv, which was just accepted. For those interested, it's at https://chemrxiv.org/articles/The_Chemfp_Project/7877846 . It's a rather long

[Rdkit-discuss] Are there Ubuntu packages for rdkit for python-3.6 somewhere?

2019-03-13 Thread Francois Berenger
Hello, I know where to find packages for python-2.7. No idea for python 3 though. Thanks, F. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] AM1-BCC charges for small molecules

2019-03-11 Thread Francois Berenger
On 12/03/2019 03:55, James T. Metz via Rdkit-discuss wrote: RDkit Discussion Group, I am interested in generating and assigning AM1-BCC charges to small molecules, You can do it with Chimera. Cf. http://www.cgl.ucsf.edu/chimera/current/docs/ContributedSoftware/addcharge/addcharge.html

Re: [Rdkit-discuss] Scaffold Tree implementation

2019-02-14 Thread Francois Berenger
On 14/02/2019 19:44, Colin Bournez wrote: Dear all, I would like to know if there is the possibility to use the Scaffold Tree algorithm in RDKit? In the documentation, there is an existing page : http://rdkit.org/docs/source/rdkit.Chem.Scaffolds.ScaffoldTree.html [1] But it is empty... Maybe

Re: [Rdkit-discuss] Pharmacophore atom typing for torsion or atom pair FP

2019-01-31 Thread Francois Berenger
Hi, I have a related question: how to output the type of an atom in a molecule, if possible in a human-readable format; i.e. a human readable/understandable string rather than some (obscure) integer. I am interested to look at the atom types used by the ECFP and the FCFP fingerprints. Thanks a

Re: [Rdkit-discuss] any paper on fingerprint pre-selection/feature selection?

2019-06-18 Thread Francois Berenger
On 18/06/2019 05:41, Mario Lovrić wrote: Dear all, Is there any paper discussing some sort of pre-selection/feature selection with fingerprints? I know of this one at least: --- Bender, A., Mussa, H. Y., Glen, R. C., & Reiling, S. (2004). Molecular similarity searching using atom

[Rdkit-discuss] How to turn off labels and bonds coloring when calling Draw.SimilarityMaps.GetSimilarityMapFromWeights(mol, weights)?

2019-07-11 Thread Francois Berenger
Dear rdkiters, I am playing with rdkit.Chem.Draw: --- sim_map = Draw.SimilarityMaps.\ GetSimilarityMapFromWeights(mol, weights) --- I don't like that in the created figure, the map colors overlap with atoms and bonds colors. It makes the map less readable. I would prefer all

Re: [Rdkit-discuss] aromatic bonds and graph edit distance

2019-08-21 Thread Francois Berenger
On 21/08/2019 17:34, Andrew Dalke wrote: On Aug 21, 2019, at 03:42, Francois Berenger wrote: Unless rdkit has something, I think graph edit distance is the kind of things for which you have to rely on a good graph library. Do you know of any (non-chemical) graph library which can handle

Re: [Rdkit-discuss] aromatic bonds and graph edit distance

2019-08-20 Thread Francois Berenger
On 21/08/2019 05:06, Andrew Dalke wrote: Hi all, Someone asked me recently about finding the graph edit distance of two small (<= 14 atom) fragments. I figured this was something that could be brute forced. Following SmallWorld's example at https://cisrg.shef.ac.uk/shef2016/talks/oral13.pdf

[Rdkit-discuss] I updated the rdkit brew install recipe

2019-09-30 Thread Francois Berenger
Dear rdkit users, Recently, I updated the brew install recipe for rdkit on Mac. The biggest change is that boost and boost-python's versions were pinned down, so that the brew install recipe should be much more reproducible than before. Here is a fail-safe way to install rdkit with it (with

Re: [Rdkit-discuss] 2019.09.1 RDKit Release

2019-10-30 Thread Francois Berenger
ledgements: Patricia Bento, Francois Berenger, Jason Biggs, David Cosgrove, Andrew Dalke, Thomas Duigou, Eloy Felix, Guillaume Godin, Lester Hedges, Anne Hersey, Christoph Hillisch, Christopher Ing, Jan Holst Jensen, Gareth Jones, Eisuke Kawashima, Brian Kelley, Alan Kerstjens, Karl Leswing, Pat L

Re: [Rdkit-discuss] Need help with setting up the RDkit in Ubuntu

2019-12-08 Thread Francois Berenger
On 09/12/2019 12:24, ITS RDC wrote: Hi Greg and all RDkit users, This is my first time posting in this mailing list because it's my first time to use RDkit. This is also the first time I work with Ubuntu OS (I have always been a Windows person) and I already installed all relevant packages

Re: [Rdkit-discuss] Folding count vectors

2019-11-18 Thread Francois Berenger
On 19/11/2019 03:34, Benjamin Datko wrote: Hello all, I am curious on how to fold a count vector fingerprint. I understand when folding bit vectors the most common way is to split the vector in half, and apply a bitwise OR operation. I think this is how the function

Re: [Rdkit-discuss] Folding count vectors

2019-11-20 Thread Francois Berenger
cppapi/structRDKit_1_1ReactionFingerprintParams.html 3. https://www.rdkit.org/docs/GettingStartedInPython.html#morgan-fingerprints-circular-fingerprints 4. https://sourceforge.net/p/rdkit/mailman/message/35240736/ v/r, Ben On Mon, Nov 18, 2019 at 10:13 PM Francois Berenger wrote: On 19/11/2019 03:34, Benjamin

Re: [Rdkit-discuss] Markush Enumeration.

2020-02-24 Thread Francois Berenger
On 21/02/2020 22:45, Paolo Tosco wrote: Hi Jitender, you could do that quite easily using reaction SMARTS; see for example this thread: https://sourceforge.net/p/rdkit/mailman/message/35730514/ You could selectively replace a specific R attachment point by isotopically labeling it. Cheers,

Re: [Rdkit-discuss] Exhaustive fragmentation of molecules

2020-01-08 Thread Francois Berenger
On 08/01/2020 20:47, Paolo Tosco wrote: Dear Puck, You may break a bond by creating a Chem.RWMol out of your Chem.Mol, and then calling the RemoveBond() method on your Chem.RWMol, or you may use dedicated functions in the rdmolops module. Individual fragments can then be obtained by calling

[Rdkit-discuss] if you are a RDKit user in Japan and you are free between Mar. 19-20 2020

2020-01-14 Thread Francois Berenger
Dear RDKit users, You might consider joining "The 8th French-Japanese Workshop on Computational Methods in Chemistry" (FJCMC2020). Date: Mar. 19-20, 2020 Venue: 100th Anniversary Hall of Engineering Faculty, Kurokami South Campus, Kumamoto University. Website:

Re: [Rdkit-discuss] Compiling rdkit-Release_2019_09_3 with python 3.7.5 and gnu gcc, g++ on MacOS 10.15

2020-01-20 Thread Francois Berenger
On 21/01/2020 01:29, Zoltan Takacs wrote: Hi, Thanks, I repeated the compilation procedure on an Ubuntu machine with boost 1.62.0 and everything went smashingly. This indeed seems to be some cmake boost mismatch on my mac. I will use an older version of boost instead. This should do the

Re: [Rdkit-discuss] passing options to javac when building from source

2020-01-05 Thread Francois Berenger
Hi Tim, How do you compile rdkit for Java? Last time I tried on a Mac, it did not work: https://github.com/rdkit/homebrew-rdkit/issues/38 Thanks a lot, F. On 26/12/2019 23:22, Tim Dudgeon wrote: When building the Java wrappers from source (the -DRDK_BUILD_SWIG_WRAPPERS=ON option) is possible

Re: [Rdkit-discuss] RDkit/Anaconda: Fingerprints for a database

2020-03-12 Thread Francois Berenger
I have some Python code that might help in there: https://github.com/UnixJunkie/consent/blob/master/bin/lbvs_consent_ecfp4.py Regards, F. On 12/03/2020 21:21, Francesco Coppola wrote: Hello everyone, Before exposing my new problem, I wanted to thank everyone who helped me in the previous

Re: [Rdkit-discuss] Compilation problems on Linux

2020-04-16 Thread Francois Berenger
Hi Max, Not sure if it will help, but on Debian and Ubuntu you need the following system packages to be installed in order to compile rdkit: curl wget libboost-all-dev cmake git g++ libeigen3-dev python3 libpython3-all-dev python3-numpy python3-pip python3-pil python3-six python3-pandas What

Re: [Rdkit-discuss] AdditionalOutput from FingerprintGenerator

2020-03-17 Thread Francois Berenger
On 17/03/2020 17:14, Chris Earnshaw wrote: A quick comment on the cosine metric. Unlike Tanimoto it obeys the triangle inequality, so in cases where it's used essentially as a distance metric (e.g. some clustering applications) the results are probably more mathematically correct. The Tanimoto

Re: [Rdkit-discuss] The RDKit and GSoC 2020

2020-03-22 Thread Francois Berenger
On 20/03/2020 08:43, JW Feng wrote: iwatobipen blog was where I found instructions for installing RDKit on Colab. It works but I found waiting for miniconda to install to be too annoying. A one line apt-get command to install RDKit is easier and faster (~10 seconds) but it only works with

Re: [Rdkit-discuss] Smallest possible size of 100*1e6 morgan fingerprints for storage and memory

2020-09-08 Thread Francois Berenger
On 09/09/2020 09:35, Lewis Martin wrote: Hi RDKit, Looking for advice on an rdkit-adjacent problem please. Ultimately I'd like to fit an approximate-nearest neighbors index on a dataset of 100 million ligands, featurized by morgan fingerprint. The text file of the smiles is ~6gb but this blows

Re: [Rdkit-discuss] h-bond geometry

2020-09-08 Thread Francois Berenger
On 09/09/2020 01:33, Tim Dudgeon wrote: Hi All, thanks for the suggestions. Greg, that's part of what's needed but there's also some more complex logic needed. For instance, if the atom the H is attached to is rotatable e.g. an OH group) then it is more complex than if it is fixed (e.g a N in a

Re: [Rdkit-discuss] unique chemical representation

2020-09-13 Thread Francois Berenger
On 12/09/2020 00:27, Mike Mazanetz wrote: Dear Forum, I'm curious as to how the community standardizes molecules to generate unique chemical representations. Please let me know what are people's referred means to treat: * Tautomers * Protomers * Resonance structures

Re: [Rdkit-discuss] RDkit: While converting sdf file to fingerprint, facing several error

2020-08-06 Thread Francois Berenger
On 07/08/2020 03:15, dmaziuk via Rdkit-discuss wrote: On 8/6/2020 7:14 AM, Pitanti Chalowa wrote: ... DTXCID601285170 Mrv1805 05101813452D Does it have to have a blank line after '' ? No: having the molecule's name/identifier in there is quite standard. Dima

Re: [Rdkit-discuss] RDKit installation problem

2020-08-02 Thread Francois Berenger
Dear Sebastian, Since last week, you should also be able to install rdkit on Linux via linuxbrew: --- sudo apt install linuxbrew-wrapper brew tap rdkit/rdkit brew update brew install rdkit # to test it /home/linuxbrew/.linuxbrew/bin/python3 import rdkit --- Thanks to Nuri Jung on github

[Rdkit-discuss] Is there someone who manages to compile rdkit Release_2020_03_4 from sources on a Mac?

2020-07-07 Thread Francois Berenger
Dear rdkiters, I am trying to repair the brew formula. Currently, I get this when I try to compile Release_2020_03_4: $ cd build $ cmake -DPYTHON_EXECUTABLE=/usr/local/bin/python-3.7 -DRDK_INSTALL_INTREE=OFF -DRDK_BUILD_INCHI_SUPPORT=ON -DRDK_BUILD_AVALON_SUPPORT=ON

Re: [Rdkit-discuss] How to add a caption/legend to a 2D SVG depiction of a molecule?

2020-06-22 Thread Francois Berenger
On 22/06/2020 14:47, Francois Berenger wrote: Dear RDKiters, My current code looks like this: --- AllChem.Compute2DCoords(mol) # generate 2D conformer mol.SetProp("_Name", name) d = rdMolDraw2D.MolDraw2DSVG(200, 200) d.DrawMolecule(mol) Answer

[Rdkit-discuss] How to add a caption/legend to a 2D SVG depiction of a molecule?

2020-06-21 Thread Francois Berenger
Dear RDKiters, My current code looks like this: --- AllChem.Compute2DCoords(mol) # generate 2D conformer mol.SetProp("_Name", name) d = rdMolDraw2D.MolDraw2DSVG(200, 200) d.DrawMolecule(mol) d.FinishDrawing() out_fn = '%d.svg' % i with

Re: [Rdkit-discuss] Removing solvent and ions from dataset

2020-06-08 Thread Francois Berenger
On 06/06/2020 17:33, Max Pinheiro Jr wrote: Hi RDkit team, I am working on a chemically diverse dataset of smiles strings and I need to do some preprocessing to clean a bit the data before starting the modeling part. So I was looking for some tools or built-in functions in RDkit to make such

Re: [Rdkit-discuss] How to calculate Tanimoto similarity score between reactions

2020-06-10 Thread Francois Berenger
On 10/06/2020 13:11, 丁邵珍 wrote: Hi, I want to calculate Tanimoto similarity score of two reactions ('CCCO>>CCC=O', 'CC(O)C>>CC(=O)C'), I found all methods of Tanimoto similarity score calculation are for compounds. Could you please tell me how to calculate the Tanimoto similarity score of

Re: [Rdkit-discuss] How many bonds of a Type in a molecule

2020-12-08 Thread Francois Berenger
On 08/12/2020 23:01, José Emilio Sánchez Aparicio wrote: Dear all, I need to find how many bonds of a certain type are in a molecule. For example, for DOUBLE bonds, I would do: bond_number = 0 for bond in mol.GetBonds(): if bond.GetType() == Chem.BondType.DOUBLE: bond_number

Re: [Rdkit-discuss] GPU Implementation of shape-based 3D overlap on rdkit?

2020-11-08 Thread Francois Berenger
On 04/11/2020 04:26, Lewis Martin wrote: Ive had an initial go at something like this using JAX. I chose JAX since it has a shallow learning curve, essentially being numpy on a GPU. This is great for vectorized calculations, but less so for applications that involve a lot of control flow (ie

[Rdkit-discuss] SMARTS pattern replacement inside a ring; without breaking the ring open...

2021-01-07 Thread Francois Berenger
Dear list, I have been trying to replace this SMARTS pattern in a ring: 'c(=O)[nH]' By this SMILES fragment: 'c(O)n' My trials using a single SMARTS pattern search then replace break open the ring, which is not what I want. My not working trial code: --- mol =

[Rdkit-discuss] How to construct a simple molecule with a Z stereo double bond using RWMol?

2021-01-14 Thread Francois Berenger
Hello, Please tell me if you understand why the code below is not working and if you know how to change it so that it does. Thanks a lot! :) F. --- #!/usr/bin/env python3 # try to construct a molecule with a Z stereo double bond using RWMol import rdkit from rdkit import Chem wanted_smi =

Re: [Rdkit-discuss] How to construct a simple molecule with a Z stereo double bond using RWMol?

2021-01-14 Thread Francois Berenger
was being a good kid, I thought that someone must always sanitize a RWMol prior to extracting the final resulting molecule (in the end I want a SMILES). Regards, F. On Thu, Jan 14, 2021 at 9:46 AM Francois Berenger wrote: Hello, Please tell me if you understand why the code below is not work

Re: [Rdkit-discuss] How to construct a simple molecule with a Z stereo double bond using RWMol?

2021-01-14 Thread Francois Berenger
= rwmol.GetBondWithIdx(1) In [8]: db.SetStereoAtoms(0,3) In [9]: db.SetStereo(Chem.BondStereo.STEREOCIS) In [10]: Chem.MolToSmiles(rwmol) Out[10]: 'CN=CS' In [11]: Chem.AssignStereochemistry(rwmol) In [12]: Chem.MolToSmiles(rwmol) Out[12]: 'C/N=C\\S' On Thu, Jan 14, 2021 at 9:46 AM Francois Berenger wrote

Re: [Rdkit-discuss] SMARTS pattern replacement inside a ring; without breaking the ring open...

2021-01-12 Thread Francois Berenger
helps! Best, Fio On Thu, Jan 7, 2021 at 10:33 PM Francois Berenger wrote: Dear list, I have been trying to replace this SMARTS pattern in a ring: 'c(=O)[nH]' By this SMILES fragment: 'c(O)n' My trials using a single SMARTS pattern search then replace break open the ring, which is not what

Re: [Rdkit-discuss] non-element elements

2021-02-03 Thread Francois Berenger
On 04/02/2021 00:35, Brian Peterson wrote: Hello RDKit people, Is it possible to modify the properties of elements in the periodic table or to create new ones? Use case: Suppose one had some molecules defined in terms of functional groups or united atoms or some other entities that are not

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-22 Thread Francois Berenger
Dear JP, To confuse you even more, you can also have a look at the ChEMBL open-source molecular standardizer: https://github.com/chembl/ChEMBL_Structure_Pipeline/blob/master/chembl_structure_pipeline/standardizer.py No need to thank me. :D On 18/06/2021 03:12, JP Ebejer wrote: Dear all, I

Re: [Rdkit-discuss] install on macosx with Python 3.8

2021-06-24 Thread Francois Berenger
On 25/06/2021 02:57, Michal Krompiec wrote: Hello, Is it possible to install RDKit on MacOSX in a Python 3.8 environment? There is no conda binary for 3.8, so I tried homebrew. But the following gives me an error message (brew doesn't like the --with-python3 argument): brew install rdkit

[Rdkit-discuss] How to prevent a SMILES from starting with a specific atom?

2021-05-11 Thread Francois Berenger
Hello, I have some molecules with unspecified atoms ('*' in SMILES notation). I would like that when such a molecule is written out, the resulting SMILES never starts by one of those atoms (since the molecule also has plenty of "normal" atoms). Is it possible to do that with rdkit? Or, more

[Rdkit-discuss] Are the path-based fingerprints formally described in the scientific literature?

2021-05-19 Thread Francois Berenger
Dear list, The other day, I was looking for a paper describing them but the only thing I found was a reference to some Daylight product. I know there is a paper (maybe several in fact) for ECFP for example. Weren't the path-based FPs formally described somewhere? Thanks a lot, F.

Re: [Rdkit-discuss] Shape Tanimoto distance question

2021-06-29 Thread Francois Berenger
On 29/06/2021 12:26, Greg Landrum wrote: Hi Leon, You can convert the tanimoto distance to similarity, but the formula is: Similarity = 1 - Distance In other words: Tanimoto_distance = 1.0 - Tanimoto_score Worth noting: the Tanimoto distance is a metric; hence it is pretty useful in

Re: [Rdkit-discuss] Are Partial Charge Calculations Dependent on Conformers?

2021-07-06 Thread Francois Berenger
On 01/07/2021 13:15, Hao wrote: Thanks Greg! That helps a lot, it was purely out of curiosity and understanding. I'm working with some legacy code that requires conformer generation before calculating partial charges. Now that I know it's unnecessary, I can speed up this process by quite a bit.

[Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?

2021-04-26 Thread Francois Berenger
Hello, I am trying MurckoScaffold.MakeScaffoldGeneric(mol), but this keeps the side chains. While my understanding of BM scaffolds is that only rings and ring linkers should be kept. The fact that the rdkit implementation keeps the side chains makes Murcko scaffolds a much less powerful filter

Re: [Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?

2021-04-26 Thread Francois Berenger
On 27/04/2021 10:12, Francois Berenger wrote: On 26/04/2021 23:35, Greg Landrum wrote: Hi Francois, The implementation which is there does, I believe, the right thing. However... first you need to find the Murcko Scaffold, then you can convert that scaffold to the generic form: In [5]: m

Re: [Rdkit-discuss] rejoining pairs of fragments after fragmenting a molecule

2021-04-04 Thread Francois Berenger
On 03/04/2021 00:03, Andrew Dalke wrote: Hi Ling, On Apr 2, 2021, at 16:23, Ling Chan wrote: Thank you Francois, I took a look at your code and borrowed parts of it to rejoin two molecules. It seems like my problem is solved. I eventually arrived at something like example 4 in

Re: [Rdkit-discuss] rejoining pairs of fragments after fragmenting a molecule

2021-03-31 Thread Francois Berenger
On 01/04/2021 04:55, Ling Chan wrote: Dear Colleagues, I am trying to do something that I think is quite simple, but I have not figured out a simple way. Don't know if I am missing something. I am sure that ultimately I can figure it out, but I wonder if there is a good way. I fragmented a

Re: [Rdkit-discuss] The latest RDKit (2020.09.5) is now available on homebrew/linuxbrew

2021-03-31 Thread Francois Berenger
/linuxbrewed-rdkit may conflict with conda-rdkit, please decide which one to use at your own risk. I thank Francois Berenger, Eddie Cao, and the contributors who maintained the rdkit formula. Sincerely, Yoshitaka Moriwaki ___ Rdkit-discuss mailing list Rdkit

Re: [Rdkit-discuss] Parsing a PDB file with atoms that are too close, causing bad bond

2021-09-27 Thread Francois Berenger
Hi Lewis, Just an idea: you might try to load your PDB in UCSF Chimera, then save it as a mol2 or sdf file. Then, try to read this sdf file from rdkit. Another idea: try to get your pdb file through the pdbredo service. https://pdb-redo.eu/ They might have fixed a few things; maybe this PDB

Re: [Rdkit-discuss] Parsing a PDB file with atoms that are too close, causing bad bond

2021-09-27 Thread Francois Berenger
Regards, F. Cheers Lewis On Mon, Sep 27, 2021 at 5:55 PM Francois Berenger wrote: Hi Lewis, Just an idea: you might try to load your PDB in UCSF Chimera, then save it as a mol2 or sdf file. Then, try to read this sdf file from rdkit. Another idea: try to get your pdb file through the pdbredo

Re: [Rdkit-discuss] State of the art for shape alignment

2021-11-11 Thread Francois Berenger
On 12/11/2021 01:58, Paolo Tosco wrote: Hi Tim, Open3DAlign is not shape-based, it is atom-based. The score is proportional to the # of matched atoms, weighted by similarity. It will work well for homologous series of compounds with reasonable scaffold similarity, and will in general perform

[Rdkit-discuss] How to get the electronegativity of an atom?

2022-03-08 Thread Francois Berenger
Dear rdkit experts, I am looking to access the electronegativity value of a given atom in a molecule. Funnily, I don't know _at_ _all_ how to do this. I guess that there should be a way using the atomic number to get this value from a table inside of rdkit but my code searches on github where

Re: [Rdkit-discuss] How to get the electronegativity of an atom?

2022-03-08 Thread Francois Berenger
On 08/03/2022 17:23, Francois Berenger wrote: Dear rdkit experts, I am looking to access the electronegativity value of a given atom in a molecule. Funnily, I don't know _at_ _all_ how to do this. I guess that there should be a way using the atomic number to get this value from a table inside

Re: [Rdkit-discuss] pharmacophore

2022-03-29 Thread Francois Berenger
On 30/03/2022 03:49, Patrick Walters wrote: One way to compare interactions (pharmacophores) in a binding site is to use interaction fingerprints. I've had a good experience with ProLIF. https://github.com/chemosim-lab/ProLIF Additionally, I know about all those open-source ones: -

Re: [Rdkit-discuss] Dask + Rdkit Use Cases

2022-01-25 Thread Francois Berenger
On 25/01/2022 01:57, Oren Herschander wrote: Hi Everyone, I'm working on a research project about how Dask and other python tools for distributed/parallel computing are used in Life Sciences. I'm on the lookout for use cases, stories, and overall thoughts that combine rdkit or other similar

[Rdkit-discuss] What is the recommended 3D-sensitive file format to use with RDKit?

2022-06-16 Thread Francois Berenger
Hi all, I assume it's ".sdf". But, do we have good support for ".xyz" also? In addition, what about RDKit's support of ".mol2" these days? Regards, F. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net

Re: [Rdkit-discuss] GIL Lock in BulkTanimotoSimilarity

2022-10-25 Thread Francois Berenger
On 24/10/2022 19:47, David Cosgrove wrote: For the record, I have attempted this, but got only a marginal speed-up (130% of CPU used, with any number of threads above 2). The procedure I used was to extract the fingerprint pointers into a std::vector, create a std::vector for the results,

  1   2   >