Re: [Rdkit-discuss] Hello questions about the Synthetic Accessibility score

2020-11-15 Thread Peter Gedeck
The paper is pretty vague on implementation details. However, note that the code is copyright Novartis Institutes for BioMedical Research Inc. It was released in the public domain and at that point (2013) it was the implementation that was used internally at Novartis. You can therefore use the

Re: [Rdkit-discuss] Angstroms Hydrogen bonding

2016-09-14 Thread Peter Gedeck
Hello Here are a few suggestion you can try that may speed up your code. Instead of GetSubstructMatches, you can use the list of neighbours of each atom. Here is something that may work, it creates an iterator that will return all the atoms for bond angles. I did not test it, however it may give

Re: [Rdkit-discuss] Trouble compiling and installing on Ubuntu 14.04

2016-10-03 Thread Peter Gedeck
Hello Python failures are usually an indication of problems with the boost library. You might pickup libraries for the wrong Python version. Best Peter On Mon, Oct 3, 2016 at 11:06 AM Philip Adler wrote: > Dear All, > > I am trying to compile rdkit to run with Python3.4 on Ubuntu 14.04 as per

Re: [Rdkit-discuss] Trouble compiling and installing on Ubuntu 14.04

2016-10-03 Thread Peter Gedeck
You can also check the CMakeCache.txt file in the build directory. When I last compiled for 3.5 on the Mac, I had to correct the PYTHON_INCLUDE_DIR. Greg, PYTHON_INCLUDE_DIR was incorrectly set after "cmake ..". Executable and library correctly found. //Path to a program. PYTHON_EXECUTAB

Re: [Rdkit-discuss] Trouble compiling and installing on Ubuntu 14.04

2016-10-04 Thread Peter Gedeck
One of the tests says: > ImportError: libInfoTheory.so.1: cannot open shared object file: No such file or directory Did you "make install" and does LD_LIBRARY_PATH contain $RDBASE/lib? Best, Peter On Tue, Oct 4, 2016 at 11:18 AM Philip Adler wrote: > Unfortunately David Hall's suggestion h

Re: [Rdkit-discuss] Atom Environments

2016-11-21 Thread Peter Gedeck
Hello, In the past, I've had very good experience with the rooted fingerprints. They were introduced by Vulpetti et al. as a description of local environment of fluorine (LEF) atoms. Later on, we used it to compare ionization sites in a Moka retraining study (Gedeck et al.). The LEF code in the R

Re: [Rdkit-discuss] Pandas

2016-11-23 Thread Peter Gedeck
Is it possible to use the bulk similarity searching functionality for better performance instead of the list comprehension? Best, Peter On Wed, Nov 23, 2016 at 9:11 AM Greg Landrum wrote: No worries. This, and Anna's question about similarity searching and clustering illustrate a great opport

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread Peter Gedeck
Hello Alexis, Depending on the size of your document, you could consider limit storing the already tested strings by word length and only memoize shorter words. SMILES tend to be longer, so everything above a given number of characters has a higher probability of being a SMILES. Large words probab

Re: [Rdkit-discuss] SetAtomAlias

2016-12-16 Thread Peter Gedeck
Hello, SetMolAlias is available in Python as a function and not as an Atom method: from rdkit import Chem import sys m = Chem.MolFromSmiles('CCC') for i, atom in enumerate(m.GetAtoms()): Chem.SetAtomAlias(atom, 'C' + str(i + 1)) w = Chem.SDWriter(sys.stdout) w.write(m) w.close() Best, Pete

Re: [Rdkit-discuss] SDwriter

2016-12-16 Thread Peter Gedeck
Hello In cases like this i know that Greg did the valid implementation according to the standard. If you check the ctfile definition ( http://c4.cabrillo.edu/404/ctfile.pdf#page41) you will see that the data header is pretty flexible. The only requirement is that it starts with a >. Usually we fin

Re: [Rdkit-discuss] SetAtomAlias

2016-12-17 Thread Peter Gedeck
ill...@univ-reims.fr> wrote: > Dear Peter, > > I got: > > AttributeError: 'module' object has no attribute 'SetAtomAlias' > > with your example code, below. > > Best regards, > > Jean-Marc > > > Le 17/12/2016 à 00:44, Peter Gedeck a écr

Re: [Rdkit-discuss] multiline legend in MolsToGridImage

2016-12-20 Thread Peter Gedeck
Hello, I thought we had removed all of these by now. I'll open an issue and fix the code. Best, Peter On Tue, Dec 20, 2016 at 6:01 AM David Hall wrote: > The replace and split methods were removed from the string module in > python3. You can replace the code as follows: > > s = s.replace

Re: [Rdkit-discuss] PMI API

2017-01-15 Thread Peter Gedeck
According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rod-like molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15,

Re: [Rdkit-discuss] adding custom number of explicit H to specified non-hydrogen atoms

2017-01-21 Thread Peter Gedeck
Looks like you have a very old version of RDkit. The additional option was included in RDkit 2016.03.1. Check import rdkit print(rdkit.__version__) Best, Peter On Sat, Jan 21, 2017 at 3:39 PM Janusz Petkowski wrote: > Czesc again, > > Many thanks for the code snippet. I thought that I use i

Re: [Rdkit-discuss] Drawing structure with generic labels

2017-02-16 Thread Peter Gedeck
Hello Alexis, I had a look at the python and the C++ code for drawing of molecules. Neither supports your requirement. It would be useful to implement it. I can have a look at it in more detail and see if I could implement a quick fix, e.g. Drawing a custom label based on an atom property. Best

Re: [Rdkit-discuss] FindAtomEnvironmentOfRadiusN

2017-03-27 Thread Peter Gedeck
Hello, The atom numbers start with 0. From the middle atom, there are no environments with radius 2. You will get a result if you use the first (=0) or the last (=2) atom. Try this: m = Chem.MolFromSmiles("NCO") i = Chem.FindAtomEnvironmentOfRadiusN(m, 1, 0) Chem.MolToSmiles(Chem.PathToSubmol(m,

Re: [Rdkit-discuss] FindAtomEnvironmentOfRadiusN

2017-03-27 Thread Peter Gedeck
r? > So, if a radius is greater than the number of available bonds in all > directions from the rooted atom the function will return empty list as it > considers that such environment does not exist. Is this a correct > expectation? > > > Pavel. > > > On 03/27/2017 03:5

Re: [Rdkit-discuss] FindAtomEnvironmentOfRadiusN

2017-03-27 Thread Peter Gedeck
ee from test examples for the 0 atom in CC1CC1 an environment > with radius 3 exists. Thus, radius should be interpreted as a number of > bonds from the rooted atom, not a number of atoms. I did not expect this as > well. > > > Pavel. > > On 03/27/2017 04:22 PM, Peter Gedeck wro

Re: [Rdkit-discuss] mmpdb installation on windows using mingw

2017-09-22 Thread Peter Gedeck
Here is a relevant stackoverflow question. https://stackoverflow.com/questions/1948862/is-the-python-3-x-signal-library-for-windows-incomplete What happens if you comment out the code if you run on windows? Best Peter On Fri, Sep 22, 2017 at 7:25 AM Markus Metz wrote: > Hello Christian: > > I

Re: [Rdkit-discuss] Programatic access to the mol sanitation process results

2018-03-09 Thread Peter Gedeck
Hello Lukas, The file rdkit/TestRunner.py contains a class/context manager called OutputRedirectC. If I remember correctly, this allowed capturing these messages. It's not used anywhere in the RDkit code base, so it not work anymore. Anyway, give it a try and if it works, you can modify it to r

[Rdkit-discuss] One flavour of mcss

2012-12-11 Thread Peter Gedeck
Hello > given a data set of let's say 2000 compounds, > how do I extract the most > common substructures rather than the > maximum common substructures? > In addition, I would like to output the > frequency of the found One approach would be to take a brics decomposition where you keep the full

[Rdkit-discuss] Brics question

2012-12-16 Thread Peter Gedeck
hello Paul If you look at your fragments you see that in your decomposition you have overlapping fragments: set(['[3*]OCC(O[3*])C1CC1', '[4*]CC([4*])C1CC1', '[3*]Oc1ncncn1', '[14*]c1ncncn1', '[3*]OCC([4*])C1CC1', '[3*]OC(C[4*])C1CC1', '[3*]OC1CNC1', '[15*]C1CNC1']) This indicated that the fragme

Re: [Rdkit-discuss] compound mass calculation

2013-12-18 Thread Peter Gedeck
Hello Yingfeng, The molecular formula in the inchi key is not the same as the one from the molecule that is encoded. It ignores charges. You can see this when you convert the mol to a SMILES: >>> sInchi = "InChI=1S/C5H9NO4/c6-3(5(9)10)1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/p-1/t3-/m0/s1" >>> curre

[Rdkit-discuss] Fwd: compound mass calculation

2013-12-18 Thread Peter Gedeck
-- Forwarded message -- From: Peter Gedeck Date: 19 December 2013 11:43 Subject: Re: [Rdkit-discuss] compound mass calculation To: Yingfeng Wang Hello, Here is RDkit code that will neutralize a compound: http://code.google.com/p/rdkit/wiki/NeutralisingCompounds If you are

Re: [Rdkit-discuss] Problem reading a specific smiles with the cartridge

2014-03-22 Thread Peter Gedeck
Hello Can you construct similar SMILES like  C12(C1)C2  N12CCC(C1)C2 C1(CCC2)CCC12 Are other smiles in your dataset  bicyclic systems?  Does it work with the rewritten smiles C1C2CC1C2? Best Peter On Sat, Mar 22, 2014 at 7:42 pm, Gerebtzoff, Gregori mailto

Re: [Rdkit-discuss] Problem reading a specific smiles with the cartridge

2014-03-24 Thread Peter Gedeck
Hi If we had known that the helpdesk advice would work here, ... ;-) Best, Peter On 24 March 2014 19:05, Gerebtzoff, Gregori wrote: > Hi guys, > > Many thanks for your help and suggestions! > Don't ask me why but restarting PostgreSQL did the trick, now my "C12CC(C1)C2" > smiles can be rea

Re: [Rdkit-discuss] MMFFGetMoleculeProperties()

2014-05-05 Thread Peter Gedeck
Hello, I searched through the source code for MMFFGetMoleculeProperties and found a few test files. The method MMFFGetMoleculeProperties is part of the ChemicalForceFields module: from rdkit.Chem import ChemicalForceFields def testMMFFAngleConstraints(self) : m = Chem.MolFromMolBlock(s

Re: [Rdkit-discuss] Errors while running CTest

2014-09-24 Thread Peter Gedeck
Hello, Running the tests creates the directory Testing/Temporary which contains a file LastTest.log. This file is the actual output from the tests and may help you identify the reason why your tests failed. Best, Peter On 24 September 2014 17:22, Shantheya Balasupramaniam < s.balasupraman...@

Re: [Rdkit-discuss] trouble with SMARTs interpretation of 'not hydrogen'

2015-09-16 Thread Peter Gedeck
Hello This may be just an example that you picked out of many, however why don't you just make this atom an 'any atom'? It's in a ring and normally hydrogen don't come up in rings. Best Peter On Thu, 17 Sep 2015 at 6:24 am, Andrew Dalke wrote: > On Sep 16, 2015, at 9:57 PM, Bodle, Christopher

Re: [Rdkit-discuss] SetProp behavior

2016-01-17 Thread Peter Gedeck
Hello To change properties of a molecule, it is not necessary to convert to a RWMol. This is required only if you want to modify the structure. molsin2[0].GetProp('PUBCHEM_ATOM_DEF_STEREO_COUNT') molsin2[0].SetProp('PUBCHEM_ATOM_DEF_STEREO_COUNT', str(5)) molsin2[0].GetProp('PUBCHEM_ATOM_DEF_STER

Re: [Rdkit-discuss] Stereochemistry Perception

2016-05-27 Thread Peter Gedeck
Hello Rob The compound is not chiral. There is a mirror plane that contains the 5-ring and the C-NH3 bond. There is a cis / trans stereoisomers here like in 1,4-dichloro-cyclohexane. That cannot be defined using the @symbols. However I cannot tell you how to do this for cases like this in SMILES.

Re: [Rdkit-discuss] OCN = NCO, and I don't want that.

2016-06-05 Thread Peter Gedeck
Hello This is the expected behaviour. The path of length 2 creates one fragment OCN. That fragment is the same if you start from oxygen or from the nitrogen. You will get a differentiation of the O and the N if you include paths of different length. It could also be that substitution can modify t

Re: [Rdkit-discuss] Querying when using CTabs

2016-06-06 Thread Peter Gedeck
My solution for the problem was the following: qmol = Chem.MolFromMolBlock(molblock) for atom in qmol.GetAtoms(): if atom.HasQuery(): continue atom.SetNumExplicitHs(atom.GetTotalNumHs()) This gives a SMARTS like this: [#7]1(-[#6](-[#6H2]-[#6,#8]-[#6H](-[#6H2]-1)-[*])=[#8])-[*] This may b