Re: [Rdkit-discuss] ModuleNotFoundError: No module named 'rdkit'
Hi Andres, Maybe Spyder runs on the base conda environment. Do you run Spyder from your activated environment? Kind regards, Christos On Wed, Apr 14, 2021, 17:52 Andrés Sánchez Ruiz < andressanchezrui...@gmail.com> wrote: > Dear Greg, > > It works! It seems I can call functions of RDKit from this console, > however, when I start spyder and try to run them I still get the > error. Could it be something related to the spyder interpreter? > > Best regards, > > Andrés > > El mié, 14 abr 2021 a las 17:38, Greg Landrum > () escribió: > > > > That looks good so far. > > So what happens in that exact same shell if you then start ipython > > and do "import rdkit"? > > > > -greg > > > > > > On Wed, Apr 14, 2021 at 5:33 PM Andrés Sánchez Ruiz < > andressanchezrui...@gmail.com> wrote: > >> > >> Dear Greg, > >> > >> After activating my enviroment (foodpains) I wrote the command " > >> ipython -c 'import IPython;import > >> rdkit;print(IPython.__file__,rdkit.__file__)' ". Right after getting > >> the output I wrote: " where ipython ". This is what I get: > >> > >> (foodpains) C:\Users\Andres Sanchez>ipython -c "import IPython;import > >> rdkit;print(IPython.__file__,rdkit.__file__)" > >> C:\Anaconda\envs\foodpains\lib\site-packages\IPython\__init__.py > >> C:\Anaconda\envs\foodpains\lib\site-packages\rdkit\__init__.py > >> > >> (foodpains) C:\Users\Andres Sanchez>where ipython > >> C:\Anaconda\envs\foodpains\Scripts\ipython.exe > >> C:\Anaconda\Scripts\ipython.exe > >> > >> Best regards, > >> > >> Andrés > >> > >> El mié, 14 abr 2021 a las 17:06, Greg Landrum > >> () escribió: > >> > > >> > That looks good. Please send the output of: > >> > ipython -c 'import IPython;import > rdkit;print(IPython.__file__,rdkit.__file__)' > >> > > >> > and we also need to figure out exactly which version of ipython you > are running. > >> > > >> > If you are running these commands in the command shell, that's > >> > where ipython > >> > > >> > in powershell: > >> > gcm ipython > >> > > >> > if you're using a bash shell: > >> > which ipython > >> > > >> > Please run the ipython -c and which/where/gcm command directly after > each other and paste in both the command you executed and its output. > >> > > >> > -greg > >> > > >> > > >> > > >> > > >> > On Wed, Apr 14, 2021 at 4:46 PM Andrés Sánchez Ruiz < > andressanchezrui...@gmail.com> wrote: > >> >> > >> >> Dear Greg, > >> >> > >> >> This is what I see after activating my enviroment (foodpains) and > >> >> introducing your command: > >> >> > >> >> C:\Anaconda\envs\foodpains\lib\site-packages\IPython\__init__.py > >> >> C:\Anaconda\envs\foodpains\lib\site-packages\rdkit\__init__.py > >> >> > >> >> Best regards, > >> >> > >> >> Andrés > >> >> > >> >> El mié, 14 abr 2021 a las 15:42, Greg Landrum > >> >> () escribió: > >> >> > > >> >> > What do you see when you execute this quick test to ensure that > ipython and the rdkit are both really installed? > >> >> > > >> >> > python -c 'import IPython;import > rdkit;print(IPython.__file__,rdkit.__file__)' > >> >> > > >> >> > -greg > >> >> > > >> >> > On Wed, Apr 14, 2021 at 2:58 PM Andrés Sánchez Ruiz < > andressanchezrui...@gmail.com> wrote: > >> >> >> > >> >> >> Hello, > >> >> >> > >> >> >> I have not been able to solve the issue yet after installing > ipython > >> >> >> in the same enviroment in which I have RDKIT. > >> >> >> > >> >> >> ipython 7.22.0 py39hd4e2768_0 > >> >> >> ipython_genutils 0.2.0 pyhd3eb1b0_1 > >> >> >> . > >> >> >> . > >> >> >> . > >> >> >> rdkit 2021.03.1py39hfadf033_0 > conda-forge > >> >> >> > >> >> >> From this enviroment I can call pandas (for example) but not > RDKIT. > >> >> >> What is still not working? > >> >> >> > >> >> >> Best regards, > >> >> >> > >> >> >> Andrés > >> >> >> > >> >> >> > >> >> >> ___ > >> >> >> Rdkit-discuss mailing list > >> >> >> Rdkit-discuss@lists.sourceforge.net > >> >> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] HasSubstructMatch & GetSubstructMatches hang when useChirality is True
Thanks a lot Paolo. Christos Christos Kannas Research Software Engineer (Cheminformatics) [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On Fri, 26 Mar 2021 at 16:01, Paolo Tosco wrote: > Hi Christos, > > this is a possible workaround that will address your current problem: > > https://gist.github.com/ptosco/863cb55ace485c6664c21c244b2ca10a > > A better solution would be to implement in the C++ layer a callback or > timeout similarly to MCS and other similar, potentially time consuming > operations. > > Cheers, > p. > > On Fri, Mar 26, 2021 at 1:15 PM Christos Kannas > wrote: > >> Hi, >> >> Long story short, I'm using rdChiral to extract reaction templates from >> Reaction SMILES. >> >> I've found an issue with substructure matching when using a large >> molecule, a large pattern, having lots of chiral centers and >> the HasSubstructMatch & GetSubstructMatches have useChirality set to True, >> the process hangs. >> >> Here is a notebook showing the issue >> https://nbviewer.jupyter.org/gist/CKannas/d54bb5ab0fa3c964086c75f18250ddac >> >> Is there any workaround for this? >> Looking for a solution to stop the computation in a graceful manner. >> >> Thanks, >> >> Christos >> >> Christos Kannas >> >> Research Software Engineer (Cheminformatics) >> >> [image: View Christos Kannas's profile on LinkedIn] >> <http://cy.linkedin.com/in/christoskannas> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] HasSubstructMatch & GetSubstructMatches hang when useChirality is True
Hi, Long story short, I'm using rdChiral to extract reaction templates from Reaction SMILES. I've found an issue with substructure matching when using a large molecule, a large pattern, having lots of chiral centers and the HasSubstructMatch & GetSubstructMatches have useChirality set to True, the process hangs. Here is a notebook showing the issue https://nbviewer.jupyter.org/gist/CKannas/d54bb5ab0fa3c964086c75f18250ddac Is there any workaround for this? Looking for a solution to stop the computation in a graceful manner. Thanks, Christos Christos Kannas Research Software Engineer (Cheminformatics) [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] conda install rdkit
Hi Ling, Maybe you should switch to conda-forge channel. Replace "-c rdkit" with "-c conda-forge". At least that's what I'm using personally and I have no problems so far. The latest version of rdkit there is 2020.03.6. Best, Christos Christos Kannas Scientific Software Developer (Cheminformatics) [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> Virus-free. www.avast.com <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> <#m_-2612651416672237168_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Fri, 16 Oct 2020 at 21:00, Ling Chan wrote: > Thank you Drew for your suggestion. I tried it, but it did not help. > > I also did a "conda clean -a" on top of that. Still, when I do > conda create -c rdkit -n rdkenv rdkit > it stubbornly points to > rdkit rdkit/linux-64::rdkit-2018.03.2.0-py36h6bb024c_1 > > and as I wrote before, I don't have a ".condarc" file in my home > directory. Moreover, there is no "pinned" file found in any of my > conda-meta directories. And even "conda update rdkit --no-pin" does not > update my rdkit. > > I have no idea where it picks up that 2018 rdkit version and sticks to it. > > Ling > > Drew Gibson 於 2020年10月12日週一 上午4:35寫道: > >> Hi, >> >> I had a similar issue in the past. Try updating conda... >> >> conda update conda >> >> then try creating your RDKit environment again. >> >> Drew >> >> On Sat, 10 Oct 2020 at 23:47, Ling Chan wrote: >> >>> Dear colleagues, >>> >>> I am trying to install RDKit using conda. According to the manual at >>> https://www.rdkit.org/docs/Install.html#how-to-install-rdkit-with-conda >>> >>> it is very simple. I just need to do >>> >>> conda create -c rdkit -n my-rdkit-env rdkit >>> >>> It used to work. Somehow when I try this again, things are not working. >>> When I investigated, it turns out that somehow the 2018.03.2.0 version of >>> rdkit was installed instead of the most current one. It seems to me that I >>> have screwed up my conda setup. Just wonder what have I screwed up? How can >>> I repair it? >>> >>> One hint could be found at the message when I did conda create. The line >>> for rdkit looks different from the other lines, as indicated below. >>> Unfortunately I still could not figure it out. >>> >>> Thank you for your insight. >>> >>> Ling >>> >>> >>> = >>> >>> > conda create -c rdkit -n tempenv rdkit >>> >>> The following NEW packages will be INSTALLED: >>> >>> _libgcc_mutex pkgs/main/linux-64::_libgcc_mutex-0.1-main >>> blas pkgs/main/linux-64::blas-1.0-mkl >>> bzip2 pkgs/main/linux-64::bzip2-1.0.8-h7b6447c_0 >>> >>> python pkgs/main/linux-64::python-3.6.12-hcff3b4d_2 >>> python-dateutilpkgs/main/noarch::python-dateutil-2.8.1-py_0 >>> pytz pkgs/main/noarch::pytz-2020.1-py_0 >>> rdkit rdkit/linux-64::rdkit-2018.03.2.0-py36h6bb024c_1 >>> readline pkgs/main/linux-64::readline-8.0-h7b6447c_0 >>> setuptools pkgs/main/linux-64::setuptools-50.3.0-py36hb0f4dca_1 >>> sixpkgs/main/noarch::six-1.15.0-py_0 >>> >>> >>> >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> Virus-free. www.avast.com <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Different InChI: RDKit Knime Vs RDKit Python
Dear RDKiters, I'm having the following problem. I have a workflow that standardises compounds and as part of the process it generates standard InChI and InChIkey for the compound. The output is stored in an SDF. If I parse the SDF to a dataframe in jupyter notebook, then use the mol object to generate standard inchi, for a small number of compounds the new standard InChI is slightly different than the one generated in Knime environment. Environments Details: - RDKit Knime Nodes: 3.8.0v201906261723 - RDKit Python (conda): 2018.09.3, 2019.03.4, 2019.09.1 See image: https://imgur.com/a/EnYoHWG Best, Christos Christos Kannas Scientific Software Developer (Cheminformatics) ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Adding molecules to pandas dataframe
Hi Gianluca, Yes you can do that. You create a list of molecule objects from the mol2 files and then you assign this list to a new column in your dataframe. I.e. (Pythopsuedocode...) mols = list() for mol2 in mol2_files: mol = Chem.MolFromMol2(mol2) mols.append(mol) df["Molecule"] = mols Best, Christos Christos Kannas Scientific Software Developer (Cheminformatics) On Thu, 25 Jul 2019 at 15:45, Gianluca Sforna wrote: > Hi all, > is it possible to manually add molecules to a pandas dataframe? I am > reading a bunch of mol2 files, adding some properties (including some > atom highlighting), then I'd like to add the resulting molecule to the > dataframe in order to show its depiction along with the data. > However, API docs and examples I found around always assume you have a > SMILES string to start with. > > Any pointers? > > -- > Gianluca Sforna > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Using RdKit in Parallel
Hi Stamatia, Yes, SDMolSupplier is not thread safe. My guess is due to the nature of SDF file where a molecule record needs multiple lines and you do not know a-priory the number of lines per molecule in order to split the file to different threads/processes. Given that your proposed approach is the preferred one. Process each SDF file and return matched molecules using a separate process. I would advice to use concurrent.futures ( https://docs.python.org/3/library/concurrent.futures.html) package instead of multiprocessing. As it provides an abstraction layer on top of multiprocessing. See the example on ProcessPoolExecutor. One important think to remember when returning the list of matched molecules make use you preserve the molecule objects ( https://rdkit.org/docs/GettingStartedInPython.html#preserving-molecules) as transferring data between processes in Python requires that the data to be picklable. Best, Christos Christos Kannas Cheminformatics Researcher & Software Developer [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On Wed, 20 Feb 2019 at 11:28, Stamatia Zavitsanou < stamatia.zavitsa...@oriel.ox.ac.uk> wrote: > Hello everyone, > > > We have been writing a script that searches though a large number of > molecules within different files for a common substructure. To speed this > up we have been attempting to run this script in parallel-see scripts > below. However online the tutorial notes make reference to problems with > using the SDMolSupplier in parallel, we were wondering what is the issue > and how we could circumvent them to speed up some of our calculations. > > > Non-parallel > > > from __future__ import print_function > > from rdkit import Chem > > import os > > from progressbar import ProgressBar > > pbar=ProgressBar() > > matches = [] > > directory = 'Q:\Data2' > > patt = Chem.MolFromSmarts('NC(NNC=O)=O') > > for file in pbar(os.listdir(directory)): > > filename = os.fsdecode(file) > > if filename.endswith(".sdf"): > > f = os.path.join(directory,filename) > > suppl= Chem.SDMolSupplier(f) > > for mol in suppl: > > if mol is None: continue > > if mol.HasSubstructMatch(patt): > > matches.append(mol) > > w = Chem.SDWriter(r'C:\Users\tom.watts\Desktop\datasmarts4c.sdf') > > for m in matches: w.write(m) > > print(filename) > > > > Parallel > > > pbar=ProgressBar() > > matches = [] > > directory = 'E:\Data' > > patt = Chem.MolFromSmarts('NC(NNC=O)=O') > > w = Chem.SDWriter(r'C:\Users\tom.watts\Desktop\SearchDataNonly.sdf') > > l=[] > > for file in pbar(os.listdir(directory)): > > filename = os.fsdecode(file) > > if filename.endswith(".sdf"): > > f = os.path.join(directory,filename) > > l.append(f) > > num_cores = multiprocessing.cpu_count() > > print(num_cores) > > lock = multiprocessing.Lock() > > def Search(i): > > suppl= Chem.SDMolSupplier(i) > > for mol in suppl: > > if mol is None: continue > > if mol.HasSubstructMatch(patt): > > matches.append(mol) > > return matches > > results = Parallel(n_jobs=20)(delayed(Search)(i) for i in l) > > > > We also wish to use a second script that opens one SDF file and then > runs a loop over each molecule in the file. This is currently done > serially and we were wondering if it could be made parallel. > > > > suppl = Chem.SDMolSupplier('Red3.sdf') > > *for* mol *in* suppl: > > patt = Chem.MolFromSmarts('NC(N)=O') > > num=mol.GetSubstructMatches(patt) > > logger.debug(Chem.MolToSmiles(mol)) > > h=len(num) > > m3=Chem.AddHs(mol) > > cids =AllChem.EmbedMultipleConfs(m3, numConfs) > > > > Any comments can be useful. > > > Thanks a lot, > > Stamatia Zavitsanou > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] InChI to Mol to InChi
I would call it a "feature"... I guess running conformer optimization (i.e. ETKDG, UFF, MMFF94) after the embedding would be a good practice... > I think I do vaguely remember that InChI gives precedence to 3D coordinates if present over anything else for the determination of stereochemistry. I guess that's why there are inconsistencies [sometimes] when the molecule has been generated from a SMILES instead from a MOL block with 2D or 3D coordinates... Christos Christos Kannas Chem[o]informatics Researcher & Software Developer [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On Wed, 19 Dec 2018 at 01:12, Markus Sitzmann wrote: > I think I do vaguely remember that InChI gives precedence to 3D > coordinates if present over anything else for the determination of > stereochemistry. And I think that is what happens here: the Allchem > embedding of the molecule adds 3D coordinates which are not present for the > original molecule create straight from InChI. Probably the minimization of > the structure during the embedding is “turning around” the stereochemistry > (probably you could have a long discussion whether this is a bug or a > feature), > > Markus > > - > | Markus Sitzmann > | markus.sitzm...@gmail.com > > On 18. Dec 2018, at 19:43, Jason Biggs wrote: > > see https://github.com/rdkit/rdkit/issues/1852, and > https://sourceforge.net/p/rdkit/mailman/message/36309813/ > > You can see it in the smiles if you remove stereo after embedding, then > re-detect stereo from the conformation. > > inchi1 = > "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" > m1 = Chem.MolFromInchi(inchi1) > m1 = Chem.AddHs(m1) > m2 = Chem.Mol(m1) > AllChem.EmbedMolecule(m2) > m3 = Chem.Mol(m2) > Chem.rdmolops.RemoveStereochemistry(m3) > Chem.rdmolops.AssignStereochemistryFrom3D(m3) > sm1 = Chem.MolToSmiles(m1) > sm2 = Chem.MolToSmiles(m2) > sm3 = Chem.MolToSmiles(m3) > print(sm1 == sm2) # returns true > print(sm2 == sm3) # returns false > > > The difference between sm2 and sm3 is just swapping a \ for a /, > confirming what Christos was able to read from the InChI. > > Why does the inchi reflect the 3D bond stereo but the smiles doesn't until > you remove and re-detect the stereo? Does the InChI code go to the 3D > structure when present and ignore stereo information in the mol object? > > Jason Biggs > > > On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas > wrote: > >> Hi Jean-Marc, >> >> There difference is due to bond orientation (if my inchi analysis skills >> are correct). >> See the bold bond layer below (14-7+ vs 14-7-). >> >> >> m1 -> >> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1 >> >> m2 -> >> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1 >> >> >> Not sure why it happens, but I've seen it multiple times... >> >> >> Best, >> >> Christos >> >> Christos Kannas >> >> Chem[o]informatics Researcher & Software Developer >> >> [image: View Christos Kannas's profile on LinkedIn] >> <http://cy.linkedin.com/in/christoskannas> >> >> >> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD < >> jm.nuzill...@univ-reims.fr> wrote: >> >>> Thank you for your answer but alatis might not be adapted to my current >>> problem. >>> >>> Attempting to understand what was changed by the embedding step I wrote: >>> >>> inchi1 = >>> >>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" >>> m1 = Chem.MolFromInchi(inchi1) >>> m1 = Chem.AddHs(m1) >>> m2 = Chem.Mol(m1) >>> AllChem.EmbedMolecule(m2) >>> sm1 = Chem.MolToSmiles(m1) >>> sm2 = Chem.MolToSmiles(m2) >>> print(sm1) >>> print(sm2) >>> print(sm1 == sm2) >>> inc1 = Chem.MolToInchi(m1) >>> inc2 = Chem.MolToInchi(m2) >>> print(inc1) >>> print(inc2) >>> print(inc1 == inc2) >>> >>> Molecules m1 and m2 have identical SMILES representations >>> but different InChI representations, which I find odd. >>> >>> All the best, >>> >>> Jean-Mar
Re: [Rdkit-discuss] InChI to Mol to InChi
Hi Jean-Marc, There difference is due to bond orientation (if my inchi analysis skills are correct). See the bold bond layer below (14-7+ vs 14-7-). m1 -> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1 m2 -> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1 Not sure why it happens, but I've seen it multiple times... Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD < jm.nuzill...@univ-reims.fr> wrote: > Thank you for your answer but alatis might not be adapted to my current > problem. > > Attempting to understand what was changed by the embedding step I wrote: > > inchi1 = > > "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" > m1 = Chem.MolFromInchi(inchi1) > m1 = Chem.AddHs(m1) > m2 = Chem.Mol(m1) > AllChem.EmbedMolecule(m2) > sm1 = Chem.MolToSmiles(m1) > sm2 = Chem.MolToSmiles(m2) > print(sm1) > print(sm2) > print(sm1 == sm2) > inc1 = Chem.MolToInchi(m1) > inc2 = Chem.MolToInchi(m2) > print(inc1) > print(inc2) > print(inc1 == inc2) > > Molecules m1 and m2 have identical SMILES representations > but different InChI representations, which I find odd. > > All the best, > > Jean-Marc > > > > > Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit : > > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote: > >> Is there any more deterministic procedure than the one of trying until > >> success is obtained? > >> > >> How do I determine the InChI string of a conformer obtained after > >> multiple embedding? > > > > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/ > > > > Generally speaking the problem with InChI is that the only *required* > > layer is the formula. Therefore *an* InChI string cannot be used to > > differentiate conformers, you need the InChI string with all the > > relevant layers and all the protons. > > > > https://www.nature.com/articles/sdata201773 > > > > ___ > > Rdkit-discuss mailing list > > Rdkit-discuss@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Matching Generalized Compounds
Hi Kovas, You have two fuzzy compounds that you try to match them, because our intuition says that any atom notation [*:1] from m1 should match the Fluorine [F:11] in m2 and any atom [*:14] in m2 should match Carbon [CH3:4] in m1. The issue here is that you create two query compounds from m1 and m2 which will match their own specific substructures. Query to query matching is not trivial. In order to do what you want you need a query compound that combines their characteristic, which is what Paolo showed. Paolo with MCS and modifying atom properties created that query compound '[*:1]-[CH2:2]-[C:3](-[*:4])=[CH2:5]' or '[*:1]-[CH2X4:2]-[CX3:3](-[*:4])=[CH2X3:5]' Also bare in mind that Paolo's approach changed the starting compounds, as now they resemble the generic query compound that combines their fuzzy atoms. https://gist.github.com/CKannas/ac1a4791dec909552d7c8899cfaff030 Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On Thu, 23 Aug 2018 at 12:36, Paolo Tosco wrote: > Dear Kovas, > > It looks like GetSubstructMatch() only finds a match if the dummy atom is > in the query, not if it is in the molecule they you are matching the query > against. > > This notebook present a possible solution off the top of my head: > > https://gist.github.com/ptosco/a35ac28a14103b47096f6d6af1aec831 > > which does not involve changes to the C++ layer, even though it is > computationally more expensive and will fail with disconnected fragments as > it uses FindMCS(). There may be better solutions - this is what I came out > with yesterday night in the little time I had available. > > Cheers, > P. > > On 08/22/18 19:34, Kovas Palunas wrote: > > Hi All, > > > > I’m interested in having GetSubstructMatches return non-“null” results in > the following example. The results should lead to a match where atom 1 > maps to atom 11, 2 to 12, etc. > > > > m1 = Chem.MolFromSmiles('[*:1][CH2:2][C:3]([CH3:4])=[CH2:5]') > > m2 = Chem.MolFromSmiles('[F:11][CH2:12][C:13]([*:14])=[CH2:15]') > > > > ### do something here so that the mols will match ### > > qp = Chem.AdjustQueryParameters() > > qp.makeDummiesQueries = True > > m1 = Chem.AdjustQueryProperties(m1, qp) > > m2 = Chem.AdjustQueryProperties(m2, qp) > > > > # I’d like both of the following to return results > > m1.GetSubstructMatches(m2) > > m2.GetSubstructMatches(m1) > > > > My understanding of why these mols currently do not match is as follows: > because only the dummy atoms are made queries (based on my query parameter > adjustment), when one mol is matched to another dummy 1 may match to F:11, > but dummy 14 will then not match to methyl:14. This is because (as I > understand), normal atoms can only be matched by queries, and cannot match > them themselves. > > > > Potential ideas to make this work as I’d like: > >1. Override atom.Match in the python code – not sure that this would >work since the C++ version of this function is what would be called during >GetSubstructMatches >2. Override atom.Match in the C++ code – not quite sure how to do >this, or what side affects it might have. Ideally the changes I make would >only affect this example (and other similar ones) >3. Make all atoms in both molecules QueryAtoms, but otherwise leave >them unchanged. I’m not quite sure how to do this! > > > > Does anyone have any ideas for what the best approach here would be, or > knows if there is already built in functionality for something like this? > I’d prefer to not use SMARTS to construct my molecules if possible, since I > don’t really think of them as queries, just as other molecules in the > system that happen to not be fully specified. > > > > - Kovas > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit PostgreSQL
Hi Paolo, Thanks, I forgot to mention that I got it working on a linux box that I use as dev machine. Eventually I plan to deploy it on a Linux server. When I'll have to build the RDKit cartridge, will I have to just build the cartridge or I'll have to rebuild RDKit and the cartridge? Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On Thu, 3 May 2018 at 09:47, Paolo Tosco <paolo.tosco.m...@gmail.com> wrote: > Hi Christos, > > you may use an existing PostgreSQL installation, but then you will need to > build the cartridge from source. You may rebuild the conda RDKit cartridge > against your system PostgreSQL - I have done it on Linux and I think it > should be doable on macOS too. I'll give it a go tonight and get back to > you. > > Cheers, > Paolo > > On 05/03/18 08:16, Christos Kannas wrote: > > Hi Paolo, > > Will do, thanks! > > I was playing with it yesterday and I managed to have it in a working > manner. > I'm going through the ChEMBL example. > > Questions: > >1. The conda recipe for rdkit-postgresql95 works only with PostgreSQL >9.5 and rdkit_2017.09.03, given that those are in the conda environment. If >I need to connect it to an existing PostgreSQL installation I will have to >build RDKit and the cartridge from source don't I? >2. Will the cartridge work in the latest RDKit 2018 version, if I >build it from the source? > > > Best, > > Christos > > Christos Kannas > > Chem[o]informatics Researcher & Software Developer > > Mob (UK): +44 (0) 7447700937 > Mob (Cyprus): +357 99530608 > > [image: View Christos Kannas's profile on LinkedIn] > <http://cy.linkedin.com/in/christoskannas> > > > On Thu, 3 May 2018 at 08:05, Paolo Tosco <paolo.tosco.m...@gmail.com> > wrote: > >> Hi Christos, >> >> It was definitely possible last time I tried, but it was some time ago. >> Give it a try and get back to me directly (i.e., off-list) if you have >> problems. >> >> Cheers, >> p. >> >> On 1 May 2018, at 16:31, Christos Kannas <chriskan...@gmail.com> wrote: >> >> Hi RDKiters, >> >> Is it possible to install RDKit PostgreSQL cartridge via Anaconda on >> MacOS? >> Or would be better to try with a Linux VM? >> >> Best, >> >> Christos >> >> Christos Kannas >> >> Chem[o]informatics Researcher & Software Developer >> >> Mob (UK): +44 (0) 7447700937 >> Mob (Cyprus): +357 99530608 >> >> [image: View Christos Kannas's profile on LinkedIn] >> <http://cy.linkedin.com/in/christoskannas> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit PostgreSQL
Hi Paolo, Will do, thanks! I was playing with it yesterday and I managed to have it in a working manner. I'm going through the ChEMBL example. Questions: 1. The conda recipe for rdkit-postgresql95 works only with PostgreSQL 9.5 and rdkit_2017.09.03, given that those are in the conda environment. If I need to connect it to an existing PostgreSQL installation I will have to build RDKit and the cartridge from source don't I? 2. Will the cartridge work in the latest RDKit 2018 version, if I build it from the source? Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On Thu, 3 May 2018 at 08:05, Paolo Tosco <paolo.tosco.m...@gmail.com> wrote: > Hi Christos, > > It was definitely possible last time I tried, but it was some time ago. > Give it a try and get back to me directly (i.e., off-list) if you have > problems. > > Cheers, > p. > > On 1 May 2018, at 16:31, Christos Kannas <chriskan...@gmail.com> wrote: > > Hi RDKiters, > > Is it possible to install RDKit PostgreSQL cartridge via Anaconda on MacOS? > Or would be better to try with a Linux VM? > > Best, > > Christos > > Christos Kannas > > Chem[o]informatics Researcher & Software Developer > > Mob (UK): +44 (0) 7447700937 > Mob (Cyprus): +357 99530608 > > [image: View Christos Kannas's profile on LinkedIn] > <http://cy.linkedin.com/in/christoskannas> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] RDKit PostgreSQL
Hi RDKiters, Is it possible to install RDKit PostgreSQL cartridge via Anaconda on MacOS? Or would be better to try with a Linux VM? Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] seg fault when importing Chem on OS-X 10.12
Hi Patrick, I had a similar problem with RDKit 2017.09.03 on MacOS, using rdkit channel in anaconda. Using the conda-forge channel with python 3.5.5 and ipython 6.2 works fine. I can post my env tomorrow from work. Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 16 April 2018 at 18:57, Brian Cole <col...@gmail.com> wrote: > Pat, the beta for 2018.03 seems to work fine: > > conda install -c rdkit/label/beta rdkit > > My current guess is that there is a boost python dependency problem with > RDKit 2017 for Python 3. I've twiddled between a few different boost > versions in the conda environment, but to no success in getting 2017 > working. > > -Brian > > > On Mon, Apr 16, 2018 at 1:20 PM, Brian Cole <col...@gmail.com> wrote: > >> I can reproduce the problem, and the issue does appear to be different >> than the previous issue. Reproducible with the following on OSX: >> >> $ conda create -c rdkit -n rdkit_2017 rdkit python=3.5 >> $ source activate rdkit_2017 >> $ python -c 'import rdkit.rdBase' >> Segmentation fault: 11 >> >> $ lldb -- python -c 'import rdkit.rdBase' >> (lldb) target create "python" >> Current executable set to 'python' (x86_64). >> (lldb) settings set -- target.run-args "-c" "import rdkit.rdBase" >> (lldb) run >> Process 14929 launched: '/Users/coleb/anaconda2/envs/rdkit_2017/bin/python' >> (x86_64) >> Process 14929 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = >> EXC_BAD_ACCESS (code=1, address=0xa9) >> frame #0: 0x0001001c301d python`visit_decref + 13 >> python`visit_decref: >> -> 0x1001c301d <+13>: testb $0x40, 0xa9(%rax) >> 0x1001c3024 <+20>: jne0x1001c302f ; <+31> >> 0x1001c3026 <+22>: xorl %eax, %eax >> 0x1001c3028 <+24>: addq $0x8, %rsp >> Target 0: (python) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = >> EXC_BAD_ACCESS (code=1, address=0xa9) >> * frame #0: 0x0001001c301d python`visit_decref + 13 >> frame #1: 0x0001000a07a7 python`tupletraverse + 55 >> frame #2: 0x0001001c1eb3 python`collect + 291 >> frame #3: 0x000100196972 python`Py_Finalize + 226 >> frame #4: 0x0001001bfbc8 python`Py_Main + 3096 >> frame #5: 0x00011301 python`main + 497 >> frame #6: 0x7fff5fe23015 libdyld.dylib`start + 1 >> frame #7: 0x7fff5fe23015 libdyld.dylib`start + 1 >> (lldb) info threads >> >> >> >> On Mon, Apr 16, 2018 at 1:11 PM, Brian Cole <col...@gmail.com> wrote: >> >>> An issue like this was fixed in the past: https://github.com/rdkit >>> /rdkit/commit/009dd580527caa662de8bac5ad0c60f1e9bc90cd >>> >>> Will see if I can reproduce this. >>> >>> -Brian >>> >>> On Mon, Apr 16, 2018 at 12:09 PM, Patrick Walters <wpwalt...@gmail.com> >>> wrote: >>> >>>> Hi All, >>>> >>>> I installed the latest RDKit using conda >>>> >>>> conda create -c rdkit -n rdkit_2017 rdkit >>>> >>>> When I import Chem I get a seg fault >>>> >>>> ➜ ~ source activate rdkit_2017 >>>> (rdkit_2017) ➜ ~ python >>>> Python 3.5.5 |Anaconda, Inc.| (default, Mar 12 2018, 16:25:05) >>>> [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin >>>> Type "help", "copyright", "credits" or "license" for more information. >>>> >>> from rdkit import Chem >>>> [1]85097 segmentation fault python >>>> >>>> Has anyone else encountered this? >>>> >>>> Thanks, >>>> >>>> Pat >>>> >>>> >>>> >>>> -- >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>> ___ >>>> Rdkit-discuss mailing list >>>> Rdkit-discuss@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>>> >>>> >>> >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Exhaustive Library Enumeration
Hi Andy, A better option is to sanitize the products of a reaction enumeration before using them as reactants. Look at this example from RDKit "Getting Started" documentation. Note that the molecules that are produced by the chemical reaction processing code are not sanitized, as this artificial reaction demonstrates: >>> rxn = >>> AllChem.ReactionFromSmarts('[C:1]=[C:2][C:3]=[C:4].[C:5]=[C:6]>>[C:1]1=[C:2][C:3]=[C:4][C:5]=[C:6]1')>>> >>> ps = rxn.RunReactants((Chem.MolFromSmiles('C=CC=C'), >>> Chem.MolFromSmiles('C=C')))>>> Chem.MolToSmiles(ps[0][0])'C1=CC=CC=C1'>>> >>> p0 = ps[0][0]>>> >>> Chem.SanitizeMol(p0)rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE>>> >>> Chem.MolToSmiles(p0)'c1c1' PS: I forgot that the results of a reaction enumeration were not sanitised, until I so the error in the command line. Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 18 January 2018 at 00:07, Andy Jennings <andy.j.jenni...@gmail.com> wrote: > Hi Christos, > > Many thanks for the reply. I hadn't appreciated that the presence of a > single invalid reagent would bring the entire thing crashing down, rather > than issuing a warning/error and moving onto other molecules in the set. > Good to know, and I'll have to be less lazy in my code ;-) > > Best, > Andy > > On Wed, Jan 17, 2018 at 1:56 PM, Christos Kannas <chriskan...@gmail.com> > wrote: > >> Hi Andy, >> >> The reason that your code breaks is that the second product of the third >> iteration ( 'NN(Cc1c1)(Cc1c1)Cc1c1') is not a valid >> molecule. >> And when calling Chem.MolFromSmiles( 'NN(Cc1c1)(Cc1c1)Cc1 >> c1') it creates a None object. >> So you have to filter out the molecules that are not valid. >> >> See this Jupyter Notebook >> <https://gist.github.com/CKannas/11bb9bcaa9435dd18a0bb969501219b2> at >> cell 5 the 1st line in the while loop. >> >> Best, >> >> Christos >> >> Christos Kannas >> >> Chem[o]informatics Researcher & Software Developer >> >> [image: View Christos Kannas's profile on LinkedIn] >> <http://cy.linkedin.com/in/christoskannas> >> >> On 17 January 2018 at 18:16, Andy Jennings <andy.j.jenni...@gmail.com> >> wrote: >> >>> Hi RDKitters, >>> >>> I have a question and an observation on the topic of library enumeration. >>> >>> First, the question: is there a call within RDKit to trigger the >>> exhaustive reaction of reagents? For example, if I have two reagents - a >>> primary amine and an akyl chloride - can I tell RDKit to enumerate the >>> reaction as though there were an excess of each reagent? In my case here >>> the reaction would continue until the alkylation can no longer occur >>> because there are no more valences available on the amine and I would >>> either be tri-alkylated for a neutral product or quat-alkylated for a >>> positively charged product >>> e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R >>> >>> This brings me to my observation. When I try to attempt exactly this by >>> repeatedly exposing the product to the reagent again I am able to drive it >>> to exhaustion *in some cases*. >>> >>> For example, in the example above where RCl is benzyl chloride and my >>> smirks is: >>> [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]' >>> I do drive the final product to be exclusively the tri-akylated amine. >>> Success. >>> >>> However, when I attempt the same thing with an amine with more than one >>> reactive nitrogen (e.g. NN) I don't get a single product with 6 >>> alkylations, I get two unique product each with three alkylations. One >>> product has two alkylations on the first nitrogen and one on the second, >>> the other product has three alkylations on the first nitrogen and none on >>> the second. Attempting to drive the reaction once again leads to a >>> 'reaction called with None reactants' ValueError. My dreadful code is below >>> and the output is >>> Reaction 1: ['NNCc1c1'] >>> Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1'] >>> Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1', >>> 'NN(Cc1c1)(Cc1c1)Cc1c1'] >>> Reaction 4: ValueError >>> >>> Any pointers would be gre
Re: [Rdkit-discuss] Exhaustive Library Enumeration
Hi Andy, The reason that your code breaks is that the second product of the third iteration ( 'NN(Cc1c1)(Cc1c1)Cc1c1') is not a valid molecule. And when calling Chem.MolFromSmiles( 'NN(Cc1c1)(Cc1c1)Cc1c1') it creates a None object. So you have to filter out the molecules that are not valid. See this Jupyter Notebook <https://gist.github.com/CKannas/11bb9bcaa9435dd18a0bb969501219b2> at cell 5 the 1st line in the while loop. Best, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 17 January 2018 at 18:16, Andy Jennings <andy.j.jenni...@gmail.com> wrote: > Hi RDKitters, > > I have a question and an observation on the topic of library enumeration. > > First, the question: is there a call within RDKit to trigger the > exhaustive reaction of reagents? For example, if I have two reagents - a > primary amine and an akyl chloride - can I tell RDKit to enumerate the > reaction as though there were an excess of each reagent? In my case here > the reaction would continue until the alkylation can no longer occur > because there are no more valences available on the amine and I would > either be tri-alkylated for a neutral product or quat-alkylated for a > positively charged product > e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R > > This brings me to my observation. When I try to attempt exactly this by > repeatedly exposing the product to the reagent again I am able to drive it > to exhaustion *in some cases*. > > For example, in the example above where RCl is benzyl chloride and my > smirks is: > [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]' > I do drive the final product to be exclusively the tri-akylated amine. > Success. > > However, when I attempt the same thing with an amine with more than one > reactive nitrogen (e.g. NN) I don't get a single product with 6 > alkylations, I get two unique product each with three alkylations. One > product has two alkylations on the first nitrogen and one on the second, > the other product has three alkylations on the first nitrogen and none on > the second. Attempting to drive the reaction once again leads to a > 'reaction called with None reactants' ValueError. My dreadful code is below > and the output is > Reaction 1: ['NNCc1c1'] > Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1'] > Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1', > 'NN(Cc1c1)(Cc1c1)Cc1c1'] > Reaction 4: ValueError > > Any pointers would be great, as would any pre-existing library enumeration > code. The examples I've found shipped with RDKit don't appear to allow me > to name the products using a combination of the reagent names (useful for > tracking library content). > > Best, > Andy > > Code snippet > > amine = Chem.MolFromSmiles('NN') > acyl = Chem.MolFromSmiles('c1c1CCl') > rxn = AllChem.ReactionFromSmarts('[#7:1].[#6:2][Cl:3]>>[#6:2][#7: > 1].[Cl:3]') > > # First reaction > reactantListMols = [amine,acyl] > prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols, > reactantListMols]) > prods = list(prods) > smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in > prods])) > print smis > # ['NNCc1c1'] > > # Now repeat until doom > for i in range(0,10): > oldproducts = [Chem.MolFromSmiles(x) for x in smis] > reactantListMols = oldproducts + [acyl] > prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols, > reactantListMols]) > prods = list(prods) > smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in > prods])) > print smis > > End Code > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] adding fragment to existing molecule
Hi Per, I can think of 2 approaches to solve this. The 1st is to have fragments of molecules that have an explicit connection point, i.e. OH[*] and C[*], and use RDKit's functionality of combining fragments. The 2nd is to use define a reaction for this using SMIRKS or Reaction SMILES, i.e. [OH-].C>>COH, and use RDKit's reaction functionality to perform the reaction on your molecules. Hope this was a bit helpful. Regards, Christos Christos Kannas Chem[o]informatics Researcher & Software Developer [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 7 August 2017 at 18:46, Per Jr. Greisen <pgrei...@gmail.com> wrote: > Hi Nikolaus and Ling, > > Thanks for your help (the atom numbe shouldnt be 43 but it still gives the > error I will clarify)- yes Nikolaus you are right it is a sanitization > issue and in this case I am trying to use it as a molecular editor to build > a model molecule (a transition state model to be exact) - I would normally > do this calling some other script but it would be very nice to do all of it > in the framework of RDkit - can this be done? Thanks > > On Mon, Aug 7, 2017 at 12:05 PM, Stiefl, Nikolaus < > nikolaus.sti...@novartis.com> wrote: > >> Hi Per >> >> Just by looking at your code I would assume you have a sanitization >> issue. You create your pentane molecule and then add H’s. This will >> saturate each single carbon. When you then add a bond between the two >> fragments your atom 3 will have a valence of 5 and this causes issues. >> >> Maybe do the fragment combination first and then add the H’s? Or do an >> explicit handling of the correct carbon you link to upfront. >> >> Hope this helps >> >> Nik >> >> >> >> >> >> *From: *"Per Jr. Greisen" <pgrei...@gmail.com> >> *Date: *Sunday 6 August 2017 at 19:55 >> *To: *RDKit <rdkit-discuss@lists.sourceforge.net> >> *Subject: *[Rdkit-discuss] adding fragment to existing molecule >> >> >> >> Hi all, >> >> >> >> I am trying to add a fragment to an existing molecule using RDkit - I >> start by generating the desired molecules I would like to combine: >> >> >> >> oh = '[OH-]' >> >> ohh = Chem.MolFromSmiles(oh) >> >> oh = Chem.AddHs(ohh) >> >> oh.SetProp("_Name","OH-") >> >> AllChem.EmbedMolecule(oh, AllChem.ETKDG()) >> >> >> >> smiles_ = 'C' >> >> m = Chem.MolFromSmiles(smiles_) >> >> m_h = Chem.AddHs(vxm) >> >> m_h.SetProp("_Name","XP") >> >> AllChem.EmbedMolecule(m_h, AllChem.ETKDG()) >> >> >> >> I combine them which works fine: >> >> >> >> combo = Chem.CombineMols(m_h,oh) >> >> >> >> and I can add the bond between the desired atoms: >> >> >> >> >> >> edcombo = Chem.EditableMol(combo) >> >> >> >> edcombo.AddBond(3,1,order=Chem.rdchem.BondType.SINGLE) >> >> back = edcombo.GetMol() >> >> >> >> The problems arises when I want to edit the geometry between the two : >> >> >> >> from rdkit.Chem import rdMolTransforms as rdmt >> >> conf = back.GetConformer(0) >> >> >> >> rdmt.SetBondLength(conf,3,43,10) >> >> >> >> writer3 = Chem.SDWriter('out_long.sdf') >> >> writer3.write(back,confId=0) >> >> >> >> >> >> >> >> RuntimeError Traceback (most recent call last) >> >> in () >> >> * 2* conf = back.GetConformer(0) >> >> * 3* >> >> > 4 rdmt.SetBondLength(conf,3,43,10) >> >> * 5* >> >> * 6* writer3 = Chem.SDWriter('out_long.sdf') >> >> >> >> RuntimeError: Pre-condition Violation >> >> RingInfo not initialized >> >> Violation occurred on line 66 in file Code/GraphMol/RingInfo.cpp >> >> Failed Expression: df_init >> >> RDKIT: 2017.03.3 >> >> BOOST: 1_56 >> >> >> >> So I am not sure how fix - thanks in advance >> >> >> >> >> >> -- >> >> With kind regards >> >> >> Per >> > > > > -- > With kind regards > > Per > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Using RDKit in PyCharm and Anaconda on Windows
Hi Richard, I use rdkit with Miniconda and PyCharm as you do. Which makes things easier as far for autocomplete in the IDE. But I do not use PyCharm's Python console for that same reason, instead I have a cmd window with my Python environment activated to run tests. Greg's proposed solution for a script I think is the best approach for having different Python environments activated along with additional instances of PyCharm. Regards, Christos Christos Kannas Researcher Ph.D Student [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 31 May 2017 at 09:56, Greg Landrum <greg.land...@gmail.com> wrote: > Hi Richard, > > I'm glad you found something that works. I'm not enough of a Windows > expert to be able to provide an example of how to fix it, but here's what > you need to do: > > When you activate a conda environment a bunch of stuff happens behind the > scenes, it's not generally sufficient to just call the corresponding > interpreter directly (as you've seen). So you really need to activate the > environment before invoking PyCharm. I would guess that the easiest way to > do this is to create a batch file that first does the environment execution > and then invokes PyCharm. That way you only have one command that you need > to execute. > > A more "fixing the problem with a big hammer" solution, and probably not > something that would be recommended, would be to install the rdkit into the > root conda environment (i.e. do `conda install -c rdkit rdkit` without > activating an environment first. This should allow things to work directly. > > -greg > > > > On Tue, May 30, 2017 at 10:10 PM, West, Richard <r.w...@northeastern.edu> > wrote: > >> We're having trouble getting RDKit to work in a PyCharm project using an >> Anaconda interpreter (Python 2.7), on Windows 8.1. >> Has anyone had success with this and can guide us? >> The trouble is we get an >> >> ImportError: DLL load failed: The specified module could not be found. >> >> when trying to import rdkit (or rdBase). >> >> We have tried many variations of the following, but here is a basic >> recipe of what does/doesn't work: >> 1. Make a new conda environment (called 'eg1'), install rdkit ('conda >> install -c rdkit rdkit') >> 2. From a cmd.exe prompt, use this environment ('activate eg1') load >> python ('python') and import rdkit ('import rdkit') it works fine. >> 3. From PyCharm, create a Project Interpreter (pointing to >> 'C:\Anaconda2\envs\eg1\python.exe'), and use this to run a script or >> create a new Python Console in which you 'import rdkit', leading to the >> "DLL load failed" message. >> 4. We have tried manually adding a bunch of things to the "Interpreter >> Paths" in PyCharm, but without success (perhaps we just didn't add the >> right thing). >> >> >> >> >> Update: just before I hit "send" on this request for help, we stumbled >> across this posting of the same problem, and solution, from Christian >> Ribeaud: >> https://intellij-support.jetbrains.com/hc/en-us/community/ >> posts/115000244450-DLL-load-failed >> >> It seems that if we open cmd.exe, activate the environment, and then >> launch PyCharm exe from there, it works. >> I'm sharing this here because it took us a while to find the other post, >> but also to ask: is there a "better" way? >> >> Cheers, >> Richard >> >> >> -- >> Richard H. West, Ph.D. >> Assistant Professor, Department of Chemical Engineering, >> Northeastern University, 360 Huntington Ave, Boston, MA 02115 >> http://northeastern.edu/comochengPhone: 617-373-5163 >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] How to transform SMARTS of aromatic structures so that their aromatic atoms could be any?
Hi Alexis, In SMARTS you can define an aromateic atom with "a". So I'm thinking that something like the following, might produce more correct generalised SMARTS patterns. https://gist.github.com/CKannas/7a9e2768461260461155257fd30c2152 *Note: Please check if the chemistry is correct.* Best, Christos Christos Kannas Researcher Ph.D Student [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 19 May 2017 at 12:52, Alexis Parenty <alexis.parenty.h...@gmail.com> wrote: > Hi everyone, > > > I need a function that could generalize any aromatic rings from a SMARTS: > > [image: Inline images 1] > > > I have noticed that it is possible to rearrange most of SMARTS strings > into a general aromatic SMARTS strings by following those simple rules: > > 1 Exchange any lower case of a SMARTS string with > “:[*]” > > 2 Catch the two cycle junctions of the SMARTS: > > a. Where a number(1-9) appears a first time in the string: insert a > colon after the digit (for example “[*]1” to “[*]1:” > > b. Where the same number appears a second time, move the semi colon > before the digit (for example “[*]1:” to “[*]:1 the > > > I have written a function (see under) that works fine with any SMART > containing a single aromatic ring. But it does get buggy when I have a > SMARTS with more than one aromatic ring: > > > > [image: Inline images 2] > > > > def get_aromatic_generalised_smarts(smarts): >for arom_atom in ("c", "o", "n", "s"): > smarts = smarts.replace(arom_atom, "x") >smarts = smarts.replace("[xH]", "x") # to take care of explicit hydrogen > atoms > >for char in smarts: > if char == 'x': > smarts = smarts.replace(char, ":[*]") > >for char in smarts: > if char.isdigit(): > if ("[*]"+char) in smarts: > for cycle_junction in ("[*]1", "[*]2", "[*]3", "[*]4", "[*]5", > "[*]6", "[*]7", "[*]8", "[*]9"): >smarts = smarts.replace(cycle_junction, "[*]:" + > cycle_junction[-1]) # that make the second cycle junction OK but introduce > an error in the first cycle jonction that is corrected next line > smarts = smarts.replace(":[*]:"+char, "[*]"+char, 1) # to correct > the first cycle junction. > break >return smarts > > > print(get_aromatic_generalised_smarts("[*]c1coc(Cl)n1")) > print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1")) > > print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1Cc2c2") > > > Am I heading in the right direction? I can't make my heads around SMARTS > with more than one aromatic rings... > > Maybe regular expressions would be more appropriate? Maybe there is an > RDKit function that does the trick from a mol object? > > > Thanks, > > > Alexis > > > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] comparing two or more tables of molecules
Hi Steve, I think it would be better to use a similarity metric based on fingerprints. Regards, Christos Christos Kannas Researcher Ph.D Student [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 28 November 2016 at 18:25, Stephen O'hagan <soha...@manchester.ac.uk> wrote: > Has anyone come up with fool-proof way of matching structurally equivalent > molecules? > > > > Unique Smiles or InChI String comparisons don’t appear to work presumable > because there are different but equivalent structures, e.g. explicit vs > non-explicit H’s, Kekule vs Aromatic, isomeric forms vs non-isomeric form, > tautomers etc. > > > > I also expect that comparing InChI strings might need something more than > just a simple string comparison, such as masking off stereo information > when you don’t care about stereo isomers. > > > > I assume there are suitable tools within RDKit that can do this? > > > > N.B. I need to collate tables from several sources that have a mix of > smiles / InChI / sdf molecular representations. > > > > I usually use RDKit via Python and/or Knime. > > > > Cheers, > > Steve. > > > > > -- > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Count carbon atoms
Hi Joss, Yes there is an easier way, by using substructure search, i.e. do a substructure search for [C] and then get the number of matches. Hope this example to be readable and answer your question. In [1]: from rdkit import rdBase print rdBase.rdkitVersion from rdkit import Chem from rdkit.Chem import AllChem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole 2014.09.2 In [2]: m = Chem.MolFromSmiles("c1c1") m Out[2]: [image: Inline images 1] In [3]: patt= Chem.MolFromSmarts("[C]") print Chem.MolToSmarts(patt) C In [4]: pm = m.GetSubstructMatches(patt) print len(pm) 8 Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] <http://cy.linkedin.com/in/christoskannas> On 7 October 2015 at 10:12, Joos Kiener <joos.kie...@gmail.com> wrote: > Hi all, > > is there an easy way I'm missing to get the number of C-Atoms in a > molecule? > > Currently I iterate all atoms and check if it's symbol is C. Doesn't seem > very efficient. > > Best Regards, > > Joos Kiener > > > -- > Full-scale, agent-less Infrastructure Monitoring from a single dashboard > Integrate with 40+ ManageEngine ITSM Solutions for complete visibility > Physical-Virtual-Cloud Infrastructure monitoring from one console > Real user monitoring with APM Insights and performance trend reports > Learn More > http://pubads.g.doubleclick.net/gampad/clk?id=247754911=/4140 > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Full-scale, agent-less Infrastructure Monitoring from a single dashboard Integrate with 40+ ManageEngine ITSM Solutions for complete visibility Physical-Virtual-Cloud Infrastructure monitoring from one console Real user monitoring with APM Insights and performance trend reports Learn More http://pubads.g.doubleclick.net/gampad/clk?id=247754911=/4140___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] install rdkit cartridge
Hi Tim, I'm not sure about this, but I think that in order to use the cartridge you have to build RDKit and cartridge from source. Regards, Christos Kannas Sent from my Galaxy Note 4. On 24 Apr 2015 13:47, Tim Dudgeon tdudgeon...@gmail.com wrote: Is it possible when using the packages? I'm trying to get a reproducible build process so prefer not to have to build from sources. Tim On 24/04/2015 13:29, Axel Pahl wrote: The instructions below appliy to the RDKit installation from source and are probably not applicable to the Debian package, sorry for generating confusion. Kind regards, Axel On 04/24/2015 02:24 PM, Axel Pahl wrote: Dear Tim, please take a look at this README in your installation: your RDKit folder/Code/PgSQL/rdkit/README It essentially boils down to this: cd $RDBASE/Code/PgSQL/rdkit $ make $ sudo make install $ make installcheck (only the second step has to be performed as root) Kind regards, Axel On 04/24/2015 01:33 PM, Tim Dudgeon wrote: How to install RDKit cartridge? The instructions here show how to use it, but not how to install it. http://www.rdkit.org/docs/Cartridge.html I have RDKit and PostgreSQL installed from the corresponding Debian packages, but alas no cartridge :-( Tim -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Rdkit-discuss Digest, Vol 88, Issue 13
Hi Samuel, The problem is that x is a tuple in your case not an rdkit molecule object. Check your code to see what mols list actually has inside. I'm guessing that x is a tuple containing a molecule object plus some other info. Best. Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] http://cy.linkedin.com/in/christoskannas On 13 February 2015 at 12:14, segie...@sanbi.ac.za wrote: Kindly look through this aspect of my script below. I keep getting the error: AttributeError: 'tuple' object has no attribute 'HasSubstructMatch' #!/usr/bin/python from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem import AllChem pains = []# Contains my PAINS query molecules #Loop through data with PAINS query for p in pains: match = [x for x in mols if x.HasSubstructMatch(p)] a = len(match) print a Thank you Samuel Ayodele Egieyeh South African National Bioinformatics Institute University of the Western Cape Send Rdkit-discuss mailing list submissions to rdkit-discuss@lists.sourceforge.net To subscribe or unsubscribe via the World Wide Web, visit https://lists.sourceforge.net/lists/listinfo/rdkit-discuss or, via email, send a message with subject or body 'help' to rdkit-discuss-requ...@lists.sourceforge.net You can reach the person managing the list at rdkit-discuss-ow...@lists.sourceforge.net When replying, please edit your Subject line so it is more specific than Re: Contents of Rdkit-discuss digest... Today's Topics: 1. Re: Chem.Draw darker colors (Soren Wacker) 2. Re: Chem.Draw darker colors (David Hall) 3. Re: Inchi installation in postgresql database driving me mad (Jan Holst Jensen) 4. Re: Docker images (Greg Landrum) 5. Re: Inchi installation in postgresql database driving me mad (JP) -- Message: 1 Date: Thu, 12 Feb 2015 17:58:42 + From: Soren Wacker swac...@ucalgary.ca Subject: Re: [Rdkit-discuss] Chem.Draw darker colors To: Greg Landrum greg.land...@gmail.com Cc: RDKit Discuss rdkit-discuss@lists.sourceforge.net Message-ID: cf4e4cc78f22f44bb773c76c066a3dd109255...@itcimexch03.uc.ucalgary.ca Content-Type: text/plain; charset=us-ascii but how? Soren From: Greg Landrum [greg.land...@gmail.com] Sent: Wednesday, February 11, 2015 10:27 PM To: Soren Wacker Cc: RDKit Discuss Subject: Re: [Rdkit-discuss] Chem.Draw darker colors My other answer about using the DrawingOptions object applies here too. Instead of setting elemDict to be a defaultDict, you would just change the colors for S and F to whatever you prefer. -greg On Thu, Feb 12, 2015 at 12:13 AM, Soren Wacker swac...@ucalgary.camailto:swac...@ucalgary.ca wrote: Hi, I printed some moecules with the Draw module of rdkit and generated some useful figures. I noticed that for some elements the contrast to white is very low. Therefore, I suggest to change the default colors to the darker versions. E.g. darkyellow instead of yellow for sulfur and darkcyan instead of cyan for fluorine kind regards Soren -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.netmailto: Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Message: 2 Date: Thu, 12 Feb 2015 13:14:35 -0500 From: David Hall li...@cowsandmilk.net Subject: Re: [Rdkit-discuss] Chem.Draw darker colors To: Soren Wacker swac...@ucalgary.ca Cc: RDKit Discuss rdkit-discuss@lists.sourceforge.net, Greg Landrum greg.land...@gmail.com Message-ID: 1a6e481e-8a31-46cc-b719-6fd1a4e2c...@cowsandmilk.net Content-Type: text/plain; charset=us-ascii In [6]: opt.elemDict Out[6]: {0: (0.5, 0.5, 0.5), 1: (0.55, 0.55, 0.55), 7: (0, 0, 1), 8: (1, 0, 0), 9: (0.2, 0.8, 0.8), 15: (1, 0.5, 0), 16: (0.8, 0.8, 0), 17: (0, 0.8, 0), 35: (0.5, 0.3, 0.1)} presumably, you set fluorine and sulfur by changing the values of 9 and 16. -David On Feb 12, 2015, at 12:58 PM, Soren Wacker swac...@ucalgary.ca wrote: but how? Soren From: Greg Landrum [greg.land...@gmail.com] Sent
Re: [Rdkit-discuss] Modified Mol objects with concurrent.futures
Hi Michael, The problem occurs because child processes return their results using pickle, and the ordinary rdkit molecule object when is being pickled it looses information. A solution that I use is to convert the molecule objects to PropertyMol objects, which retain their properties. Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] http://cy.linkedin.com/in/christoskannas On 2 February 2015 at 09:03, Reutlinger, Michael michael.reutlin...@roche.com wrote: Hi all, I am currently trying to parallelize part of a script using RDKIT and concurrent.futures. The function that is executed in parallel returns processed molecules as RDKIT Mol objects. Without parallelization everything is fine and the Mol objects keep all the properties that they had before the processing. When using concurrent.futures, the returned molecules lose all properties and seem to be created from scratch maybe with unknown side-effects. I am wondering if anyone experienced the same issue and knows how to circumvent this. I attached a ipython notebook with a small script demonstrating the issue. Best, Michael Example Code: from concurrent import futures from rdkit import Chem from rdkit.Chem import AllChem from rdkit.Chem.Draw import IPythonConsole def process(mol): if not Name in mol.GetPropNames(): print Processing: Name missing mol.SetProp(Processed,True) return mol mol = Chem.MolFromSmiles(N[C@@H](C)C(=O)O) mol.SetProp(Name,Alanine) with futures.ProcessPoolExecutor(max_workers=1) as pool: future = pool.submit(process, mol) molOut = future.result() if Name not in molOut.GetPropNames(): print Result: Name missing if Processed not in molOut.GetPropNames(): print Result: Processed missing -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Can't kelulize
Sergio, You have to use GetSubstructMatches. Look at my sample here http://nbviewer.ipython.org/gist/CKannas/5a762b97c52e389d492e. Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] http://cy.linkedin.com/in/christoskannas On 9 December 2014 at 18:48, Wong, Sergio E. wong...@llnl.gov wrote: Dear Ling; Thank you for pointing out the issue with the lactam ring. I manually changed the bond types in the mol2 file and now the error is gone. The MolFromMol2File function can sanitize the molecule. However, I still have a problem with the output. Again, my code is: mol=Chem.MolFromMol2File(%s.lig.%d.mol2%(name, i), sanitize = True, removeHs = False) aromatic_6=[c,n]1[c,n][c,n][c,n][c,n][c,n]1 aromatic_5=[c,n]1[c,n][c,n][c,n][c,n]1 pattern6=Chem.MolFromSmarts(aromatic_6) pattern5=Chem.MolFromSmarts(aromatic_5) print Pattern 6 lar = mol.GetSubstructMatch(pattern6) print lar print Pattern 5 lar = mol.GetSubstructMatch(pattern5) print lar The output is: Pattern 6 (0, 1, 3, 5, 7, 9) Pattern 5 (30, 31, 32, 41, 42) So for some reason, the pattern match for an aromatic six-membered ring returns the conjugated lactam ring, but fails to recognize the other two (all-carbon) aromatic rings in the system. Interesting, it correctly recognizes the five-membered ring system. Do you have any idea's on how to address the issue? I am attaching the hand-edited mol2 file. Thanks! -Sergio *From:* S.L. Chan [slch...@yahoo.com] *Sent:* Monday, December 08, 2014 8:26 PM *To:* Wong, Sergio E.; rdkit-discuss@lists.sourceforge.net *Subject:* Re: [Rdkit-discuss] Can't kelulize Dear Sergio, The lactam ring (atoms 1 2 4 6 8 10) is not really aromatic. The bonds 4-6, 6-8, 8-10 should all be single rather than aromatic in the mol2 file. The remaining three bonds in the ring should be double or single rather than aromatic. Ling -- *From:* Wong, Sergio E. wong...@llnl.gov *To:* rdkit-discuss@lists.sourceforge.net rdkit-discuss@lists.sourceforge.net *Sent:* Monday, December 8, 2014 3:27 PM *Subject:* Re: [Rdkit-discuss] Can't kelulize Dear RDKit users: I tried reading a mol2 file using the function MolFromMol2 (). The goal of my script is to read the molecule and find 5 or 6 membered aromatic rings. First I got the following error: Can't Kekulize mol The code I used is as follows: mol=Chem.MolFromMol2File(%s.lig.%d.mol2%(name, i), sanitize = True, removeHs = False) As a work-around I tried removing the sanitize flag and did the following mol=Chem.MolFromMol2File(%s.lig.%d.mol2%(name, i), sanitize = False, removeHs = False) aromatic_6=[c,n]1[c,n][c,n][c,n][c,n][c,n]1 aromatic_5=[c,n]1[c,n][c,n][c,n][c,n]1 pattern6=Chem.MolFromSmarts(aromatic_6) pattern5=Chem.MolFromSmarts(aromatic_5) print Pattern 6 lar = mol.GetSubstructMatch(pattern6) print lar print Pattern 5 lar = mol.GetSubstructMatch(pattern5) print lar The output should 3 aromatic six-membered rings and 1 aromatic five-membered ring. Instead I get only the first six-membered ring and no listing of the five-membered ring.: Pattern 6 (0, 1, 3, 5, 7, 9) Pattern 5 () So basically, I can not get around the kekulize function. I looked the mol2 file (attached) and it correctly lists the bond types as aromatic for all of the rings. Is there a way to use the bond information from the mol2 file to assign aromaticity? Thanks!! -Sergio -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] UGM Update
Hi RDKiters, How is UGM going? Is there a tweet feed to follow? Hope you are having a nice and interesting time! Best, Christos Christos Kannas Researcher Ph.D Student [image: View Christos Kannas's profile on LinkedIn] http://cy.linkedin.com/in/christoskannas -- ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Installation of RDKit 2014 on Centos 5.10 (Final)
Hi Enrico, The latest version of RDKit does not require flex or bison, thankfully. In the attached file I list the commands that I used to build CMake (2.8.12.2), Boost libraries (1.55) and RDKit (2014_03_1) on a CentOS 5.10 VM. In my case I was using an Anaconda Python environment so I didn't have to install NumPy, since it is bundled to it. Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] http://cy.linkedin.com/in/christoskannas On 24 July 2014 10:05, Enrico Perspicace e.perspic...@mx.uni-saarland.de wrote: Dear all, I would like to install RDKit 2014 on Centos 5.10 (Final) but I did not succeed! I follow Instructions for Installation on RDKIT website but I got an error when I used cmake command line... Indeed, cmake is not able to find boost_python library. I installed: Python 2.7, atlas, lapack, blas, fftw3, numpy 1.8 via canopy 1.4.1 (and is working with python import numpy), boost 1.55, flex 2.5.35 and bison 3.0.2 before performing the RDKit installation. I followed the procedure described here: https://www.mail-archive.com/ rdkit-discuss@lists.sourceforge.net/msg01376.html Please find in attached document related files which describe my problem. Thanks a lot for you help. Best regards, Enrico Perspicace -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss RDKit Installation on CentOS 5.10 = first create some directories... mkdir devel cd devel/ mkdir boost mkdir cmake mkdir RDKit Install CMake - cd cmake wget http://www.cmake.org/files/v2.8/cmake-2.8.12.2.tar.gz tar -xzvf cmake-2.8.12.2.tar.gz cd cmake cd cmake-2.8.12.2 ./bootstrap sudo make sudo make install ctest Install Boost - cd boost wget http://sourceforge.net/projects/boost/files/boost/1.55.0/boost_1_55_0.tar.bz2/download tar -xjvf boost_1_55_0.tar.bz2 cd boost_1_55_0 ./bootstrap.sh --with-libraries=python,regex Note: 64-bit OS ./b2 address-model=64 cflags=-fPIC cxxflags=-fPIC Install RDKit - cd RDKit/ wget wget http://downloads.sourceforge.net/project/rdkit/rdkit/Q1_2014/RDKit_2014_03_1.tgz tar -xzvf RDKit_2014_03_1.tgz cd RDKit_2014_03_1 mkdir build cd build/ Note: I was using Anaconda Python environment export PATH=~/anaconda/bin:$PATH export RDBASE=~/devel/RDKit/RDKit_2014_03_1 export LD_LIBRARY_PATH=/home/christos/devel/RDKit/RDKit_2014_03_1/lib:~/devel/boost/boost_1_55_0/stage/lib:$LD_LIBRARY_PATH export PYTHONPATH=$RDBASE:$PYTHONPATH cmake -D PYTHON_LIBRARY=~/anaconda/lib/python2.7/config/libpython2.7.a -D PYTHON_INCLUDE_DIR=~/anaconda/include/python2.7/ -D PYTHON_EXECUTABLE=~/anaconda/bin/python -D BOOST_ROOT=~/devel/boost/boost_1_55_0 .. make make install ctest -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Aldehyde functional group does not identify Formaldehyde as an aldehyde
Hi Greg and RDKiters, I'm doing some functional group (FG) substructure matching in some molecules and I noticed that the SMARTS pattern used to define the aldehyde FG will identify all aldehydes larger than Acetalhyde but fails to identify formaldehyde as part of the family. Is this something that should happen? Attached is an ipython notebook showing this behaviour. Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn] http://cy.linkedin.com/in/christoskannas Aldehyde FG.ipynb Description: Binary data -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] autodock vina pdbqt file to mol2
Hi Jan, AutoDock has a set of tools (MGLTools) that have tools to convert pdb to pdbqt and vice-versa. If I recall it can also convert pdbqt to mol2 also. See this discussion http://autodock.1369657.n2.nabble.com/ADL-pdbqt-to-mol2-td6755769.html Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 9 May 2014 20:17, Jan Domanski jan...@gmail.com wrote: Hi guys, I'm really stuck here: I have some output from autodock vina in a rather obscure pdbqt format. It's a little bit like pdb but not quite. I'm trying to get back a mol2 file. The autodock pdbqt file has only the polar hydrogens in it – part of the trick is to re-add the hydrogens. Example autodock vina output is attached (it's a conformer of the ACE native ligand DUDE). First of all, I convert that to a PDB file by doing a simple sed, sed -e '/ROOT/d' -e '/BRANCH/d' Then I reorder the atoms to match those of the original crystal_ligand.mol2 (because autodock re-orders the atoms duh). Finally, I save a mol2 file out (attached) ordered as the original crystal_ligand and with polar hydrogens (for each pose of a conformer). Let's go to rdkit and try to add hydrogens: mol = Chem.MolFromMol2File(output, removeHs=False) mol2 = AllChem.AddHs(mol, addCoords=True) print mol.GetNumAtoms(), mol2.GetNumAtoms() 44 44 So, only the implicit hydorgens are present. Calling AddHs doesn't raise an error and it doesn't really change the number of hydrogens... Now this may not be the best way of doing things: what I care for is to get a mol2 from autodock vina that I can compare to the original mol2 from DUD (same atom order, same number of atoms). Maybe there are other ways to achieve this: one idea would be to inject the docked pose coordinates into the original mol2 atoms (heavy and polar hydrogens) and somehow adjust the non-polar hydrogens. Thanks, - Jan -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Sanitization Errors
Hi all, I'm having a dozen of compounds, where some of them have a charged atom (see the attached SMILES file). When I parse the file I get sanitization errors on the compounds with the charged atoms. But when I view them with MarvinView 6.2.0 all goes fine. I'm using an RDKit build from github, version 2014.03.1pre. In order to see what sanitization error occurs in each case I did the following: 1. To parse all compounds without sanitization suppl = Chem.SmilesMolSupplier('data/SurfactantTestCompounds.smi', titleLine=True, sanitize=False) molsList = [x for x in suppl if x is not None] print len(molsList) 2. Sanitize the compounds and catch specific errors for m in molsList: error = Chem.SanitizeMol(m, catchErrors=True) if error: print m.GetProp(_Name), Chem.MolToSmiles(m), error 2.1 the output is as follows NaLAS ()c1ccc(S(=O)(=O)O[Na+])cc1 SANITIZE_PROPERTIES NaOLAS ()C1=CC=CC=C1S(=O)(=O)O[Na+] SANITIZE_PROPERTIES SLES3EO OCCOCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES SLES2EO OCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES SLES1EO OCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES SDS OS(=O)(=O)O[Na+] SANITIZE_PROPERTIES DTAC [N+](C)(C)(C)[Cl-] SANITIZE_PROPERTIES Sdoc (=O)O[Na+] SANITIZE_PROPERTIES 3. Visualize compounds Draw.MolsToGridImage(molsList, molsPerRow=5, legends=[x.GetProp('_Name') for x in molsList], kekulize=True) For visualized output check http://nbviewer.ipython.org/gist/anonymous/11248962/Sanitization_Errors.ipynb Is this an expected behaviour? Is there something I can do as a fix? Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas SurfactantTestCompounds.smi Description: application/smil -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Sanitization Errors
Hi Patrick, Thanks. So the correct would be, sodium should not have an explicit bond with the oxygen. From O=S(c1ccc(C(CCC))cc1)(O-[Na+])=O I should have O=S(c1ccc(C(CCC))cc1)([O-])=O.[Na+] Similar to the rest of my compounds. And regarding nitrogen it already has 4 bonds with carbons so chloride should be disconnected. [N+]([Cl-])(C)(C)C - [N+](C)(C)C.[Cl-] Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 24 April 2014 11:37, Patrick Walters wpwalt...@gmail.com wrote: It looks like the problem here is a covalent bond to the counter ion. Pat On Thu, Apr 24, 2014 at 6:04 AM, Christos Kannas chriskan...@gmail.comwrote: Hi all, I'm having a dozen of compounds, where some of them have a charged atom (see the attached SMILES file). When I parse the file I get sanitization errors on the compounds with the charged atoms. But when I view them with MarvinView 6.2.0 all goes fine. I'm using an RDKit build from github, version 2014.03.1pre. In order to see what sanitization error occurs in each case I did the following: 1. To parse all compounds without sanitization suppl = Chem.SmilesMolSupplier('data/SurfactantTestCompounds.smi', titleLine=True, sanitize=False) molsList = [x for x in suppl if x is not None] print len(molsList) 2. Sanitize the compounds and catch specific errors for m in molsList: error = Chem.SanitizeMol(m, catchErrors=True) if error: print m.GetProp(_Name), Chem.MolToSmiles(m), error 2.1 the output is as follows NaLAS ()c1ccc(S(=O)(=O)O[Na+])cc1 SANITIZE_PROPERTIES NaOLAS ()C1=CC=CC=C1S(=O)(=O)O[Na+] SANITIZE_PROPERTIES SLES3EO OCCOCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES SLES2EO OCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES SLES1EO OCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES SDS OS(=O)(=O)O[Na+] SANITIZE_PROPERTIES DTAC [N+](C)(C)(C)[Cl-] SANITIZE_PROPERTIES Sdoc (=O)O[Na+] SANITIZE_PROPERTIES 3. Visualize compounds Draw.MolsToGridImage(molsList, molsPerRow=5, legends=[x.GetProp('_Name') for x in molsList], kekulize=True) For visualized output check http://nbviewer.ipython.org/gist/anonymous/11248962/Sanitization_Errors.ipynb Is this an expected behaviour? Is there something I can do as a fix? Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Opposite of GetSubstructureMatches()
Hi JP, Well I've noticed the same thing on my tests, the only reason I can think off is that it either takes the preceding or the ending atom to the tuple of atom indices. I hope Greg or someone else can shed some light into it. Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 17 April 2014 09:32, JP jeanpaul.ebe...@inhibox.com wrote: On 16 April 2014 19:13, Christos Kannas chriskan...@gmail.com wrote: Chem.PathToSubmol(mol, path) Hi there Christos, Many thanks for your reply (and idea of using nbviewer) There is still something strange happening which I cannot figure out - my atom index is a tuple with six elements - and in the resulting submol I get seven atoms. Also the ring is opened in a chain (so some of the properties are changing). A simple example here: http://nbviewer.ipython.org/gist/anonymous/10964449 Any ideas? - Jean-Paul Ebejer Early Stage Researcher -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Opposite of GetSubstructureMatches()
Hi Greg, Thats why I had that strange bug with my hydrophobic - hydrophilic fragmentation, I was using PathToSubmol with list of atom indices too. And the solution I've actually found is, as Greg said, iterate through the atoms of the molecule and find the bonds that connect my query pattern, hydrophobic atoms, to atoms that are not part of it, aka are hydrophilic. Then I used Chem.FragmentOnBonds to break the molecule on that list of bonds. Here is the IPython Notebook that shows what I'm doing http://nbviewer.ipython.org/gist/CKannas/10975497 I do not break rings, if some ring atoms are mapped as hydrophobic, and I also keep terminal carbons (CH3) connected to the adjacent hydrophobic group, if any. I hope these assumptions are chemically correct. Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 17 April 2014 10:18, Greg Landrum greg.land...@gmail.com wrote: On Thu, Apr 17, 2014 at 10:32 AM, JP jeanpaul.ebe...@inhibox.com wrote: On 16 April 2014 19:13, Christos Kannas chriskan...@gmail.com wrote: Chem.PathToSubmol(mol, path) Hi there Christos, Many thanks for your reply (and idea of using nbviewer) There is still something strange happening which I cannot figure out - my atom index is a tuple with six elements - and in the resulting submol I get seven atoms. Also the ring is opened in a chain (so some of the properties are changing). A simple example here: http://nbviewer.ipython.org/gist/anonymous/10964449 Any ideas? PathToSubmol is underdocumented. It's expecting a list/tuple of bond indices; not atom indices. What you need to do is loop over the atoms in your match and find all the bonds that they are involved in that go to other atoms in the match. Pass that tuple/list to PathToSubmol and you should get what you want. If you're ok having dummies marking attachment points (which I suspect you aren't), you could use Chem.ReplaceSidechains(), but otherwise I don't think there's an easier way to do this. -greg -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Opposite of GetSubstructureMatches()
Hi JP, I think smol = Chem.PathToSubmol(mol, path), where path is a list of indices, will do what you want. Best, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 16 April 2014 16:44, JP jeanpaul.ebe...@inhibox.com wrote: Hi there RDKitters, This is probably an easy one, but I cannot find anything in the docs or the mailing list. I have a tuple of atom Ids (e.g. 21,22,24,26,27) and a mol and I would like to extract the substructure (molecule) which matches those indices. Note that in my case this will be a connected subgraph of the molecule (no fragmentation). This is pretty much the opposite of GetSubstruct family of methods which give Mol - Indices. I want Indices - Mol. Is there a convenience method to do this? - Jean-Paul Ebejer Early Stage Researcher -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Fragmentation Bug?
Hi Greg, I'm writing a piece of script that identifies the hydrophobic parts of compounds using the pharmacophore features and then fragment the compounds around the hydrophobic regions. While I was doing this I noticed that some compounds were raising ValueError exceptions during the fragmentation process. I've created a sample IPython Notebook that shows this. http://nbviewer.ipython.org/gist/CKannas/10255625/Fragmentation%20Bug%20Maybe.ipynb Check cell 6, you will notice the ValueError print out. For example for the first compound : COc1cc(C=CC(=O)OC2OC[C@@H](O)[C@H](O)[C@H]2O)ccc1O the hydrophobe region is: CC=Cc but when the compound is broken up in sidechains and core(hydrophobe region) the SMILES are: SideChains: [1*]:cc(OC)c(O)cc:[4*].[2*]=O.[3*]OC1OC[C@@H](O)[C@H](O)[C@H]1O Hydrophobe: [1*]c([2*])C=CC(=[3*])[4*] which I think is has some errors in the attachment points in hydrophobe region core. Is this the normal behaviour, or a bug? Also if I merge the two families of Hydrophobe features then I do not get the error. Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas -- Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SMARTS/SMARTS and SMILES/SMARTS substructure matching
Hi Greg, Thanks a lot for the explanation. It makes things clearer now. Well the reason I'm doing SMARTS-SMARTS match is because I would like to match functional groups with the reactants in reactions. Regards, Christos Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas On 5 March 2014 04:44, Greg Landrum greg.land...@gmail.com wrote: Hi Christos, On Tue, Mar 4, 2014 at 3:46 PM, Christos Kannas chriskan...@gmail.comwrote: Hi all, Why does the following happen? In [1]: from rdkit import Chem In [2]: from rdkit.Chem import AllChem In [3]: from rdkit.Chem import Draw In [4]: patt = Chem.MolFromSmarts([CH;D2;!$(C-[!#6;!#1])]=O) In [5]: z2 = Chem.MolFromSmarts([*]-C-C([H])(=O), 1) In [6]: print Chem.MolToSmiles(z2) [*]CC=O In [7]: print Chem.MolToSmarts(z2) *-C-[C!H0]=O In [9]: z2.HasSubstructMatch(patt) Out[9]: False In [10]: z3 = Chem.MolFromSmiles(Chem.MolToSmiles(z2)) In [11]: print Chem.MolToSmiles(z3) [*]CC=O In [12]: print Chem.MolToSmarts(z3) [*]-[#6]-[#6]=[#8] In [13]: z3.HasSubstructMatch(patt) Out[13]: True Shouldn't be that z2 and z3 have the same information? The way SMARTS/SMARTS matches is handled is different than the way SMARTS/SMILES matches works. The short answer is that when doing a SMARTS/SMARTS match, the RDKit compares the queries to each other; when doing a SMARTS/SMILES match, on the other hand, it checks to see if the atoms in the SMILES molecule match the queries in the SMARTS molecule. A bit longer answer: Molecules built using MolFromSmiles contain Atoms, molecules built using MolFromSmarts contain QueryAtoms. Both atoms and QueryAtoms have a Match() method that takes another Atom or QueryAtom as an argument and returns whether or not the two match. The substructure matching code makes heavy use of this Match() method. QueryAtom.Match(Atom) checks to see if the Atom satisfies the query. QueryAtom.Match(QueryAtom) checks to see if the queries on the atoms are the same. This uses a crude approach that is easy to fool, but I assume that a SMARTS-SMARTS match is not a frequent thing someone wants to do. query-query matching is also not a particularly easy problem to solve in a general way. -greg -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] sanitization removes Hs - is this expected?
Hi Michal, It doesn't actually removes them, to be more precise it hides them. You actually explicit define a hydrogen as ([H]), but if you omit it it still exists. You can use Chem.AddHs(...) to add the hydrogens in a molecule and Chem.RemoveHs(..) to hide them. Best, Christos On 24 February 2014 15:48, Michal Krompiec michal.kromp...@gmail.comwrote: Hello, I have just noticed this: Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H])) 'c1ccsc1' Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False)) '[H]c1sc([H])c([H])c1[H]' Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False))) 'c1ccsc1' Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H])) 'c1ccsc1' Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H],sanitize=False)) '[H]c1cscc1[H]' Is it the expected behaviour? Why does sanitization remove hydrogens? Is it controlled by any of the SanitizeFlags? Best wishes, Michal -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] SMARTS Substructure matching
Hi all, At my current project I'm working on reaction based multiobjective de novo design. And I have a set of reactions that I have converted into SMIRKS and reaction SMARTS.. The problem I have is that when I have a reactant pattern in SSMARTS, as required by SMIRKS, that has explicit mapped Hydrogens that play a role in reaction, and I request a substructure search matching to a compound that has the substructure in question it can not find a match. But when I change the pattern to not have explicit mapped hydrogens the substructure matching search is successful. To help you understand I've created this small IPython Notebook http://nbviewer.ipython.org/gist/CKannas/9089271 Can you give me the reasons why this happens? Best, Christos -- Christos Kannas Researcher Ph.D Student Mob (UK): +44 (0) 7447700937 Mob (Cyprus): +357 99530608 [image: View Christos Kannas's profile on LinkedIn]http://cy.linkedin.com/in/christoskannas -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Best Practice: Git model for code submissions to the RDkit
Hi JP, Best is to use branches for committing changes. And once the changes are in a state to share, then you merge the branch with master and do a pull request. Also try to use short meaningful names for branches. Regards, Christos On Thu, Oct 24, 2013 at 1:06 PM, JP jeanpaul.ebe...@inhibox.com wrote: Hi RDkit-Devs, ** Disclaimer: Just used git as svn till now ** What are the best practices for submitting code changes to the RDKit codebase via git? Right now I do the following: 0. Fork the rdkit repository (upstream) 1. Make my changes on the master 2. Send a pull request to original RDKit repo I have local commits I do not want to send in the pull request (e.g. .gitignore file which ignores all build files). Also I have some erroneous commits in my forked repo which I would not like to send over). The solution probably lies in using branches - but what is the best practice to do this? Should all commits which I want to send be in the branch and the commits I want to keep private be on the master (or on another branch). How do you do it? Perhaps I am thinking too much in terms of SVN. Cheers JP [small note: By mistake, I sent this email from another address to the mailing list and I got the Waiting for moderator approval message ... just pointing this out perhaps there are other messages stuck in that queue] -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Christos Kannas Researcher Ph.D Student e-Health Laboratory http://www.medinfo.cs.ucy.ac.cy/ kannas.chris...@ucy.ac.cy kannas.chris...@cs.ucy.ac.cy chriskan...@gmail.com Mob: (+357) 99530608 -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] [RDKit-Discuss]: Aromatic Heavy Atoms
Dear RDKiters, I'm creating a descriptor for estimating water solubility (clogSw) base on the following article of Delaney (doi:10.1021/ci034243x). J. S. Delaney, “ESOL: Estimating Aqueous Solubility Directly from Molecular Structure,” *Journal of Chemical Information and Modeling*, vol. 44, no. 3, pp. 1000–1005, May 2004. In this paper he proposes an equation to calculate an estimation of the water solubility of molecules based on physio-chemical descriptors. One of the descriptors used is Aromatic Proportion, that is the proportion of heavy atoms of the molecule that are in aromatic ring. So in order to find the aromatic heavy atoms I use GetSubstructMatches(...) with query SMARTS '[a]'. Is that the correct way to find all the aromatic atoms of a molecule? If not what is the correct SMARTS to use? @Greg: When I complete this, can we look into adding it as a new descriptor, clogSw (like clogP), within the RDKit distribution? Kind Regards, Christos -- Christos Kannas Researcher Ph.D Student e-Health Laboratory http://www.medinfo.cs.ucy.ac.cy/ kannas.chris...@ucy.ac.cy kannas.chris...@cs.ucy.ac.cy chriskan...@gmail.com Mob: (+357) 99530608 -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Domain of applicability
Hi Paul, We are using RDKit along with R, instead of scikit-learn, though the general idea is the same. Using a good dataset of compounds with known biological property you can train a model using an algorithm such as Random Forests. This model, can later be used on a new dataset to predict the aforementioned biological property. What you actually do, is finding correlations between the chemical properties and a known biological property of compounds. This can be applied to any problem where there is the need to predict a biological property. Hope this helps a bit. Kind regards, Christos -- Christos Kannas Sent from my Galaxy Note! On Mar 19, 2013 3:42 PM, paul.czodrow...@merckgroup.com wrote: Dear RDKitters, anyone worked with RDKit (data processing descriptor calculation) scikit-learn (train Random Forests) and could share some experiences with setting up/defining a domain of applicability? Cheers Thanks so far, Paul P.S.: Just resent this mail, since the last mail contained typos which might make future searches for keywords in the mailing list quite challenging... ;) This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer. -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] how to install rdkit systemwide on Ubuntu 12.04
Hi Michal, I don't think it will work, at least with Python since it requires the RDKit directory in PYTHONPATH and RDKit/lib is required in LD_LIBRARY_PATH in order that Python can find the library files created when building RDKit. Regards, Christos Kannas Sent from my Galaxy Note! On Nov 8, 2012 7:21 PM, Michał Nowotka mmm...@gmail.com wrote: Hello, I would like to install rdkit in such a way, I don't have to append anything to LD_LIBRARY_PATH. If I do: export RDBASE=/usr/lib and run cmake make make install would that do the trick? Kind regards, Michal Nowotka -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_nov ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_nov___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss