[Rdkit-discuss] RDKit training workshop: link not working

2018-06-04 Thread Paul Czodrowski
eers, Paul ____ Paul Czodrowski, PhD Global Research & Development | Discovery Technologies Merck Merck KGaA | Frankfurter Str. 250 | Postcode: A019/001 | 64293 Darmstadt | Germany Phone: +49 6151 72 3218 E-mail: paul.czodrow...@merckgroup.com<ma

Re: [Rdkit-discuss] RDKit-MolVS intergration: Google Summer of Code Project

2018-04-25 Thread Paul Czodrowski
Susan, great news, looking forward to this project, enjoy GSoC! Paul Von: Susan Leung [mailto:susan.le...@st-hildas.ox.ac.uk] Gesendet: Mittwoch, 25. April 2018 23:35 An: rdkit-discuss@lists.sourceforge.net Betreff: [Rdkit-discuss] RDKit-MolVS intergration: Google Summer of Code Project Hi all,

Re: [Rdkit-discuss] [Rdkit-devel] [Announcement] 7th RDKit UGM in Cambridge UK

2018-04-10 Thread Paul Czodrowski
Great idea about the optional training on the day before. However, I could not see any link (on the eventbrite page) regarding this event. There was an option for the dinner & reception, but no training.. Or is the “main UGM registration” a different place than the evenbrite page? Paul Von:

Re: [Rdkit-discuss] News on 3D molecule visualization in RDKit (project: RDKit - 3Dmol.js integration)

2017-08-31 Thread Paul Czodrowski
lecule visualization in RDKit (project: RDKit - 3Dmol.js integration) Dear all, I am about to share news on 3D molecule visualization in RDKit. This summer I have worked as Google Summer Of Codes (GSoC) participant under supervision of Paul Czodrowski and Greg Landrum. The codes were reviewed s

Re: [Rdkit-discuss] rdkit

2017-07-05 Thread Paul Czodrowski
zlib 1.2.8 vc9_3 [vc9] Best regards, Paul Von: Greg Landrum [mailto:greg.land...@gmail.com] Gesendet: Mittwoch, 5. Juli 2017 06:33 An: Paul Czodrowski <paul.czodrow...@merckgroup.com> Cc: rdkit <rdkit-discuss@lists.sourceforge.net>

Re: [Rdkit-discuss] rdkit

2017-07-04 Thread Paul Czodrowski
g Landrum [mailto:greg.land...@gmail.com] Gesendet: Mittwoch, 5. Juli 2017 05:44 An: Paul Czodrowski <paul.czodrow...@merckgroup.com> Cc: rdkit <rdkit-discuss@lists.sourceforge.net> Betreff: Re: [Rdkit-discuss] rdkit Hi Paul, The small bit of sample code below and the error message make

[Rdkit-discuss] rdkit

2017-07-04 Thread Paul Czodrowski
S than Windows") Cheers, Paul ________ Paul Czodrowski, PhD Global Research & Development | Discovery Technologies Merck Merck KGaA | Frankfurter Str. 250 | Postcode: A019/001 | 64293 Darmstadt | Germany Phone: +49 6151 72 3218 E-mail: paul.cz

Re: [Rdkit-discuss] ipywidgets & py3Dmol

2017-06-20 Thread Paul Czodrowski
but no structure… This presumably looks like off-topic for this mailing list. However, if anyone can point into the right direction: that would be highly appreciated. Cheers, Paul Paul Czodrowski, PhD Global Research & Development | Disco

Re: [Rdkit-discuss] ipywidgets & py3Dmol

2017-06-20 Thread Paul Czodrowski
ched screenshot). This is the configuration I used (in Anaconda): python3.5.3 rdkit 2017.03.2 notebook 5.0.0 ipywidgets6.0.0 py3Dmol 0.6.3 Kind regards, Axel On 20.06.2017 07:48, Paul Czodrowski wrote:

[Rdkit-discuss] ipywidgets & py3Dmol

2017-06-20 Thread Paul Czodrowski
Dear RDKitters, When trying to re-run Greg's wonderful blog entry about the py3Dmol integration (https://rdkit.blogspot.de/2016/07/using-ipywidgets-and-py3dmol-to-browse.html ), I'm getting a different behavior (check the attachment with a screenshot and the jupyter notebook). Any help would

[Rdkit-discuss] RDKit-Py3DMol integration

2017-05-09 Thread Paul Czodrowski
Dear RDkitters, This is to inform you exciting community Malitha Kabir who will working as a GoogeSummerOfCode (GSoC) student over the next couple couple of weeks on the RDKit-Py3DMol integration. Let's give Malitha a warm welcome (and comprehensive replies during his GSoC project)! On

[Rdkit-discuss] OCEAN: Our Target Prediction Paper (including Source Code)

2016-09-26 Thread Paul Czodrowski
Dear RDKitters, Our target prediction method - fully based on RDKit - has become online: OCEAN: Optimized Cross rEActivity estimatioN http://pubs.acs.org/doi/abs/10.1021/acs.jcim.6b00067 The source code can be found here: https://github.com/rdkit/OCEAN We will give a talk as well an hands-on

[Rdkit-discuss] PostDoc position available

2016-05-10 Thread Paul Czodrowski
Dear RDKitters, this is slightly off topic, but nonetheless of interest to some of you. We are looking for a PostDoc heavily using RDKit for our PredictiveModels machinery at Merck in Darmstadt. Please find below the announcement and contact me via email if you are interested. Cheers, Paul "

Re: [Rdkit-discuss] Pandas dataframe manipulation

2016-03-11 Thread Paul Czodrowski
axis=0) Paul Von: Maciek Wójcikowski [mailto:mac...@wojcikowski.pl] Gesendet: Freitag, 11. März 2016 12:29 An: Paul Czodrowski <paul.czodrow...@merckgroup.com> Cc: rdkit <rdkit-discuss@lists.sourceforge.net> Betreff: Re: [Rdkit-discuss] Pandas dataframe manipulation Hi Paul

[Rdkit-discuss] Pandas dataframe manipulation

2016-03-11 Thread Paul Czodrowski
Dear RDKitter & Pandas-Dataframes heavy users, please find below a question concerning the conversion of pandas dataframes: df = pd.DataFrame({"item": ["a", "b", "c", "d", "e"], "row1": [1,2,3,">2",5], "row2":[0.1,0.2,0.3,0.4,0.5],"row3":["ab","cd","ed","gh","ij"]}) df_new =

[Rdkit-discuss] Weird iPython PandasTools notebook lookfeel

2014-09-26 Thread Paul . Czodrowski
Dear RDKitters, on my new Windows laptop, I tried one of the PandasTools examples, which should give this output: sdfFile = os.path.join(RDConfig.RDDataDir,'NCI/first_200.props.sdf') frame = PandasTools.LoadSDF(sdfFile,smilesName='SMILES',molColName='Molecule',includeFingerprints=True)

Re: [Rdkit-discuss] Weird iPython PandasTools notebook lookfeel

2014-09-26 Thread Paul . Czodrowski
Dear Riccardo, upps, that did the job! Thanks! Bur I have another: import os from rdkit import RDConfig sdfFile = os.path.join(RDConfig.RDDataDir,'NCI/first_200.props.sdf') frame = PandasTools.LoadSDF(sdfFile,smilesName='SMILES',molColName='Molecule',includeFingerprints=True)

[Rdkit-discuss] PandasTools on Windows iPython notebook

2014-09-25 Thread Paul . Czodrowski
Dear RDKitter, on my new Windows laptop, I run into this issue: import pandas as pd import rdkit.Chem as Chem from rdkit.Chem import PandasTools = ValueErrorTraceback (most recent call last) ipython-input-6-52de9e808c94 in module() 1 import pandas as pd

Re: [Rdkit-discuss] PandasTools on Windows iPython notebook

2014-09-25 Thread Paul . Czodrowski
height has been deprecated. but, it works! thanks, paul!!! paul From: Paul Emsley pems...@mrc-lmb.cam.ac.uk To: rdkit-discuss@lists.sourceforge.net, Date: 25.09.2014 19:32 Subject:Re: [Rdkit-discuss] PandasTools on Windows iPython notebook On 25/09/14 18:06,

Re: [Rdkit-discuss] question on MatchedPairs

2014-08-28 Thread Paul . Czodrowski
If what you are looking for is information about what type of atom [*:1] is in the original molecule, I think you can probably figure it out based on the additional info that's present in the output from Jameed's code. For example, here's one of the output lines from an indexing.py run

[Rdkit-discuss] question on MatchedPairs

2014-08-27 Thread Paul . Czodrowski
Dear RDKitters, I'm using Jameed's wonderful code for a matched pair analysis. Given such a transformation string [*:1]C[*:1][H] = How do I check if [*:1] is an aromatic or an aliphatic atom? I fear that this can only be done by going back into the original data/output, or am I wrong ?

Re: [Rdkit-discuss] Chem.PandasTools

2014-05-09 Thread Paul . Czodrowski
Dear Grégori, when storing the image into a new data frame: MMP_reaction = Chem.rdChemReactions.ReactionFromSmarts([*:1][H][*:1]C) newnew_df = pd.DataFrame(columns=['fig'],index=[1] ) newnew_df['fig'].ix[1] = Draw.ReactionToImage(MMP_reaction) apparently, the image can be stored in a data

[Rdkit-discuss] Chem.PandasTools

2014-05-08 Thread Paul . Czodrowski
Dear RDKitters, I started to play around with the great Chem.PandasTool contribution provided by Nicholas and Samo. Given such a data frame: Transformation npairs 1 [*:1][H][*:1]C5 how do I depict the molecular transformation in the dataframe? I guess that I somehow

Re: [Rdkit-discuss] Chem.PandasTools

2014-05-08 Thread Paul . Czodrowski
Dear Gregori Samo, thanks for your hints. I just tried running Draw.ReactionToImage([*:1][H][*:1]C) = AttributeError: 'str' object has no attribute 'GetNumReactantTemplates' BTW, how would I finally add a picture to a Pandas data frame? Cheers, Paul Hi Paul, The Draw modules

Re: [Rdkit-discuss] Possible rotatable bonds replacement

2014-01-30 Thread Paul . Czodrowski
I could add the new descriptor as Toby provided it. People are then free to pick between NumRotatableBonds() and NumStrictRotatableBonds (). This has the advantage of maintaining strict backwards compatibility, but I could imagine it being confusing/irritating to people using the code to

[Rdkit-discuss] aggdraw freetype2

2014-01-19 Thread Paul . Czodrowski
Dear RDKitters, on my Win7/32 bit /RDKit system, I just installed aggdraw, but apparently without great success: --- IOError Traceback (most recent call last)

[Rdkit-discuss] docker.io - container for fully fledged rdkit installation on linux?

2013-09-22 Thread Paul . Czodrowski
Dear RDKitters, a few days ago, I came across one technology named docker (http://www.docker.io/): it provides a possibility to ship a complete Linux distribution into one container. It attracted my interest, since I remembered one situation from the GordonResearchConference on CADD, where I

Re: [Rdkit-discuss] pandas / sd-tags

2013-07-02 Thread Paul . Czodrowski
Dear Niko, I was exactly looking for this functionality, great work! A few follow-up questions: * frame.set_index('_Name') did not work, but there is a name set in the SD file. * Is there a way to load in only a specified list of SD tags? (I didn't find a names parameter for LoadSDF) *

[Rdkit-discuss] PyMOL-Integration

2013-06-30 Thread Paul . Czodrowski
Dear RDKitters, when testing the PyMOL integration (given the iPython notebook from the tutorial): from rdkit import Chem from rdkit.Chem import AllChem from rdkit import RDConfig import os from rdkit.Chem.Draw import IPythonConsole from rdkit.Chem import Draw cdk2mols = [x for x in

[Rdkit-discuss] MMP analysis - active vs. inactive compounds

2013-05-03 Thread Paul . Czodrowski
Dear RDKitters, has anyone applied Jameed's great code to the following scenario: - Perform a MMP analysis with respect to a particular property (e.g. activity) Given the current code, I do not see any chance to consider any property besides the compound ID. It is also not possible to provide

Re: [Rdkit-discuss] any experience boost::bad_any_cast

2013-04-24 Thread Paul . Czodrowski
Dear Greg, thanks a lot for the build! Unfortunately, the same error appears. Strangely enough, the SD export works fine with our Linux installation. Cheers Thanks, Paul https://rdkit.googlecode.com/files/RDKit_2013_03_1beta2.win32.py27.zip On Tue, Apr 23, 2013 at 10:44 AM, Greg Landrum

Re: [Rdkit-discuss] any experience boost::bad_any_cast

2013-04-24 Thread Paul . Czodrowski
import sys,gzip,os from rdkit import Chem from rdkit.Chem import Descriptors from rdkit.ML.Descriptors import MoleculeDescriptors inF = sys.argv[1] if inF.endswith(.sdf.gz) or inF.endswith(.sd.gz): cpds = [x for x in Chem.ForwardSDMolSupplier(gzip.open(sys.argv[1])) if x is not None] else:

Re: [Rdkit-discuss] any experience boost::bad_any_cast

2013-04-22 Thread Paul . Czodrowski
Dear Greg, Hi Paul, On Sun, Apr 21, 2013 at 2:01 PM, paul.czodrow...@merckgroup.com wrote: Dear RDKitters, did anyone running into this error message when outputting a SDF: RuntimeError: boost::bad_any_cast: failed conversion using boost::any_cast One of the bug fixes in the

[Rdkit-discuss] any experience boost::bad_any_cast

2013-04-21 Thread Paul . Czodrowski
Dear RDKitters, did anyone running into this error message when outputting a SDF: RuntimeError: boost::bad_any_cast: failed conversion using boost::any_cast Cheers Thanks, Paul This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you

[Rdkit-discuss] Domain of appicability

2013-03-19 Thread Paul . Czodrowski
Dear RDKitters, anyone worked with RDKit (data processing descriptor calculation) scikit-learn (train Random Forests) and could share some experiences with setting up a domain of application? Cheers Thanks so far, Paul This message and any attachment are confidential and may be privileged

[Rdkit-discuss] Domain of applicability

2013-03-19 Thread Paul . Czodrowski
Dear RDKitters, anyone worked with RDKit (data processing descriptor calculation) scikit-learn (train Random Forests) and could share some experiences with setting up/defining a domain of applicability? Cheers Thanks so far, Paul P.S.: Just resent this mail, since the last mail contained

Re: [Rdkit-discuss] windows binary installation

2013-02-22 Thread Paul . Czodrowski
Dear RDKitters, thanks all to you for your quick help! Sorry my typo in the setting of the environment variables - if I meet any of you guys in the near future, I will offer him a drink, promised! Cheers Thanks again! Paul Subject: Re: [Rdkit-discuss] windows binary installation

[Rdkit-discuss] RDKit Windows Installation - missing DLL - add to wiki?

2013-02-22 Thread Paul . Czodrowski
Dear RDKitters, just to let you know - I added one % in the WindowsInstallation Wiki entry :) In addition, I added two links to install missing DLLs on Win7 systems - based on George's findings early 2012: http://code.google.com/p/rdkit/wiki/InstallingOnWindows Cheers, Paul This message

[Rdkit-discuss] descriptor calculation: MOE vs. RDKit implementation

2013-02-11 Thread Paul . Czodrowski
Dear RDKitters, how do the MOE and RDKit implementations of VSA descriptors correlate? I was looking into the documentation and only found a fingerprint correlation plot. Not sure if there is a correlation plot for the descriptors as well - or maybe it was part of my dreams. Given that

Re: [Rdkit-discuss] descriptor calculation: MOE vs. RDKit implementation

2013-02-11 Thread Paul . Czodrowski
Dear Greg, Dear RDKitters, how do the MOE and RDKit implementations of VSA descriptors correlate? I've never checked, so I'm afraid I don't know. At the time we implemented the descriptors, we didn't have access to MOE. I could run a comparison between MOE and RDKit and place the

Re: [Rdkit-discuss] writing sdf.gz files

2012-12-31 Thread Paul . Czodrowski
Dear Greg, thanks a lot for your help! Cheers, Paul The solution from the mailing list http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg02127.html^ file_name = sys.argv[1]+.onlylargestfrag.sdf.gz test_output = gzip.open(file_name,'w+') test_cpd_out =

[Rdkit-discuss] BRICS - discrepancies regarding the output

2012-12-16 Thread Paul . Czodrowski
Dear RDKitters, when trying to reproduce the BRICS example from the mailing list: http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg00479.html I end up in a different number of BRICS fragments (using 2012/9 release): base = Chem.MolFromSmiles(n1cncnc1OCC(C1CC1)OC1CNC1) catalog

Re: [Rdkit-discuss] one flavor of MCS

2012-12-13 Thread Paul . Czodrowski
Dear Andrew, thanks a lot for the quick hack and sorry for my late answer! I'm still interested in that issue, but I had no access to coding facilities for almost two days, what a shame! The current hooks in the MCS code don't make that possible. If you tweak the code a bit, I think you can

[Rdkit-discuss] one flavor of MCS

2012-12-11 Thread Paul . Czodrowski
Dear RDKitters, given a data set of let's say 2000 compounds, how do I extract the most common substructures rather than the maximum common substructures? In addition, I would like to output the frequency of the found substructures E.g., the output would look like that N(CCc1c1)C,

Re: [Rdkit-discuss] pilfont error message

2012-11-20 Thread Paul . Czodrowski
I took a couple things out of the thread: 1) those bad exceptions in the sping code need to be fixed. 2) it would be helpful to have the sping canvas support the font 'serif' as well as 'sans'. I checked in fixes for both of those this morning. cool, thanks! paul This message and any

Re: [Rdkit-discuss] RDKit pgSQL cartridge on 64-bit Windows, anyone ?

2012-11-14 Thread Paul . Czodrowski
Dear Jan, nope - but this reminds me on one UGM topic: Could anyone provide a 64bit Win7 build? Cheers Thanks, Paul Hi RDKitters, Before I embark on this journey - has anyone else attempted compiling and running the RDKit pgSQL cartridge on 64-bit Windows ? Gotchas, success stories, and

Re: [Rdkit-discuss] descriptor calculation - fragment counts Co.

2012-11-11 Thread Paul . Czodrowski
I'm wondering about the total number of accessible descriptors in RDKit: This is is my code: import sys from rdkit import Chem from rdkit.Chem import Descriptors from rdkit.ML.Descriptors import MoleculeDescriptors file_in = sys.argv[1] file_out = file_in+.descr.sdf ms =

Re: [Rdkit-discuss] Crippen pKa model in RDKit

2012-10-23 Thread Paul . Czodrowski
Dear James, that's a wonderful piece of work! I have attached the script (in .py and .pynb formats – it really is nice working interactively in the iPython notebook!). I have also attached the modified form of the decision tree data that was used. I hope these attachments come through ok,

Re: [Rdkit-discuss] computing molecular descriptors for a small molecule

2012-10-15 Thread Paul . Czodrowski
this one should work: http://code.google.com/p/rdkit/wiki/descriptor_calculation Paul C Hello, Let's say I am in Python and have a molecule in a .sdf file, how do I compute all molecular descriptors for that molecule? By all molecular descriptors, I mean all that rdkit knows about.

Re: [Rdkit-discuss] neutralize charges

2012-10-12 Thread Paul . Czodrowski
Dear Hans, you might also consider uploading a short wiki entry with the SMARTS patterns. Cheers Thanks, Paul Hi Andrew, We are doing this with a set of Smarts-based replacements. I can send you the Smarts but you will have to wait a couple of days as I cannot access them right now.

[Rdkit-discuss] set default protonation

2012-09-13 Thread Paul . Czodrowski
Dear RDKitters, has anyone worked so far on this topic: http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01706.html Rebalance protonation states by deprotonating strong acids and/or protonating strong bases This would be easy to do given a set of SMARTS patterns defining

Re: [Rdkit-discuss] ML question

2012-08-26 Thread Paul . Czodrowski
Dear Greg, # actual predicion prediction_dictionary = {} for x in cpds_w_descr: pred,conf=cmp.ClassifyExample(x[1:]) NAME=x[0] prediction_dictionary[NAME]=pred,conf i+=1 for mol in cpds: mol_name = mol.GetProp('_Name')

Re: [Rdkit-discuss] 3D SDF with hydrogens set - where have my hydrogens gone

2012-06-14 Thread Paul . Czodrowski
Who hasn't been bitten by this? Try: SDMolSupplier(isdf, removeHs=False) Wonderful, Jean-Paul, thank you very much! paul This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy

Re: [Rdkit-discuss] PIL - molecule depiction

2012-05-08 Thread Paul . Czodrowski
Dear Greg, PIL is a separate python package that you need to install: http://www.pythonware.com/products/pil/ I guess one important question that I forgot to ask is: Why are you trying to do this; from rdkit.sping import PIL Forget about that! Here, I was totally on the wrong track...

[Rdkit-discuss] PIL - molecule depiction

2012-05-07 Thread Paul . Czodrowski
Dear RDKitters, when typing in from rdkit.sping import PIL I end up in this error message Traceback (most recent call last): File stdin, line 1, in module File /SW/python/x86_64/2.6/lib/python2.6/site-packages/rdkit/sping/PIL/__init__.py, line 2, in module from pidPIL import * File

Re: [Rdkit-discuss] PIL - molecule depiction

2012-05-07 Thread Paul . Czodrowski
Dear Greg, Dear Paul, thanks for your prompt answers! Dear RDKitters, when typing in from rdkit.sping import PIL I end up in this error message Traceback (most recent call last):  File stdin, line 1, in module  File

Re: [Rdkit-discuss] ctest of new build - testGrid

2012-04-26 Thread Paul . Czodrowski
Dear RDKitters, I just wanted to mention that this issue was resolved - RDBASE was not correctly set. A classical one, at least from mailing listing point of view... Thanks2Greg, Cheers, Paul when running ctest on a new build, I run into an issue when it comes to testGrid - this step

[Rdkit-discuss] ctest of new build - testGrid

2012-04-24 Thread Paul . Czodrowski
Dear RDKitters, when running ctest on a new build, I run into an issue when it comes to testGrid - this step takes ages and does not stop... Of course, I checked the mail archive and found out some related problems: * --output-on-failure does not give any additional information Simply

Re: [Rdkit-discuss] Gobbi_Pharm2D

2012-01-13 Thread Paul . Czodrowski
I think I found the solution by myself: from rdkit.Chem.Pharm2D import Gobbi_Pharm2D fds=Gobbi_Pharm2D.factory.featFactory.GetFeatureDefs() followed by fds['COO'] = 'C(=O)O' COOPattern = Chem.MolFromSmarts(fds['COO']) and now I can search for carboxylic groups as well. the interested

[Rdkit-discuss] Gobbi_Pharm2D

2012-01-12 Thread Paul . Czodrowski
Dear RDKitters, is there any way of tailoring the Gobbi_Pharm2D fingerprints? Or to state it that way: Is it possible to code his own definitions which can be used for queryin? Cheers Thanks, Paul This message and any attachment are confidential and may be privileged or otherwise

Re: [Rdkit-discuss] Using Gobbi_Pharm2D - ctd

2011-11-09 Thread Paul . Czodrowski
Dear Greg, this is VERY helpful, thanks a lot! Nonetheless, I might come up with a few more questions... Cheers, Paul Dear Paul, On Mon, Nov 7, 2011 at 12:44 PM, paul.czodrow...@merckgroup.com wrote: can anyone given an example how to use the acid/base pharmacophore

[Rdkit-discuss] Using Gobbi_Pharm2D - ctd

2011-11-07 Thread Paul . Czodrowski
Dear RDkitters, can anyone given an example how to use the acid/base pharmacophore annotation using the Gobbi_Pharm2D fingerprints? Cheers Thanks, Paul This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended

Re: [Rdkit-discuss] Using Gobbi_Pharm2D

2011-11-02 Thread Paul . Czodrowski
Dear Greg, if my query molecule contains any acidic functionality, I would like to get all acids in the database - as defined by the Gobbi pharmacophore fingerprint. Can this be handled by the Gobbi fingerprints? In the GettingStartedTutorial, there is a table (on the very last page) of the

[Rdkit-discuss] Using Gobbi_Pharm2D

2011-10-27 Thread Paul . Czodrowski
Dear RDKitters, within a given list of SMILES codes, I would like to get matches between acidic functionalies and basic functionalities as well when I do a search gainst a query SMILES code. To be more precise: In the example given below, I would like to get all the entries in the db_smiles list

[Rdkit-discuss] multiprocessing rdkit

2011-10-11 Thread Paul . Czodrowski
Dear RDkitters, I'm trying to use Python's multiprocessing module in conjunction with RDKit. It should be applied in 2 cases: (1) fingerprint calculation (2) Picking Diverse Molecules (1) from multiprocessing import Pool p4 = Pool(processes=4) def fps_calc(m): fps =

Re: [Rdkit-discuss] multiprocessing rdkit

2011-10-11 Thread Paul . Czodrowski
Dear Jean-Paul, (1) from multiprocessing import Pool p4 = Pool(processes=4) ms = [x for x in Chem.SDMolSupplier('cpds.sdf') if x is not None] def fps_calc(m):        fps = [GetMorganFingerprint(x,3) for x in m]        return fps fps =  p4.map(fps_calc,ms) == TypeError:

Re: [Rdkit-discuss] multiprocessing rdkit

2011-10-11 Thread Paul . Czodrowski
Dear Greg, Dear Nik, thanks for your very quick and informative answers! I hope you two guys do not follow a battle rightaway :) (2) from multiprocessing import Pool p4 = Pool(processes=4) def distij(i,j,fps=fps):        return 1-DataStructs.DiceSimilarity(fps[i],fps[j])

Re: [Rdkit-discuss] how to come to a good model

2011-10-10 Thread Paul . Czodrowski
Dear Greg, One suggestion would be to try doing a two class model (either combine two of your classes together or use only classes 0 and 2 in the training) and see if that helps. Another would be try using different descriptors. You might be able to get something useful with the FeatMorgan

[Rdkit-discuss] how to come to a good model

2011-10-07 Thread Paul . Czodrowski
Dear RDKitters, I'm in the process of training a 3-class decision tree model. I have roughly about 1500 compounds with an almost equal distribution of the 3 classes. This is the Grow command I'm using for MorganFP model: nPossible = [0]+[2]*2048+[3]

Re: [Rdkit-discuss] cmp - store model on disk

2011-08-21 Thread Paul . Czodrowski
Dear Greg, Dear Paul, On Sat, Aug 20, 2011 at 4:35 PM, paul.czodrow...@merckgroup.com wrote: after having trained a model cmp = Composite() cmp.Grow (pts,attrs=attrs,nPossibleVals=nPossible,nTries=1, buildDriver=CrossValidate.CrossValidationDriver,treeBuilder=QuantTreeBoot,  

Re: [Rdkit-discuss] a documentation experiment

2011-08-09 Thread Paul . Czodrowski
Dear Greg, very neat nifty - I already like it a lot! Paul I'd be very happy for feedback on what people think of the new format. -greg This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient,

Re: [Rdkit-discuss] confusion matrix

2011-08-01 Thread Paul . Czodrowski
Dear Greg, I have added the code how I generate my own confusion matrix to the Wiki. In my understanding, my function uses the predictions from the out-of-bag prediction. But I guess that I have overlooked some nasty detail. You call: pred,conf=cmp.ClassifyExample(pts[i]) This uses

[Rdkit-discuss] confusion matrix

2011-07-28 Thread Paul . Czodrowski
Dear RDKitters, when training a solubility model (see http://code.google.com/p/rdkit/wiki/TrainAThreeClassSolubilityModel I run into the problem that three different confusion matrices are outputted. I wonder what is the origin of these confusion matrices. Even though x- and y-axis might be

[Rdkit-discuss] more questions regarding psql

2011-07-06 Thread Paul . Czodrowski
Dear all, I manually copied the Code directory after building rdkit into $RDBASE - don't understand why it was not copied over... Running make in $RDBASE/Code/PgSQL/rdkit gives the following: g++ -fomit-frame-pointer -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector

[Rdkit-discuss] questions regarding psql

2011-07-05 Thread Paul . Czodrowski
dear rdkitters, i'm following greg's great emolecules tutorial. psql was installed by our sysadmin on my linux pc. therefore, i'm not 100% familiar with all installation settings. createdb works. but i wonder where to find rdkit.sql? according to the wikie (

Re: [Rdkit-discuss] questions regarding psql

2011-07-05 Thread Paul . Czodrowski
Dear Greg, Dear Adrian, thanks for your hints! I just rebuilt rdkit (the beta Q2 release), but I cannot find $RDBASE/Code/PgSQL/rdkit Is there any flag/setting I overlooked? I did the following: make -D PYTHON_NUMPY_INCLUDE_PATH=/SW/python/lib/python2.6/site-packages/numpy/core/include/

[Rdkit-discuss] apply model - use Composite

2011-06-09 Thread Paul . Czodrowski
Dear all, I'm trying to apply a Composite model on a test set. However, no output is generated. At least, no error/warning, but I cannot jugde if the model gives any predictions. This is the model

Re: [Rdkit-discuss] ML, rdkit, 3class model

2011-06-08 Thread Paul . Czodrowski
Dear all, Dear Paul, On Tue, Jun 7, 2011 at 4:54 PM, paul.czodrow...@merck.de wrote: Dear folks, finally, I updated the Wiki entry for the 3class model: http://code.google.com/p/rdkit/wiki/TrainAThreeClassSolubilityModel Do you have any explanation for the bad statistics? [see

[Rdkit-discuss] mysql,frowns -- rdkit

2011-05-30 Thread Paul . Czodrowski
Dear folks, I have some old code which I would like to convert smoothly to rdkit. I have set-up a MySQL database of molecules given MOL2 and SMILES. The substructure search happens via frowns: db_smiles = db.do('SELECT ligand_id, smiles FROM ligands;') ligand_ids = [] for db_smile in

[Rdkit-discuss] Antwort: Re: Antwort: Re: Antwort: Re: random forest in RDKit - ctd.

2011-05-11 Thread Paul . Czodrowski
Dear Jean-Paul Greg, the solution is too easy... Thanks! Cheers, Paul try: from rdkit import ML (MY FIRST HELPFUL POST!!) Jean-Paul Ebejer Early Stage Researcher InhibOx Ltd Pembroke House 36-37 Pembroke Street Oxford OX1 1BP UK (+44 / 0) 1865 262 034 This message and any

[Rdkit-discuss] Antwort: Ways people can help

2011-05-11 Thread Paul . Czodrowski
Dear Greg, - Write todos, tutorials, or sample scripts I would jump here, even though I'm quite new to RDKit. Since I'm especially interested in the ML capabilities, I would be willing to expand the ML tutorial you started. e.g. I could provide a literature data set for solubility - huah, not

[Rdkit-discuss] Antwort: Re: random forest in RDKit - ctd.

2011-05-10 Thread Paul . Czodrowski
Dear Greg, However, I wonder how to build a 3-class model: for i,m in enumerate(ms):  if m.GetProp('ACTIVITY_CLASS')=='active':  act=1  else:  act=0  pts.append([m.GetProp('CompoundName')]+list(descrs[i])+[act]) Naively, I just tried act 0,1 or 2 - but this did not

[Rdkit-discuss] PDB processing

2011-05-09 Thread Paul . Czodrowski
Dear folks, is there a way to add occupancy B-factors (e.g. 1.00 50.0) to a PDB file? Thanks Cheers, Paul This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or

[Rdkit-discuss] random forest in RDKit - ctd.

2011-05-09 Thread Paul . Czodrowski
Dear Greg, the Wiki is a great place to start right from scratch with the RDKit ML capabilities! However, I wonder how to build a 3-class model: for i,m in enumerate(ms): if m.GetProp('ACTIVITY_CLASS')=='active': act=1 else: act=0

[Rdkit-discuss] rdBase.so

2011-05-04 Thread Paul . Czodrowski
Dear folks, what could be the reason causing the following error: [GCC 4.3.2 20080917 (Red Hat 4.3.2-4)] on linux2 Type help, copyright, credits or license for more information. from rdkit import Chem Traceback (most recent call last): File stdin, line 1, in module File

[Rdkit-discuss] PubChem search

2011-01-19 Thread Paul . Czodrowski
Dear RDKit users, has anyone used RDKit for local searches of PubChem? Can be approximate numbers of the performance be given how long a substructure search takes for, let's say, 50 million compounds? Best regards, Paul This message and any attachment are confidential and may be privileged or

[Rdkit-discuss] Antwort: Installation fails for KNIME nodes

2010-11-16 Thread Paul . Czodrowski
Dear all, I extracted the archive and copied the files to the plugins/ and features/ directories - now it works! Thanks to George Papadatos for his help! Cheers, Paul Dear all, I tried to install the RDKit KNIME nodes, but it fails on my machine. The starup window of the installation

[Rdkit-discuss] Installation fails for KNIME nodes

2010-11-14 Thread Paul . Czodrowski
Dear all, I tried to install the RDKit MOE nodes, but it fails on my machine. The starup window of the installation states: The software items you selected may not be valid with your current installation. Do you want to open the wizard anyway to review the selections? I'm running KNIME 2.2.2