Re: [Rdkit-discuss] ReactionFromSmarts and RunReactants
Hi Monica, As Greg stated, why do you expect your product not to be fragmented? I suggest you to have a careful look at the SMARTS definition (for instance from Daylight: http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html) In SMARTS patterns, the dot indicates disconnected fragments. Thus you obtain in your product... disconnected fragments. Best, Grégori Le 2016-04-22 10:13, 吴玲 a écrit : > Greg, > > After changing the pattern into > "[C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:8]=[C:7]=[O:9].[N+:4].[C:6][C:5]=[O:11] > , I get the same result .A ring structure just like you said is broken > into pieces. > pattern:[C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:8]=[C:7]=[O:9].[N+:4].[C:6][C:5]=[O:11] > > output:[O-]C(=O)C1=CNC(=O)CC1.O.O>>CCC(=O)[O-].O=C=[O-].[NH4+].CCC=O > > The input reactant ([O-]C(=O)C1=CNC(=O)CC1) if mapped with > pattern'reactants ,it should be " > [O-:8][C:7](=[O:9])[C:6]1=[C:5][N:4][C:2](=[O:3])[C:1][C:*]1, > > what I mean is that the input reactant with a ring that CARBON6 > connect with CARBON1 by CARBON* ,why the CARBON6 AND CARBON1 IN THE > product is not connected by the CARBON* ? > > Thanks a lot, > > -Monica > > At 2016-04-20 12:43:36, "Greg Landrum"wrote: > >> Monica, >> >> Why do you think you should get a single chain from the ring >> structure? >> If you look at the atom mapping in your input reaction: >> >> > [C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:7]=[C:8]=[O:9].[N+:4].[C:6][C:5]=[O:11] >> >> You've told it to put: >> - carbons 1 and 2 from the reactant into product 1 >> - carbon 8 into product 2, carbon 7 into product 2 but converted >> into an oxygen first. >> - carbon 5 and 6 into product 4. In other words: you requested that >> the input molecule be broken into pieces. >> >> The atom mapping tells you which atoms in the reactants correspond >> to which atoms in the products. >> >> I would suggest that you look carefully at your reaction >> definitions, in particular the atom mapping numbers, in Marvin and >> make sure that you think that what you're asking for makes sense. >> >> -greg >> >> On Wed, Apr 20, 2016 at 6:04 AM, 吴玲 >> wrote: >> >>> Hi Greg, >>> >>> Thanks a lot for previous help! There is another question in >>> RunReactants I want to ask for some help. >>> >>> when I input the pattern as >>> >> > “[C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:7]=[C:8]=[O:9].[N+:4].[C:6][C:5]=[O:11]” >>> and give the reactants rs1 = >>> >> > ["[O-]C(=O)C1=CNC(=O)CC1","CC(=O)NC=C(CC([O-])=O)C([O-])=O","CC(=O)NC=C(C(CO)C([O-])=O)C([O-])=O"] >>> ,rs2 = ['O'] , rs3 = ['O'], >>> when I check the output product, I find that the first reactants >>> with a ring structure divided into four sections instead of three >>> parts containing a NH4+,a CO2,and a long chain composed by the >>> remained two fragments.In other words , I think that the product >>> should be like this: >>> >>> what's the matter with this pattern? why this reaction predicted >>> is wrong and the other is correct?(output result attached below). >>> >>> pattern: >>> >>> 1 [O-]C(=O)C1=CNC(=O)CC1.O.O>>CCC(=O)[O-].O=C=O.[NH4+].CCC=O >>> >>> 2 >>> >> > CC(=O)NC=C(CC([O-])=O)C([O-])=O.O.O>>CC(=O)[O-].O=C=O.[NH4+].O=(=O)[O-] >>> >>> 3 >>> >> > CC(=O)NC=C(C(CO)C([O-])=O)C([O-])=O.O.O>>CC(=O)[O-].O=C=O.[NH4+].O=CCC(CO)C(=O)[O-] >>> >>> best wishes, >>> >>> monica >>> >>> >> > -- >>> Find and fix application performance issues faster with >>> Applications Manager >>> Applications Manager provides deep performance insights into >>> multiple tiers of >>> your business applications. It resolves application problems >>> quickly and >>> reduces your MTTR. Get your free trial! >>> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z [1] >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss [2] > > > > Links: > -- > [1] https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > [2] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- > Find and fix application performance issues faster with Applications > Manager > Applications Manager provides deep performance insights into multiple > tiers of > your business applications. It resolves application problems quickly > and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > ___ > Rdkit-discuss mailing list >
Re: [Rdkit-discuss] ?==?utf-8?q? Similarity maps using machine learning
Hi, Getting the similarity maps for multiple structures should be straightforward, just add the function call in your loop. Concerning the export of the generated images to a web service, I did the following: (assuming the figure returned by the SimilarityMaps.GetSimilarityMapForModel() function is called "fig") import io from matplotlib import pyplot as plt buf = io.BytesIO() DPI = fig.get_dpi() plt.savefig(buf, format='png', bbox_inches='tight', pad_inches=0, dpi=DPI/2) plt.close(fig) img = buf.getvalue() buf.close() I remember having to play around with the picture size, the coordScale and dpi in order to get an image of a proper size (as requested by the user, in pixels) and adequate font size: fig = Draw.MolToMPL(mol, coordScale=1.5, size=(int(width/1.3),int(height/1.3))) I can't exactly remember the details, but worth looking at these parameters if the image you get is not ok. I use Flask for the web server; I send the picture trough the server this way: flask.send_file(io.BytesIO(img), mimetype='image/png') Best, Grégori On Wednesday, December 13, 2017 06:39 CET, Greg Landrumwrote: I know that Michal has done some work with this as part of the beta for the new ChEMBL interface.@Michal: do you have a bit of time to explain what you did in order to get images that you could serve via the web? On Tue, Dec 12, 2017 at 1:50 PM, Bruno Neves wrote:Dear colleagues I want to develop mechanistically interpretable machine learning models (i.e., using similarity maps) and implement them in web services. I've already managed to generate a map from a smiles. However, I can not generate maps for multiple molecules in a data set (CSV ou SDF). I’m also having some difficulty trying to save the new similarity map images. The scripts available in RDKit the tutorial do not provide detailed information to solve this problem. Do you have any idea how I can solve this? # Use the random forest to predict a new molecule (SMILES)>>> m = Chem.MolFromSmiles('FC(F)(F)C1=CC=C(OC(CCNC2=CC=CC=C2)C2=CC=CC=C2)C=C1')>>> fp = np.zeros((1,))>>> DataStructs.ConvertToNumpyArray(AllChem.GetMorganFingerprintAsBitVect(m, radius, nBits, useFeatures), fp)>>> print(rf.predict((fp,)))>>> print(rf.predict_proba((fp,))) # Get predicted probability map>>> def getProba(fp, predictionFunction):>>> return predictionFunction((fp,))[0][1]>>> fig, maxweight = SimilarityMaps.GetSimilarityMapForModel(m, SimilarityMaps.GetMorganFingerprint, lambda x: getProba(x, rf.predict_proba)) # Open CSV file with multiple molecules>>> m = pd.read_csv('C:\\Users\\bruno\\Desktop\\maps\\data\\logBB_S.csv', delimiter=',')>>> mols = []>>> y = []>>> for mol in Chem.SDMolSupplier(fname): >>> if mol is not None: >>> mols.append(mol)>>> fps = [AllChem.GetMorganFingerprintAsBitVect(m, radius, nBits,useFeatures) for m in mols]>>> def rdkit_np_convert(fp): >>> output = [] >>> for f in fp: >>> arr = np.zeros((1,)) >>> DataStructs.ConvertToNumpyArray(f, arr) >>> output.append(arr) >>> return np.asarray(output)>>> x = rdkit_np_convert(fps)>>> x.shape>>> print(fp)>>> print(rf.predict(x))>>> print(rf.predict_proba((x))) # Get predicted probability maps for multiple structures?? Best regards, Prof. Dr. Bruno Junior NevesLaboratório de QuimioinformáticaCentro Universitário de Anápolis - UniEVANGÉLICA -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] ?==?utf-8?q? Doing substructure search as quickly as possible...
Hi Alexis, Knowing what you want to achieve, I would take the problem the other way around. Instead of matching your many fragments to your input structure, I would rather apply the same transformation(s) you apply to your fragments to your input structure. I know that you replace all non-hydrogen atoms by "any" atoms, and all single/double/triple bonds by "any" bonds; you could store a list of fragments where all non-hydrogen atoms are replaced by carbons, and all bonds by single bonds; you calculate and store the fingerprints of these fragments. Finally you apply the same transformation to your input structure, calculate the fingerprint, and do your substructure search. Best, Grégori On Monday, February 10, 2020 16:08 CET, Alexis Parenty wrote: Dear Rdkiters, I am interested in doing substructure searches between many thousands structures and many thousands of fragments, as quickly as possible, with reasonable accuracy (> 0.95)... I did read Greg's excellent post on that subject: http://rdkit.blogspot.com/2019/07/a-couple-of-substructure-search-topics.html I was using the rdkit pattern fingerprint approach to filter out any fragments that have no chance of matching the bigger structure through the slow and more accurate molecular graph approach, saving a lot of time. However, I realized that this rdkit pattern fingerprint approach only works well if we compared smiles with smiles: def frag_is_a_substructure_of_structure_via_pfp(frag, smiles): pfp_frag = Chem.PatternFingerprint(Chem.MolFromSmiles(frag)) pfp_structure = Chem.PatternFingerprint(Chem.MolFromSmiles(smiles)) frag_bits = set(pfp_frag.GetOnBits()) structure_bits = set(pfp_structure.GetOnBits()) if frag_bits.issubset(structure_bits): return True else: return False Unfortunately, some of my fragments are Smarts that are not valid Smiles: Using Chem.MolFromSmarts(smarts) gives really poor result (Many False Positives leading to poor Specificity). Interestingly, there is no False Negative, leading to a Sensitivity of 1! def frag_is_a_substructure_of_structure_via_pfp(frag, smiles): pfp_frag = Chem.PatternFingerprint(Chem.MolFromSmarts(frag)) pfp_structure = Chem.PatternFingerprint(Chem.MolFromSmiles(smiles)) frag_bits = set(pfp_frag.GetOnBits()) structure_bits = set(pfp_structure.GetOnBits()) if frag_bits.issubset(structure_bits): return True else: return False Is there a way to use pattern fingerprint (or other method) for substructure searches independently of the Smiles/Smarts format of the fragments? If not, is mol_struct.HasSubstructMatch(mol_frag) the only way I am left with? Many thanks, Alexis ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Getting the list of descriptors
Hi Noel, You might try also the following: dir(Descriptors) The dir() function returns an alphabetized list of names. You can also have a look at the inspect module, especially the getdoc and getargspec functions; these will give you more information on each function. The inspect.isroutine or inspect.isfunction functions might be helpful to find out which items of the list are real descriptors. Cheers, Grégori -- Date: Thu, 31 Mar 2011 18:28:23 +0100 From: Noel O'Boyle baoille...@gmail.com Subject: [Rdkit-discuss] Getting the list of descriptors To: RDKit Discuss rdkit-discuss@lists.sourceforge.net Message-ID: BANLkTikL68+nPc5hr4wqKRkmA-=pqb+...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Hi Greg, With the deprecation of the AvailDescriptors module, it seems that the only way to get the list of descriptors is: len(Descriptors._descList) I don't like accessing hidden attributes; is there some better way to do this? - Noel -- Create and publish websites with WebMatrix Use the most popular FREE web apps or write code yourself; WebMatrix provides all the features you need to develop and publish your website. http://p.sf.net/sfu/ms-webmatrix-sf ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Capturing warnings
Hi List, Is there a clean way to capture the warnings generated by RDKit, for instance Warning: ring stereochemistry detected. The output SMILES is not canonical? I tried several approaches (the warnings python module, by redirecting the stderr with os.devnull, or try/except) but none of them allowed me to capture these warnings. Many thanks for your advices! Best, Grégori -- The Windows 8 Center - In partnership with Sourceforge Your idea - your app - 30 days. Get started! http://windows8center.sourceforge.net/ what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] RDKit cartridge speed issue
Hi RDKitters, I'm facing some performance issue using the RDKit cartridge; the database contains roughly 170k small molecules, I use the cartridge version 0.20.0 on PostgreSQL 8.4.7, and the tanimoto_threshold is set to 0.5 A simple similarity search takes at least 30 seconds to complete. The database has been recently vacuumed. Any hints are most welcome! Cheers, Grégori Table public.test_db Column | Type | Modifiers | Storage | Description +---+--+--+- rid| integer | not null default nextval('test_db_id_seq'::regclass) | plain| smi| mol | | extended | Indexes: test_db_pkey PRIMARY KEY, btree (rid) ididx btree (rid) molidx gist (smi) Referenced by: TABLE test_db_fingerprints CONSTRAINT test_db_fingerprints_rid_fkey FOREIGN KEY (rid) REFERENCES test_db(rid) Has OIDs: no Table public.test_db_fingerprints Column | Type | Modifiers | Storage | Description ---+-+---+--+- rid | integer | | plain| pairbv| bfp | | extended | torsionbv | bfp | | extended | morganbv2 | bfp | | extended | Indexes: apbvidx gist (pairbv) morganbvidx gist (morganbv2) rididx btree (rid) torsbvidx gist (torsionbv) Foreign-key constraints: test_db_fingerprints_rid_fkey FOREIGN KEY (rid) REFERENCES test_db(rid) Has OIDs: no explain analyze select test_db.rid, test_db.smi, tanimoto_sml(atompairbv_fp('CN1C=NC2=C1C(=O)N(C(=O)N2C)C'), pairbv) sml from test_db_fingerprints right join test_db on test_db.rid = test_db_fingerprints.rid where atompairbv_fp('CN1C=NC2=C1C(=O)N(C(=O)N2C)C') % pairbv order by sml desc limit 20; QUERY PLAN --- --- Limit (cost=2037.62..2037.67 rows=20 width=837) (actual time=37990.369..37990.406 rows=11 loops=1) - Sort (cost=2037.62..2038.05 rows=172 width=837) (actual time=37990.365..37990.379 rows=11 loops=1) Sort Key: (tanimoto_sml('\\340\\377\\377\\377\\000\\010\\000\\0002\\000\\000\\000\\010\\204D\\022\\004*\\014\\004\\020\\024\\002\\020,\\016\\000\\020\\030\\036\\000\\020\\272\\004\\336B\\034\\036\\200h\\272\\245\\000BP8\\00 0\\022\\354\\204\\000:@Bq\\002\\004\\012.\\000\\245\\002'::bfp, test_db_fingerprints.pairbv)) Sort Method: quicksort Memory: 22kB - Nested Loop (cost=98.53..2033.05 rows=172 width=837) (actual time=37726.008..37990.284 rows=11 loops=1) - Bitmap Heap Scan on test_db_fingerprints (cost=98.53..713.44 rows=172 width=222) (actual time=37686.483..37806.422 rows=11 loops=1) Recheck Cond: ('\\340\\377\\377\\377\\000\\010\\000\\0002\\000\\000\\000\\010\\204D\\022\\004*\\014\\004\\020\\024\\002\\020,\\016\\000\\020\\030\\036\\000\\020\\272\\004\\336B\\034\\036\\200h\\272\\245\\000BP8\ \000\\022\\354\\204\\000:@Bq\\002\\004\\012.\\000\\245\\002'::bfp % pairbv) - Bitmap Index Scan on apbvidx (cost=0.00..98.49 rows=172 width=0) (actual time=37661.723..37661.723 rows=11 loops=1) Index Cond: ('\\340\\377\\377\\377\\000\\010\\000\\0002\\000\\000\\000\\010\\204D\\022\\004*\\014\\004\\020\\024\\002\\020,\\016\\000\\020\\030\\036\\000\\020\\272\\004\\336B\\034\\036\\200h\\272\\245\\000B P8\\000\\022\\354\\204\\000:@Bq\\002\\004\\012.\\000\\245\\002'::bfp % pairbv) - Index Scan using test_db_pkey on test_db (cost=0.00..7.63 rows=1 width=623) (actual time=16.634..16.639 rows=1 loops=11) Index Cond: (test_db.rid = test_db_fingerprints.rid) Total runtime: 37990.523 ms (12 rows) -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Size of drawn molecules
Thanks Michal for the hint; I'm currently using the 2013.06.1 version; I'll give a try with the github version. Gregori On 6 September 2013 10:25, Michał Nowotka mmm...@gmail.com wrote: Did you check latest (github) version of RDKit? This problem (respecting dotsPerAngstrom) should be solved there. The only problem I see after changing dotsPerAngstrom is that font size stays the same so again you need to do the math and scale it on your own (and set atomLabelFontSize accordingly). Regards, Michal Nowotka On Fri, Sep 6, 2013 at 8:15 AM, Gerebtzoff, Gregori gregori.gerebtz...@roche.com wrote: Hi guys, Is there an easy way to increase the maximal size of a molecule on the canvas? I realized that at some point increasing the canvas size won't increase the size of the molecule anymore. Looking at the code of MolDrawing.py the function scaleAndCenter seems to deal with that aspect, I tried to change for instance the value of dotsPerAngstrom but didn't help. Thanks, Grégori -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58041391iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58041391iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Transparency and enhanced molecular depiction
Hello RDKitters, I'm happy to advertise the improved molecular depiction which has just been pulled back in the main RDKit repository. I want to thank Paul Emsley for the preliminary work during the hackathon at the RDKit UGM, and Greg who helped me finalizing the changes, especially for other canvas than Cairo, and testing the code. What's new: - transparency: by default the molecules will be depicted on a white background; for transparent background use this code: from rdkit.Chem import Draw o = Draw.DrawingOptions() o.bgColor=None Draw.MolToImage(m,size=(600,600),options=o) - improved depiction: the labels are better positioned (for instance NH3, the bond will point to the N and not to the middle of the label anymore), and the bonds are also better drawn. Some work still has to be done especially for the matplotlib canvas. Cheers, Grégori -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Compound Neutralization
Hi Yingfeng, Let me remind you some chemistry basics: Chlorine atom has 17 electrons. In last orbit it has 7 electrons hence it requires 1 electron to complete octet. Hence it's valency is 1. Thus it's not a surprise that your smiles is generating an error. In order to check for get the charge state of a compound, you should loop through every atom of the molecule and check its charge; you will find all useful function here: http://www.rdkit.org/Python_Docs/rdkit.Chem.rdchem.Atom-class.html (For instance GetFormalCharge). Some readings for you: http://www.rdkit.org/docs/GettingStartedInPython.html Best, Gregori -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Possible rotatable bonds replacement
Hi, I would also go for the second option (i.e. replace the current SMART): I also see it as a bug fix. What you could do is to highlight in the release notes or somewhere else the call one would have to do to mimic the behavior of previous releases: Lipinski.RotatableBondSmarts = Chem.MolFromSmarts('[!$(*#*)!D1]-!@[!$(*#*)!D1]') Lipinski.NumRotatableBonds(m) Grégori Date: Fri, 31 Jan 2014 05:05:56 +0100 From: Greg Landrum greg.land...@gmail.com Subject: [Rdkit-discuss] Possible rotatable bonds replacement To: RDKit Discuss rdkit-discuss@lists.sourceforge.net,Toby Wright toby.wri...@inhibox.com Message-ID: CAD4fdRTpTwHbq9iC3VYKfTKOZeGEtsjFG= xpfjmyue6x0_p...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 Dear all, A question for the community: Toby Wright submitted a pull request this week that introduces a new, stricter, rotatable bond definition: https://github.com/rdkit/rdkit/pull/211/files The new SMARTS, re-formatted to be somewhat more readable, is: [!$(*#*)\ !D1\ !$(C(F)(F)F)\ !$(C(Cl)(Cl)Cl)\ !$(C(Br)(Br)Br)\ !$(C([CH3])([CH3])[CH3])\ !$([CD3](=[N,O,S])-!@[#7,O,S!D1])\ !$([#7,O,S!D1]-!@[CD3]=[N,O,S])\ !$([CD3](=[N+])-!@[#7!D1])\ !$([#7!D1]-!@[CD3]=[N+])]\ -!@\ [!$(*#*)\ !D1\ !$(C(F)(F)F)\ !$(C(Cl)(Cl)Cl)\ !$(C(Br)(Br)Br)\ !$(C([CH3])([CH3])[CH3])] Toby was quite careful and added a new descriptor - NumStrictRotatableBonds() - that uses this SMARTS. I see a few options to deal with this: I could add the new descriptor as Toby provided it. People are then free to pick between NumRotatableBonds() and NumStrictRotatableBonds(). This has the advantage of maintaining strict backwards compatibility, but I could imagine it being confusing/irritating to people using the code to have to choose between them (or, worse, using both). Another option is to just replace the current NumRotatableBonds() SMARTS with the new one. This loses backwards compatibility, but replaces NumRotableBonds() with something more correct. Finally, I could take a hybrid approach: replace the default NumRotatableBonds() with the new one, but add an extra argument that allows the old one to be used. I'm leaning towards the second option. I'd normally go with the third, but I almost view this as a bug fix for the rotatable bonds definition. Comments? suggestions? Other options? -greg -- WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Problem reading a specific smiles with the cartridge
Hi guys, I've been having problem reading this particular smiles string with the PostgreSQL cartridge: C12CC(C1)C2 I don't know if I'm running the latest version of the cartridge though... Thanks for your help! Grégori cursor.execute(select rdkit_version()) cursor.fetchone() ['0.70.0'] cursor.execute(select mol_from_smiles('C12CC(C1)C2')) Traceback (most recent call last): File stdin, line 1, in module File /apps64/python/lib/python2.7/site-packages/psycopg2/extras.py, line 122, in execute return _cursor.execute(self, query, vars) import rdkit from rdkit import Chem, rdBase rdBase.rdkitVersion '2013.09.2' mol = Chem.MolFromSmiles('C12CC(C1)C2') rdkit.Chem.rdchem.Mol object at 0x1ebd7a60 Chem.MolToSmiles(mol) 'C1C2CC1C2' -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Problem reading a specific smiles with the cartridge
Hi Greg, It's just that particular smiles, I don't have any problem reading thousands of other smiles and loading them in the cartridge. Which version of the cartridge do you use? Gregori On Saturday, March 22, 2014, Greg Landrum greg.land...@gmail.com wrote: Hi Grégori, It doesn't seem to be a problem with the cartridge itself: chembl_16=# select mol_from_smiles('C12CC(C1)C2'); mol_from_smiles - C1C2CC1C2 (1 row) I can also use it from psycopg2 without problems. Can you read other SMILES or is it just that one that's problematic? -greg On Fri, Mar 21, 2014 at 6:30 PM, Gerebtzoff, Gregori gregori.gerebtz...@roche.comjavascript:_e(%7B%7D,'cvml','gregori.gerebtz...@roche.com'); wrote: Hi guys, I've been having problem reading this particular smiles string with the PostgreSQL cartridge: C12CC(C1)C2 I don't know if I'm running the latest version of the cartridge though... Thanks for your help! Grégori cursor.execute(select rdkit_version()) cursor.fetchone() ['0.70.0'] cursor.execute(select mol_from_smiles('C12CC(C1)C2')) Traceback (most recent call last): File stdin, line 1, in module File /apps64/python/lib/python2.7/site-packages/psycopg2/extras.py, line 122, in execute return _cursor.execute(self, query, vars) import rdkit from rdkit import Chem, rdBase rdBase.rdkitVersion '2013.09.2' mol = Chem.MolFromSmiles('C12CC(C1)C2') rdkit.Chem.rdchem.Mol object at 0x1ebd7a60 Chem.MolToSmiles(mol) 'C1C2CC1C2' -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.netjavascript:_e(%7B%7D,'cvml','Rdkit-discuss@lists.sourceforge.net'); https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Problem reading a specific smiles with the cartridge
Hi guys, Many thanks for your help and suggestions! Don't ask me why but restarting PostgreSQL did the trick, now my C12CC(C1)C2 smiles can be read correctly. = select mol_from_smiles('C12CC(C1)C2'); mol_from_smiles - C1C2CC1C2 (1 row) Maybe the DB was somehow corrupted, since I got subsequent warnings like null argument to internal routine. Sorry to have bothered you with that! Grégori On 23 March 2014 10:30, Greg Landrum greg.land...@gmail.com wrote: On Saturday, March 22, 2014, Gerebtzoff, Gregori gregori.gerebtz...@roche.com wrote: Hi Greg, It's just that particular smiles, I don't have any problem reading thousands of other smiles and loading them in the cartridge. Which version of the cartridge do you use? I was just testing against the svn version. I don't recall having made any modifications in the SMILES parser that would lead to this behavior, but obviously something is going on. Peter's suggestion to try another form of the same SMILES (to check if it's the molecule and not the SMILES) is a very good one. -greg -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chem.PandasTools
Hi Paul, The Draw modules also contains a ReactionToImage function; Your MMP can be read as a reaction. Hope this helps further! Grégori Date: Thu, 8 May 2014 16:31:32 +0200 From: paul.czodrow...@merckgroup.com Subject: [Rdkit-discuss] Chem.PandasTools To: rdkit-discuss@lists.sourceforge.net Message-ID: ofc0c168e1.8dc7f4cf-onc1257cd2.004f2cec-c1257cd2.004fc...@merck.de Content-Type: text/plain; charset=US-ASCII Dear RDKitters, I started to play around with the great Chem.PandasTool contribution provided by Nicholas and Samo. Given such a data frame: Transformation npairs 1 [*:1][H][*:1]C5 how do I depict the molecular transformation in the dataframe? I guess that I somehow have to integrate this function def showLine_MMP(in_string): f = in_string.split(\t) LHS = Chem.MolFromSmiles(f[0].split()[0]) RHS = Chem.MolFromSmiles(f[0].split()[1]) mols.append(LHS) mols.append(RHS) return Draw.MolsToGridImage(mols,molsPerRow=2) but I'm not sure how to accomplish this. Cheers Thanks, Paul This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer. -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chem.PandasTools
Hi Paul, You first have to read the MMP into a reaction object (Chem.ReactionFromSmarts). Greg On Friday, May 9, 2014, paul.czodrow...@merckgroup.com wrote: Dear Gregori Samo, thanks for your hints. I just tried running Draw.ReactionToImage([*:1][H][*:1]C) = AttributeError: 'str' object has no attribute 'GetNumReactantTemplates' BTW, how would I finally add a picture to a Pandas data frame? Cheers, Paul Hi Paul, The Draw modules also contains a ReactionToImage function; Your MMP can be read as a reaction. Hope this helps further! Grégori This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith. Click http://www.merckgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer. -- Is your legacy SCM system holding you back? Join Perforce May 7 to find out: #149; 3 signs your SCM is hindering your productivity #149; Requirements for releasing software faster #149; Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss