Re: [Rdkit-discuss] ReactionFromSmarts and RunReactants

2016-04-22 Thread gregori
Hi Monica,

As Greg stated, why do you expect your product not to be fragmented?
I suggest you to have a careful look at the SMARTS definition (for 
instance from Daylight: 
http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html)
In SMARTS patterns, the dot indicates disconnected fragments. Thus you 
obtain in your product... disconnected fragments.

Best,

Grégori


Le 2016-04-22 10:13, 吴玲 a écrit :
> Greg,
> 
> After changing the pattern into
> "[C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:8]=[C:7]=[O:9].[N+:4].[C:6][C:5]=[O:11]
> , I get the same result .A ring structure just like you said is broken
> into pieces.
> pattern:[C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:8]=[C:7]=[O:9].[N+:4].[C:6][C:5]=[O:11]
> 
> output:[O-]C(=O)C1=CNC(=O)CC1.O.O>>CCC(=O)[O-].O=C=[O-].[NH4+].CCC=O
> 
> The input reactant ([O-]C(=O)C1=CNC(=O)CC1) if mapped with
> pattern'reactants ,it should be "
> [O-:8][C:7](=[O:9])[C:6]1=[C:5][N:4][C:2](=[O:3])[C:1][C:*]1,
> 
> what I mean is that the input reactant with a ring that CARBON6
> connect with CARBON1 by CARBON* ,why the CARBON6  AND  CARBON1 IN THE
> product is not connected by the CARBON* ?
> 
> Thanks a lot,
> 
> -Monica
> 
> At 2016-04-20 12:43:36, "Greg Landrum"  wrote:
> 
>> Monica,
>> 
>> Why do you think you should get a single chain from the ring
>> structure?
>> If you look at the atom mapping in your input reaction:
>> 
>> 
> [C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:7]=[C:8]=[O:9].[N+:4].[C:6][C:5]=[O:11]
>> 
>> You've told it to put:
>> - carbons 1 and 2 from the reactant into product 1
>> - carbon 8 into product 2, carbon 7 into product 2 but converted
>> into an oxygen first.
>> - carbon 5 and 6 into product 4. In other words: you requested that
>> the input molecule be broken into pieces.
>> 
>> The atom mapping tells you which atoms in the reactants correspond
>> to which atoms in the products.
>> 
>> I would suggest that you look carefully at your reaction
>> definitions, in particular the atom mapping numbers, in Marvin and
>> make sure that you think that what you're asking for makes sense.
>> 
>> -greg
>> 
>> On Wed, Apr 20, 2016 at 6:04 AM, 吴玲 
>> wrote:
>> 
>>> Hi Greg,
>>> 
>>> Thanks a lot for previous help! There is another question in
>>> RunReactants I want to ask for some help.
>>> 
>>> when I input the pattern as
>>> 
>> 
> “[C:1][C:2](=[O:3])[N+0:4][C:5]=[C:6][C:7]([O-:8])=[O:9].[OH2:10].[OH2:11]>>[C:1][C:2]([O-:10])=[O:3].[O:7]=[C:8]=[O:9].[N+:4].[C:6][C:5]=[O:11]”
>>> and give the reactants rs1 =
>>> 
>> 
> ["[O-]C(=O)C1=CNC(=O)CC1","CC(=O)NC=C(CC([O-])=O)C([O-])=O","CC(=O)NC=C(C(CO)C([O-])=O)C([O-])=O"]
>>> ,rs2 = ['O'] , rs3 = ['O'],
>>> when I check the output product, I find that the first reactants
>>> with a ring structure divided into four sections instead of three
>>> parts containing a NH4+,a CO2,and a long chain composed by the
>>> remained two fragments.In other words , I think that the product
>>> should be like this:
>>> 
>>> what's the matter with this pattern? why this reaction predicted
>>> is wrong and the other is correct?(output result attached below).
>>> 
>>> pattern:
>>> 
>>> 1 [O-]C(=O)C1=CNC(=O)CC1.O.O>>CCC(=O)[O-].O=C=O.[NH4+].CCC=O
>>> 
>>> 2
>>> 
>> 
> CC(=O)NC=C(CC([O-])=O)C([O-])=O.O.O>>CC(=O)[O-].O=C=O.[NH4+].O=(=O)[O-]
>>> 
>>> 3
>>> 
>> 
> CC(=O)NC=C(C(CO)C([O-])=O)C([O-])=O.O.O>>CC(=O)[O-].O=C=O.[NH4+].O=CCC(CO)C(=O)[O-]
>>> 
>>> best wishes,
>>> 
>>> monica
>>> 
>>> 
>> 
> --
>>> Find and fix application performance issues faster with
>>> Applications Manager
>>> Applications Manager provides deep performance insights into
>>> multiple tiers of
>>> your business applications. It resolves application problems
>>> quickly and
>>> reduces your MTTR. Get your free trial!
>>> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z [1]
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss [2]
> 
> 
> 
> Links:
> --
> [1] https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
> [2] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> --
> Find and fix application performance issues faster with Applications 
> Manager
> Applications Manager provides deep performance insights into multiple 
> tiers of
> your business applications. It resolves application problems quickly 
> and
> reduces your MTTR. Get your free trial!
> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
> ___
> Rdkit-discuss mailing list
> 

Re: [Rdkit-discuss] ?==?utf-8?q? Similarity maps using machine learning

2017-12-13 Thread gregori

Hi,

Getting the similarity maps for multiple structures should be straightforward, 
just add the function call in your loop.
Concerning the export of the generated images to a web service, I did the 
following:
(assuming the figure returned by the SimilarityMaps.GetSimilarityMapForModel() 
function is called "fig")
import io
from matplotlib import pyplot as plt
buf = io.BytesIO()
DPI = fig.get_dpi()
plt.savefig(buf, format='png', bbox_inches='tight', pad_inches=0, dpi=DPI/2)
plt.close(fig)
img = buf.getvalue()
buf.close()

I remember having to play around with the picture size, the coordScale and dpi 
in order to get an image of a proper size (as requested by the user, in pixels) 
and adequate font size:
fig = Draw.MolToMPL(mol, coordScale=1.5, size=(int(width/1.3),int(height/1.3)))
I can't exactly remember the details, but worth looking at these parameters if 
the image you get is not ok.

I use Flask for the web server;
I send the picture trough the server this way:
flask.send_file(io.BytesIO(img), mimetype='image/png')

Best,

Grégori

On Wednesday, December 13, 2017 06:39 CET, Greg Landrum 
 wrote:
 I know that Michal has done some work with this as part of the beta for the 
new ChEMBL interface.@Michal: do you have a bit of time to explain what you did 
in order to get images that you could serve via the web? On Tue, Dec 12, 2017 
at 1:50 PM, Bruno Neves  wrote:Dear colleagues I want 
to develop mechanistically interpretable machine learning models (i.e., using 
similarity maps) and implement them in web services. I've already managed to 
generate a map from a smiles. However, I can not generate maps for multiple 
molecules in a data set (CSV ou SDF). I’m also having some difficulty trying to 
save the new similarity map images. The scripts available in RDKit the tutorial 
do not provide detailed information to solve this problem.  Do you have any 
idea how I can solve this?  # Use the random forest to predict a new molecule 
(SMILES)>>> m = 
Chem.MolFromSmiles('FC(F)(F)C1=CC=C(OC(CCNC2=CC=CC=C2)C2=CC=CC=C2)C=C1')>>> fp 
= np.zeros((1,))>>> 
DataStructs.ConvertToNumpyArray(AllChem.GetMorganFingerprintAsBitVect(m, 
radius, nBits, useFeatures), fp)>>> print(rf.predict((fp,)))>>> 
print(rf.predict_proba((fp,))) # Get predicted probability map>>> def 
getProba(fp, predictionFunction):>>> return predictionFunction((fp,))[0][1]>>> 
fig, maxweight = SimilarityMaps.GetSimilarityMapForModel(m, 
SimilarityMaps.GetMorganFingerprint, lambda x: getProba(x, rf.predict_proba)) # 
Open CSV file with multiple molecules>>> m = 
pd.read_csv('C:\\Users\\bruno\\Desktop\\maps\\data\\logBB_S.csv', 
delimiter=',')>>> mols = []>>> y = []>>> for mol in Chem.SDMolSupplier(fname):  
  >>> if mol is not None:        >>> mols.append(mol)>>> fps = 
[AllChem.GetMorganFingerprintAsBitVect(m, radius, nBits,useFeatures) for m in 
mols]>>> def rdkit_np_convert(fp):   >>> output = []    >>> for f in fp:        
>>> arr = np.zeros((1,))        >>> DataStructs.ConvertToNumpyArray(f, arr)     
   >>> output.append(arr)    >>> return np.asarray(output)>>> x = 
rdkit_np_convert(fps)>>> x.shape>>> print(fp)>>> print(rf.predict(x))>>> 
print(rf.predict_proba((x))) # Get predicted probability maps for multiple 
structures??  Best regards, Prof. Dr. Bruno Junior 
NevesLaboratório de QuimioinformáticaCentro Universitário de Anápolis - 
UniEVANGÉLICA 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] ?==?utf-8?q? Doing substructure search as quickly as possible...

2020-02-10 Thread gregori

Hi Alexis,

Knowing what you want to achieve, I would take the problem the other way 
around. Instead of matching your many fragments to your input structure, I 
would rather apply the same transformation(s) you apply to your fragments to 
your input structure.
I know that you replace all non-hydrogen atoms by "any" atoms, and all 
single/double/triple bonds by "any" bonds; you could store a list of fragments 
where all non-hydrogen atoms are replaced by carbons, and all bonds by single 
bonds; you calculate and store the fingerprints of these fragments. Finally you 
apply the same transformation to your input structure, calculate the 
fingerprint, and do your substructure search.

Best,

Grégori


On Monday, February 10, 2020 16:08 CET, Alexis Parenty 
 wrote:
 
Dear Rdkiters,
I am interested in doing substructure searches between many thousands 
structures and many thousands of fragments, as quickly as possible, with 
reasonable accuracy (> 0.95)...
I did read Greg's excellent post on that subject:
http://rdkit.blogspot.com/2019/07/a-couple-of-substructure-search-topics.html
I was using the rdkit pattern fingerprint approach to filter out any fragments 
that have no chance of matching the bigger structure through the slow and more 
accurate molecular graph approach, saving a lot of time.
However, I realized that this rdkit pattern fingerprint approach only works 
well if we compared smiles with smiles:
 
def frag_is_a_substructure_of_structure_via_pfp(frag, smiles):
    pfp_frag = Chem.PatternFingerprint(Chem.MolFromSmiles(frag))
    pfp_structure = Chem.PatternFingerprint(Chem.MolFromSmiles(smiles))

    frag_bits = set(pfp_frag.GetOnBits())
    structure_bits = set(pfp_structure.GetOnBits())

    if frag_bits.issubset(structure_bits):
    return True
    else:
    return False
 
Unfortunately, some of my fragments are Smarts that are not valid Smiles: Using 
Chem.MolFromSmarts(smarts) gives really poor result (Many False Positives 
leading to poor Specificity). Interestingly, there is no False Negative, 
leading to a Sensitivity of 1!
 
def frag_is_a_substructure_of_structure_via_pfp(frag, smiles):
    pfp_frag = Chem.PatternFingerprint(Chem.MolFromSmarts(frag))
    pfp_structure = Chem.PatternFingerprint(Chem.MolFromSmiles(smiles))

    frag_bits = set(pfp_frag.GetOnBits())
    structure_bits = set(pfp_structure.GetOnBits())

    if frag_bits.issubset(structure_bits):
    return True
    else:
    return False
 
Is there a way to use pattern fingerprint (or other method) for substructure 
searches independently of the Smiles/Smarts format of the fragments? If not, is 
mol_struct.HasSubstructMatch(mol_frag) the only way I am left with?
Many thanks,

Alexis
 
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Getting the list of descriptors

2011-04-01 Thread Gerebtzoff, Gregori
Hi Noel,

You might try also the following:
dir(Descriptors)
The dir() function returns an alphabetized list of names.

You can also have a look at the inspect module, especially the getdoc and 
getargspec functions; these will give you more information on each function. 
The inspect.isroutine or inspect.isfunction functions might be helpful to find 
out which items of the list are real descriptors.

Cheers,

Grégori


--

Date: Thu, 31 Mar 2011 18:28:23 +0100
From: Noel O'Boyle baoille...@gmail.com
Subject: [Rdkit-discuss] Getting the list of descriptors
To: RDKit Discuss rdkit-discuss@lists.sourceforge.net
Message-ID: BANLkTikL68+nPc5hr4wqKRkmA-=pqb+...@mail.gmail.com
Content-Type: text/plain; charset=ISO-8859-1

Hi Greg,

With the deprecation of the AvailDescriptors module, it seems that the only way 
to get the list of descriptors is:
   len(Descriptors._descList)

I don't like accessing hidden attributes; is there some better way to do this?

- Noel


--
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and 
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Capturing warnings

2012-10-29 Thread Gerebtzoff, Gregori
Hi List,

Is there a clean way to capture the warnings generated by RDKit, for instance 
Warning: ring stereochemistry detected. The output SMILES is not canonical?
I tried several approaches (the warnings python module, by redirecting the 
stderr with os.devnull, or try/except) but none of them allowed me to capture 
these warnings.

Many thanks for your advices!

Best,

Grégori
--
The Windows 8 Center - In partnership with Sourceforge
Your idea - your app - 30 days.
Get started!
http://windows8center.sourceforge.net/
what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit cartridge speed issue

2013-04-23 Thread Gerebtzoff, Gregori
Hi RDKitters,

I'm facing some performance issue using the RDKit cartridge;
the database contains roughly 170k small molecules, I use the cartridge
version 0.20.0 on PostgreSQL 8.4.7, and the tanimoto_threshold is set to 0.5
A simple similarity search takes at least 30 seconds to complete.
The database has been recently vacuumed.
Any hints are most welcome!

Cheers,

Grégori


 Table public.test_db
 Column | Type  |  Modifiers
| Storage  | Description
+---+--+--+-
 rid| integer   | not null default
nextval('test_db_id_seq'::regclass) | plain|
 smi| mol   |
   | extended |
Indexes:
test_db_pkey PRIMARY KEY, btree (rid)
ididx btree (rid)
molidx gist (smi)
Referenced by:
TABLE test_db_fingerprints CONSTRAINT test_db_fingerprints_rid_fkey
FOREIGN KEY (rid) REFERENCES test_db(rid)
Has OIDs: no

   Table public.test_db_fingerprints
  Column   |  Type   | Modifiers | Storage  | Description
---+-+---+--+-
 rid   | integer |   | plain|
 pairbv| bfp |   | extended |
 torsionbv | bfp |   | extended |
 morganbv2 | bfp |   | extended |
Indexes:
apbvidx gist (pairbv)
morganbvidx gist (morganbv2)
rididx btree (rid)
torsbvidx gist (torsionbv)
Foreign-key constraints:
test_db_fingerprints_rid_fkey FOREIGN KEY (rid) REFERENCES
test_db(rid)
Has OIDs: no


explain analyze select test_db.rid, test_db.smi,
tanimoto_sml(atompairbv_fp('CN1C=NC2=C1C(=O)N(C(=O)N2C)C'), pairbv) sml
from test_db_fingerprints right join test_db on test_db.rid =
test_db_fingerprints.rid  where
atompairbv_fp('CN1C=NC2=C1C(=O)N(C(=O)N2C)C') % pairbv order by sml desc
limit 20;


QUERY PLAN

---
---
 Limit  (cost=2037.62..2037.67 rows=20 width=837) (actual
time=37990.369..37990.406 rows=11 loops=1)
   -  Sort  (cost=2037.62..2038.05 rows=172 width=837) (actual
time=37990.365..37990.379 rows=11 loops=1)
 Sort Key:
(tanimoto_sml('\\340\\377\\377\\377\\000\\010\\000\\0002\\000\\000\\000\\010\\204D\\022\\004*\\014\\004\\020\\024\\002\\020,\\016\\000\\020\\030\\036\\000\\020\\272\\004\\336B\\034\\036\\200h\\272\\245\\000BP8\\00
0\\022\\354\\204\\000:@Bq\\002\\004\\012.\\000\\245\\002'::bfp,
test_db_fingerprints.pairbv))
 Sort Method:  quicksort  Memory: 22kB
 -  Nested Loop  (cost=98.53..2033.05 rows=172 width=837) (actual
time=37726.008..37990.284 rows=11 loops=1)
   -  Bitmap Heap Scan on test_db_fingerprints
 (cost=98.53..713.44 rows=172 width=222) (actual time=37686.483..37806.422
rows=11 loops=1)
 Recheck Cond:
('\\340\\377\\377\\377\\000\\010\\000\\0002\\000\\000\\000\\010\\204D\\022\\004*\\014\\004\\020\\024\\002\\020,\\016\\000\\020\\030\\036\\000\\020\\272\\004\\336B\\034\\036\\200h\\272\\245\\000BP8\
\000\\022\\354\\204\\000:@Bq\\002\\004\\012.\\000\\245\\002'::bfp % pairbv)
 -  Bitmap Index Scan on apbvidx  (cost=0.00..98.49
rows=172 width=0) (actual time=37661.723..37661.723 rows=11 loops=1)
   Index Cond:
('\\340\\377\\377\\377\\000\\010\\000\\0002\\000\\000\\000\\010\\204D\\022\\004*\\014\\004\\020\\024\\002\\020,\\016\\000\\020\\030\\036\\000\\020\\272\\004\\336B\\034\\036\\200h\\272\\245\\000B
P8\\000\\022\\354\\204\\000:@Bq\\002\\004\\012.\\000\\245\\002'::bfp %
pairbv)
   -  Index Scan using test_db_pkey on test_db
 (cost=0.00..7.63 rows=1 width=623) (actual time=16.634..16.639 rows=1
loops=11)
 Index Cond: (test_db.rid = test_db_fingerprints.rid)
 Total runtime: 37990.523 ms
(12 rows)
--
Try New Relic Now  We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app,  servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Size of drawn molecules

2013-09-06 Thread Gerebtzoff, Gregori
Thanks Michal for the hint;
I'm currently using the 2013.06.1 version; I'll give a try with the github
version.

Gregori


On 6 September 2013 10:25, Michał Nowotka mmm...@gmail.com wrote:

 Did you check latest (github) version of RDKit? This problem (respecting
 dotsPerAngstrom) should be solved there.
 The only problem I see after changing dotsPerAngstrom is that font size
 stays the same so again you need to do the math and scale it on your own
 (and set atomLabelFontSize accordingly).

 Regards,
 Michal Nowotka




 On Fri, Sep 6, 2013 at 8:15 AM, Gerebtzoff, Gregori 
 gregori.gerebtz...@roche.com wrote:

 Hi guys,

 Is there an easy way to increase the maximal size of a molecule on the
 canvas?

 I realized that at some point increasing the canvas size won't increase
 the size of the molecule anymore.
 Looking at the code of MolDrawing.py the function scaleAndCenter seems
 to deal with that aspect, I tried to change for instance the value of
 dotsPerAngstrom but didn't help.

 Thanks,

 Grégori


 --
 Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
 Discover the easy way to master current and previous Microsoft
 technologies
 and advance your career. Get an incredible 1,500+ hours of step-by-step
 tutorial videos with LearnDevNow. Subscribe today and save!

 http://pubads.g.doubleclick.net/gampad/clk?id=58041391iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Transparency and enhanced molecular depiction

2013-11-14 Thread Gerebtzoff, Gregori
Hello RDKitters,

I'm happy to advertise the improved molecular depiction which has just been
pulled back in the main RDKit repository.
I want to thank Paul Emsley for the preliminary work during the hackathon
at the RDKit UGM, and Greg who helped me finalizing the changes, especially
for other canvas than Cairo, and testing the code.

What's new:
 - transparency: by default the molecules will be depicted on a white
background; for transparent background use this code:
from rdkit.Chem import Draw
o = Draw.DrawingOptions()
o.bgColor=None
Draw.MolToImage(m,size=(600,600),options=o)
 - improved depiction: the labels are better positioned (for instance NH3,
the bond will point to the N and not to the middle of the label anymore),
and the bonds are also better drawn.

Some work still has to be done especially for the matplotlib canvas.

Cheers,

Grégori
--
DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
Free app hosting. Or install the open source package on any LAMP server.
Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Compound Neutralization

2013-12-24 Thread Gerebtzoff, Gregori
Hi Yingfeng,

Let me remind you some chemistry basics:
Chlorine atom has 17 electrons. In last orbit it has 7 electrons hence it
requires 1 electron to complete octet. Hence it's valency is 1.
Thus it's not a surprise that your smiles is generating an error.

In order to check for get the charge state of a compound, you should loop
through every atom of the molecule and check its charge; you will find all
useful function here:
http://www.rdkit.org/Python_Docs/rdkit.Chem.rdchem.Atom-class.html
(For instance GetFormalCharge).

Some readings for you:
http://www.rdkit.org/docs/GettingStartedInPython.html

Best,

Gregori
--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Possible rotatable bonds replacement

2014-01-31 Thread Gerebtzoff, Gregori
Hi,

I would also go for the second option (i.e. replace the current SMART): I
also see it as a bug fix. What you could do is to highlight in the release
notes or somewhere else the call one would have to do to mimic the behavior
of previous releases:
Lipinski.RotatableBondSmarts =
Chem.MolFromSmarts('[!$(*#*)!D1]-!@[!$(*#*)!D1]')
Lipinski.NumRotatableBonds(m)

Grégori

Date: Fri, 31 Jan 2014 05:05:56 +0100
 From: Greg Landrum greg.land...@gmail.com
 Subject: [Rdkit-discuss] Possible rotatable bonds replacement
 To: RDKit Discuss rdkit-discuss@lists.sourceforge.net,Toby
 Wright
 toby.wri...@inhibox.com
 Message-ID:
 CAD4fdRTpTwHbq9iC3VYKfTKOZeGEtsjFG=
 xpfjmyue6x0_p...@mail.gmail.com
 Content-Type: text/plain; charset=iso-8859-1

 Dear all,

 A question for the community:

 Toby Wright submitted a pull request this week that introduces a new,
 stricter, rotatable bond definition:
 https://github.com/rdkit/rdkit/pull/211/files
 The new SMARTS, re-formatted to be somewhat more readable, is:
 [!$(*#*)\
  !D1\
  !$(C(F)(F)F)\
  !$(C(Cl)(Cl)Cl)\
  !$(C(Br)(Br)Br)\
  !$(C([CH3])([CH3])[CH3])\
  !$([CD3](=[N,O,S])-!@[#7,O,S!D1])\
  !$([#7,O,S!D1]-!@[CD3]=[N,O,S])\
  !$([CD3](=[N+])-!@[#7!D1])\
  !$([#7!D1]-!@[CD3]=[N+])]\
 -!@\
 [!$(*#*)\
  !D1\
  !$(C(F)(F)F)\
  !$(C(Cl)(Cl)Cl)\
  !$(C(Br)(Br)Br)\
  !$(C([CH3])([CH3])[CH3])]

 Toby was quite careful and added a new descriptor -
 NumStrictRotatableBonds() - that uses this SMARTS.

 I see a few options to deal with this:

 I could add the new descriptor as Toby provided it. People are then free to
 pick between NumRotatableBonds() and NumStrictRotatableBonds(). This has
 the advantage of maintaining strict backwards compatibility, but I could
 imagine it being confusing/irritating to people using the code to have to
 choose between them (or, worse, using both).

 Another option is to just replace the current NumRotatableBonds() SMARTS
 with the new one.
 This loses backwards compatibility, but replaces NumRotableBonds() with
 something more correct.

 Finally, I could take a hybrid approach: replace the default
 NumRotatableBonds() with the new one, but add an extra argument that allows
 the old one to be used.

 I'm leaning towards the second option. I'd normally go with the third, but
 I almost view this as a bug fix for the rotatable bonds definition.

 Comments? suggestions? Other options?
 -greg

--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Problem reading a specific smiles with the cartridge

2014-03-21 Thread Gerebtzoff, Gregori
Hi guys,

I've been having problem reading this particular smiles string with the
PostgreSQL cartridge: C12CC(C1)C2
I don't know if I'm running the latest version of the cartridge though...

Thanks for your help!

Grégori


 cursor.execute(select rdkit_version())
 cursor.fetchone()
['0.70.0']
 cursor.execute(select mol_from_smiles('C12CC(C1)C2'))
Traceback (most recent call last):
  File stdin, line 1, in module
  File /apps64/python/lib/python2.7/site-packages/psycopg2/extras.py,
line 122, in execute
return _cursor.execute(self, query, vars)

 import rdkit
 from rdkit import Chem, rdBase
 rdBase.rdkitVersion
'2013.09.2'
 mol = Chem.MolFromSmiles('C12CC(C1)C2')
rdkit.Chem.rdchem.Mol object at 0x1ebd7a60
 Chem.MolToSmiles(mol)
'C1C2CC1C2'
--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem reading a specific smiles with the cartridge

2014-03-22 Thread Gerebtzoff, Gregori
Hi Greg,

It's just that particular smiles, I don't have any problem reading
thousands of other smiles and loading them in the cartridge.
Which version of the cartridge do you use?

Gregori

On Saturday, March 22, 2014, Greg Landrum greg.land...@gmail.com wrote:

 Hi Grégori,

 It doesn't seem to be a problem with the cartridge itself:

 chembl_16=# select mol_from_smiles('C12CC(C1)C2');
  mol_from_smiles
 -
  C1C2CC1C2
 (1 row)

 I can also use it from psycopg2 without problems.

 Can you read other SMILES or is it just that one that's problematic?

 -greg


 On Fri, Mar 21, 2014 at 6:30 PM, Gerebtzoff, Gregori 
 gregori.gerebtz...@roche.comjavascript:_e(%7B%7D,'cvml','gregori.gerebtz...@roche.com');
  wrote:

 Hi guys,

 I've been having problem reading this particular smiles string with the
 PostgreSQL cartridge: C12CC(C1)C2
 I don't know if I'm running the latest version of the cartridge though...

 Thanks for your help!

 Grégori


  cursor.execute(select rdkit_version())
  cursor.fetchone()
 ['0.70.0']
  cursor.execute(select mol_from_smiles('C12CC(C1)C2'))
 Traceback (most recent call last):
   File stdin, line 1, in module
   File /apps64/python/lib/python2.7/site-packages/psycopg2/extras.py,
 line 122, in execute
 return _cursor.execute(self, query, vars)

  import rdkit
  from rdkit import Chem, rdBase
  rdBase.rdkitVersion
 '2013.09.2'
  mol = Chem.MolFromSmiles('C12CC(C1)C2')
 rdkit.Chem.rdchem.Mol object at 0x1ebd7a60
  Chem.MolToSmiles(mol)
 'C1C2CC1C2'



 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.netjavascript:_e(%7B%7D,'cvml','Rdkit-discuss@lists.sourceforge.net');
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem reading a specific smiles with the cartridge

2014-03-24 Thread Gerebtzoff, Gregori
Hi guys,

Many thanks for your help and suggestions!
Don't ask me why but restarting PostgreSQL did the trick, now my C12CC(C1)C2
smiles can be read correctly.
= select mol_from_smiles('C12CC(C1)C2');
 mol_from_smiles
-
 C1C2CC1C2
(1 row)

Maybe the DB was somehow corrupted, since I got subsequent warnings like
null argument to internal routine.
Sorry to have bothered you with that!

Grégori



On 23 March 2014 10:30, Greg Landrum greg.land...@gmail.com wrote:



 On Saturday, March 22, 2014, Gerebtzoff, Gregori 
 gregori.gerebtz...@roche.com wrote:

 Hi Greg,

 It's just that particular smiles, I don't have any problem reading
 thousands of other smiles and loading them in the cartridge.
 Which version of the cartridge do you use?


 I was just testing against the svn version.
 I don't recall having made any modifications in the SMILES parser that
 would lead to this behavior, but obviously something is going on.

 Peter's suggestion to try another form of the same SMILES (to check if
 it's the molecule and not the SMILES) is a very good one.

 -greg


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chem.PandasTools

2014-05-08 Thread Gerebtzoff, Gregori
Hi Paul,

The Draw modules also contains a ReactionToImage function;
Your MMP can be read as a reaction.
Hope this helps further!

Grégori

Date: Thu, 8 May 2014 16:31:32 +0200
 From: paul.czodrow...@merckgroup.com
 Subject: [Rdkit-discuss] Chem.PandasTools
 To: rdkit-discuss@lists.sourceforge.net
 Message-ID:
 
 ofc0c168e1.8dc7f4cf-onc1257cd2.004f2cec-c1257cd2.004fc...@merck.de
 Content-Type: text/plain; charset=US-ASCII

 Dear RDKitters,

 I started to play around with the great Chem.PandasTool contribution
 provided by Nicholas and Samo.

 Given such a data frame:
 
 Transformation  npairs
 1   [*:1][H][*:1]C5
 

 how do I depict the molecular transformation in the dataframe?


 I guess that I somehow have to integrate this function
 
 def showLine_MMP(in_string):
 f = in_string.split(\t)
 LHS = Chem.MolFromSmiles(f[0].split()[0])
 RHS = Chem.MolFromSmiles(f[0].split()[1])
 mols.append(LHS)
 mols.append(RHS)
 return Draw.MolsToGridImage(mols,molsPerRow=2)
 

 but I'm not sure how to accomplish this.


 Cheers  Thanks,
 Paul


 This message and any attachment are confidential and may be privileged or
 otherwise protected from disclosure. If you are not the intended recipient,
 you must not copy this message or attachment or disclose the contents to
 any other person. If you have received this transmission in error, please
 notify the sender immediately and delete the message and any attachment
 from your system. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not accept liability for any omissions or errors in this
 message which may arise as a result of E-Mail-transmission or for damages
 resulting from any unauthorized changes of the content of this message and
 any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not guarantee that this message is free of viruses and does
 not accept liability for any damages caused by any virus transmitted
 therewith.

 Click http://www.merckgroup.com/disclaimer to access the German, French,
 Spanish and Portuguese versions of this disclaimer.


--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chem.PandasTools

2014-05-08 Thread Gerebtzoff, Gregori
Hi Paul,

You first have to read the MMP into a reaction object
(Chem.ReactionFromSmarts).

Greg

On Friday, May 9, 2014, paul.czodrow...@merckgroup.com wrote:

 Dear Gregori  Samo,

 thanks for your hints.

 I just tried running

 Draw.ReactionToImage([*:1][H][*:1]C)

 =

 AttributeError: 'str' object has no attribute 'GetNumReactantTemplates'



 BTW, how would I finally add a picture to a Pandas data frame?


 Cheers,
 Paul


 
  Hi Paul,
 
  The Draw modules also contains a ReactionToImage function;
  Your MMP can be read as a reaction.
  Hope this helps further!
 
  Grégori


 This message and any attachment are confidential and may be privileged or
 otherwise protected from disclosure. If you are not the intended recipient,
 you must not copy this message or attachment or disclose the contents to
 any other person. If you have received this transmission in error, please
 notify the sender immediately and delete the message and any attachment
 from your system. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not accept liability for any omissions or errors in this
 message which may arise as a result of E-Mail-transmission or for damages
 resulting from any unauthorized changes of the content of this message and
 any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not guarantee that this message is free of viruses and does
 not accept liability for any damages caused by any virus transmitted
 therewith.

 Click http://www.merckgroup.com/disclaimer to access the German, French,
 Spanish and Portuguese versions of this disclaimer.

--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss