from:"Peter St. John"

Re: [Rdkit-discuss] install on macosx with Python 3.8

2021-06-24 Thread Peter St. John

There should be a conda binary for mac / python 3.8 with conda-forge, just
follow the first command listed here:
https://www.rdkit.org/docs/Install.html#how-to-install-rdkit-with-conda

On Thu, Jun 24, 2021 at 11:59 AM Michal Krompiec 
wrote:

> Hello,
> Is it possible to install RDKit on MacOSX in a Python 3.8 environment?
> There is no conda binary for 3.8, so I tried homebrew. But the following
> gives me an error message (brew doesn't like the --with-python3 argument):
>
> brew install rdkit --with-python3 --without-numpy
>
> So I did just "brew install rdkit", but then rdkit is unimportable in
> Python ("No module named 'rdkit'"). What am I doing wrong?
>
> I'm using brew 3.2.0 on MacOS 11.4
>
>
> Thanks in advance,
>
>
> Michal Krompiec
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] ModuleNotFoundError: No module named 'rdkit'

2021-04-09 Thread Peter St. John

what happens when you type
$ which python
or
$ conda env list

On Fri, Apr 9, 2021 at 6:07 AM Andrés Sánchez Ruiz <
andressanchezrui...@gmail.com> wrote:

> Dear Greg,
>
> I didn't know I had to send my answer to the discuss inbox... my bad.
> First of all, thank you very much for your quick answer. But yes, I also
> have ipython installed in my enviroment and still get the same error :
>
> ipython   7.22.0   py39hd4e2768_0
> ipython_genutils  0.2.0  pyhd3eb1b0_1
> .
> .
> .
> rdkit 2021.03.1py39hfadf033_0conda-forge
>
> From this enviroment I can call pandas (for example) but not rdkit. What
> is still not working?
>
> Best regards,
>
> Andrés
>
> El mié, 7 abr 2021 a las 17:09, Greg Landrum ()
> escribió:
>
>> Hi Andrés,
>>
>> The typical reason for this problem is that you created a separate
>> environment for the RDKit and installed the package there, but forgot to
>> install ipython. When this happens ipython is run from the base environment
>> and can't find the rdkit. Can you please confirm that you have ipython
>> installed in the same environment that you installed the RDKit itself in?
>>
>> -greg
>>
>> On Wed, Apr 7, 2021 at 4:38 PM Andrés Sánchez Ruiz <
>> andressanchezrui...@gmail.com> wrote:
>>
>>> To whom it may concern,
>>>
>>> I am having some trouble with the rdkit installation, the error I get is
>>> the following:
>>> > import rdkit
>>> Traceback (most recent call last):
>>>
>>>   File "", line 1, in 
>>> import rdkit
>>>
>>> ModuleNotFoundError: No module named 'rdkit'
>>>
>>> However, when I check in my enviroment I can see the module installed:
>>> rdkit 2021.03.1py39hfadf033_0conda-forge
>>>
>>> I followed both the guide offered in your page:
>>> https://www.rdkit.org/docs/Install.html and two other videos on youtube
>>> that describe the procedure: https://www.youtube.com/watch?v=3JywpzUKon8
>>> and https://www.youtube.com/watch?v=UmW9Cr8uF5g which are slighlty
>>> different.
>>> I have Windows 10 installed in this computer and python version 3.8.5
>>> (the one that comes with anaconda). If you needed any further information
>>> that I am missing, please, let know.
>>>
>>> Thank in advance,
>>>
>>> Best regards,
>>>
>>> Andrés
>>>
>>> P.D. I first installed anaconda with the path option, then I uninstalled
>>> to see if such was the source of the error but still got it.
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] [External] Re: Using the RDKit with Dask

2021-03-22 Thread Peter St. John

Do you still get the error if you move the import into the function body?

def calc_bcut(smi):
from rdkit.Chem.rdMolDescriptors import BCUT2D
mol = Chem.MolFromSmiles(smi)
return BCUT2D(mol)


On Mon, Mar 22, 2021 at 7:29 AM Patrick Walters  wrote:

> 2020.09.5
>
> On Mon, Mar 22, 2021 at 9:24 AM Guillaume GODIN <
> guillaume.go...@firmenich.com> wrote:
>
>> Hi Pat,
>>
>>
>>
>> Hum, I’ve got same error as you.
>>
>>
>>
>> By the way I have to change code to use this
>>
>> from rdkit.Chem.rdMolDescriptors import CalcExactMolWt
>>
>> to avoid another error.
>>
>> Which version of rdkit do you use  ?
>>
>>
>>
>> BR
>>
>>
>>
>> Guillaume
>>
>>
>>
>>
>>
>> *De : *Patrick Walters 
>> *Date : *lundi, 22 mars 2021 à 14:20
>> *À : *Guillaume GODIN 
>> *Cc : *rdkit-discuss 
>> *Objet : *Re: [*External*] Re: [Rdkit-discuss] Using the RDKit with Dask
>>
>>
>>
>> The input is just SMILES and molecule name separated by a space.   I've
>> attached an example.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Pat
>>
>>
>>
>>
>>
>> On Mon, Mar 22, 2021 at 9:13 AM Guillaume GODIN <
>> guillaume.go...@firmenich.com> wrote:
>>
>> Hi Pat,
>>
>>
>>
>> Do you have a small example file to proceed , or can I use esol.csv for
>> example ?
>>
>>
>>
>> Thanks
>>
>>
>>
>> Guillaume
>>
>>
>>
>> *De : *Patrick Walters 
>> *Date : *lundi, 22 mars 2021 à 13:51
>> *À : *rdkit-discuss 
>> *Objet : *[*External*] Re: [Rdkit-discuss] Using the RDKit with Dask
>>
>> Apologies, there was a bug in the code I sent in my previous message.
>> The problem is the same.  Here is the corrected code in a gist.
>>
>>
>>
>> https://gist.github.com/PatWalters/ca41289a6990ebf7af1e5c44e188fccd
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Mar 22, 2021 at 8:16 AM Patrick Walters 
>> wrote:
>>
>> Hi All,
>>
>>
>>
>> I've been trying to calculate BCUT2D descriptors in parallel with Dask
>> and get this error with the code below.
>>
>> TypeError: cannot pickle 'Boost.Python.function' object
>>
>>
>>
>> Everything works if I call mw_df, which calculates molecular weight, but
>> I get the error above if I call bcut_df.  Does anyone have a workaround?
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Pat
>>
>>
>>
>> #!/usr/bin/env python
>>
>> import sys
>> import dask.dataframe as dd
>> import pandas as pd
>> from rdkit import Chem
>> from rdkit.Chem.Descriptors import MolWt
>> from rdkit.Chem.rdMolDescriptors import BCUT2D
>> import time
>>
>> # --  molecular weight functions
>> def calc_mw(smi):
>> mol = Chem.MolFromSmiles(smi)
>> return MolWt(mol)
>>
>> def mw_df(df):
>> return df.SMILES.apply(calc_mw)
>>
>> # -- bcut functions
>> def bcut_df(df):
>> return df.apply(calc_bcut)
>>
>> def calc_bcut(smi):
>> mol = Chem.MolFromSmiles(smi)
>> return BCUT2D(mol)
>>
>> def main():
>> start = time.time()
>> df = pd.read_csv(sys.argv[1],sep=" ",names=["SMILES","Name"])
>> ddf = dd.from_pandas(df,npartitions=16)
>> ddf['MW'] =
>> ddf.map_partitions(mw_df,meta='float').compute(scheduler='processes')
>> ddf['BCUT'] =
>> ddf.map_partitions(bcut_df,meta='float').compute(scheduler='processes')
>> print(time.time()-start)
>> print(ddf.head())
>>
>>
>> if __name__ == "__main__":
>> main()
>>
>>
>> ***
>> DISCLAIMER
>> This email and any files transmitted with it, including replies and
>> forwarded copies (which may contain alterations) subsequently transmitted
>> from Firmenich, are confidential and solely for the use of the intended
>> recipient. The contents do not represent the opinion of Firmenich except to
>> the extent that it relates to their official business.
>>
>> ***
>>
>>
>> ***
>> DISCLAIMER
>> This email and any files transmitted with it, including replies and
>> forwarded copies (which may contain alterations) subsequently transmitted
>> from Firmenich, are confidential and solely for the use of the intended
>> recipient. The contents do not represent the opinion of Firmenich except to
>> the extent that it relates to their official business.
>>
>> ***
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] molecule sizing in MolsToGridImage

2021-02-25 Thread Peter St. John

Hi RDKit users,

With rdkit 2020.03.3, MolsToGridImage had more reasonable molecule sizes:
[image: Screen Shot 2021-02-25 at 1.39.46 PM.png]

With rdkit 2020.09.4, the default drawing parameters seem to shrink the
molecules by default in MolsToGridImage:
[image: Screen Shot 2021-02-25 at 1.37.38 PM.png]

Is there a make the individual molecules larger in MolsToGridImage?
subImgSize just seems to increase the padding between molecules.

Thanks!
-- Peter
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] hybridization of nitrogen in beta-lactam

2021-02-13 Thread Peter St. John

Ok great, thanks for the clarification! I was trying to follow 'S3' in
table 2 of https://doi.org/10.1021/ci300415d, where their criteria was "at
most one sp2 -center in 4-membered rings" which apparently allowed
beta-lactam, but it seems there's more to that characterization than just
total degree.

-- Peter

On Sat, Feb 13, 2021 at 12:15 PM Peter S. Shenkin  wrote:

> Amide Ns are usually viewed as sp2 because of the resonance RC(=O)-NR2 <->
> RC([O-])=[N+]R2, where R can be H.
>
> Unlike sp3 Ns (amines), amides are not strong H-bond acceptors, though
> both amides and amines are strong donors. This observation is consistent
> with sp2 character.
>
> -P.
>
> On Sat, Feb 13, 2021 at 1:18 PM Peter St. John 
> wrote:
>
>> Is there any reason why RDKit says the nitrogen in beta-lactam is
>> SP2-hybridized? I would have assumed it should be SP3. It doesn't seem to
>> be the ring structure, 'C1NC1' lists all the atoms as being SP3.
>>
>>
>>
>> >>> [(atom.GetSymbol(), atom.GetHybridization()) for atom in
>>  rdkit.Chem.MolFromSmiles('O=C1CCN1').GetAtoms()]
>>
>> [('O', rdkit.Chem.rdchem.HybridizationType.SP2),
>>  ('C', rdkit.Chem.rdchem.HybridizationType.SP2),
>>  ('C', rdkit.Chem.rdchem.HybridizationType.SP3),
>>  ('C', rdkit.Chem.rdchem.HybridizationType.SP3),
>>  ('N', rdkit.Chem.rdchem.HybridizationType.SP2)]
>>
>>
>> Thanks!
>> -- Peter
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] hybridization of nitrogen in beta-lactam

2021-02-13 Thread Peter St. John

Is there any reason why RDKit says the nitrogen in beta-lactam is
SP2-hybridized? I would have assumed it should be SP3. It doesn't seem to
be the ring structure, 'C1NC1' lists all the atoms as being SP3.



>>> [(atom.GetSymbol(), atom.GetHybridization()) for atom in
 rdkit.Chem.MolFromSmiles('O=C1CCN1').GetAtoms()]

[('O', rdkit.Chem.rdchem.HybridizationType.SP2),
 ('C', rdkit.Chem.rdchem.HybridizationType.SP2),
 ('C', rdkit.Chem.rdchem.HybridizationType.SP3),
 ('C', rdkit.Chem.rdchem.HybridizationType.SP3),
 ('N', rdkit.Chem.rdchem.HybridizationType.SP2)]


Thanks!
-- Peter
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Exhaustive fragmentation of molecules

2020-01-09 Thread Peter St. John

I wrote some code to do something like this that you might be able to start
from:
https://github.com/pstjohn/bde/blob/master/bde/fragment.py
Mine was mainly concerned with going to/from SMILES strings of the parent
mol and resulting radicals

On Wed, Jan 8, 2020 at 3:40 AM Puck van Gerwen 
wrote:

> Dear rdkit community,
>
> I am looking to start from a mol object (loaded from an .xyz file) and
> return all possible fragments generated from breaking one bond (any bond
> order). I don't want any pre-encoded rules about which bonds to break as in 
> BRICS.
> I sa
>
> --
> Puck van Gerwen
> Doktorandin
> Gruppe von Anatole von Lilienfeld
> Universität Basel
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Anaconda installation without hard dependency on Intel MKl (windows)

2019-11-12 Thread Peter St. John

Another option would be to try the conda-forge rdkit. It doesn't appear to
use MKL -- I think the MKL dependency for the rdkit::rdkit package is
coming from the defaults::numpy dependency.

some tools for example scipy and pandas are only available as openblas
> builds via pypi (pip).

I believe the conda-forge recipes for these are also based on openblas. You
might try installing rdkit from conda-forge in a fresh environment and see
if those numpy / scipy builds work for you.

-- Peter

On Tue, Nov 12, 2019 at 6:48 AM Greg Landrum  wrote:

>
>
> On Tue, Nov 12, 2019 at 2:00 PM Thomas Strunz 
> wrote:
>
>>
>> So for me this is temporary workaround but not really a permanent long
>> term solution (and as far as I can tell mostly an issue of conda and
>> windows and not rdkit)
>>
>
> Yeah, it's clearly not the idea solution to the problem. And, yes, the
> problem is clearly related to conda and windows. It seems that there used
> to be a blog post explaining this (linked from this github issue:
> https://github.com/ContinuumIO/anaconda-issues/issues/656), but the URL
> no longer works and I can't find it.
> If the performance difference is that dramatic on AMD CPUs, it may be
> worth raising a new issue in the repo above and see if you get any kind of
> response.
>
> -greg
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] warnings when exporting pandas tables with molecules to hdf

2019-02-15 Thread Peter St. John

you might be better off not storing the molecule RDkit objects themselves
in the hdf file; but rather some other representation of the molecule. If
you need 3D atom coordinates, you could call MolToMolBlock() on each of the
rdkit mols, and then MolFromMolBlock later to regenerate them. If you don't
need 3D atom coordinates to get saved, SMILES strings would work well.

PyTables is expecting each entry to be something like an 'int', 'string',
'float64', etc. So the RDKit mol object is a fairly odd data structure for
that library; and it's just warning you that it will have to use Python's
`pickle` module to serialize it.

On Fri, Feb 15, 2019 at 6:35 AM Jose Manuel Gally <
jose.manuel.ga...@gmail.com> wrote:

> Hi all,
>
> I am working on some molecules in a pandas DataFrame and have to export
> them to a hdf file.
>
> This works just fine but I get a warning about Performance due to mixed
> types. (1)
>
> Why are RDKIT Mol objects causing this warning in the first place? Am I
> doing something wrong?
>
> Please find attached a small notebook with an example.
>
> For now I set the type of hdf to 'table', but I'm unsure this is the
> best work-around.
>
> Also, invoking pytest with --disable-warnings flag removes the message
> but the warning itself remains.
>
> Thanks in advance for any hindsight!
>
> Cheers,
> Jose Manuel
>
> (1) PerformanceWarning:
> your performance may suffer as PyTables will pickle object types that it
> cannot
> map directly to c-types [inferred_type->mixed,key->values] [items->None]
>
>return pytables.to_hdf(path_or_buf, key, self, **kwargs)
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Bond tags in SVGs

2019-02-05 Thread Peter St. John

Just wanted to chime in that I'm also excited by this functionality!

an alternative to the SVG classes would be to nest atoms and bonds in svg
groups;






You could also put the atoms and bonds in their own groups if it was easy:


...
...


 
...


That being said, I'm not sure that does much more for someone writing in
d3.js functionality that you couldn't do with the existing code. So if
trying to do that messes up the rest of the drawing code ignore the comment
:)


Thanks for the work on this!
-- Peter


On Tue, Feb 5, 2019 at 7:46 AM Greg Landrum  wrote:

> Sure. I don't have my Mac with me, so that'll need to wait until I'm back
> in Basel on the weekend.
>
> -greg
>
>
>
> On Tue, Feb 5, 2019 at 2:39 PM Lukas Pravda  wrote:
>
>> If it is not too much trouble to ask, please build it for mac os
>> (10.14.3) python 3.6.x.
>>
>>
>>
>> Thanks!
>>
>> Lukas
>>
>>
>>
>> *From: *Greg Landrum 
>> *Date: *Tuesday, 5 February 2019 at 13:40
>> *To: *Lukas Pravda 
>> *Cc: *RDKIT mailing list 
>> *Subject: *Re: [Rdkit-discuss] Bond tags in SVGs
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Feb 5, 2019 at 12:23 PM Lukas Pravda  wrote:
>>
>>
>>
>> Thanks for this. It looks excellent!! Is there a way how I can test this?
>> Other than cloning and compiling the repository? So far I have been using
>> rdkit solely from python and its conda builds, so don’t really know how to
>> test it.
>>
>>
>>
>> At the moment you would need to get a copy of the repo and build it. I
>> can do a build so that it's conda-installable though. Which OS are you
>> using?
>>
>>
>>
>> If I understand this correctly, the atom and bond class ids are added
>> only after TagAtoms() is called, or are they added at the ‘DrawMolecule()’
>> stage?
>>
>>
>>
>> Bond classes are added as the bonds are written. Atom classes can only be
>> added at the TagAtoms() stage - there's not an object in the SVG for many
>> atoms without TagAtoms() being called.
>>
>>
>>
>> I can imagine a lot of possible scenarios and use cases with this new
>> functionality. However, in order to make the function TagAtoms()
>> sufficiently general, a bit more control over the javascript used in the
>> events would be needed. As a possible suggestions, I can imagine to pass as
>> the third parameter a lambda selector, which would in turn feed the JS
>> function with parameters to display names/charges/whatever. Also it would
>> be nice to have a mean how to pass dict of key-val properties for both
>> atoms and bonds so that you can incorporate related data into the svg.
>>
>>
>>
>> Having said that, in my opinion if svgs end up as a part of
>> html/javascript application, it is the best to expose this interactivity
>> directly from the client, rather than ‘pre-generating’ the behaviour on the
>> server. So I’m not sure If it is worth investing time into mimicking this
>> functionality in C++/python code, Whoever is in a need of generating
>> interactive svgs, can directly consume the svg string and modify it
>> according to their needs.
>>
>>
>>
>> Yeah, that's more or less what I was thinking. We want to write something
>> that can be reasonably easily modified after the fact to produce something
>> useful.
>>
>>
>>
>>
>>
>> To sum up, I think it should enough just to tag positions and identifiers
>> of atoms/bonds exactly as you do and possibly further extend them with a
>> mean how to pass some extra data to all of it. Then users can modify svgs
>> whichever way they want, but others might think differently.
>>
>>
>>
>> Excellent!
>>
>>
>>
>> -greg
>>
>>
>>
>>
>>
>> Best,
>>
>> Lukas
>>
>> *From: *Greg Landrum 
>> *Date: *Sunday, 3 February 2019 at 17:49
>> *To: *Lukas Pravda 
>> *Cc: *RDKIT mailing list 
>> *Subject: *Re: [Rdkit-discuss] Bond tags in SVGs
>>
>>
>>
>> Hi Lukas,
>>
>>
>>
>> I had a chance to do a bit of work on this recently and I'd be interested
>> to hear your feedback.
>>
>>
>>
>> Bonds are now tagged with their bond IDs (using classes) and the
>> "TagAtoms()" function now adds clickable transparent circles above each
>> atom. These are also tagged with atom IDs using classes. TagAtoms() also
>> lets you add callback functions for events associated with the atom
>> circles. At the moment these are simply called with the atom id, but
>> there's almost certainly a better way to do that. Suggestions are very
>> welcome.
>>
>>
>>
>> Here's a gist showing what's currently on the branch:
>> https://gist.github.com/greglandrum/d23517cb449003252cf09b5bd14d8637
>>
>>
>>
>>
>>
>> On Tue, Dec 4, 2018 at 6:46 PM Lukas Pravda  wrote:
>>
>> Hi Greg,
>>
>>
>>
>> that’s what I have been thinking, unlucky. Essentially, I want to color
>> the molecule in web-browser with various annotations and make it
>> interactive. For that part I’m converting it internally to the d3.js
>> internal representation (https://d3js.org/) and connecting it to its
>> environment. For most of the parts I’m just fine with the position of atoms
>> in svg using the tag property.
>>
>>

Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-21 Thread Peter St. John

Another option is dask (https://docs.dask.org/en/latest/). I've used
`map_partitions` from dask to bulk convert a column of smiles strings into
various computed properties. You could then output to a CSV or other
database file.

-- Peter

On Mon, Jan 21, 2019 at 1:45 AM Markus Sitzmann 
wrote:

> > SQLalchemy creates a fairly specific ecosystem that you have to buy
> > into for it to make sense. When you don't have objects, only a table
> > of properties, OR mapper is just bloat.
>
> There is no need for objects with SQLAlchemy, SQLAlchemy's Core and its
> expression language is pretty excellent without objects ...
>
> >With parallel processing your bottleneck is going to be database
> >inserts. One option is write out CSV file(s) from each thread/job,
> >concatenate them in the final node, and then bulk-import into the
> >database: typically CSV (or other such format) bulk import is orders
> >of magnitude faster than inserting one SQL statement at a time.
>
> ... and bulk-inserts of Python data types into the database.
>
> Markus
>
> On Sun, Jan 20, 2019 at 9:17 PM Dmitri Maziuk via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net> wrote:
>
>> On Sun, 20 Jan 2019 12:03:50 +0100
>> Shojiro Shibayama  wrote:
>>
>> > ... I guess SQLalchemy
>> > in python might be good, but I'm not sure. Hope that you'll find out
>> > a good library of SQL OR mapper for python.
>>
>> SQLalchemy creates a fairly specific ecosystem that you have to buy
>> into for it to make sense. When you don't have objects, only a table
>> of properties, OR mapper is just bloat.
>>
>> With parallel processing your bottleneck is going to be database
>> inserts. One option is write out CSV file(s) from each thread/job,
>> concatenate them in the final node, and then bulk-import into the
>> database: typically CSV (or other such format) bulk import is orders
>> of magnitude faster than inserting one SQL statement at a time.
>>
>> --
>> Dmitri Maziuk 
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] issues with explicit / implicit valence

2018-11-15 Thread Peter St. John

Makes sense, apologies for the lack of details -- it was a bit of a
convoluted process to get to that molecule.
Attached is a python script that hopefully reproduces it.

Essentially I'm taking the result of a Gaussian optimization (for a
radical); constructing an SDF file with OpenBabel (via cclib), and then
trying to read the result in RDKit.
I have the SMILES string of the radical, but the connectivity is lost in
the gaussian output file. So the SDF that gets created by OpenBabel has to
assume bond orders based on distances that it sometimes gets wrong.
I also had to edit the AssignBondOrdersFromTemplate function in AllChem to
handle the radical atoms.

If you had another recommendation on going from a gaussian output file to
an RDKit mol though, I'd certainly like to hear it.

Thanks!
-- Peter

On Wed, Nov 14, 2018 at 10:53 PM Greg Landrum 
wrote:

> Hi Peter,
>
> Without seeing how you're building the molecule this one is a bit tricky
> to help with.
>
> If I start with a standard molecule and just adjust the valence count
> things are fine:
>
> In [22]: m = Chem.MolFromSmiles('CNC(C)C')
>
> In [23]: m.GetAtomWithIdx(0).SetNumRadicalElectrons(1)
>
> In [24]: mh = Chem.AddHs(m)
>
> In [25]: print(Chem.MolToMolBlock(mh))
>
>  RDKit  2D
>
>  16 15  0  0  0  0  0  0  0  0999 V2000
> 0.0.0. C   0  0  0  0  0  4  0  0  0  0  0  0
> 1.5000   -0.0. N   0  0  0  0  0  0  0  0  0  0  0  0
> 2.2500   -1.29900. C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.9510   -2.04900. C   0  0  0  0  0  0  0  0  0  0  0  0
> 3.5490   -0.54900. C   0  0  0  0  0  0  0  0  0  0  0  0
>-1.50000.0. H   0  0  0  0  0  0  0  0  0  0  0  0
> 0.1.50000. H   0  0  0  0  0  0  0  0  0  0  0  0
>-0.0972   -0.79120. H   0  0  0  0  0  0  0  0  0  0  0  0
> 2.08611.38080. H   0  0  0  0  0  0  0  0  0  0  0  0
> 3.   -2.59810. H   0  0  0  0  0  0  0  0  0  0  0  0
>-0.3481   -2.79900. H   0  0  0  0  0  0  0  0  0  0  0  0
> 0.3314   -1.54740. H   0  0  0  0  0  0  0  0  0  0  0  0
> 1.7010   -3.34810. H   0  0  0  0  0  0  0  0  0  0  0  0
> 4.84810.20100. H   0  0  0  0  0  0  0  0  0  0  0  0
> 4.2990   -1.84810. H   0  0  0  0  0  0  0  0  0  0  0  0
> 2.96300.83170. H   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   2  3  1  0
>   3  4  1  0
>   3  5  1  0
>   1  6  1  0
>   1  7  1  0
>   1  8  1  0
>   2  9  1  0
>   3 10  1  0
>   4 11  1  0
>   4 12  1  0
>   4 13  1  0
>   5 14  1  0
>   5 15  1  0
>   5 16  1  0
> M  RAD  1   1   2
> M  END
>
>
> In [26]: Chem.SanitizeMol(mh)
> Out[26]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
> In [27]: Chem.SanitizeMol(m)
> Out[27]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
>
>
> How are you constructing the molecule with the radical?
>
> Best,
> -greg
>
>
> On Wed, Nov 14, 2018 at 7:36 PM Peter St. John 
> wrote:
>
>> I have a molecule with radicals for which I'm trying to correct the bond
>> orders.
>> The mol block I have currently is shown below.
>>
>> Ultimately it thinks the first carbon (which is supposed to have 2
>> explicit hydrogens, 1 C-C bond, and 1 radical electron) has a valence of 5.
>> So when I try to call `SanitizeMol`, it errors out with too high a valence.
>>
>> for the problematic atom 'a',
>>
>> >>> a.GetNumImplicitHs()
>>
>> RuntimeError: Pre-condition Violation
>>  getNumImplicitHs() called without preceding call to 
>> calcImplicitValence()
>>
>>
>> >>> a.GetTotalValence()
>>
>> 3 (odd, since this is what I want)
>>
>>
>> >>> a.UpdatePropertyCache()
>>
>> ValueError: Sanitization error: Explicit valence for atom # 0 C, 5, is 
>> greater than permitted
>>
>>
>> And when I print the mol block, it clearly thinks that first carbon as a 
>> valence of 5.
>>
>> Any suggestions how to fix this?
>>
>>
>> >>> print(Chem.MolToMolBlock(mol))
>>
>> 9572
>>  RDKit  3D
>>
>>  15 14  0  0  0  0  0  0  0  0999 V2000
>> 2.0411   -0.0455   -0.1061 C   0  0  0  0  0  *5*  0  0  0  0  0  0
>> 0.8127   -0.56440.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
>>-0.39530.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
>>-0.65111.43260.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
>>-1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
>>

Re: [Rdkit-discuss] issues with explicit / implicit valence

2018-11-14 Thread Peter St. John

 a.GetNumRadicalElectrons() does have the right number.

If it set all the atoms to not have implicit hydrogens with
for atom in mol.GetAtoms():
atom.SetNoImplicit(True)

then I still get a sanitization error with
>>> Chem.SanitizeMol(mol)

ValueError: Sanitization error: Explicit valence for atom # 0 C, 5, is
greater than permitted


But oddly, the MolBlock is now correct:
>>> print(Chem.MolToMolBlock(mol))
...

2.0411   -0.0455   -0.1061 C   0  0  0  0  0  4  0  0  0  0  0  0

...

So I can get the molecule I'm looking for I suppose by calling
>>> mol2 = Chem.MolFromMolBlock(Chem.MolToMolBlock(mol))
¯\_(ツ)_/¯

Thanks for the help!`

On Wed, Nov 14, 2018 at 12:16 PM Paolo Tosco 
wrote:

> Hi Peter,
>
> try a.setNoImplicit(True)
>
> does a.GetNumRadicalElectrons() report the correct figure?
>
> Cheers,
> p.
>
> On 11/14/18 18:35, Peter St. John wrote:
>
> I have a molecule with radicals for which I'm trying to correct the bond
> orders.
> The mol block I have currently is shown below.
>
> Ultimately it thinks the first carbon (which is supposed to have 2
> explicit hydrogens, 1 C-C bond, and 1 radical electron) has a valence of 5.
> So when I try to call `SanitizeMol`, it errors out with too high a valence.
>
> for the problematic atom 'a',
>
> >>> a.GetNumImplicitHs()
>
> RuntimeError: Pre-condition Violation
>   getNumImplicitHs() called without preceding call to 
> calcImplicitValence()
>
>  >>> a.GetTotalValence()
>
> 3 (odd, since this is what I want)
>
>  >>> a.UpdatePropertyCache()
>
> ValueError: Sanitization error: Explicit valence for atom # 0 C, 5, is 
> greater than permitted
>
>  And when I print the mol block, it clearly thinks that first carbon as a 
> valence of 5.
>
> Any suggestions how to fix this?
>
>  >>> print(Chem.MolToMolBlock(mol))
>
> 9572
>  RDKit  3D
>
>  15 14  0  0  0  0  0  0  0  0999 V2000
> 2.0411   -0.0455   -0.1061 C   0  0  0  0  0  *5*  0  0  0  0  0  0
> 0.8127   -0.56440.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
>-0.39530.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.65111.43260.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
>-1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
> 2.15780.2387   -1.1430 H   0  0  0  0  0  0  0  0  0  0  0  0
> 2.9032   -0.40210.4366 H   0  0  0  0  0  0  0  0  0  0  0  0
> 0.7154   -0.78891.2330 H   0  0  0  0  0  0  0  0  0  0  0  0
>-0.22820.0219   -1.4109 H   0  0  0  0  0  0  0  0  0  0  0  0
>-0.84631.43781.2242 H   0  0  0  0  0  0  0  0  0  0  0  0
> 0.21972.0597   -0.0426 H   0  0  0  0  0  0  0  0  0  0  0  0
>-1.51611.8651   -0.3565 H   0  0  0  0  0  0  0  0  0  0  0  0
>-1.7375   -0.96401.0535 H   0  0  0  0  0  0  0  0  0  0  0  0
>-1.3932   -1.9131   -0.4005 H   0  0  0  0  0  0  0  0  0  0  0  0
>-2.4874   -0.5194   -0.4787 H   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   1  7  1  0
>   2  8  1  0
>   3  5  1  0
>   3  4  1  0
>   3  2  1  0
>   4 10  1  0
>   5 13  1  0
>   6  1  1  0
>   9  3  1  0
>  11  4  1  0
>  12  4  1  0
>  14  5  1  0
>  15  5  1  0
> M  RAD  1   1   2
> M  END
>
>  Thanks!
>
> -- Peter St. John
>
>
>
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] issues with explicit / implicit valence

2018-11-14 Thread Peter St. John

I have a molecule with radicals for which I'm trying to correct the bond
orders.
The mol block I have currently is shown below.

Ultimately it thinks the first carbon (which is supposed to have 2 explicit
hydrogens, 1 C-C bond, and 1 radical electron) has a valence of 5. So when
I try to call `SanitizeMol`, it errors out with too high a valence.

for the problematic atom 'a',

>>> a.GetNumImplicitHs()

RuntimeError: Pre-condition Violation
getNumImplicitHs() called without preceding call to 
calcImplicitValence()


>>> a.GetTotalValence()

3 (odd, since this is what I want)


>>> a.UpdatePropertyCache()

ValueError: Sanitization error: Explicit valence for atom # 0 C, 5, is
greater than permitted


And when I print the mol block, it clearly thinks that first carbon as
a valence of 5.

Any suggestions how to fix this?


>>> print(Chem.MolToMolBlock(mol))

9572
 RDKit  3D

 15 14  0  0  0  0  0  0  0  0999 V2000
2.0411   -0.0455   -0.1061 C   0  0  0  0  0  *5*  0  0  0  0  0  0
0.8127   -0.56440.2519 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.39530.0049   -0.3294 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.65111.43260.1487 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.5741   -0.9060   -0.0263 C   0  0  0  0  0  0  0  0  0  0  0  0
2.15780.2387   -1.1430 H   0  0  0  0  0  0  0  0  0  0  0  0
2.9032   -0.40210.4366 H   0  0  0  0  0  0  0  0  0  0  0  0
0.7154   -0.78891.2330 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.22820.0219   -1.4109 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.84631.43781.2242 H   0  0  0  0  0  0  0  0  0  0  0  0
0.21972.0597   -0.0426 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.51611.8651   -0.3565 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.7375   -0.96401.0535 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.3932   -1.9131   -0.4005 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.4874   -0.5194   -0.4787 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
  1  7  1  0
  2  8  1  0
  3  5  1  0
  3  4  1  0
  3  2  1  0
  4 10  1  0
  5 13  1  0
  6  1  1  0
  9  3  1  0
 11  4  1  0
 12  4  1  0
 14  5  1  0
 15  5  1  0
M  RAD  1   1   2
M  END


Thanks!

-- Peter St. John
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Peter St. John

Awesome, thanks for the tip!

Connor, that also is a great idea, I didn't know about atom-mapped SMILES
strings. That would definitely be a good method if the indexing algorithm
changes across rdkit versions.

Thanks!
-- Peter

On Tue, Oct 2, 2018 at 2:56 PM Nils Weskamp  wrote:

> Hi Peter,
>
> to the best of my knowledge: for a given SMILES string, you should
> always end up with the same molecule object.
>
> On the other hand, generation of (canonical / unique) SMILES often
> reorders atoms and bonds (to ensure that the SMILES is unique for a
> given structure). A conversion Molecule -> SMILES -> Molecule could thus
> lead to a different ordering of atoms and bonds and you will have to
> canonicalize your structure before you generate your index. [Or make
> sure that you use non-canonical SMILES.]
>
> Best,
> Nils
>
> Am 02.10.2018 um 22:32 schrieb Peter St. John:
> > If I store a molecule as a SMILES string, along with relevant
> > information about different bonds, is it safe to annotate those bond
> > entries by bond index?
> >
> > I.e., if I create a new rdkit Molecule with
> > rdkit.Chem.MolFromSmiles(xxx), will the bond ordering always be the
> > same? If not, does anyone know a a robust way of specifying a bond
> > within a molecule as a string-based representation?
> >
> > Thanks for the help!
> > -- Peter
> >
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Peter St. John

If I store a molecule as a SMILES string, along with relevant information
about different bonds, is it safe to annotate those bond entries by bond
index?

I.e., if I create a new rdkit Molecule with rdkit.Chem.MolFromSmiles(xxx),
will the bond ordering always be the same? If not, does anyone know a a
robust way of specifying a bond within a molecule as a string-based
representation?

Thanks for the help!
-- Peter
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] install on macosx with Python 3.8

Re: [Rdkit-discuss] ModuleNotFoundError: No module named 'rdkit'

Re: [Rdkit-discuss] [External] Re: Using the RDKit with Dask

[Rdkit-discuss] molecule sizing in MolsToGridImage

Re: [Rdkit-discuss] hybridization of nitrogen in beta-lactam

[Rdkit-discuss] hybridization of nitrogen in beta-lactam

Re: [Rdkit-discuss] Exhaustive fragmentation of molecules

Re: [Rdkit-discuss] Anaconda installation without hard dependency on Intel MKl (windows)

Re: [Rdkit-discuss] warnings when exporting pandas tables with molecules to hdf

Re: [Rdkit-discuss] Bond tags in SVGs

Re: [Rdkit-discuss] Dividing inputstream over threads

Re: [Rdkit-discuss] issues with explicit / implicit valence

Re: [Rdkit-discuss] issues with explicit / implicit valence

[Rdkit-discuss] issues with explicit / implicit valence

Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

[Rdkit-discuss] Are atom and bond indexes deterministic?

16 matches

Site Navigation

Mail list logo

Footer information