Hi,

A python standard library multiprocessing may help you to parallelize your
code.

I wrote a code that converts SMILES to hashed MorganFP using parallel
computation in the following short post. The code took 10 mins for 1.5m
compounds when 6 processes were used.
https://loudspeaker.sakura.ne.jp/devblog/2019/01/20/python-multiprocessing-write-strings-single/

multiprocessing.Pool.imap can be incorporated into for loop, which safely
accesses to a text file or even your SQL. I guess SQLalchemy in python
might be good, but I'm not sure. Hope that you'll find out a good library
of SQL OR mapper for python.

Sincerely yours,
Shojiro


On Tue, 15 Jan 2019, 01:54 Andreas Luttens <andreas.lutt...@gmail.com wrote:

> Hi!
>
> I have developed a small script that calculates molecules properties for
> molecules that are stored in a SMILES file. The properties should be stored
> in an SQL database, which works fine, but I would like to speed up the
> process a bit. I was thinking of implementing some parallelization for the
> calculating of properties and storing into separate connections to my SQL
> database. I have done this before in Python with OpenEye and seems to be
> doing the trick. I would however want my code to useable by people who do
> not hold a license for OpenEye, which is why I try RDKit. I would like my
> code to be in C++ as well.
>
> I was wondering how I would tackle this problem. Does the RDKit have a
> similar functionality as an "oemolithread" to chunk up the incoming stream?
> I haven't found something like this when I first scrolled through
> documentation. If it is not implemented, how would I divide the work on
> incoming molecules over N threads?
>
> All help is very appreciated. Thanks in advance.
>
> Best regards,
>
> Andreas Luttens
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to