Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-21 Thread Dmitri Maziuk via Rdkit-discuss
On Mon, 21 Jan 2019 09:43:48 +0100 Markus Sitzmann wrote: > There is no need for objects with SQLAlchemy, SQLAlchemy's Core and > its expression language is pretty excellent without objects ... I spent weeks last year rewriting code that I myself wrote back when I believed that... When I wrote

Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-21 Thread Peter St. John
Another option is dask (https://docs.dask.org/en/latest/). I've used `map_partitions` from dask to bulk convert a column of smiles strings into various computed properties. You could then output to a CSV or other database file. -- Peter On Mon, Jan 21, 2019 at 1:45 AM Markus Sitzmann wrote: >

Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-21 Thread Markus Sitzmann
> SQLalchemy creates a fairly specific ecosystem that you have to buy > into for it to make sense. When you don't have objects, only a table > of properties, OR mapper is just bloat. There is no need for objects with SQLAlchemy, SQLAlchemy's Core and its expression language is pretty excellent

Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-20 Thread Dmitri Maziuk via Rdkit-discuss
On Sun, 20 Jan 2019 12:03:50 +0100 Shojiro Shibayama wrote: > ... I guess SQLalchemy > in python might be good, but I'm not sure. Hope that you'll find out > a good library of SQL OR mapper for python. SQLalchemy creates a fairly specific ecosystem that you have to buy into for it to make

Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-20 Thread Shojiro Shibayama
Hi, A python standard library multiprocessing may help you to parallelize your code. I wrote a code that converts SMILES to hashed MorganFP using parallel computation in the following short post. The code took 10 mins for 1.5m compounds when 6 processes were used.

Re: [Rdkit-discuss] Dividing inputstream over threads

2019-01-14 Thread Francois Berenger
On 15/01/2019 09:53, Andreas Luttens wrote: Hi! I have developed a small script that calculates molecules properties for molecules that are stored in a SMILES file. The properties should be stored in an SQL database, which works fine, but I would like to speed up the process a bit. I was

[Rdkit-discuss] Dividing inputstream over threads

2019-01-14 Thread Andreas Luttens
Hi! I have developed a small script that calculates molecules properties for molecules that are stored in a SMILES file. The properties should be stored in an SQL database, which works fine, but I would like to speed up the process a bit. I was thinking of implementing some parallelization for