Re: [Rdkit-discuss] Dividing inputstream over threads

Markus Sitzmann Mon, 21 Jan 2019 00:45:30 -0800

> SQLalchemy creates a fairly specific ecosystem that you have to buy
> into for it to make sense. When you don't have objects, only a table
> of properties, OR mapper is just bloat.


There is no need for objects with SQLAlchemy, SQLAlchemy's Core and its
expression language is pretty excellent without objects ...

>With parallel processing your bottleneck is going to be database
>inserts. One option is write out CSV file(s) from each thread/job,
>concatenate them in the final node, and then bulk-import into the
>database: typically CSV (or other such format) bulk import is orders
>of magnitude faster than inserting one SQL statement at a time.

... and bulk-inserts of Python data types into the database.

Markus

On Sun, Jan 20, 2019 at 9:17 PM Dmitri Maziuk via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> On Sun, 20 Jan 2019 12:03:50 +0100
> Shojiro Shibayama <notify.p...@gmail.com> wrote:
>
> > ... I guess SQLalchemy
> > in python might be good, but I'm not sure. Hope that you'll find out
> > a good library of SQL OR mapper for python.
>
> SQLalchemy creates a fairly specific ecosystem that you have to buy
> into for it to make sense. When you don't have objects, only a table
> of properties, OR mapper is just bloat.
>
> With parallel processing your bottleneck is going to be database
> inserts. One option is write out CSV file(s) from each thread/job,
> concatenate them in the final node, and then bulk-import into the
> database: typically CSV (or other such format) bulk import is orders
> of magnitude faster than inserting one SQL statement at a time.
>
> --
> Dmitri Maziuk <dmaz...@bmrb.wisc.edu>
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Dividing inputstream over threads

Reply via email to