Another option is dask (https://docs.dask.org/en/latest/). I've used
`map_partitions` from dask to bulk convert a column of smiles strings into
various computed properties. You could then output to a CSV or other
database file.

-- Peter

On Mon, Jan 21, 2019 at 1:45 AM Markus Sitzmann <[email protected]>
wrote:

> > SQLalchemy creates a fairly specific ecosystem that you have to buy
> > into for it to make sense. When you don't have objects, only a table
> > of properties, OR mapper is just bloat.
>
> There is no need for objects with SQLAlchemy, SQLAlchemy's Core and its
> expression language is pretty excellent without objects ...
>
> >With parallel processing your bottleneck is going to be database
> >inserts. One option is write out CSV file(s) from each thread/job,
> >concatenate them in the final node, and then bulk-import into the
> >database: typically CSV (or other such format) bulk import is orders
> >of magnitude faster than inserting one SQL statement at a time.
>
> ... and bulk-inserts of Python data types into the database.
>
> Markus
>
> On Sun, Jan 20, 2019 at 9:17 PM Dmitri Maziuk via Rdkit-discuss <
> [email protected]> wrote:
>
>> On Sun, 20 Jan 2019 12:03:50 +0100
>> Shojiro Shibayama <[email protected]> wrote:
>>
>> > ... I guess SQLalchemy
>> > in python might be good, but I'm not sure. Hope that you'll find out
>> > a good library of SQL OR mapper for python.
>>
>> SQLalchemy creates a fairly specific ecosystem that you have to buy
>> into for it to make sense. When you don't have objects, only a table
>> of properties, OR mapper is just bloat.
>>
>> With parallel processing your bottleneck is going to be database
>> inserts. One option is write out CSV file(s) from each thread/job,
>> concatenate them in the final node, and then bulk-import into the
>> database: typically CSV (or other such format) bulk import is orders
>> of magnitude faster than inserting one SQL statement at a time.
>>
>> --
>> Dmitri Maziuk <[email protected]>
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to