the topic of performance is one that many years of effort have gone
into. For background on this topic particularly inserts, see:
http://docs.sqlalchemy.org/en/latest/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow
for a detailed test suite that will illustrate the many varieties of
INSERT, see
http://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html.
as far as bound parameter slowness, this is a known condition which can
arise if you are for example passing a lot of unicode strings in and the
database driver and/or SQLAlchemy is spending lots of time encoding them
into a string encoding like utf-8 or similar. There's not really any
other bound parameter process that is known to take much time, other
than strings, pretty much all bound parameters are straight
pass-throughs to the DBAPI. The difference is often that the straight
DBAPI application doesn't set up unicode encoding whereas the SQLAlchemy
version does (and can of course be changed).
If you can share a Python profile as documented at
http://docs.sqlalchemy.org/en/latest/faq/performance.html#code-profiling
I can show you how to remove the bound parameter overhead.
On 05/25/2017 05:49 AM, Михаил Доронин wrote:
I've tried to benchmark alchemy performance when inserting a lot of data.
The results wasn't that good for sqlalchemy. The difference was up to
three times in median values.
First of all the more elements inserted the more the difference between
sqlalchemy and executemany (mysqlclient).
I've profiled the code - most of the time spent in visit_bind_param and
BindParam initializer. I've skimmed over the code and no places for
optimization are obvious, however it seems like the logic is too much
compilcated. There's a lot of conditions etc. Maybe this can be
simplified in some way or maybe there could be a parameter in the insert
that user can use to say that he don't want any complex logic, he just
inserting some data and he takes the responsibility that the data is
correct.
Next thing is that in executemany they keep an eye on the size of the
string to be executed and if it's more than max_allowed_packet limit
they split it into batches (they hardcoded this limit though instead of
taking it from database at runtime).
Not only sqlalchemy isn't doing that - it doesn't provide a way to know
what the size of a string would be. And the only thing the user can do
is too catch exception and use heuristics to split the data.
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and
Verifiable Example. See http://stackoverflow.com/help/mcve for a full
description.
---
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.