the topic of performance is one that many years of effort have gone into. For background on this topic particularly inserts, see:

http://docs.sqlalchemy.org/en/latest/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow

for a detailed test suite that will illustrate the many varieties of INSERT, see http://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html.

as far as bound parameter slowness, this is a known condition which can arise if you are for example passing a lot of unicode strings in and the database driver and/or SQLAlchemy is spending lots of time encoding them into a string encoding like utf-8 or similar. There's not really any other bound parameter process that is known to take much time, other than strings, pretty much all bound parameters are straight pass-throughs to the DBAPI. The difference is often that the straight DBAPI application doesn't set up unicode encoding whereas the SQLAlchemy version does (and can of course be changed).


If you can share a Python profile as documented at http://docs.sqlalchemy.org/en/latest/faq/performance.html#code-profiling I can show you how to remove the bound parameter overhead.



On 05/25/2017 05:49 AM, Михаил Доронин wrote:
I've tried to benchmark alchemy performance when inserting a lot of data.
The results wasn't that good for sqlalchemy. The difference was up to three times in median values.

First of all the more elements inserted the more the difference between sqlalchemy and executemany (mysqlclient). I've profiled the code - most of the time spent in visit_bind_param and BindParam initializer. I've skimmed over the code and no places for optimization are obvious, however it seems like the logic is too much compilcated. There's a lot of conditions etc. Maybe this can be simplified in some way or maybe there could be a parameter in the insert that user can use to say that he don't want any complex logic, he just inserting some data and he takes the responsibility that the data is correct.

Next thing is that in executemany they keep an eye on the size of the string to be executed and if it's more than max_allowed_packet limit they split it into batches (they hardcoded this limit though instead of taking it from database at runtime). Not only sqlalchemy isn't doing that - it doesn't provide a way to know what the size of a string would be. And the only thing the user can do is too catch exception and use heuristics to split the data.

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com <mailto:sqlalchemy+unsubscr...@googlegroups.com>. To post to this group, send email to sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to