[sqlalchemy] insert performance vs executemany and max_allowed_packet limit

Михаил Доронин Thu, 25 May 2017 02:49:29 -0700

I've tried to benchmark alchemy performance when inserting a lot of data.
The results wasn't that good for sqlalchemy. The difference was up to three 
times in median values.


First of all the more elements inserted the more the difference between 
sqlalchemy and executemany (mysqlclient).
I've profiled the code - most of the time spent in visit_bind_param and 
BindParam initializer. I've skimmed over the code and no places for 
optimization are obvious, however it seems like the logic is too much 
compilcated. There's a lot of conditions etc. Maybe this can be simplified 
in some way or maybe there could be a parameter in the insert that user can 
use to say that he don't want any complex logic, he just inserting some 
data and he takes the responsibility that the data is correct.

Next thing is that in executemany they keep an eye on the size of the 
string to be executed and if it's more than max_allowed_packet limit they 
split it into batches (they hardcoded this limit though instead of taking 
it from database at runtime).
Not only sqlalchemy isn't doing that - it doesn't provide a way to know 
what the size of a string would be. And the only thing the user can do is 
too catch exception and use heuristics to split the data.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

[sqlalchemy] insert performance vs executemany and max_allowed_packet limit

Reply via email to