Hello! I've spent some time looking at SQLAlchemy's ability to batch 
inserts, and have a few questions about bulk_save_objects (and flushing in 
general).

Two statements that I think are true:

   1. Right now, bulk_save_objects does not fetch primary keys for inserted 
   rows (unless return_defaults is passed in, which mitigates the performance 
   gains by performing one insert statement per model).
   2. When running db.session.flush(), it looks like SQLAlchemy can batch 
   model creation into a single multi-value insert statement *only if all of 
   the primary keys are already defined on the model*. (Verified by looking at 
   sqlalchemy/lib/sqlalchemy/orm/persistence.py:_emit_insert_statements)


Questions:

   1. Would it be possible for bulk_save_objects to fetch primary keys 
   while still batching INSERTs into a single query?
   2. Or even better, would it be possible to batch INSERT statements 
   during a flush even when models don't have their primary keys pre-defined? 
   (This would help with e.g. inserting a bunch of rows into a table with an 
   auto-incrementing primary key).

At least in Postgres (haven't verified for other databases), it looks like 
one can use RETURNING for inserts with multi-value VALUES clauses. At the 
surface, it seems like one could extend the current behavior for a single 
row INSERT:

INSERT INTO table ... VALUES (1) RETURNING id

to multi-value INSERTS:

INSERT INTO table ... VALUES (1), (2), ... RETURNING id

and get all the benefits of the ORM while still batching inserts.

I'm sure this is something that was thought about already, so I'd mostly 
love to hear any thoughts about what makes this hard. We have a lot of 
complex relationships and joined-table inheritance, so working with 
bulk_save_objects has been a bit challenging. From the comments in 
sqlalchemy/lib/sqlalchemy/sql/dml.py:return_defaults(), it seems like 
SQLAlchemy is already aware that RETURNING for insert statements with 
multi-values VALUES clauses is supported, so it's possible there is a 
reason that this is hard or can't be done.

This is also something that, if it just a “missing feature”, I would love 
to learn more and see if I could contribute.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to