Re: [sqlalchemy] Speed up bulk inserts

Michael Bayer Wed, 13 Nov 2013 08:46:44 -0800

On Nov 13, 2013, at 4:57 AM, Achim <[email protected]> wrote:

> 
> Am Mittwoch, 6. November 2013 21:58:53 UTC+1 schrieb Michael Bayer:
> I wrote a full post regarding this topic on stackoverflow at  
> http://stackoverflow.com/questions/11769366/why-is-sqlalchemy-insert-with-sqlite-25-times-slower-than-using-sqlite3-directly/11769768#11769768
>  .  If you start with this, I can answer more specific questions. 
> 
> The article was very helpful, thanks. I still want to figure out the best 
> balance between convenience and speed for my use case. Do the following make 
> sense and is possible?
> 
> I work only with Postgresql and I'm sure that all involved objects have a 
> unique id column which is called 'id'.  So before doing a session.commit(), I 
> could check how many objects are in my session. As I'm just bulk inserting, I 
> know that all of them are new and don't have their id set yet. Now I ask the 
> database for that number of new ids, iterate over the objects in my session 
> and set the ids. Internally all ids would come from a single sequence, so I 
> don't have to care about object types and so on. Afterwards SqlAlchemy should 
> be aware that ids have already been set, so no generated ids have to be 
> returned and the session.commit() should be much simpler and faster.
> 
> Sounds like a still quite simple, but hopefully much faster solution. Do you 
> agree?



sure that should be fine, if you can pre-calc your PKs.   It just won’t work 
under any kind of concurrency, as in such a situation there could be 
interleaved INSERTs from different processes.

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [sqlalchemy] Speed up bulk inserts

Reply via email to