ok, executemany() it is.

Thank you for your help

On Nov 2, 10:36 am, Michael Bayer <[email protected]> wrote:
> On Nov 1, 2009, at 5:37 PM, BEES INC wrote:
>
>
>
>
>
> > Ah yes I'm not sure of the terminology either, i got the 'extended'
> > from the mysqldump usage info (mysqldump -e, on by default but you can
> > turn it off with mysqldump --extended-insert=false)
>
> > Anyway, from looking at the mysqldump output, extended inserts are
> > covered by this bit from the INSERT documentation:
>
> > "INSERT statements that use VALUES syntax can insert multiple rows. To
> > do this, include multiple lists of column values, each enclosed within
> > parentheses and separated by commas. Example:
>
> > INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);"
>
> > So its the difference between a single INSERT with lots of data vs a
> > separate INSERT for each row
>
> > The execute many thing is an option, but the relationships are a
> > little complex (26 tables and counting). There is only one object
> > added to the session, everything else is linked from there.
> > Implementing the inserts without the ORM might get a little involved
> > when it comes to ensuring the relations are established correctly, but
> > if that's the way to do it then so be it.
>
> > I am also open to the possibility to rejiggering my schema/session
> > usage to fit better with SA, which is a fairly undefined goal I know.
> > One thing that is very common in my program is a lot of (datetime,
> > float) observations. They are all mapped to different objects and all
> > go in different tables as they belong to different objects by the
> > business logic. I'm not sure if it would be better to put them in one
> > table and use a secondary relationship table with 2 foreign keys
> > mapping objects to their given date/float observations. Indeed I had
> > it that way originally but it has not seemed to have any impact on
> > performance either way.
>
> the ORM can't bundle INSERT statements together in any way since its  
> necessary to retrieve information about each INSERT statement  
> individually, particularly for MySQL where it's the only way we can  
> get at the "last inserted ID", necessary for assigning identity to  
> each ORM-mapped instance.  Additionally, when many objects are being  
> INSERTed by the ORM, there's no guarantee that each INSERT statement  
> would render in the same way, since missing values on an instance  
> translate into that column being omitted from just that statement so  
> that a server side default can fire off.
>
> so you can have effortless or large-scale-performant for a large bulk  
> insert, take your pick.
>
> I also would hypothesize that standard executemany() is a better  
> choice than than the "INSERT" with many values version, since the  
> latter places a toll on mysql's query parser and buffer.
>
>
>
> > Thanks for your reply
>
> > On Nov 2, 5:38 am, Michael Bayer <[email protected]> wrote:
> >> On Nov 1, 2009, at 9:52 AM, BEES INC wrote:
>
> >>> Hi
>
> >>> Does sqlalchemy support mysql extended insert syntax?
> >>> I have been chasing down very slow performance. My program  
> >>> generates a
> >>> whole bunch of data then does a big SA commit() at the end, which is
> >>> proving to be inconveniently slow.
>
> >>> This is a raw import of a mysqldump without extended inserts
>
> >>> dbregress$ time mysql -ubees -pbees tt_test < up50ma.dbdump6
>
> >>> real    0m33.043s
> >>> user    0m1.343s
> >>> sys     0m1.572s
> >>> dbregress$
>
> >>> Which roughly corresponds to the time it is taking SA to do the
> >>> commit.
>
> >>> This is a raw import using extended inserts:
>
> >>> dbregress$ time mysql -ubees -pbees tt_test < up50ma.dbdump6.e
>
> >>> real    0m5.517s
> >>> user    0m0.081s
> >>> sys     0m0.024s
> >>> dbregress$
>
> >>> Which is clearly noticeably faster.
>
> >>> Also this is for a test dataset, the real dataset is much larger,  
> >>> and
> >>> when used the final commit can take upwards of 25 minutes.
>
> >>> I could not see anything in the documentation regarding this,
> >>> apologies if i missed something. I tried periodic commits vs a  
> >>> single
> >>> commit at the end but this did not offer any real difference in
> >>> performance, and also introduced other issues.
>
> >> you'll get much faster performance for bulk inserts if you forego the
> >> usage of the ORM and use executemany() syntax which is documented  
> >> here:
>
> >>http://www.sqlalchemy.org/docs/05/sqlexpression.html#executing-
> >> multip...
>
> >> I've googled and searched MySQL's site for "extended insert syntax"
> >> and while there are many mentions of it, there are no definitions or
> >> examples I can find.  For example the word "extended" does not appear
> >> on this page in any form:  
> >> http://dev.mysql.com/doc/refman/5.0/en/insert.html
> >>   .
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to