Gregg Lind wrote: > > Good morning! > > I hope I'm not tipping any sacred cows here, but for simple SqlA > objects, it seems to be a heckuva lot faster to just insert into the > table directly (using SqlA expression language) than to insert the > objects via session.flush(). In the attached tests, I'm observing a > 10x -20x speedup. I'm still new to SqlA, so I was hoping the list > here would be kind enough to verify my observation.
verified > > My questions: > > 1. If so, why? I assume it's because session_flush() does seperate > insert statments (as verified when echo = True is on). session.flush() performs a topological sort of all dirty/pending/deleted objects based on foreign key dependencies between tables as well as between rows, checks all modified attributes and collections (the collections part sometimes requires a load of the collection, unless certain options are set) for a net change in value, issues INSERT/UPDATE/DELETE statements in an exact order based on dependencies, tailors individual INSERT and UPDATE statements based on the values which are present in memory vs. server side defaults (for inserts) or the values which have a net change (for updates). It then issues all of these statements individually (by necessity, since they all have different argument lists and sometimes inter-row dependencies, you also cannot fetch the "last inserted id" from an executemany()) which from a DBAPI point of view is slower in any case, since you are calling execute() many times. Newly inserted rows often require extra statements to fetch newly generated primary keys, which are then distributed to all the foreign-key-holding attributes which require it (which are then potentially inserted or updated in subsequent statements). After all SQL is emitted, it then refreshes the bookkeeping status on all entities which were changed, and expires attributes whose values were generated within the DB but don't need to be fetched until needed. OTOH an executemany() call receives a pre-made list of parameters for any number of bind parameter sets, the DBAPI then prepares a single statement and runs it N times, usually within C code, and you can modify or insert tens of thousands of rows in a few seconds (the trick is that you've generated this huge dict of data beforehand, and that your parameters are all of identical structure). > > 2. In test 3, is this a reasonable away to "convert" from "session" > to direct table insert? Is there a simpler way than the Thing.to_dict > method I hacked together. for simple table mappings, its easy enough to deal with your rows as dicts and use execute() to change things. For more complexity with relations to other tables in various ways, it becomes less trivial. There are always tradeoffs to be navigated according to your specific needs. > > 3. Are these valid tests? I don't want to have all the embarrassment > of some others who have 'slammed' SqlA without a proper grounding. > I'm no expert, and I want to make sure what I have is something > approximating idiomatic SqlA. I tried to be generous about what to > include in the timed section of each test. I do have autoflush off, > and I'm using Sqlite (in memory), which might affect things. I didnt look closely but the general observation of "expressions are faster than ORM" is valid. Your orders of magnitude might be off. > > 4. If there is a faster way to flush out a session, I'm all ears! I > understand the Big Win (tm) of the ORM is programmer simplicity and > power, but if I can get that without major hits to performance, I'd > like to be able to Be Greedy (tm) and have it all. try keeping the size of the session small, and look into options like "passive_deletes" and "passive_updates", which prevent rows from being loaded in order to accomodate cascades that can be established in the database directly. In any case flushing tens of thousands of objects is unlikely to be performant. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---
