Gregg Lind wrote:
>
> Good morning!
>
> I hope I'm not tipping any sacred cows here, but for simple SqlA
> objects, it seems to be a heckuva lot faster to just insert into the
> table directly (using SqlA expression language) than to insert the
> objects via session.flush().  In the attached tests, I'm observing a
> 10x -20x  speedup.  I'm still new to SqlA, so I was hoping the list
> here would be kind enough to verify my observation.


verified

>
> My questions:
>
> 1.  If so, why?  I assume it's because session_flush() does seperate
> insert statments (as verified when echo = True is on).

session.flush() performs a topological sort of all dirty/pending/deleted
objects based on foreign key dependencies between tables as well as
between rows, checks all modified attributes and collections (the
collections part sometimes requires a load of the collection, unless
certain options are set) for a net change in value, issues
INSERT/UPDATE/DELETE statements in an exact order based on dependencies,
tailors individual INSERT and UPDATE statements based on the values which
are present in memory vs. server side defaults (for inserts) or the values
which have a net change (for updates).   It then issues all of these
statements individually  (by necessity, since they all have different
argument lists and sometimes inter-row dependencies, you also cannot fetch
the "last inserted id" from an executemany()) which from a DBAPI point of
view is slower in any case, since you are calling execute() many times. 
Newly inserted rows often require extra statements to fetch newly
generated primary keys, which are then distributed to all the
foreign-key-holding attributes which require it (which are then
potentially inserted or updated in subsequent statements).  After all SQL
is emitted, it then refreshes the bookkeeping status on all entities which
were changed, and expires attributes whose values were generated within
the DB but don't need to be fetched until needed.

OTOH an executemany() call receives a pre-made list of parameters for any
number of bind parameter sets, the DBAPI then prepares a single statement
and runs it N times, usually within C code, and you can modify or insert
tens of thousands of rows in a few seconds (the trick is that you've
generated this huge dict of data beforehand, and that your parameters are
all of identical structure).


>
> 2.  In test 3, is this a reasonable away to "convert" from "session"
> to direct table insert?  Is there a simpler way than the Thing.to_dict
> method I hacked together.

for simple table mappings, its easy enough to deal with your rows as dicts
and use execute() to change things.  For more complexity with relations to
other tables in various ways, it becomes less trivial.   There are always
tradeoffs to be navigated according to your specific needs.

>
> 3.  Are these valid tests?  I don't want to have all the embarrassment
> of some others who have 'slammed' SqlA without a proper grounding.
> I'm no expert, and I want to make sure what I have is something
> approximating idiomatic SqlA.  I tried to be generous about what to
> include in the timed section of each test.  I do have autoflush off,
> and I'm using Sqlite (in memory), which might affect things.

I didnt look closely but the general observation of "expressions are
faster than ORM" is valid.   Your orders of magnitude might be off.

>
> 4.  If there is a faster way to flush out a session, I'm all ears!  I
> understand the Big Win (tm) of the ORM is programmer simplicity and
> power, but if I can get that without major hits to performance, I'd
> like to be able to Be Greedy (tm) and have it all.

try keeping the size of the session small, and look into options like
"passive_deletes" and "passive_updates", which prevent rows from being
loaded in order to accomodate cascades that can be established in the
database directly.  In any case flushing tens of thousands of objects is
unlikely to be performant.



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to